Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Cookbook Recipes (runnable)

Every recipe under cookbook/ ships as a runnable example.py next to a markdown README and is wired into CI via tests/cookbook_smoke.py — a broken recipe blocks the merge. The cookbook is the OSS source of truth; this page mirrors each README below.

The recipes shipped at MVP:

RecipeDemonstrates
mutable_tablesCreate/insert/select/drop on a mutable companion table
trigger_streamsPublish + subscribe on a topic via the in-process broker
eval_embeddingsrecall@k, MRR, nDCG against a golden set
eval_inferenceAccuracy + macro F1 against gold labels
fine_tuneLoRA fine-tune end-to-end
flight_sqlQuery a remote jammi serve over Arrow Flight SQL

Mutable tables

End-to-end create / insert / select / drop on a Jammi mutable table — the OSS primitive for state that needs to live alongside read-only result tables.

When to use this pattern. You need a writable table that sits in the same SQL catalog as your registered sources and embedding tables — for caching enriched rows, holding cursor state, recording user feedback, or any “small table I want to UPDATE / DELETE / INSERT from SQL” workload — without standing up an external Postgres.

What example.py does

  1. Connects to a temporary artifact dir
  2. Creates a notes mutable table with an int64 primary key + utf8 body column
  3. Inserts three rows through DataFusion DML (INSERT INTO ...)
  4. Verifies count and ordering via SELECT
  5. Drops the table, then asserts a SELECT after the drop raises
  6. Demonstrates the idempotent drop_mutable_table(..., if_exists=True)

API surface exercised

  • Database.create_mutable_table(name, *, schema, primary_key, ...)
  • Database.sql("INSERT INTO mutable.public.<name> ...")
  • Database.sql("SELECT ... FROM mutable.public.<name>")
  • Database.drop_mutable_table(name, *, if_exists=False)

The DataFusion namespace for mutable tables is always mutable.public.<name> — distinct from registered sources, which live under <source>.public.<source>.

Run it

python cookbook/recipes/mutable_tables/example.py

Exits 0 on success, prints mutable_tables: OK on the last line.


Trigger streams

End-to-end publish + subscribe on a Jammi topic, plus the registration and listing surface. Uses the embedded in-process broker — no NATS or external broker needed.

When to use this pattern. You need a low-friction event bus inside your application — for fan-out to downstream consumers, fan-in from batch jobs, or replay-from-offset semantics — without bringing up Kafka or NATS in dev/test. The same surface scales out to NATS JetStream by flipping a config flag at deploy time.

What example.py does

  1. Connects to a temporary artifact dir
  2. Registers a topic events.demo with a typed schema and broker metadata
  3. Confirms list_topics() returns the new topic
  4. Publishes a 3-row batch through publish_topic — captures the broker-assigned offset
  5. Subscribes from from_offset=0 and round-trips the same rows back
  6. Drops the topic, confirms it’s gone from list_topics()
  7. Demonstrates idempotent drop_topic(..., if_exists=True) and strict-mode failure when dropping a missing topic

API surface exercised

  • Database.register_topic(name, *, schema, broker_metadata=None)
  • Database.list_topics()
  • Database.publish_topic(name, *, batch) — returns the assigned offset
  • Database.subscribe_collect(name, *, from_offset, max_batches)
  • Database.drop_topic(name, *, if_exists=False)

The subscribe_collect path drives the replay-from-backing-table flow when from_offset=0; the live-tail flow is exercised in the broker integration suite.

Run it

python cookbook/recipes/trigger_streams/example.py

Exits 0 on success, prints trigger_streams: OK on the last line.


Evaluate retrieval quality

Measure recall@k, precision@k, MRR, and nDCG of an embedding index against a golden relevance set.

When to use this pattern. You have a corpus and a small set of (query, expected document) judgments, and you need a number that tells you “is my new encoder better than the one I shipped last month?” The same loop powers nightly regression dashboards and A/B model comparison.

What example.py does

  1. Connects to a temporary artifact dir
  2. Registers the tiny corpus as a Parquet source
  3. Builds 32-dim embeddings over the content column with the local tiny_bert fixture
  4. Reads cookbook/fixtures/tiny_golden.json, expands it into the (query_id, query_text, relevant_id) CSV shape eval_embeddings consumes, and registers it as a golden source
  5. Calls db.eval_embeddings(source="corpus", golden_source="golden.public.golden", k=5)
  6. Asserts each aggregate metric is in [0.0, 1.0] and the per-query records carry their golden-set query_id

API surface exercised

  • Database.generate_text_embeddings(source, *, model, columns, key)
  • Database.eval_embeddings(*, source, golden_source, model=None, k=10)

The returned dict carries aggregate (mean across queries — recall_at_k, precision_at_k, mrr, ndcg) and per_query (one entry per query with query_id and a metrics sub-dict of the same four names, un-averaged).

Golden source shape

eval_embeddings requires a registered source with these columns:

columntypeexample
query_idutf8q1
query_textutf8quantum computing applications
relevant_idutf81 (matches corpus.id as a string)

Image queries are supported via a query_image BLOB column instead of query_text; cross-modal eval is out of scope for this recipe.

Run it

python cookbook/recipes/eval_embeddings/example.py

Exits 0 on success, prints the metrics dict + eval_embeddings: OK.


Evaluate inference (classification)

Run a classifier over a registered source and score its predictions against gold labels.

When to use this pattern. You have a labelled holdout set and you want a single number — accuracy, macro F1, per-class F1 — to compare two classifiers, or to track drift over time on the same classifier.

What example.py does

  1. Connects to a temporary artifact dir
  2. Registers the tiny corpus as corpus (parquet)
  3. Registers tiny_labels.csv as golden (csv) — (id, label) rows
  4. Runs db.eval_inference with the local tiny_modernbert_classifier fixture against the content column
  5. Prints the returned aggregate accuracy, macro f1, per-class metrics, and the count of per-record predictions
  6. Asserts every reported rate is in [0.0, 1.0]

API surface exercised

  • Database.eval_inference(*, model, source, columns, task, golden_source, label_column)

The returned dict carries aggregate (tagged by "task" — currently "classification") with accuracy, f1, and per_class, plus per_record (one entry per aligned {record_id, predicted, gold}).

The task argument is the string form of the inference task — "classification" here. NER is recognized but not yet supported via this entrypoint (see the runner’s EvalTask::Ner branch); for token-level eval, call the jammi-numerics NER kernels directly.

Golden source shape

eval_inference requires a registered source with these columns:

columntypeexample
idutf8"1"
<label_column>utf8physics

label_column is the kwarg you pass at call time — label in this recipe. Every id in the golden source must resolve to a row in the input source; rows without a gold label are silently dropped from the metric.

Run it

python cookbook/recipes/eval_inference/example.py

Exits 0 on success, prints the metrics dict + eval_inference: OK.


Fine-tune an encoder

Run a LoRA fine-tune on top of an existing text encoder, poll the job to completion, and use the resulting checkpoint to encode a query.

When to use this pattern. Your domain (legal contracts, medical abstracts, patent claims, internal product docs) doesn’t match the distribution the base encoder was trained on, and you have a few hundred to a few thousand labelled or contrastive pairs. LoRA gets you ~80% of the lift of a full fine-tune at a fraction of the cost; the resulting adapter is small enough to ship as an attachment to the base model rather than a re-distributed full checkpoint.

What example.py does

  1. Connects to a temporary artifact dir
  2. Registers tiny_pairs.csv (30 contrastive pairs) as training
  3. Calls db.fine_tune(...) with the local tiny_bert base, a small LoRA rank, and one epoch — kept fast for CI
  4. Waits for terminal status via job.wait()
  5. Asserts the resulting model_id starts with jammi:fine-tuned:
  6. Encodes a query through the fine-tuned model to confirm it loads

API surface exercised

  • Database.fine_tune(*, source, base_model, columns, method, task=..., ...)
  • FineTuneJob.wait()
  • FineTuneJob.job_id, FineTuneJob.model_id
  • Database.encode_text_query(model_id, text)

The full keyword list on fine_tune covers LoRA rank/alpha/dropout, learning rate, epochs, batch size, max sequence length, validation fraction, early-stopping patience/metric, warmup, gradient accumulation, backbone dtype, weight decay, and gradient clipping — the recipe uses the defaults for everything except rank and epochs.

Performance note

This recipe is excluded from the per-PR smoke matrix because even at one epoch it runs ~30 seconds on CPU. The nightly cron with JAMMI_COOKBOOK_SLOW=1 includes it. Override the gate locally:

JAMMI_COOKBOOK_SLOW=1 python tests/cookbook_smoke.py

Run it

python cookbook/recipes/fine_tune/example.py

Exits 0 on success, prints job_id, model_id, and fine_tune: OK.


Connect via Flight SQL

Run a query against a remote jammi server over Arrow Flight SQL.

When to use this pattern. You’re connecting from a non-Python client (Tableau, dbt, JDBC tools, Rust binaries), or you want to expose Jammi to multiple readers without each one holding an embedded session. The same protocol is what dbt-flightsql, the official Flight SQL JDBC driver, and BI tools speak natively.

What example.py does

  1. Spawns target/release/jammi serve as a child process pointed at a temp artifact_dir
  2. Polls the health endpoint (http://127.0.0.1:8080/health) until the server is ready (5 s budget)
  3. Opens a pyarrow.flight.FlightClient against grpc://127.0.0.1:8081
  4. Submits SELECT 1 AS one over Flight SQL and confirms the response
  5. Tears down the server process cleanly

This recipe is gated out of the per-PR CI matrix — it depends on the jammi binary being built (cargo build --release -p jammi-cli), and the build cost dominates the test wall-clock. The nightly cookbook job builds the binary and runs the recipe behind JAMMI_COOKBOOK_SLOW=1.

Prerequisites

  • cargo build --release -p jammi-cli — produces target/release/jammi
  • pip install pyarrow (already a jammi-ai dependency)

The script auto-detects JAMMI_BIN (env var) or falls back to the workspace’s target/release/jammi.

API surface exercised

  • pyarrow.flight.FlightClient.execute(query) over the Flight SQL command dialect
  • jammi serve — the OSS deployment-shape binary entrypoint

Run it

cargo build --release -p jammi-cli      # one-time build
python cookbook/recipes/flight_sql/example.py

Exits 0 on success, prints the query result + flight_sql: OK.