Cookbook Recipes (runnable)
Every recipe under cookbook/
ships as a runnable example.py next to a markdown README and is wired
into CI via tests/cookbook_smoke.py — a broken recipe blocks the
merge. The cookbook is the OSS source of truth; this page mirrors each
README below.
The recipes shipped at MVP:
| Recipe | Demonstrates |
|---|---|
mutable_tables | Create/insert/select/drop on a mutable companion table |
trigger_streams | Publish + subscribe on a topic via the in-process broker |
eval_embeddings | recall@k, MRR, nDCG against a golden set |
eval_inference | Accuracy + macro F1 against gold labels |
fine_tune | LoRA fine-tune end-to-end |
flight_sql | Query a remote jammi serve over Arrow Flight SQL |
Mutable tables
End-to-end create / insert / select / drop on a Jammi mutable table — the OSS primitive for state that needs to live alongside read-only result tables.
When to use this pattern. You need a writable table that sits in the same SQL catalog as your registered sources and embedding tables — for caching enriched rows, holding cursor state, recording user feedback, or any “small table I want to UPDATE / DELETE / INSERT from SQL” workload — without standing up an external Postgres.
What example.py does
- Connects to a temporary artifact dir
- Creates a
notesmutable table with anint64primary key +utf8body column - Inserts three rows through DataFusion DML (
INSERT INTO ...) - Verifies count and ordering via
SELECT - Drops the table, then asserts a
SELECTafter the drop raises - Demonstrates the idempotent
drop_mutable_table(..., if_exists=True)
API surface exercised
Database.create_mutable_table(name, *, schema, primary_key, ...)Database.sql("INSERT INTO mutable.public.<name> ...")Database.sql("SELECT ... FROM mutable.public.<name>")Database.drop_mutable_table(name, *, if_exists=False)
The DataFusion namespace for mutable tables is always
mutable.public.<name> — distinct from registered sources, which live
under <source>.public.<source>.
Run it
python cookbook/recipes/mutable_tables/example.py
Exits 0 on success, prints mutable_tables: OK on the last line.
Trigger streams
End-to-end publish + subscribe on a Jammi topic, plus the registration and listing surface. Uses the embedded in-process broker — no NATS or external broker needed.
When to use this pattern. You need a low-friction event bus inside your application — for fan-out to downstream consumers, fan-in from batch jobs, or replay-from-offset semantics — without bringing up Kafka or NATS in dev/test. The same surface scales out to NATS JetStream by flipping a config flag at deploy time.
What example.py does
- Connects to a temporary artifact dir
- Registers a topic
events.demowith a typed schema and broker metadata - Confirms
list_topics()returns the new topic - Publishes a 3-row batch through
publish_topic— captures the broker-assigned offset - Subscribes from
from_offset=0and round-trips the same rows back - Drops the topic, confirms it’s gone from
list_topics() - Demonstrates idempotent
drop_topic(..., if_exists=True)and strict-mode failure when dropping a missing topic
API surface exercised
Database.register_topic(name, *, schema, broker_metadata=None)Database.list_topics()Database.publish_topic(name, *, batch)— returns the assigned offsetDatabase.subscribe_collect(name, *, from_offset, max_batches)Database.drop_topic(name, *, if_exists=False)
The subscribe_collect path drives the replay-from-backing-table flow
when from_offset=0; the live-tail flow is exercised in the broker
integration suite.
Run it
python cookbook/recipes/trigger_streams/example.py
Exits 0 on success, prints trigger_streams: OK on the last line.
Evaluate retrieval quality
Measure recall@k, precision@k, MRR, and nDCG of an embedding index against a golden relevance set.
When to use this pattern. You have a corpus and a small set of (query, expected document) judgments, and you need a number that tells you “is my new encoder better than the one I shipped last month?” The same loop powers nightly regression dashboards and A/B model comparison.
What example.py does
- Connects to a temporary artifact dir
- Registers the tiny corpus as a Parquet source
- Builds 32-dim embeddings over the
contentcolumn with the localtiny_bertfixture - Reads
cookbook/fixtures/tiny_golden.json, expands it into the(query_id, query_text, relevant_id)CSV shapeeval_embeddingsconsumes, and registers it as agoldensource - Calls
db.eval_embeddings(source="corpus", golden_source="golden.public.golden", k=5) - Asserts each aggregate metric is in
[0.0, 1.0]and the per-query records carry their golden-setquery_id
API surface exercised
Database.generate_text_embeddings(source, *, model, columns, key)Database.eval_embeddings(*, source, golden_source, model=None, k=10)
The returned dict carries aggregate (mean across queries — recall_at_k,
precision_at_k, mrr, ndcg) and per_query (one entry per query with
query_id and a metrics sub-dict of the same four names, un-averaged).
Golden source shape
eval_embeddings requires a registered source with these columns:
| column | type | example |
|---|---|---|
query_id | utf8 | q1 |
query_text | utf8 | quantum computing applications |
relevant_id | utf8 | 1 (matches corpus.id as a string) |
Image queries are supported via a query_image BLOB column instead of
query_text; cross-modal eval is out of scope for this recipe.
Run it
python cookbook/recipes/eval_embeddings/example.py
Exits 0 on success, prints the metrics dict + eval_embeddings: OK.
Evaluate inference (classification)
Run a classifier over a registered source and score its predictions against gold labels.
When to use this pattern. You have a labelled holdout set and you want a single number — accuracy, macro F1, per-class F1 — to compare two classifiers, or to track drift over time on the same classifier.
What example.py does
- Connects to a temporary artifact dir
- Registers the tiny corpus as
corpus(parquet) - Registers
tiny_labels.csvasgolden(csv) —(id, label)rows - Runs
db.eval_inferencewith the localtiny_modernbert_classifierfixture against thecontentcolumn - Prints the returned aggregate
accuracy, macrof1, per-class metrics, and the count of per-record predictions - Asserts every reported rate is in
[0.0, 1.0]
API surface exercised
Database.eval_inference(*, model, source, columns, task, golden_source, label_column)
The returned dict carries aggregate (tagged by "task" — currently
"classification") with accuracy, f1, and per_class, plus
per_record (one entry per aligned {record_id, predicted, gold}).
The task argument is the string form of the inference task —
"classification" here. NER is recognized but not yet supported via this
entrypoint (see the runner’s EvalTask::Ner branch); for token-level
eval, call the jammi-numerics NER kernels directly.
Golden source shape
eval_inference requires a registered source with these columns:
| column | type | example |
|---|---|---|
id | utf8 | "1" |
<label_column> | utf8 | physics |
label_column is the kwarg you pass at call time — label in this
recipe. Every id in the golden source must resolve to a row in the
input source; rows without a gold label are silently dropped from the
metric.
Run it
python cookbook/recipes/eval_inference/example.py
Exits 0 on success, prints the metrics dict + eval_inference: OK.
Fine-tune an encoder
Run a LoRA fine-tune on top of an existing text encoder, poll the job to completion, and use the resulting checkpoint to encode a query.
When to use this pattern. Your domain (legal contracts, medical abstracts, patent claims, internal product docs) doesn’t match the distribution the base encoder was trained on, and you have a few hundred to a few thousand labelled or contrastive pairs. LoRA gets you ~80% of the lift of a full fine-tune at a fraction of the cost; the resulting adapter is small enough to ship as an attachment to the base model rather than a re-distributed full checkpoint.
What example.py does
- Connects to a temporary artifact dir
- Registers
tiny_pairs.csv(30 contrastive pairs) astraining - Calls
db.fine_tune(...)with the localtiny_bertbase, a small LoRA rank, and one epoch — kept fast for CI - Waits for terminal status via
job.wait() - Asserts the resulting
model_idstarts withjammi:fine-tuned: - Encodes a query through the fine-tuned model to confirm it loads
API surface exercised
Database.fine_tune(*, source, base_model, columns, method, task=..., ...)FineTuneJob.wait()FineTuneJob.job_id,FineTuneJob.model_idDatabase.encode_text_query(model_id, text)
The full keyword list on fine_tune covers LoRA rank/alpha/dropout,
learning rate, epochs, batch size, max sequence length, validation
fraction, early-stopping patience/metric, warmup, gradient accumulation,
backbone dtype, weight decay, and gradient clipping — the recipe uses
the defaults for everything except rank and epochs.
Performance note
This recipe is excluded from the per-PR smoke matrix because even at one
epoch it runs ~30 seconds on CPU. The nightly cron with
JAMMI_COOKBOOK_SLOW=1 includes it. Override the gate locally:
JAMMI_COOKBOOK_SLOW=1 python tests/cookbook_smoke.py
Run it
python cookbook/recipes/fine_tune/example.py
Exits 0 on success, prints job_id, model_id, and fine_tune: OK.
Connect via Flight SQL
Run a query against a remote jammi server over Arrow Flight SQL.
When to use this pattern. You’re connecting from a non-Python client
(Tableau, dbt, JDBC tools, Rust binaries), or you want to expose Jammi
to multiple readers without each one holding an embedded session. The
same protocol is what dbt-flightsql, the official Flight SQL JDBC
driver, and BI tools speak natively.
What example.py does
- Spawns
target/release/jammi serveas a child process pointed at a tempartifact_dir - Polls the health endpoint (
http://127.0.0.1:8080/health) until the server is ready (5 s budget) - Opens a
pyarrow.flight.FlightClientagainstgrpc://127.0.0.1:8081 - Submits
SELECT 1 AS oneover Flight SQL and confirms the response - Tears down the server process cleanly
This recipe is gated out of the per-PR CI matrix — it depends on the
jammi binary being built (cargo build --release -p jammi-cli), and
the build cost dominates the test wall-clock. The nightly cookbook job
builds the binary and runs the recipe behind JAMMI_COOKBOOK_SLOW=1.
Prerequisites
cargo build --release -p jammi-cli— producestarget/release/jammipip install pyarrow(already ajammi-aidependency)
The script auto-detects JAMMI_BIN (env var) or falls back to the
workspace’s target/release/jammi.
API surface exercised
pyarrow.flight.FlightClient.execute(query)over the Flight SQL command dialectjammi serve— the OSS deployment-shape binary entrypoint
Run it
cargo build --release -p jammi-cli # one-time build
python cookbook/recipes/flight_sql/example.py
Exits 0 on success, prints the query result + flight_sql: OK.