Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Architecture

Crate dependency graph

jammi-db (foundation)
    |
    |-- Config, Catalog, Sources, SQL execution
    |-- Parquet storage, ANN indexes, crash recovery
    |
    v
jammi-ai (intelligence)
    |
    |-- Model resolution, loading, caching
    |-- InferenceExec, AnnSearchExec operators
    |-- Embedding pipeline, result persistence
    |-- SearchBuilder, evidence provenance
    |-- Fine-tuning, evaluation
    |-- GPU scheduling
    |-- InferenceSession (wraps JammiSession)
    |
    +-------+-------+
    |               |
    v               v
jammi-server    jammi-python
    |               |
    |-- Flight SQL  |-- PyO3 bindings
    '-- Health API   '-- pyarrow interop

jammi-cli
    |
    '-- Clap CLI wrapping InferenceSession

jammi-db has no dependency on jammi-ai. The intelligence layer is an optional addition — you can use jammi-db standalone for SQL queries over local data.

Key types and their responsibilities

Engine layer (jammi-db)

TypeResponsibility
JammiConfigTOML + env config loading with defaults
CatalogSQLite-backed persistence for sources, models, result tables, eval runs, evidence channels
JammiSessionDataFusion session + source registration + SQL execution
SourceCatalog / JammiSchemaProviderDataFusion catalog integration
ResultStoreParquet storage coordinator: create, finalize, recover, register
ParquetResultWriterZSTD-compressed Parquet file writer (64K row groups)
VectorIndex / SidecarIndexANN index trait + USearch implementation with row_id mapping

AI layer (jammi-ai)

TypeResponsibility
InferenceSessionWraps JammiSession + ModelCache + ResultStore. Entry point for all operations
ModelResolverResolves model ID to file paths + backend. Chain: catalog -> local -> HF Hub
ModelCacheLRU cache with single-flight loading, ref-counted guards
CandleBackend / OrtBackendModel backends: Candle (safetensors, BERT + ModernBERT + DistilBERT + OpenCLIP ViT), ONNX Runtime
HttpBackendRemote backend: HTTP endpoint for embeddings
InferenceExecDataFusion ExecutionPlan operator for inference with backpressure
AnnSearchExecDataFusion ExecutionPlan leaf node for ANN vector search
EmbeddingPipelineOrchestrates embedding generation (text or image): model -> InferenceExec -> ResultSink -> index. Parameterized by ModelTask
ResultSinkStreams inference output to Parquet + sidecar index, filters failed rows
SearchBuilderFluent API: join, annotate, filter, sort, limit, select, run
EvidenceRow / RowProvenanceEvidence model types for provenance tracking
OutputAdapterTrait that converts raw model output to Arrow arrays per task
GpuSchedulerGPU memory permit system with budget-based admission control
FineTuneJobLoRA fine-tuning with contrastive loss, checkpointing, early stopping
EvalRunnerRetrieval and classification evaluation

Server layer (jammi-server)

TypeResponsibility
AppStateShared state: Arc<InferenceSession> + ANN cache
FlightSqlServiceArrow Flight SQL server backed by DataFusion
Health endpointHTTP /health for container liveness probes

Python layer (jammi-python)

TypeResponsibility
DatabasePyO3 class wrapping Arc<InferenceSession> with shared tokio runtime
SearchBuilderPyO3 class with imperative-style search composition
FineTuneJobPyO3 class for monitoring fine-tuning jobs
connect()Module-level function to create a Database

Data flow

SQL query path

JammiSession::sql("SELECT ...")
    -> DataFusion parses SQL
    -> Resolves table from SourceCatalog/JammiSchemaProvider
    -> Creates ListingTable scan from Parquet/CSV/JSON or federated source
    -> Executes plan
    -> Returns Vec<RecordBatch>

Embedding generation path (text and image)

InferenceSession::generate_text_embeddings(source, model, columns, key)
InferenceSession::generate_image_embeddings(source, model, image_column, key)
    -> EmbeddingPipeline::run(task = TextEmbedding | ImageEmbedding)
    -> Register result_table (status = "building")
    -> Build plan: SourceScan -> InferenceExec(task)
    -> InferenceExec dispatches to CandleModel::forward(content, task):
    |   TextEmbedding:  arrow_to_texts -> tokenize -> BERT/ModernBERT -> mean_pool -> L2_normalize
    |   ImageEmbedding: arrow_to_images -> preprocess (model-driven) -> ViT forward -> L2_normalize
    -> Stream batches through ResultSink
    |   |-- Filter _status = "ok"
    |   |-- Transform to embedding schema
    |   |-- Write to Parquet via ParquetResultWriter
    |   '-- Feed vectors to SidecarIndex::add()
    -> Close writer, build ANN index, save sidecar bundle
    -> Register as DataFusion table, update catalog to "ready"
    -> Return ResultTableRecord

Vector search path

InferenceSession::search(source, query_vec, k)
    -> Resolve embedding table from catalog
    -> AnnSearchExec: SidecarIndex (ANN) or exact_vector_search (fallback)
    -> Hydration: join ANN results back to source table
    -> SearchBuilder: .join() .annotate() .filter() .sort() .limit() .select()
    -> .run(): execute DataFusion plan, add provenance columns
    -> Returns Vec<RecordBatch> with similarity + original columns + evidence

Module layout

crates/jammi-db/src/
|-- config.rs           # Configuration loading
|-- error.rs            # Unified error type
|-- session.rs          # JammiSession (DataFusion wrapper)
|-- catalog/            # SQLite-backed catalog
|-- source/             # Source types, registry, schema provider
|-- store/              # ResultStore, Parquet writer/reader
'-- index/              # VectorIndex trait, sidecar, exact search

crates/jammi-ai/src/
|-- session.rs          # InferenceSession
|-- model/              # ModelResolver, ModelCache, backends
|-- operator/           # InferenceExec, AnnSearchExec
|-- inference/          # Runner, observer, output adapters, image preprocessing
|-- pipeline/           # EmbeddingPipeline, ResultSink
|-- evidence/           # Provenance types and columns
|-- search/             # SearchBuilder
|-- fine_tune/          # LoRA training, config, jobs
|-- eval/               # Retrieval and classification eval
'-- concurrency/        # GpuScheduler, permits

crates/jammi-server/src/
|-- lib.rs              # Health server startup, signal handling
|-- routes/health.rs    # GET /health liveness probe
|-- error.rs            # 404 fallback
'-- flight.rs           # Arrow Flight SQL service

crates/jammi-cli/src/
|-- main.rs             # Clap CLI entry point
'-- commands/           # serve, query, sources, models, explain

crates/jammi-python/src/
|-- lib.rs              # PyO3 module, connect()
|-- database.rs         # Database class
|-- search.rs           # SearchBuilder class
|-- job.rs              # FineTuneJob class
|-- convert.rs          # Arrow <-> PyArrow conversion
'-- error.rs            # Error conversion