Quickstart: Rust
This walkthrough registers a local data file, runs a SQL query, generates embeddings, and performs a semantic search — all in one program.
Full example
extern crate jammi_db;
extern crate jammi_ai;
extern crate tokio;
use std::sync::Arc;
use jammi_ai::session::InferenceSession;
use jammi_db::config::JammiConfig;
use jammi_db::source::{FileFormat, SourceConnection, SourceType};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let config = JammiConfig::load(None)?;
let session = Arc::new(InferenceSession::new(config).await?);
// 1. Register a data source
session.add_source("patents", SourceType::File, SourceConnection {
url: Some("file:///path/to/patents.parquet".into()),
format: Some(FileFormat::Parquet),
..Default::default()
}).await?;
// 2. Query with SQL
let rows = session.sql(
"SELECT id, title, year FROM patents.public.patents WHERE year > 2020 LIMIT 5"
).await?;
for batch in &rows {
println!("{batch:?}");
}
// 3. Generate embeddings
let record = session.generate_text_embeddings(
"patents",
"sentence-transformers/all-MiniLM-L6-v2",
&["title".to_string()],
"id",
).await?;
println!("Embedded {} rows", record.row_count);
// 4. Semantic search
let query = session.encode_text_query(
"sentence-transformers/all-MiniLM-L6-v2",
"quantum computing applications",
).await?;
let results = session.search("patents", query, 5).await?
.sort("similarity", true)?
.run().await?;
for batch in &results {
println!("{batch:?}");
}
Ok(())
}
The first run downloads the model from HuggingFace Hub (~90MB). Subsequent runs load from cache.
What’s happening
JammiConfig::load(None)loads config fromjammi.toml,$JAMMI_CONFIG, or defaultsInferenceSessionwraps the query engine with model loading, caching, and GPU schedulingadd_sourceregisters a file in the catalog — it survives session restartssqlruns any SQL query via DataFusion, returnsVec<RecordBatch>generate_text_embeddingsruns the model over every row, persists vectors to Parquet with a sidecar ANN indexencode_text_queryencodes a text string into the same vector spacesearchfinds the nearest neighbors, hydrates all source columns, and returns results with similarity scores
Next steps
- Query Your Data with SQL — SQL features, joins, aggregations
- Generate Embeddings — persistence, multiple models, crash recovery
- Semantic Search — SearchBuilder API, filtering, evidence provenance