Databases & warehouses
Postgres, MySQL, MongoDB, Snowflake, BigQuery, ClickHouse, Redshift, Databricks
MindsDB Query Engine · Open-source
MindsDB is the open-source query engine that gives AI agents a single way to read from databases, warehouses, SaaS apps, document stores, and vector indexes — with built-in Knowledge Bases for unstructured data and Jobs & Triggers for automation.
Three primitives — three SQL statements — turn a sprawl of databases, documents, and APIs into a single queryable surface an agent can reason over.
Wire up a Postgres warehouse, a Salesforce account, an S3 bucket of PDFs, and a vector store — each as a "database" inside MindsDB. Then query across them with standard SQL: joins, aggregates, subqueries, the whole vocabulary. No data movement, no ETL pipelines to maintain. Handlers ship in the open and new ones merge from the community.
CREATE DATABASE postgres_prod
WITH ENGINE = 'postgres',
PARAMETERS = {
"host": "db.internal",
"user": "readonly",
"password": "${POSTGRES_PWD}",
"database": "analytics"
};
SELECT customer_id, total_arr
FROM postgres_prod.accounts
WHERE region = 'EU'; Point a Knowledge Base at a folder of PDFs, a Confluence space, or a stream of support tickets. MindsDB chunks the content, vectorizes it, and stores metadata you choose — author, source URL, last-updated, anything. Then query it with the same SQL: semantic search, metadata filters, joins against structured tables. The hard part of building a RAG pipeline becomes one CREATE statement.
CREATE KNOWLEDGE_BASE customer_docs
USING
embedding_model = 'openai.text-embedding-3-small',
content_columns = ['body'],
metadata_columns = ['author', 'source_url', 'updated_at'];
INSERT INTO customer_docs
SELECT body, author, source_url, updated_at
FROM s3_bucket.support_pdfs;
SELECT chunk_content, source_url
FROM customer_docs
WHERE content LIKE 'invoice dispute resolution'
AND author = 'support-team'
LIMIT 5; Jobs run on a schedule — refresh a Knowledge Base nightly, sync a derived table every hour, recompute a feature view every five minutes. Triggers fire on data changes — when a new row lands in Postgres, run a follow-up query that vectorizes it into the right Knowledge Base. Together they turn the query engine into a self-maintaining data layer agents can rely on.
CREATE JOB refresh_docs (
INSERT INTO customer_docs
SELECT body, author, source_url, updated_at
FROM s3_bucket.support_pdfs
WHERE updated_at > (
SELECT MAX(updated_at) FROM customer_docs
)
)
EVERY 1 hour; MindsDB sits between agents and the systems where data actually lives. Agents speak SQL or MCP to MindsDB; MindsDB speaks each source's native protocol to fetch, join, and return rows.
Each integration is an open-source handler in the main repo — merged in the open, with a consistent SQL interface across the whole fleet.
Postgres, MySQL, MongoDB, Snowflake, BigQuery, ClickHouse, Redshift, Databricks
Salesforce, HubSpot, Stripe, Shopify, Slack, Notion, Jira, GitHub
S3, GCS, Azure Blob, local files, PDF, HTML, Markdown
SAP, Oracle, NetSuite, ServiceNow, custom REST endpoints
Pinecone, Weaviate, Chroma, pgvector, OpenAI, Anthropic, Hugging Face
A traditional RAG stack means stitching together a chunker, an embedding model, a vector store, a metadata layer, and a retrieval API. A MindsDB Knowledge Base is one SQL statement: you declare the embedding model and the metadata columns, and the engine handles chunking, vectorization, storage, re-embedding on update, and hybrid retrieval.
LIKE, structured filters via WHERE.| content | author | source |
|---|---|---|
| invoice dispute… | support | s3://docs |
| refund policy… | legal | confluence |
| onboarding step… | cs-team | s3://docs |
SELECT … WHERE content LIKE '…' Most agent demos fail in production for the same reason: the data layer behind them isn't shaped for what agents actually do.
One uniform SQL interface to Postgres, S3, Salesforce, and a vector store — instead of one bespoke tool per provider. Smaller prompt, fewer tool-use failures.
Jobs and Triggers keep Knowledge Bases and derived tables in sync without a separate orchestrator. The data the agent retrieves at run time is current, not from last week's ETL.
A useful answer often joins the row from your warehouse with a paragraph from a PDF. MindsDB lets you write that join in one SQL statement — instead of two systems and a glue layer.
Permissions, audit, and source-of-truth live where they should — in your databases — and MindsDB enforces them on every query. The agent inherits your existing access model.
MindsHub is a separate platform from the same team — open-source agents, model routing, a credentials vault, and tool access, ready to run.
Choose which categories of cookies and similar technologies you allow us to use on this site. You can change these choices at any time via the Cookie preferences link in the footer.
Required for the site to work — the anonymous session cookie that lets us measure aggregate traffic. Always on.
Anonymous usage measurement (PostHog server-side). Helps us understand which pages are useful.
Click-ID attribution to ad platforms (Google, X, LinkedIn) and identified events so we can measure ad performance.
California, Virginia, Colorado, Connecticut, Utah, and other US state privacy laws give you the right to opt out of the “sale” or “sharing” of your personal information for cross-context behavioral advertising. We don’t sell your information for money, but we do share identifiers with our advertising partners to measure ad performance. You can opt out here.
See our Privacy Notice and Cookie Policy for full detail.