Databricks Documentation MCP Server
A lightweight, stateless Model Context Protocol (MCP) server that lets AI assistants read and search docs.databricks.com in real time — no local cache, no database, no crawling required.
Features
- Live fetch — always returns current documentation content, never stale
- Full-text search — real-time, site-scoped search via DuckDuckGo
site:operator - Docusaurus-aware extraction — strips navigation, sidebars, and page chrome; returns clean markdown
- Section extraction — pull specific h2 sections from long reference pages
- Pagination —
start_indexandmax_lengthparameters for large pages
Prerequisites
- Python 3.10 or later
- uv (recommended) or pip
Installation
# run directly without installing (recommended)
uvx databricks-docs-mcp
# or install permanently
pip install databricks-docs-mcp
MCP client configuration (release install)
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or%APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"databricks-docs": {
"command": "uvx",
"args": ["databricks-docs-mcp"]
}
}
}
VS Code (GitHub Copilot)
Add to .vscode/mcp.json in your workspace:
{
"servers": {
"databricks-docs": {
"type": "stdio",
"command": "uvx",
"args": ["databricks-docs-mcp"]
}
}
}
To pin a specific version, use
"args": ["databricks-docs-mcp==1.2.0"].
From source (development)
git clone https://gitlab.com/rokorolev/databricks-docs-mcp.git
cd databricks-docs-mcp
uv sync --extra dev
MCP client config for a local clone:
{
"servers": {
"databricks-docs": {
"type": "stdio",
"command": "uv",
"args": ["--directory", "/absolute/path/to/databricks-mcp", "run", "databricks-docs-mcp"]
}
}
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
MCP_USER_AGENT | Mozilla/5.0 (compatible; DatabricksDocsMCP/1.0) | HTTP User-Agent sent with every request |
FASTMCP_LOG_LEVEL | WARNING | Log verbosity: DEBUG, INFO, WARNING, ERROR |
Tools
search_documentation
Search docs.databricks.com using a site-scoped real-time web search.
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | — | Keywords or topic to search for |
limit | integer | 10 | Maximum results to return (max 30) |
Returns a JSON array of results with URL, title, and snippet.
read_documentation
Fetch a docs.databricks.com page as clean markdown.
| Parameter | Type | Default | Description |
|---|---|---|---|
url | string | — | Full docs.databricks.com URL |
max_length | integer | 5000 | Maximum characters to return per call |
start_index | integer | 0 | Character offset for pagination |
Returns markdown-formatted page content with a continuation hint when the page is truncated.
read_sections
Extract specific h2 sections from a docs page by heading title.
| Parameter | Type | Default | Description |
|---|---|---|---|
url | string | — | Full docs.databricks.com URL |
section_titles | string[] | — | h2 heading titles to extract (case-insensitive) |
Returns markdown of the matched sections only.
Basic Usage
Recommended workflow
1. Search for a topic:
search_documentation("Delta Live Tables pipeline settings")
2. Read the most relevant result:
read_documentation("https://docs.databricks.com/aws/en/dlt/settings.html")
3. Extract specific sections from large pages:
read_sections(
"https://docs.databricks.com/aws/en/dlt/settings.html",
["Pipeline mode", "Compute settings"]
)
Tips
- Databricks docs URLs follow the pattern
https://docs.databricks.com/<cloud>/en/<topic>/...
Useawsfor AWS,gcpfor GCP,azurefor Azure. - Use
start_indexinread_documentationto page through long articles. - Section titles for
read_sectionsare matched case-insensitively against<h2>headings on the page.
Development
uv sync --extra dev
Lint
uv run ruff check src/ tests/
Run tests
uv run --frozen pytest --cov --cov-branch --cov-report=term-missing
Project structure
src/
databricks_docs_mcp/
server.py # MCP server and tool definitions
utils.py # HTML extraction and formatting utilities
models.py # Pydantic models for search results
tests/
test_server.py
test_utils.py
License
MIT — see LICENSE.