Databricks Documentation MCP Server

A lightweight, stateless Model Context Protocol (MCP) server that lets AI assistants read and search docs.databricks.com in real time — no local cache, no database, no crawling required.

Features

  • Live fetch — always returns current documentation content, never stale
  • Full-text search — real-time, site-scoped search via DuckDuckGo site: operator
  • Docusaurus-aware extraction — strips navigation, sidebars, and page chrome; returns clean markdown
  • Section extraction — pull specific h2 sections from long reference pages
  • Paginationstart_index and max_length parameters for large pages

Prerequisites

  • Python 3.10 or later
  • uv (recommended) or pip

Installation

# run directly without installing (recommended)
uvx databricks-docs-mcp

# or install permanently
pip install databricks-docs-mcp

MCP client configuration (release install)

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or
%APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "databricks-docs": {
      "command": "uvx",
      "args": ["databricks-docs-mcp"]
    }
  }
}

VS Code (GitHub Copilot)

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "databricks-docs": {
      "type": "stdio",
      "command": "uvx",
      "args": ["databricks-docs-mcp"]
    }
  }
}

To pin a specific version, use "args": ["databricks-docs-mcp==1.2.0"].

From source (development)

git clone https://gitlab.com/rokorolev/databricks-docs-mcp.git
cd databricks-docs-mcp
uv sync --extra dev

MCP client config for a local clone:

{
  "servers": {
    "databricks-docs": {
      "type": "stdio",
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/databricks-mcp", "run", "databricks-docs-mcp"]
    }
  }
}

Environment Variables

VariableDefaultDescription
MCP_USER_AGENTMozilla/5.0 (compatible; DatabricksDocsMCP/1.0)HTTP User-Agent sent with every request
FASTMCP_LOG_LEVELWARNINGLog verbosity: DEBUG, INFO, WARNING, ERROR

Tools

search_documentation

Search docs.databricks.com using a site-scoped real-time web search.

ParameterTypeDefaultDescription
querystringKeywords or topic to search for
limitinteger10Maximum results to return (max 30)

Returns a JSON array of results with URL, title, and snippet.

read_documentation

Fetch a docs.databricks.com page as clean markdown.

ParameterTypeDefaultDescription
urlstringFull docs.databricks.com URL
max_lengthinteger5000Maximum characters to return per call
start_indexinteger0Character offset for pagination

Returns markdown-formatted page content with a continuation hint when the page is truncated.

read_sections

Extract specific h2 sections from a docs page by heading title.

ParameterTypeDefaultDescription
urlstringFull docs.databricks.com URL
section_titlesstring[]h2 heading titles to extract (case-insensitive)

Returns markdown of the matched sections only.

Basic Usage

1. Search for a topic:

search_documentation("Delta Live Tables pipeline settings")

2. Read the most relevant result:

read_documentation("https://docs.databricks.com/aws/en/dlt/settings.html")

3. Extract specific sections from large pages:

read_sections(
  "https://docs.databricks.com/aws/en/dlt/settings.html",
  ["Pipeline mode", "Compute settings"]
)

Tips

  • Databricks docs URLs follow the pattern https://docs.databricks.com/<cloud>/en/<topic>/...
    Use aws for AWS, gcp for GCP, azure for Azure.
  • Use start_index in read_documentation to page through long articles.
  • Section titles for read_sections are matched case-insensitively against <h2> headings on the page.

Development

uv sync --extra dev

Lint

uv run ruff check src/ tests/

Run tests

uv run --frozen pytest --cov --cov-branch --cov-report=term-missing

Project structure

src/
  databricks_docs_mcp/
    server.py   # MCP server and tool definitions
    utils.py    # HTML extraction and formatting utilities
    models.py   # Pydantic models for search results
tests/
  test_server.py
  test_utils.py

License

MIT — see LICENSE.