MCPDocSearch

0
0 Reviews
8 Stars
MCPDocSearch enables crawling of websites and wikis, converting content into Markdown files. It then hosts a Model Context Protocol (MCP) server that loads, chunks, and embeds this documentation for efficient semantic search, supporting tool integrations like Cursor.
Added on:
Created by:
Apr 18 2025
MCPDocSearch

MCPDocSearch

0 Reviews
8
0
MCPDocSearch
MCPDocSearch enables crawling of websites and wikis, converting content into Markdown files. It then hosts a Model Context Protocol (MCP) server that loads, chunks, and embeds this documentation for efficient semantic search, supporting tool integrations like Cursor.
Added on:
Created by:
Apr 18 2025
Alireza Davoodi
Featured

What is MCPDocSearch?

The MCPDocSearch project provides a comprehensive solution for web documentation aggregation and search. It features a web crawler (`crawler_cli`) that extracts and cleans content from websites or wikis, converting pages into Markdown format stored locally. The MCP server (`mcp_server`) loads these Markdown documents, chunks them based on headings, and generates embeddings for semantic search. It exposes MCP tools for listing documents, retrieving headings, and performing searches, which can be accessed via clients like Cursor. Designed for efficient indexing, caching, and fast retrieval, this project facilitates integrating large documentation sets into intelligent search systems.

Who will use MCPDocSearch?

  • Developers needing documentation search integration
  • Technical writers creating searchable document repositories
  • Organizations maintaining extensive online wikis
  • AI and data scientists working with knowledge bases

How to use the MCPDocSearch?

  • Step1: Use `crawl.py` or `uv run` to crawl a website and generate Markdown documentation
  • Step2: Save the generated Markdown files in the `./storage/` directory
  • Step3: Run the MCP server with `python -m mcp_server.main` from the project root
  • Step4: Configure clients like Cursor to connect to the MCP server via the specified command and arguments
  • Step5: Use MCP tools (`list_documents`, `search_documentation`) to query the crawled content

MCPDocSearch's Core Features & Benefits

The Core Features
  • Web crawling with configurable depth and URL patterns
  • HTML content cleaning and conversion to Markdown
  • Loading, chunking, and semantic embedding of documentation
  • Caching processed data for performance
  • Exposing search and document management MCP tools
The Benefits
  • Automates documentation collection from websites
  • Enables fast, semantic search over large documentation sets
  • Supports seamless integration with MCP-compatible tools
  • Reduces manual effort in maintaining searchable documentation
  • Optimizes performance with caching mechanisms

MCPDocSearch's Main Use Cases & Applications

  • Creating searchable internal wikis for technical teams
  • Indexing online API documentation for quick reference
  • Building knowledge bases for AI assistants
  • Archiving and searching enterprise documentation

FAQs of MCPDocSearch

Developer

  • alizdavoodi

You may also like:

Developer Tools

A desktop application for managing server and client interactions with comprehensive functionalities.
A Model Context Protocol server for Eagle that manages data exchange between Eagle app and data sources.
A chat-based client that integrates and uses various MCP tools directly within a chat environment for enhanced productivity.
A Docker image hosting multiple MCP servers accessible through a unified entry point with supergateway integration.
Provides access to YNAB account balances, transactions, and transaction creation through MCP protocol.
A fast, scalable MCP server for managing real-time multi-client Zerodha trading operations.
A remote SSH client facilitating secure, proxy-based access to MCP servers for remote tool utilization.
A Spring-based MCP server integrating AI capabilities for managing and processing Minecraft mod communication protocols.
A minimalistic MCP client with essential chat features, supporting multiple models and contextual interactions.
A secure MCP server enabling AI agents to interact with Authenticator App for 2FA codes and passwords.

Research And Data

A server implementation supporting Model Context Protocol, integrating CRIC's industrial AI capabilities.
Provides real-time traffic, air quality, weather, and bike-sharing data for Valencia city in a unified platform.
A React application demonstrating integration with Supabase via MCP tools and Tambo for UI component registration.
A MCP client integrating Brave Search API for web searches, utilizing MCP protocol for efficient communication.
A protocol server enabling seamless communication between Umbraco CMS and external applications.
NOL integrates LangChain and Open Router to create a multi-client MCP server using Next.js
Connects LLMs to Firebolt Data Warehouse for autonomous querying, data access, and insight generation.
A client framework for connecting AI agents to MCP servers, enabling tool discovery and integration.
Spring Link facilitates linking and managing multiple Spring Boot applications efficiently within a unified environment.
An open-source client to interact with multiple MCP servers, enabling seamless tool access for Claude.

Knowledge And Memory

A Next.js-based chat interface connecting to MCP servers with tool-calling and styled UI.
A Spring Boot-based MCP client demonstrating how to handle chat requests and responses in a robust application.
Spring Boot app providing REST API for AI inference and knowledge base management with language model integration.
A server that executes AppleScript commands, providing full control over macOS automations remotely.
An MCP server for managing notes with features like viewing, adding, deleting, and searching notes in Claude Desktop.
Fetches latest knowledge from deepwiki.com, converts pages to Markdown, and provides structured or single document outputs.
A client library enabling SSE-based real-time interaction with Notion MCP servers through a local setup.
Provides long-term memory for LLMs by storing and retrieving contextual information via MCP standards.
A straightforward client for managing and building MCP (Model Context Protocol) communications efficiently.
A server that queries Solana transactions via natural language using the Solscan API, simplifying blockchain interactions.