llama-cpp-agent is an open-source C++ framework for running AI agents entirely offline. It leverages the llama.cpp inference engine to provide fast, low-latency interactions and supports a modular plugin system, configurable memory, and task execution. Developers can integrate custom tools, switch between different local LLM models, and build privacy-focused conversational assistants without external dependencies.