Agent Framework
AnythingLLM
Chat with your documents
AnythingLLM is a local RAG (Retrieval-Augmented Generation) system. Upload your documents — PDFs, Word files, spreadsheets — and ask questions about them. The AI finds relevant passages and answers based on your actual documents, not training data.
Book a consultation to set up AnythingLLM →What it is
Your documents. Your questions. No upload required.
AnythingLLM creates a local knowledge base from your files. You upload documents into workspaces — a workspace for a client, a workspace for a project, a workspace for your SOPs — and ask questions in plain language.
The system uses a technique called RAG: it finds the most relevant passages in your documents and feeds them to the AI as context. The AI answers from your documents, not from general training data. Hallucinations drop dramatically because the model is grounded in your text.
Everything runs locally. Your documents are indexed and stored on your Mac. Nothing is transmitted to a third-party AI provider.
How it works
Upload, embed, ask
When you upload a document, AnythingLLM breaks it into chunks and converts them into numerical vectors using a local embedding model. These vectors are stored in a local vector database.
When you ask a question, the same embedding process runs on your query. The system finds document chunks whose vectors are closest to your question, then passes those chunks to the AI model as context. You get an answer grounded in your actual documents — with source citations.
Who it's for
Anyone who needs to work with large document sets
- ✓Legal professionals reviewing case files and contracts
- ✓Researchers querying literature and source documents
- ✓Businesses with SOPs, handbooks, and internal documentation
- ✓Financial professionals analyzing reports and statements
- ✓Anyone who wants to "chat" with a PDF rather than read it top to bottom
Unique advantage
Your documents never leave your computer. No uploading to OpenAI or Google. For professionals who handle sensitive documents — case files, financial records, patient notes — this is the fundamental difference.
Full stack
What gets installed
| Layer | Component | Purpose |
|---|---|---|
| AI Engine | Ollama (MLX backend) | Runs models on Apple Silicon |
| Agent Framework | AnythingLLM | RAG pipeline with vector database and document workspace |
| Chat UI | Open WebUI | Browser chat, always available |
| Networking | Tailscale | Secure access from anywhere |
| Vector Store | Local vector database | Indexes your documents for semantic search |
| Security | Hardened config | Loopback binding, no document transmission |
Security
Local index, local inference
AnythingLLM runs entirely on your Mac. Your documents are indexed locally — no cloud storage, no third-party vector database. The embedding model runs locally. The LLM runs locally. Nothing leaves your hardware during document processing or question answering.
Recommended models
Models that pair well
Fast and light — great for quick document questions on lower-RAM Macs
Excellent reasoning — strong at understanding nuanced document content
Best comprehension — handles complex legal, financial, and technical documents
Ready to set up AnythingLLM?
Book a consultation. We'll configure AnythingLLM on your Mac, set up your document workspaces, and hand it back ready to use.
Book a consultation to set up AnythingLLM →