Agent Framework

AnythingLLM

Chat with your documents

AnythingLLM is a local RAG (Retrieval-Augmented Generation) system. Upload your documents — PDFs, Word files, spreadsheets — and ask questions about them. The AI finds relevant passages and answers based on your actual documents, not training data.

Book a consultation to set up AnythingLLM →

What it is

Your documents. Your questions. No upload required.

AnythingLLM creates a local knowledge base from your files. You upload documents into workspaces — a workspace for a client, a workspace for a project, a workspace for your SOPs — and ask questions in plain language.

The system uses a technique called RAG: it finds the most relevant passages in your documents and feeds them to the AI as context. The AI answers from your documents, not from general training data. Hallucinations drop dramatically because the model is grounded in your text.

Everything runs locally. Your documents are indexed and stored on your Mac. Nothing is transmitted to a third-party AI provider.

How it works

Upload, embed, ask

When you upload a document, AnythingLLM breaks it into chunks and converts them into numerical vectors using a local embedding model. These vectors are stored in a local vector database.

When you ask a question, the same embedding process runs on your query. The system finds document chunks whose vectors are closest to your question, then passes those chunks to the AI model as context. You get an answer grounded in your actual documents — with source citations.

Who it's for

Anyone who needs to work with large document sets

✓Legal professionals reviewing case files and contracts
✓Researchers querying literature and source documents
✓Businesses with SOPs, handbooks, and internal documentation
✓Financial professionals analyzing reports and statements
✓Anyone who wants to "chat" with a PDF rather than read it top to bottom

Unique advantage

Your documents never leave your computer. No uploading to OpenAI or Google. For professionals who handle sensitive documents — case files, financial records, patient notes — this is the fundamental difference.

Full stack

What gets installed

Layer	Component	Purpose
AI Engine	Ollama (MLX backend)	Runs models on Apple Silicon
Agent Framework	AnythingLLM	RAG pipeline with vector database and document workspace
Chat UI	Open WebUI	Browser chat, always available
Networking	Tailscale	Secure access from anywhere
Vector Store	Local vector database	Indexes your documents for semantic search
Security	Hardened config	Loopback binding, no document transmission

Security

Local index, local inference

AnythingLLM runs entirely on your Mac. Your documents are indexed locally — no cloud storage, no third-party vector database. The embedding model runs locally. The LLM runs locally. Nothing leaves your hardware during document processing or question answering.

Recommended models

Models that pair well

Llama 3.2 3B

Fast and light — great for quick document questions on lower-RAM Macs

Phi-4 14B

Excellent reasoning — strong at understanding nuanced document content

Qwen 3 32B

Best comprehension — handles complex legal, financial, and technical documents

Ready to set up AnythingLLM?

Book a consultation. We'll configure AnythingLLM on your Mac, set up your document workspaces, and hand it back ready to use.

Book a consultation to set up AnythingLLM →