Question 1

Is local AI actually private, or does it phone home?

Accepted Answer

Your prompts and documents stay on your machine. But the app running the model (Ollama, LM Studio) has normal network access, and some collect anonymous telemetry by default. Turn off the tool's telemetry and keep it off the internet for genuine privacy, then pull the cable and it still answers.

Question 2

Can a local model use my own documents?

Accepted Answer

Yes. You point it at your files and it answers from them, cited to the page, instead of guessing from its training. A grounded mid-size model reading your actual documents beats a giant one that has never seen them. This is called RAG.

Question 3

What hardware do I need to run a local LLM?

Accepted Answer

Less than you'd think to start. A modest gaming GPU (8-16GB) runs small models fine for learning. A 24GB card runs a 32B model comfortably. Big unified-memory boxes load huge models slowly. Don't overbuy for a 70B you won't use day-to-day.

Question 4

Do I need internet to run a local model?

Accepted Answer

Only once, to download the model. After that it runs fully offline. Air-gap it and it keeps working; a dropped connection doesn't take your AI down with it.

Question 5

Should I use Ollama or LM Studio?

Accepted Answer

Both run the same models. LM Studio is the friendly GUI, best for your first week. Ollama is what most people settle on when they want it scriptable: a REST API on by default and headless on a server. Start with LM Studio, move to Ollama when you want to wire it into things.

Question 6

Is a local LLM hard to set up?

Accepted Answer

The easy 90% is genuinely easy: install Ollama or LM Studio, download a model, start chatting, no terminal required. The hard 10% is squeezing specific hardware to its limit, and most people never need to touch that.

Question 7

Why does my local model forget what I told it a minute ago?

Accepted Answer

Almost never the model: it's the context window. Out of the box Ollama defaults context to 2048 tokens (num_ctx), so once your chat runs past that it drops the oldest tokens. Raise it in one line (/set parameter num_ctx 8192, or the context-length slider in LM Studio), within what your VRAM allows.

Question 8

Which model should I run, and is it as good as ChatGPT?

Accepted Answer

It's a different tool, not a worse one. The answer is always two or three names that rotate every few months; in mid-2026 most people run the Qwen 3.6 pair (27B dense for quality, 35B-A3B MoE for speed) or Mistral Small for lean. A fast 32B does real work; the cloud still wins on frontier reasoning, local wins on ownership, privacy, and your own documents.

Question 9

What is a local LLM actually good for?

Accepted Answer

Drafting and rewriting, summarizing long documents, answering from your own files, coding help, cleaning up messy data, and running quiet automations, all day for the cost of electricity.

Question 10

Why not just use ChatGPT?

Accepted Answer

Ownership and privacy. No subscription, no rate limits, no policy change breaking your setup, and your sensitive data never leaves the building. If that doesn't matter to you, use the cloud; for real business work it usually does.

Question 11

How do I start with local AI?

Accepted Answer

Install LM Studio or Ollama, pull a small model, and point it at a few of your own documents. An hour with it on your own files tells you more than any benchmark.

Local AI, straight answers

Is it actually private, or does it phone home?

Can it use my own documents?

What hardware do I need?

Do I need internet?

Which app — Ollama or LM Studio?

Is it hard to set up?

Why does it forget what I told it a minute ago?

Which model should I run? Is it as good as ChatGPT?

What's it actually good for?

Why not just use ChatGPT, then?

How do I start?