Friday, January 23, 2026
Running Claude Code Locally
Users are discussing how to run Claude Code with local, open-source models like GLM-4.7-Flash using tools like Ollama and llama-server, leveraging Anthropic API compatibility.
Quick tutorial for hosting your favorite LLM (GLM-4.7-flash) locally and using it in Claude Code: - Install Ollama (easy way; vLLM is an alternative). - Modify the Ollama .conf by adding OLLAMA_NUM_PARALLEL=4 to allow multiple sessions. - export ANTHROPIC_AUTH_TOKEN=ollama -
I don’t either but it’s the easiest way to get local models working with Claude Code. Until LM Studio adds Anthropic-friendly endpoints that is.
How to Run Claude Code Locally (100% Free & Fully Private) https://x.com/dr_cintas/status/2014380662300533180… Not sure if gemini-cli can do it? But honestly, switching to a different model will probably make a huge difference in performance, right?