TradingAgents is a multi-agent LLM trading research framework—analyst teams, bull/bear debates, trader, risk, portfolio—built on LangGraph. Running it with Ollama on a Mac mini M4 in 2026 keeps inference local at http://localhost:11434/v1 so you avoid per-token cloud API bills ($0 marginal LLM spend after hardware). This guide covers install, llm_provider: "ollama", dual-model RAM splits, Docker vs native, and staging on a rented cloud Mac mini (~$16.9/day).
Pair with OpenClaw webhooks and Ollama for automation glue and Mac mini M4 cloud power for hardware context. Official references: TradingAgents on GitHub, Ollama docs, and Apple Mac mini specs.
Disclosure: MacHTML provides the cloud Mac mini rental service referenced in this article.
Why Ollama on M4 for TradingAgents
Each ticker analysis may trigger 4–8 LLM calls across debate rounds. Cloud APIs multiply cost; Ollama on Apple Silicon serves OpenAI-compatible chat at port 11434. TradingAgents v0.2.5+ accepts llm_provider: "ollama" with optional OLLAMA_BASE_URL for a remote host.
Mac mini M4 idle power commonly sits near 6–12 W—cheaper than leaving a laptop pegged at full fan during overnight batch runs.
Hardware and RAM tiers
| Unified RAM | deep_think_llm | quick_think_llm | Guardrails |
|---|---|---|---|
| 16 GB | llama3.1:8b | llama3.2:3b | max_debate_rounds=1 |
| 24 GB | qwen2.5:14b | llama3.2:3b | Default research tier |
| 32 GB+ | qwen2.5:32b (Q4) | llama3.1:8b | Longer analyst chains |
Apple lists up to 32 GB unified memory on the Mac mini specs page.
Install TradingAgents and Ollama
brew install ollama
ollama pull llama3.1:8b
ollama pull llama3.2:3b
git clone https://github.com/TauricResearch/TradingAgents.git
cd TradingAgents
python3.13 -m venv .venv && source .venv/bin/activate
pip install .
Ollama model split strategy
deep_think_llm handles synthesis and debates; quick_think_llm handles fast routing. Do not assign the same large quant to both slots on 16 GB—you will OOM with exit code 137.
Configure llm_provider ollama
config = DEFAULT_CONFIG.copy()
config["llm_provider"] = "ollama"
config["deep_think_llm"] = "llama3.1:8b"
config["quick_think_llm"] = "llama3.2:3b"
config["max_debate_rounds"] = 1
ta = TradingAgentsGraph(debug=True, config=config)
_, decision = ta.propagate("AAPL", "2026-05-20")
Remote Ollama on another Mac mini: export OLLAMA_BASE_URL=http://10.0.0.5:11434/v1.
CLI and first analysis run
tradingagents # pick Ollama + model IDs interactively
Expect 5–20 minutes per ticker on M4 16 GB with 8B/3B splits—latency trades off against $0 token spend.
Docker profile ollama
docker compose --profile ollama up -d --build
docker compose --profile ollama run --rm tradingagents-ollama
Native Ollama + venv is usually faster on Apple Silicon than Docker for daily research loops.
Cloud Mac mini staging
Laptops thermal-throttle during hour-long graphs. Renting an Apple Silicon Mac mini gives 24/7 Ollama uptime and SSH batch runs for about $16.9/day on published MacHTML pricing—less than buying idle metal for a two-week experiment.
For a UI-first desktop agent with 20-minute auto-fetch into an Obsidian vault (not only trading graphs), read OpenHuman on macOS and Mac mini M4 on the same cloud Mac mini node.
FAQ
Does zero API cost mean zero spend?
You avoid per-token LLM bills; electricity, hardware, or cloud rent still apply.
Can TradingAgents run without cloud LLM keys?
With ollama, LLM API keys are optional; some market data providers may still need keys—see .env.example in the repo.
Is this financial advice?
No—TradingAgents is a research framework; read the project disclaimer before acting on outputs.
Planning a hardware step-up for local LLM throughput? See our M5 Pro Fusion vs M6 2nm decision matrix for memory bandwidth tiers and buy-now vs wait guidance.
If 64 GB Mac mini lead times block your local LLM stack, compare WWDC 2026 M5 Mac mini DRAM delays and the 8-row wait-vs-rent matrix before you commit to TradingAgents hardware.
When Ollama agent turns stall on megabyte tool JSON—not API cost—route traffic through Headroom before the model. See our Headroom + Ollama local latency runbook for 60–95% context shrink on Mac mini.
Run TradingAgents 24/7 on real macOS
Rent a cloud Mac mini M4 to keep Ollama hot, batch tickers over SSH, and avoid laptop thermal throttling.