Debugging an Hermes Telegram API Timeout Caused by a Proxy
Debugging an Hermes Telegram API Timeout Caused by a Proxy
The Telegram bot kept showing "typing" without ever replying, while the terminal CLI using the exact same model worked perfectly at the same time. It turned out the launchd-started gateway process doesn't inherit the system proxy environment variables, and the proxy lines in .env were commented out — so the gateway connected directly to an overseas API and timed out.
Symptom
The user sent a message to the Hermes bot via Telegram:
- The bot showed "typing..." but never replied
- After several minutes, a
No response from provider for 180serror arrived - At the same time, the terminal CLI using the same model worked perfectly, 5-10s per response
Error log:
⚠️ No response from provider for 180s (model: mimo-v2.5-pro, context: ~20,614 tokens). Reconnecting...
⚠️ API call failed (attempt 1/3): APITimeoutError
🔌 Provider: custom Model: mimo-v2.5-pro
🌐 Endpoint: https://YOUR-API-ENDPOINT/v1
📝 Error: Request timed out.Investigation
Initial hypothesis (wrong direction)
At first I suspected the API was unstable or the streaming timeout was too short: the log showed Stream stale for 180s (threshold 180s) — no chunks received, suggesting raising HERMES_STREAM_STALE_TIMEOUT=300. But the user pointed out the CLI with the same model worked perfectly.
Lesson
When the same API behaves differently across clients, the problem is almost certainly not the API but a client-side difference.
Check the proxy config
The system environment had a proxy:
$ env | grep -i proxy
HTTP_PROXY=http://127.0.0.1:10808
HTTPS_PROXY=http://127.0.0.1:10808
ALL_PROXY=socks5h://127.0.0.1:10808The proxy is Xray on port 10808. Testing the API through it worked:
$ curl --proxy http://127.0.0.1:10808 https://YOUR-API-ENDPOINT/v1/models
# 200, 0.8s — fineThe proxy itself was fine.
Telegram session stuck in a loop
The agent.log showed the Telegram session stuck in a loop: receive message, call the API, get killed after 180s of no data, gateway restarts and resumes the session, stuck for 180s again... Deleting the stuck session and restarting the gateway didn't help; new sessions got stuck the same way.
Comparing CLI vs Telegram API calls
Key observation: the CLI session's API call completed in 5-10s; the Telegram session's API call timed out at 180s. Same endpoint, same model, completely different behavior.
Gateway blocked
The gateway.log stopped producing output after a certain point: the cron ticker stopped printing, new messages stopped being recorded, but the CLI session's agent.log kept writing normally. This meant the gateway process's asyncio event loop was fully blocked by the stuck HTTP request.
The launchd environment difference (root cause)
Checking the launchd plist:
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>~/.hermes/hermes-agent/venv/bin:...</string>
<key>VIRTUAL_ENV</key>
<string>~/.hermes/hermes-agent/venv</string>
<key>HERMES_HOME</key>
<string>~/.hermes</string>
</dict>No HTTP_PROXY, HTTPS_PROXY, or ALL_PROXY!
And the .env:
# HTTP_PROXY=http://127.0.0.1:10808 ← commented out!
# HTTPS_PROXY=http://127.0.0.1:10808 ← commented out!
# ALL_PROXY=socks5h://127.0.0.1:10808 ← commented out!Case closed:
| Process | Started by | Proxy | API behavior |
|---|---|---|---|
| CLI | terminal, inherits system env | ✅ has proxy | 5-10s response |
| Gateway | launchd, no system env | ❌ no proxy | direct overseas connection, 180s timeout |
The API server is overseas; a direct connection from within China is blocked or extremely slow, so it must go through a proxy.
Fix
Edit ~/.hermes/.env and uncomment the proxy:
HTTP_PROXY=http://127.0.0.1:10808
HTTPS_PROXY=http://127.0.0.1:10808
ALL_PROXY=socks5h://127.0.0.1:10808Make sure NO_PROXY doesn't include domains that need the proxy:
NO_PROXY=localhost,127.0.0.1,::1Then restart the gateway:
hermes gateway restartTakeaways
Core lessons
- launchd doesn't inherit system environment variables: a process started by macOS launchd only gets the variables defined under
EnvironmentVariablesin the plist. Anythingexported in the terminal, or even set in System Preferences, is not passed to launchd processes. .envis the gateway's only environment source: the gateway loads env vars from.env. If a variable is commented out there, the gateway process doesn't have it.- Same API, different clients, different behavior -> check the client difference: it's not an API problem; the two clients have different network paths.
- gateway.log stops updating when the gateway is stuck: if gateway.log suddenly has no new entries (but agent.log keeps writing), the gateway's main loop is blocked.
Quick diagnostic checklist
When the Telegram channel is unresponsive but the CLI works:
# 1. Is the gateway stuck? (does gateway.log have new entries?)
tail -f ~/.hermes/logs/gateway.log
# 2. Check the proxy config
cat ~/.hermes/.env | grep -i proxy
# 3. Check the system env proxy
env | grep -i proxy
# 4. Compare: direct vs proxied API test
curl -w "%{time_total}s" https://YOUR-API-ENDPOINT/v1/models
curl --proxy http://127.0.0.1:10808 -w "%{time_total}s" https://YOUR-API-ENDPOINT/v1/models
# 5. Check launchd env vars
cat ~/Library/LaunchAgents/ai.hermes.gateway.plist | grep -A 20 EnvironmentVariablesProxy config best practice
If you need a proxy, set it explicitly in .env; don't rely on the system environment:
# ~/.hermes/.env
HTTP_PROXY=http://127.0.0.1:10808
HTTPS_PROXY=http://127.0.0.1:10808
ALL_PROXY=socks5h://127.0.0.1:10808
NO_PROXY=localhost,127.0.0.1,::1 # add domestically reachable services hereDon't comment out the proxy config unless you're sure every API is directly reachable domestically.
