Daily Newsletter

Tool Showing 12 tips
#089 Claude Code
Memory tool is just-in-time retrieval โ€” not an upfront context dump at startup
The API memory tool is designed for on-demand retrieval: store what's learned, pull back only what's relevant for the current task. Agents that load all memory at boot are using it wrong.
"This is the key primitive for just-in-time context retrieval: rather than loading all relevant information upfront, agents store what they learn in memory and pull it back on demand."
โ†— Source
#090 Claude Code
Bootstrap memory before work begins โ€” ad hoc accumulation creates recovery failures
Run a dedicated initializer session that creates structured memory: progress log, feature checklist, startup script reference. Sessions without this spend turns re-discovering already-known facts.
"Initializer session: The first session sets up the memory artifacts before any substantive work begins. This includes a progress log, a feature checklist, and a reference to any startup or initialization script."
โ†— Source
#091 Claude Code
Memory tool + compaction = theoretically unbounded agent sessions
When context fills, Claude writes key state to memory files. After compaction, it retrieves from memory on demand. This enables agents that run indefinitely across sessions without losing task state.
"compaction keeps the active context manageable without client-side bookkeeping, and memory persists important information across compaction boundaries so that nothing critical is lost."
โ†— Source
#092 Claude Code
Run /context to see what's consuming your context window before optimizing
/context shows a breakdown by source: conversation, files, CLAUDE.md, MCP server tool definitions, skills. MCP servers can silently consume thousands of tokens in tool schema definitions.
"Run /context to see what's using space. MCP servers add tool definitions to every request, so a few servers can consume significant context before you start working. Run /mcp to check per-server costs."
โ†— Source
#093 Claude Code
Memory tool is client-side โ€” Anthropic stores nothing; ZDR means zero post-response retention
The memory tool stores data in your infrastructure, not on Anthropic's servers. With Zero Data Retention arrangements, data isn't retained after the API response returns.
"The memory tool operates client-side: you control where and how the data is stored through your own infrastructure. This feature is eligible for Zero Data Retention (ZDR)."
โ†— Source
#094 Claude Code
resumeSessionAt: messageId forks a session from any historical checkpoint
The SDK supports forking from any message ID in a session's history โ€” not just the end. Try alternative approaches from a known-good point without re-running all prior work.
"Resume from a checkpoint in the conversation: resumeSessionAt: messageId // message.message.id from SDKAssistantMessage"
โ†— Source
#095 Codex
stream_idle_timeout_ms and retry counts are independently configurable per provider
In [model_providers.openai], set stream_idle_timeout_ms (default 300,000ms = 5 min) independently from stream_max_retries. Complex code generation regularly hits stream idle timeouts.
"[model_providers.openai] request_max_retries = 4 stream_max_retries = 10 stream_idle_timeout_ms = 300000"
โ†— Source
#096 Codex
Set model_context_window explicitly for proxies and custom deployments
With custom providers or LLM proxies, Codex may auto-detect the wrong context window and truncate unnecessarily. Set model_context_window = 128000 in the provider config block to override.
"model_context_window = 128000 # Context window size"
โ†— Source
#097 Codex
model_reasoning_summary = "none" suppresses thinking output from reasoning models
Set this to suppress thinking summaries or "low" to shorten them. Pair with model_verbosity = "low" for shorter output. Only applies to Responses API providers โ€” Chat Completions providers ignore it.
"model_reasoning_summary = 'none' # Disable summaries model_verbosity = 'low' # Shorten responses"
โ†— Source
#098 Both
LLM proxy routing gives teams centralized cost tracking, logging, and budget enforcement
Claude Code: set ANTHROPIC_VERTEX_BASE_URL. Codex: configure a [model_providers.proxy] block. A central proxy gives the whole team unified logging, per-project cost attribution, and budget alerts.
Codex: "[model_providers.proxy] name = 'OpenAI using LLM proxy' base_url = 'http://proxy.example.com' env_key = 'OPENAI_API_KEY'"
โ†— Source
#099 Both
Session transcripts are local JSONL files โ€” parse them for cost analysis and custom tooling
Claude Code stores sessions at ~/.local/share/claude/sessions/. Codex stores runs at ~/.codex/sessions/. Both are machine-readable JSONL. Parse externally to build cost dashboards or custom memory tools.
"Check ~/.codex/log/codex-tui.log (or the most recent session-*.jsonl file if you enabled session logging) after a session if you need to audit which instruction files Codex loaded."
โ†— Source
#100 Both
Both platforms expose programmatic analytics APIs for org-wide usage reporting
Claude Code has the Claude Code Analytics API (in Anthropic admin docs) for programmatic org-level session usage. Codex exposes usage through the OpenAI admin API. Build cost allocation dashboards programmatically.
"Admin API overview ยท Data residency ยท Workspaces ยท Usage and Cost API ยท Claude Code Analytics API ยท Zero Data Retention"
โ†— Source