Daily Newsletter

Tool Showing 100 tips
#101 Claude Code
'think' < 'think hard' < 'think harder' < 'ultrathink' β€” mapped to real thinking budgets
These specific phrases directly map to increasing levels of thinking budget in Claude Code. ultrathink is not a metaphor β€” it allocates progressively more compute for planning. Use it for architecture and hard debugging.
"We recommend using the word think to trigger extended thinking mode… 'think' < 'think hard' < 'think harder' < 'ultrathink'. Each level allocates progressively more thinking budget."
β†— Source
#102 Both
Ask Claude/Codex to 'interview you' before planning β€” turns fuzzy ideas into concrete specs
If you have a rough idea but aren't sure how to describe it well, ask the agent to question you first and challenge your assumptions. The interview output becomes the spec. Much better than a bad prompt.
"Ask Codex to interview you: If you have a rough idea of what you want but aren't sure how to describe it well, ask Codex to question you first. Tell it to challenge your assumptions and turn the fuzzy idea into something concrete before writing code."
β†— Source
#103 Claude Code
Include four things in every complex prompt: context, task, constraints, definition of done
A good default structure: which files/folders matter, what to do, what not to do, and what 'done' looks like. This four-part structure consistently outperforms prose descriptions.
"A good default is to include four things in your prompt: Context (which files, folders, docs matter), Task (what to do), Constraints (boundaries), and Verification (what 'good' looks like). This provides the clearest path to a useful result."
β†— Source
#104 Both
Tell the agent what 'done' looks like β€” it can only verify against what you define
Without a definition of done in the prompt or AGENTS.md/CLAUDE.md, the agent guesses. It will stop at the wrong point. Define done as: tests pass, no new deps added, no unrelated files modified.
"Don't stop at asking Codex to make a change. Ask it to create tests when needed, run the relevant checks, confirm the result, and review the work before you accept it. Codex can do this loop for you, but only if it knows what 'good' looks like."
β†— Source
#105 Both
Use Plan mode (Codex) / 'don't code yet' pattern (Claude) before implementation
Plan mode lets the agent gather context, ask clarifying questions, and build a stronger plan before implementation. Explicitly asking the agent not to code yet prevents premature solutions and improves architectural quality.
"Use Plan mode: For most users, this is the easiest and most effective option. Plan mode lets Codex gather context, ask clarifying questions, and build a stronger plan before implementation."
β†— Source
#106 Codex
Batch all file reads together β€” never read files one-by-one unless logically unavoidable
Before any tool call, decide all files you will need and read them in one parallel batch. Sequential reads when parallel is possible are a common performance anti-pattern in agentic workflows.
"Batch everything. If you need multiple files (even from different places), read them together. Only make sequential calls if you truly cannot know the next file without seeing a result first."
β†— Source
#107 Claude Code
Challenge Claude to prove its solution works before accepting it
After a fix, say 'prove to me this works' and have Claude diff between main and your branch. Or 'grill me on these changes and don't make a PR until I pass your test.' Forces real verification, not assumed success.
"challenge Claude β€” 'grill me on these changes and don't make a PR until I pass your test.' or 'prove to me this works' and have Claude diff between main and your branch."
β†— Source
#108 Claude Code
When Claude goes off-track, use Esc Esc or /rewind β€” don't try to fix in same context
Pressing Esc twice or using /rewind undoes the last turn. Trying to correct a wrong direction in the same context often makes it worse. Rewind to the last good point and re-approach.
"use Esc Esc or /rewind to undo when Claude goes off-track instead of trying to fix it in the same context."
β†— Source
#109 Claude Code
At 70% context usage Claude loses precision; at 85% hallucinations increase; 90%+ is erratic
Empirical thresholds from practitioner research: 0-50% = work freely; 50-70% = start thinking about /compact; 70-90% = run /compact; 90%+ = /clear is mandatory. These thresholds apply across model sizes.
"At 70% context, Claude starts losing precision. At 85%, hallucinations increase. At 90%+, responses become erratic. Strategy: 0-50% (work freely). 50-70% (attention). 70-90% (/compact). 90%+ (/clear mandatory)."
β†— Source
#110 Claude Code
Run /compact manually at 50% to avoid the agent's 'dumb zone'
Don't wait for auto-compaction β€” proactively compact at 50% to maintain quality. Vanilla Claude Code is better than any workflow with smaller tasks when context is properly managed.
"avoid agent dumb zone, do manual /compact at max 50%. Use /clear to reset context mid-session if switching to a new task."
β†— Source
#111 Both
Start with a minimal spec and expand β€” don't front-load all context
Ask the agent to explore your codebase, then build the spec iteratively. Front-loading everything consumes context budget before work starts. Context is infrastructure β€” manage it like a resource.
"Context is infrastructure. Claude Code automatically pulls context from your environment. Rather than treating Claude as a chatbot, the core insight is: Claude Code works best when treated like a junior engineer with tools, memory, and iteration."
β†— Source
#112 Codex
Codex ran 25 hours uninterrupted on a long-horizon task using 13M tokens β€” context compaction is the key unlock
Codex ran a 25-hour continuous session generating 30k lines of code by using context compaction throughout. Long-horizon reliability is not just about model capability β€” it's about systematic context management.
"Codex ran for about 25 hours uninterrupted, used about 13M tokens, and generated about 30k lines of code. This performed well on the parts that matter for long-horizon work: following the spec, staying on task, running verification, and repairing failures."
β†— Source
#113 Claude Code
Writer/Reviewer pattern: one Claude writes code, a fresh Claude reviews it
A fresh context improves code review because Claude won't be biased toward code it just wrote. Use one session for implementation, then open a second session with only the diff as context for review.
"A fresh context improves code review since Claude won't be biased toward code it just wrote. For example, use a Writer/Reviewer pattern. You can do something similar with tests: have one Claude write tests, then another write code to pass them."
β†— Source
#114 Claude Code
TDD with agents: have one Claude write failing tests, another write code to pass them
This is a powerful quality pattern. The test-writer has no knowledge of the implementation it's about to test, so it writes genuinely independent tests. The implementer's only goal is to pass them.
"have one Claude write tests, then another write code to pass them. Loop through tasks calling claude -p for each. Use --allowedTools to scope permissions for batch operations."
β†— Source
#115 Claude Code
Large migrations: have Claude list all files, then loop parallel invocations per file
For large-scale migrations, have Claude list all files needing changes, then spawn a parallel Claude invocation per file. Faster and more reliable than one long sequential session losing context.
"Have Claude list all files that need migrating. for file in $(cat files.txt); do claude -p 'Migrate $file from React to Vue. Return OK or FAIL.' --allowedTools 'Edit,Bash(git commit *)' done"
β†— Source
#116 Claude Code
Use -p flag for headless Claude invocations in shell scripts and loops
The -p flag runs Claude in non-interactive mode, taking the prompt as an argument and printing results to stdout. This is the foundation of all scripted and batch agentic workflows.
"Loop through tasks calling claude -p for each. Use --allowedTools to scope permissions for batch operations."
β†— Source
#117 Claude Code
Use say 'use subagents' to throw more compute at a problem without changing context
Adding the phrase 'use subagents' in your prompt offloads tasks to parallel workers and keeps your main context clean and focused. Subagents run in parallel β€” one instruction triggers multiple parallel work streams.
"say 'use subagents' to throw more compute at a problem β€” offload tasks to keep your main context clean and focused."
β†— Source
#118 Claude Code
Define security-reviewer, performance-analyzer, and doc-writer as persistent subagents
Create specialized subagents in .claude/agents/ for recurring review types. Delegate with a single instruction: 'use a subagent to review this code for security issues.' Each runs in isolated context.
"Define specialized assistants in .claude/agents/ that Claude can delegate to. --- name: security-reviewer description: Reviews code for security vulnerabilities tools: Read, Grep, Glob, Bash model: opus ---"
β†— Source
#119 Both
Multi-instance agents with tmux + git worktrees: the pattern for parallel development
Open multiple Claude Code sessions, each in its own git worktree. Each works on a separate feature independently. Tmux keeps all sessions visible. Merge each worktree when done. The foundation of AI-native team workflows.
"agent teams with tmux and git worktrees for parallel development… use test time compute β€” separate context windows make results better; one agent can cause bugs and another (same model) can find them."
β†— Source
#120 Codex
Codex parallel task execution is the 'killer feature' β€” assign 5 tasks, review all 5 when done
Assign 5 different tasks, each runs in its own isolated container, and you review all 5 when they're done. The entire UI, worktree management, and review queue in the Codex app is built for this delegation model.
"Parallel task execution. This is Codex's killer feature. Assign 5 different tasks, each runs in its own isolated container, and you review all 5 when they're done."
β†— Source
#121 Claude Code
Turn any repeated inner-loop workflow into a slash command
If you do something more than once a day, make it a /command in .claude/commands/. Commands are markdown files checked into git and available to the entire team. Build /techdebt, /pr-ready, /explain-diff.
"use slash commands for every 'inner loop' workflow you do many times a day β€” saves repeated prompting, commands live in .claude/commands/ and are checked into git."
β†— Source
#122 Both
Build a Gotchas section in every skill file β€” highest-signal content across sessions
Add a ## Gotchas section to every SKILL.md documenting the model's failure points you've discovered. This is the highest-signal content in a skill β€” it pushes Claude/Codex out of default failure modes specific to your codebase.
"build a Gotchas section in every skill β€” highest-signal content, add Claude's failure points over time."
β†— Source
#123 Both
Write skill descriptions as trigger conditions, not summaries β€” 'when should I fire?'
The skill description field is read by the model to decide when to auto-invoke. Write it from the model's perspective: 'Use this skill when the user asks to analyze database query performance.' Not a summary of what the skill does.
"skill description field is a trigger, not a summary β€” write it for the model ('when should I fire?') don't state the obvious in skills β€” focus on what pushes Claude out of its default behavior."
β†— Source
#124 Both
Give goals and constraints in skills, not step-by-step instructions
Don't railroad the agent in skills β€” prescriptive step-by-step instructions reduce quality. Provide the goal, the constraints, and the definition of done. Let the agent figure out the steps.
"don't railroad Claude in skills β€” give goals and constraints, not prescriptive step-by-step instructions."
β†— Source
#125 Both
Embed !`command` in SKILL.md to inject live shell output into the prompt
Claude/Codex runs the backtick command on skill invocation and injects the result into the prompt. Use it to inject current git status, database schema, API health check, or any dynamic context the skill needs.
"embed !`command` in SKILL.md to inject dynamic shell output into the prompt β€” Claude runs it on invocation and the model only sees the result."
β†— Source
#126 Codex
Ask Codex to plan before coding on any complex or ambiguous task
For complex or hard-to-describe tasks, ask Codex to plan before it starts coding. Toggle with /plan or Shift+Tab. You can comment on the plan inline before implementation begins.
"If the task is complex, ambiguous, or hard to describe well, ask Codex to plan before it starts coding."
β†— Source
#127 Codex
Use speech dictation inside the Codex app to provide context faster
For complex tasks with lots of context to provide, speech dictation inside the Codex app is faster than typing. The model receives the same quality of context with dramatically less friction.
"To provide context faster, try using speech dictation inside the Codex app to dictate what you want Codex to do rather than typing it."
β†— Source
#128 Both
Scope skills to one job β€” start with 2–3 concrete use cases, expand later
Don't try to cover every edge case in a skill up front. Start with one representative task, get it working well, then add edge cases. Monolithic skills that try to do everything degrade quality for each specific task.
"Keep each skill scoped to one job. Start with 2 to 3 concrete use cases, define clear inputs and outputs… Don't try to cover every edge case up front."
β†— Source
#129 Claude Code
/rename important sessions and /resume them by name later
Label each instance when running multiple Claude sessions simultaneously. Named sessions are resumable and identifiable across terminal sessions. Good for long-running projects spanning multiple days.
"/rename important sessions (e.g. [TODO - refactor task]) and /resume them later β€” label each instance when running multiple Claudes simultaneously."
β†— Source
#130 Claude Code
Use /model to switch to Opus for planning, Sonnet for implementation in the same session
Use /model to select the right model per phase: Opus for plan mode reasoning on complex architecture decisions, Sonnet for bulk code generation. Switch mid-session without losing context.
"use /model to select model and reasoning… use Opus for plan mode and Sonnet for code to get the best of both."
β†— Source
#131 Both
Paste the bug, say 'fix' β€” don't micromanage how
Claude Code fixes most bugs on its own. Paste the error or symptom, tell the agent to fix it, and let it trace the issue. Micromanaging the implementation approach often produces worse results than open delegation.
"Claude fixes most bugs by itself β€” paste the bug, say 'fix', don't micromanage how."
β†— Source
#132 Claude Code
Keep codebases clean and finish migrations β€” partially migrated frameworks confuse agents
Partially migrated frameworks (mixing old and new patterns) cause agents to pick the wrong pattern for new code. A consistent codebase produces dramatically more consistent agent output.
"keep codebases clean and finish migrations β€” partially migrated frameworks confuse models that might pick the wrong pattern."
β†— Source
#133 Both
Use browser automation MCPs (Claude in Chrome, Playwright) to let agents inspect console logs
Give the agent access to a browser tool so it can observe the actual runtime behavior of your frontend β€” console errors, network requests, visual state. Agents with perceptual feedback produce dramatically better UI fixes.
"use browser automation MCPs (Claude in Chrome, Playwright, Chrome DevTools) for Claude to inspect console logs."
β†— Source
#134 Both
Create a closed feedback loop: agent writes code, runs tests, checks output, iterates
Ask the agent to write code AND run the relevant tests AND confirm the result AND review before you accept it. This closed feedback loop, where the agent evaluates its own output, catches most errors before you see them.
"This creates a closed feedback loop where the agent evaluates its own output."
β†— Source
#135 Claude Code
After a mediocre fix, say 'knowing everything you know now, scrap this and implement the elegant solution'
When Claude produces a working but ugly fix, this prompt triggers it to use everything it learned during the failed attempt to produce a cleaner solution in a fresh try. Dramatically better than iterating on a bad first attempt.
"after a mediocre fix β€” 'knowing everything you know now, scrap this and implement the elegant solution'."
β†— Source
#136 Both
Add tools only when they unlock a real manual loop β€” don't wire in everything at once
Start with one or two tools that clearly remove a manual workflow you already do often. Every tool you add consumes context budget in every session. The wrong tools actively hurt quality.
"Add tools only when they unlock a real workflow. Do not start by wiring in every tool you use. Start with one or two tools that clearly remove a manual loop you already do often."
β†— Source
#137 Both
MCP turns the agent into a tool-orchestrating agent rather than a single-model assistant
The Model Context Protocol connects Claude/Codex to your actual infrastructure: databases, monitoring tools (Sentry), issue trackers, deployment systems. This is the foundation of AI-native developer environments.
"The Model Context Protocol (MCP) turns Claude into a tool-orchestrating agent rather than a single-model assistant. Query monitoring tools (e.g. Sentry). This is a foundational step toward AI-native developer environments."
β†— Source
#138 Both
Run /plugin to browse the marketplace for pre-built skills, tools, and integrations
The plugin marketplace provides skills, tools, and integrations without manual configuration. Browse with /plugin and install with $skill-installer. Includes Linear, GitHub, Sentry, and more.
"Run /plugin to browse the marketplace. Plugins add skills, tools, and integrations without configuration."
β†— Source
#139 Codex
Use --output-schema with codex exec to get structured JSON outputs for downstream scripts
Pass a JSON Schema file to --output-schema and Codex will constrain its final response to match that schema. Enables stable downstream scripting β€” no regex parsing of prose output.
"Use --output-schema to request a final response that conforms to a JSON Schema. This is useful for automated workflows that need stable fields (for example, job summaries, risk reports, or release metadata)."
β†— Source
#140 Codex
Use codex exec --json to stream JSONL events for CI pipeline integration
With --json, stdout becomes a JSONL stream of every event Codex emits: thread.started, turn.started, item.*, turn.completed. Parse with jq. No terminal output scraping needed in CI.
"When you enable --json, stdout becomes a JSON Lines (JSONL) stream so you can capture every event Codex emits while it's running. Event types include thread.started, turn.started, turn.completed, item.*, and error."
β†— Source
#141 Both
24 CVEs have been found in the Claude Code ecosystem β€” audit MCP servers before trusting them
MCP servers can read/write your codebase and make network calls. A malicious skill could exfiltrate code or introduce backdoors. Treat skill installation like installing software: audit sources, check for unusual network calls.
"24 CVEs identified in Claude Code ecosystem. 655 malicious skills in supply chain. MCP servers can read/write your codebase. Strategy: Systematic audit (5-min checklist)."
β†— Source
#142 Claude Code
Skills that fetch from external URLs are high-risk β€” fetched content may contain injected instructions
Even trustworthy skills can be compromised if their external dependencies change. A skill fetching from a CDN or API can receive new instructions at any time. Audit all external fetches in skill files.
"Skills that fetch data from external URLs pose particular risk, as fetched content may contain malicious instructions. Even trustworthy Skills can be compromised if their external dependencies change over time."
β†— Source
#143 Both
Unexpected changes during a session: STOP IMMEDIATELY and ask the user how to proceed
If the agent notices unexpected changes it didn't make β€” a file was modified by another process, a git state changed β€” it should stop and verify. Continuing may corrupt the working state. Build this rule into your AGENTS.md.
"While you are working, you might notice unexpected changes that you didn't make. If this happens, STOP IMMEDIATELY and ask the user how they would like to proceed."
β†— Source
#144 Both
Never use git reset --hard or git checkout -- unless specifically requested
These destructive commands should be prohibited in agent AGENTS.md/CLAUDE.md. An agent running a migration that accidentally triggers a hard reset can destroy hours of work with no undo.
"NEVER use destructive commands like git reset --hard or git checkout -- unless specifically requested or approved by the user."
β†— Source
#145 Both
Use the right model for the task β€” gpt-5.4-mini and Claude Haiku for subagents and light tasks
For most subagent work, a lighter model is sufficient. Use gpt-5.4-mini for Codex subagents and focused tasks. Use claude-haiku for Claude Code subagents. Reserve frontier models for reasoning-heavy planning.
"Use gpt-5.4-mini when you want a faster, lower-cost option for lighter coding tasks or subagents."
β†— Source
#146 Codex
GPT-5.3-Codex-Spark is available for near-instant real-time coding iteration (Pro subscribers)
The Spark model is optimized for near-instant coding iteration β€” effectively real-time feedback on code changes. Available in research preview for ChatGPT Pro subscribers. Switch with /model.
"The gpt-5.3-codex-spark model is available in research preview for ChatGPT Pro subscribers and is optimized for near-instant, real-time coding iteration."
β†— Source
#147 Both
Start with tight permissions, loosen only when need is clear
If you're new to coding agents, start with the default permissions. Keep approval and sandboxing tight by default. Loosen only for trusted repos or specific workflows once you understand the exact requirement.
"If you're new to coding agents, start with the default permissions. Keep approval and sandboxing tight by default, then loosen permissions only for trusted repos or specific workflows once the need is clear."
β†— Source
#148 Claude Code
Use /usage to check plan limits and /extra-usage to configure overflow billing
Monitor your consumption in real time with /usage. If you're hitting plan limits during critical work, /extra-usage lets you configure overflow billing without stopping mid-task.
"use /usage to check plan limits, /extra-usage to configure overflow billing, /config to configure settings."
β†— Source
#149 Both
Long-horizon work requires: following spec, staying on task, running verification, repairing failures
These are the four measurable dimensions of long-horizon task quality identified from practitioner tests. Verification and failure repair are the two that most agents fail on. Build explicit verification steps into your specs.
"This performed well on the parts that matter for long-horizon work: following the spec, staying on task, running verification, and repairing failures as it went."
β†— Source
#150 Codex
Use Extra High reasoning for long-horizon tasks β€” quality scales with reasoning effort
For complex multi-hour tasks, set reasoning effort to Extra High. Higher reasoning effort means the model spends more tokens exploring alternatives before committing to an approach. Quality scales non-linearly with effort.
"I gave Codex a blank repo, full access, and one job: build a design tool from scratch. Then I let it run with GPT-5.3-Codex at 'Extra High' reasoning."
β†— Source
#151 Both
Spec-driven development: research β†’ spec β†’ plan β†’ implement β†’ review β€” every session
All major agentic workflows converge on this pattern. The spec phase is the highest-leverage point: a well-written spec reduces total token cost and produces better output than a longer implementation session.
"All major workflows converge on the same architectural pattern: Research β†’ Plan β†’ Execute β†’ Review β†’ Ship."
β†— Source
#152 Claude Code
Agent Teams: automated coordination of multiple sessions with a team lead and shared tasks
Beyond parallelizing work, Agent Teams enable quality-focused workflows where different agents with fresh context review each other's work. The team lead delegates and coordinates without losing track of overall progress.
"Agent teams: Automated coordination of multiple sessions with shared tasks, messaging, and a team lead. Beyond parallelizing work, multiple sessions enable quality-focused workflows."
β†— Source
#153 Claude Code
Use git worktrees to run multiple parallel Claude sessions on the same repo without conflicts
Git worktrees let each parallel Claude session work on its own branch in its own directory. No conflicts, no stale index, no need to stash. The cleanest mechanism for multi-agent parallel development.
"Git worktrees make this practical by enabling parallel, isolated agent sessions on the same repo."
β†— Source
#154 Claude Code
context: fork in skill frontmatter runs the skill in an isolated subagent
The main context only sees the final result, not intermediate tool calls. Use this for read-heavy exploration skills or any skill that generates verbose intermediate output that shouldn't pollute the main session.
"use context: fork to run a skill in an isolated subagent β€” main context only sees the final result, not intermediate tool calls. The agent field lets you set the subagent type."
β†— Source
#155 Both
Skills are folders, not files β€” use references/, scripts/, examples/ subdirectories
A skill directory can contain supporting scripts, example inputs/outputs, reference schemas, and any other files the skill needs. This folder structure enables progressive disclosure at scale.
"skills are folders, not files β€” use references/, scripts/, examples/ subdirectories for progressive disclosure."
β†— Source
#156 Both
Include scripts and libraries in skills so Claude composes rather than reconstructs boilerplate
Bundle the actual scripts the skill needs alongside the SKILL.md. When the skill runs, Claude composes using the existing script rather than writing the same boilerplate from scratch every time.
"include scripts and libraries in skills so Claude composes rather than reconstructs boilerplate."
β†— Source
#157 Codex
Once a workflow is repeatable, stop using long prompts β€” encode it as a skill
Once a workflow becomes repeatable, stop relying on long prompts or repeated back-and-forth. A skill packages the instructions, context, and supporting logic into a durable, reusable artifact.
"Once a workflow becomes repeatable, stop relying on long prompts or repeated back-and-forth. Use a Skill to package the instructions in a SKILL.md file, context, and supporting logic Codex should apply consistently."
β†— Source
#158 Both
Configure Codex/Claude for your real environment early β€” many quality issues are actually setup issues
Wrong working directory, missing write access, wrong model defaults, or missing tools cause quality problems that look like model failures. Configure the environment correctly before blaming the model.
"Configure Codex for your real environment early. Many quality issues are really setup issues, like the wrong working directory, missing write access, wrong model defaults, or missing tools and connectors."
β†— Source
#159 Claude Code
Use /insights commands and verify patterns through tests β€” agents generate 1.75x more logic errors
Claude Code can generate 1.75Γ— more logic errors than human-written code (ACM 2025 research). Every output must be verified. Use tests as the ground truth, not visual inspection or agent assertions.
"Claude Code can generate 1.75x more logic errors than human-written code (ACM 2025). Every output must be verified. Use /insights commands and verify patterns through tests."
β†— Source
#160 Codex
Codex built OpenAI DevDay 2025 β€” from keynote demos to arcade machines to SDK polishing
OpenAI used Codex to build everything for DevDay 2025: programmatic camera/stage-light control for the keynote, arcade machines in the community hall, and polishing the Guardrails SDKs for Python and TypeScript. Production proof of agentic reliability.
"OpenAI used Codex to build everything for DevDay 2025 β€” from Romain Huet's keynote demo… to the arcade machines… Even the Guardrails SDKs for Python and TypeScript were polished using Codex."
β†— Source
#161 Codex
Codex scheduled automations: daily issue triage, weekly test coverage scans, Friday release notes
Codex can run recurring tasks on a schedule without you prompting it. This is the feature that separates it from interactive-only coding agents. Teams drowning in operational toil can offload entire categories of work.
"Scheduled automations. Codex can run recurring tasks on a schedule without you prompting it. Daily issue triage. Weekly test coverage scans. Friday release notes."
β†— Source
#162 Both
Metaprompting: ask the agent at the end of a bad turn how to improve its own instructions
If a turn didn't perform up to expectations, ask the model directly how to improve the instructions that produced the bad output. This self-meta-prompting technique is documented in the Codex prompting guide for fixing overthinking and loggy behavior.
"It's possible to ask the model at the end of a turn that didn't perform up to expectations how to improve its own instructions. The following prompt was used to produce some of the solutions to overthinking problems."
β†— Source
#163 Both
Use thinking mode with Explanatory output style to see Claude's decision reasoning
Always enable thinking mode (true) and set Output Style to Explanatory to see detailed output with Insight boxes. This lets you understand why Claude made a decision, not just what it produced.
"always use thinking mode true (to see reasoning) and Output Style Explanatory (to see detailed output with &#9733; Insight boxes) in /config for better understanding of Claude's decisions."
β†— Source
#164 Both
Common failure modes in agents: overthinking, loggy updates, awkward preamble tics
Documented failure patterns: taking too long before the first useful action; unnatural status updates instead of pair-programmer collaboration; repetitive tics like 'Good catch', 'Aha', 'Got it'. All are fixable via metaprompting.
"Common failure modes… Overthinking / long time before first useful action. Loggy / unnatural status updates instead of pair programmer collaboration. Awkward preamble phrasing and repetitive tics ('Good catch', 'Aha', 'Got it–')."
β†— Source
#165 Both
Handoff to a fresh agent rather than one long session for quality-critical reviews
A fresh context improves quality for any evaluative task. The reviewing agent hasn't been anchored by the implementation process. This is the key insight behind Writer/Reviewer, Implementer/Tester, and Builder/Validator patterns.
"use test time compute β€” separate context windows make results better; one agent can cause bugs and another (same model) can find them."
β†— Source
#166 Codex
Use Plan mode (/plan) toggle to review plans inline in the Codex app with diff panel
Toggle the diff panel in the Codex app to directly review changes locally. Plan mode integrates the plan review with the diff view β€” you can see both what the agent plans to do and what it has already done.
"Toggle the diff panel in the Codex app to directly review changes locally."
β†— Source
#167 Claude Code
Start with the minimal working example pattern: spec β†’ minimal working impl β†’ expand
Ask Claude to build the smallest possible working version first, then expand incrementally. This produces better architecture than specifying the full system up front because it forces decisions to be made in the right order.
"Start with a minimal spec or prompt and ask Claude to interview you using AskUserQuestion tool, then make a new session to execute the spec."
β†— Source
#168 Both
Multi-instance research pattern: break into parallelizable sub-goals, run child processes, aggregate
For deep research tasks, decompose objectives into parallelizable sub-goals and run child agent processes via codex exec or Claude subagents. Aggregate results into polished reports. The pattern scales to arbitrary research depth.
"Multi-instance (multi-agent) orchestration workflow for deep research tasks. Breaks down research objectives into parallelizable sub-goals, runs child processes via codex exec, and aggregates results into polished reports."
β†— Source
#169 Codex
Spec-kit / KIRO workflow: constitution-based spec-driven development from idea to tasks
Trigger with 'kiro' or references to .kiro/specs/. Creates EARS-format requirements, design documents, and implementation task lists from a single idea. Keeps the full decision trail in version control.
"Interactive feature development workflow from idea to implementation. Creates requirements (EARS format), design documents, and implementation task lists. Triggered by: 'kiro' or references to .kiro/specs/ directory."
β†— Source
#170 Both
Command β†’ Agent β†’ Skill: the three-layer orchestration architecture for complex workflows
A slash command invokes an agent; the agent delegates to one or more skills; skills provide focused, progressive-disclosure context. This three-layer pattern scales from simple one-off tasks to complex multi-day feature development.
"Two skill patterns: agent skills (preloaded via skills: field) vs skills (invoked via Skill tool). See orchestration-workflow for implementation details of Command β†’ Agent β†’ Skill pattern."
β†— Source
#171 Both
Verify your session started with the right instruction files using audit logs
Check ~/.codex/log/codex-tui.log or Claude Code session JSONL to audit which instruction files were loaded. Stale or wrong AGENTS.md/CLAUDE.md loaded at session start is the root cause of many mysterious quality issues.
"Check ~/.codex/log/codex-tui.log (or the most recent session-*.jsonl file if you enabled session logging) after a session if you need to audit which instruction files Codex loaded."
β†— Source
#172 Both
Real companies using coding agents: Duolingo pilots for dev workflows, Virgin Atlantic for data analysis
Codex is in production at Cisco (engineering acceleration), Virgin Atlantic (data analysis and customer engagement), and Duolingo (development workflows). These are not experiments β€” agents are running in real product environments.
"Cisco β€” exploring Codex for engineering teams to accelerate feature development. Virgin Atlantic β€” deployed AI agents internally for data analysis and customer engagement. Duolingo β€” piloting Codex for development workflows."
β†— Source
#173 Both
Parallel agents: Fountain 50% faster, CRED 2x speed with multi-agent coordination
Validated production metrics from real companies using multi-agent Claude Code coordination. Parallelism doesn't just save time β€” it enables tasks (like autonomous C compilation) that would be impossible in a single context window.
"Production metrics from real companies: Fountain: 50% faster, CRED: 2x speed. 5 validated workflows (multi-layer review, parallel debugging, large-scale refactoring)."
β†— Source
#174 Codex
Use the Slack integration to delegate tasks to Codex directly from Slack channels
Tag @Codex or @Claude with bug reports or feature requests in Slack channels for team workflows. The agent picks up the task from Slack, works in its cloud environment, and reports results back to the channel.
"Delegate tasks to Claude Code directly from Slack channels. Tag @Claude with bug reports or feature requests for team workflows."
β†— Source
#175 Both
AI-native team pattern: every PR has an AI reviewer, every sprint has AI-generated tickets
The AI-native development pattern: AI reviews every PR before humans see it, AI triages every incoming issue, AI generates Jira tickets from product specs. Humans focus on decisions and architecture β€” not mechanical work.
"Build AI Teams: Every major coding platform is converging on the same pattern β€” AI teammates that handle the mechanical work so engineers focus on the hard decisions."
β†— Source
#176 Both
Use /review command in Codex CLI for targeted pre-commit code review
The /review command in Codex CLI opens review presets: review against base branch, review uncommitted changes, or review a specific commit. Each run appears as its own transcript turn β€” compare feedback across iterations.
"Use /review in the CLI to open Codex's review presets. The CLI launches a dedicated reviewer that reads the diff you select and reports prioritized, actionable findings without touching your working tree."
β†— Source
#177 Claude Code
Feature-specific subagents with skills outperform generic QA or backend engineer agents
Don't create generic subagents. Create feature-specific subagents with relevant skills already loaded (progressive disclosure) and focused system prompts. Specificity dramatically improves quality for each task type.
"have feature specific sub-agents (extra context) with skills (progressive disclosure) instead of general qa, backend engineer."
β†— Source
#178 Both
The 'Beads' pattern: chain agents sequentially, each reads only the previous agent's output
Each agent in the chain receives only the previous agent's structured summary β€” not the full accumulated context. This prevents context rot in long chains and keeps each agent focused on its specific contribution.
"Decision framework: Teams vs Multi-Instance vs Dual-Instance vs Beads."
β†— Source
#179 Both
Use @ mentions in the prompt to precisely scope agent attention to specific files
Both tools support @file syntax to include specific files as context inline in the prompt. This is more precise than letting the agent discover files β€” it ensures attention is focused on the right files from the start.
"Type @ in the composer to open a fuzzy file search over the workspace root; press Tab or Enter to drop the highlighted path into your message."
β†— Source
#180 Codex
Press Enter mid-run to inject instructions into the current turn; Tab to queue for next
In the Codex TUI, pressing Enter while Codex is running injects new instructions into the current turn. Tab queues a follow-up prompt for the next turn. This enables real-time course correction without interrupting the agent.
"Press Enter while Codex is running to inject new instructions into the current turn, or press Tab to queue a follow-up prompt for the next turn."
β†— Source
#181 Claude Code
Use the VS Code plan review panel to comment on plans before implementation begins
In the VS Code extension, the plan preview panel updates live as Claude refines its plan. You can comment on the plan while Claude is still working on it. Commenting before implementation is dramatically cheaper than fixing after.
"VS Code plan preview: auto-updates as Claude iterates, enables commenting only when the plan is ready for review, and keeps the preview open when rejecting so Claude can revise."
β†— Source
#182 Both
Structured agent handoffs: each role writes to its own scoped folder, not shared files
In multi-agent workflows, each role (designer, backend, frontend, tester) writes its deliverables to its own folder (/design/, /backend/, etc.). No shared file conflicts, clear ownership, easy to audit which agent produced what.
"Deliverables (write to /tests): TEST_PLAN.md – bullet list… Each role writes scoped artifacts in its own folder before handing control back to the project manager."
β†— Source
#183 Both
The 'dual-instance' pattern: run two agents simultaneously, one builds, one continuously tests
Run two Claude/Codex sessions in parallel: one builds the feature incrementally, one continuously runs the test suite and reports failures back. The builder fixes immediately rather than accumulating test debt.
"Dual-Instance… use test time compute β€” separate context windows make results better; one agent can cause bugs and another (same model) can find them."
β†— Source
#184 Claude Code
Use the built-in /claude-api skill when building Anthropic SDK integrations
Claude Code ships a built-in /claude-api skill that provides specialized context for building with the Claude API and Anthropic SDK. Invoke it when building API integrations to get accurate, up-to-date patterns.
"Added the /claude-api skill for building applications with the Claude API and Anthropic SDK."
β†— Source
#185 Codex
Use --add-dir to coordinate changes across multiple repos in a single session
Launch with codex --cd apps/frontend --add-dir ../backend --add-dir ../shared to work across multiple project directories simultaneously. Codex can make coordinated changes across all of them.
"Expose more writable roots with --add-dir (for example, codex --cd apps/frontend --add-dir ../backend --add-dir ../shared) when you need to coordinate changes across more than one project."
β†— Source
#186 Both
Permanent worktrees for long-lived agent environments β€” not auto-deleted, fully reusable
Create a permanent worktree from the Codex app sidebar three-dot menu. Unlike temp worktrees, permanent worktrees aren't automatically deleted and support multiple threads. Ideal for long-running agent projects.
"If you want a long-lived environment, create a permanent worktree from the three-dot menu on a project in the sidebar. Permanent worktrees are not automatically deleted, and you can start multiple threads from the same worktree."
β†— Source
#187 Both
Handoff between local and worktree mid-session using the Handoff feature
The Codex app lets you move a thread between Local and Worktree mid-session using the Handoff feature. Start exploring locally, then move to a worktree for isolated implementation when scope is clear.
"You can also start threads on a worktree manually, and use Handoff to move a thread between Local and Worktree."
β†— Source
#188 Claude Code
Use /stats to see usage patterns, token counts, and session streaks
The /stats command opens a usage dashboard: favorite models, token consumption patterns, session counts, and streaks. Press Ctrl+S to copy for sharing. Use it to understand your actual cost distribution.
"Stats dashboard: Run /stats to see your usage patterns, favorite models, token counts, and streaks. Press Ctrl+S to copy for sharing."
β†— Source
#189 Both
Agents inherit your shell β€” your tool choices directly shape model behavior
Claude Code and Codex inherit your shell environment. If you have ripgrep, fd, or custom scripts in your PATH, the agent can use them. The better your shell tooling, the better your agent's ability to explore and act.
"Claude Code inherits your shell, which means your tooling choices directly shape model behavior."
β†— Source
#190 Codex
Use /permissions inside an interactive session to switch sandbox mode on-the-fly
Don't restart Codex to change permission levels. Use /permissions inside an active session to switch between auto, read-only, and full access modes as your comfort level changes mid-task.
"Use /permissions inside an interactive session to switch modes as your comfort level changes."
β†— Source
#191 Both
The 'Fountain' production metric: 50% faster development with multi-agent coordination
Fountain achieved 50% faster development using multi-agent Claude Code coordination in production. The gains come from context multiplication and parallel execution, not model capability improvements.
"Production metrics from real companies (autonomous C compiler, 500K hours saved) 5 validated workflows… Fountain: 50% faster."
β†— Source
#192 Both
Verify MCP server permissions regularly β€” servers can read/write your entire codebase
Review what each connected MCP server can actually access. Servers with broad filesystem permissions can read proprietary code, API keys in config files, or database credentials. Audit MCP permissions as rigorously as npm packages.
"MCP servers can read/write your codebase. Strategy: Systematic audit (5-min checklist). Community-vetted MCP Safe List. Vetting workflow documented."
β†— Source
#193 Claude Code
Use /mcp in VS Code to manage MCP servers without switching to terminal
In the VS Code extension, /mcp opens a native MCP server management dialog β€” enable/disable servers, reconnect, and manage OAuth authentication directly in the chat panel without editing config files.
"Added native MCP server management dialog β€” use /mcp in the chat panel to enable/disable servers, reconnect, and manage OAuth authentication without switching to the terminal."
β†— Source
#194 Both
The skill description is a routing signal for the model β€” write it for automatic invocation
Write skill descriptions as precise trigger conditions: 'Use when the user asks to analyze SQL query performance in PostgreSQL.' The model reads descriptions to decide when to auto-invoke. Vague descriptions cause wrong-skill invocations.
"Because implicit matching depends on description, write descriptions with clear scope and boundaries."
β†— Source
#195 Both
Track context usage live with status bars or monitoring hooks β€” don't guess when to compact
Use PreCompact hooks that display context percentage in a status line. Or use /context regularly. Guessing when to compact wastes quality (compact too late) or tokens (compact too early).
"[!] 25.0% free (50.0K/200K) -> .claude/backups/3-backup-26th-Jan-2026-5-45pm.md … the statusline shows the backup path."
β†— Source
#196 Both
Use reasoning with 'high' effort for architecture decisions β€” medium or low for routine code gen
Different tasks warrant different reasoning effort levels. Architecture decisions and complex debugging benefit from high reasoning. Formatting, simple refactors, and test generation work fine with medium or low effort at lower cost.
"Choose a reasoning level based on how hard the task is and test what works best for your workflow. Different users and tasks work best with different settings."
β†— Source
#197 Claude Code
Use the VS Code spark icon to list all Claude Code sessions and open them as full editors
A spark icon in the VS Code activity bar lists all Claude Code sessions, with sessions opening as full editors. Full markdown document view for plans, with support for adding inline comments to provide feedback on the plan.
"Added spark icon in VS Code activity bar that lists all Claude Code sessions, with sessions opening as full editors. Added full markdown document view for plans."
β†— Source
#198 Both
Treat coding agents as junior engineers with tools and memory β€” not magic code generators
The fundamental mental model shift: agents work best when given the right context, clear goals, and iteration cycles. Not when given a magic prompt and expected to produce perfect output in one shot.
"Rather than treating Claude as a chatbot, the core insight is this: Claude Code works best when treated like a junior engineer with tools, memory, and iteration β€” not a magic code generator."
β†— Source
#199 Both
The developers who will thrive: those who orchestrate AI, not just code alongside it
90% of traditional programming skills are becoming commoditized. The remaining 10% β€” architecture decisions, system design, knowing when to delegate and when to intervene β€” becomes worth 1000x more. Agents change what skills matter.
"The developers and teams who understand this shift β€” who learn to orchestrate AI rather than just code alongside it β€” will thrive in this new landscape."
β†— Source
#200 Both
Codex shipped a userpromptsubmit hook β€” prompts can be blocked or augmented before history
A recent Codex changelog addition: the userpromptsubmit hook fires before prompts enter history, enabling blocking or augmentation before execution. Claude Code has had this as UserPromptSubmit β€” now Codex has the equivalent.
"Added a userpromptsubmit hook so prompts can be blocked or augmented before execution and before they enter history."
β†— Source