Rate limits in Claude Code operate at the API level, not the session level. If you launch parallel agents that all make heavy tool calls, you can hit rate limits quickly. The solution is to scope each agent tightly so it does focused work rather than broad exploration.
Cost grows fast with parallel agents. A task that costs \0.10$0.506$ agents with overlapping context. I'll be direct: before running a large agentic workflow, estimate the scope. How many agents, how much context each needs, and whether the time savings justify the cost.
You can set the CLAUDE_CODE_SUBAGENT_MODEL environment variable to specify which model handles background and utility tasks, which can reduce cost for high-volume workflows. The older ANTHROPIC_SMALL_FAST_MODEL is deprecated.