Section 1: Mental Models & Architecture

New to Claude Code? Complete Part 0: Introduction first to get hands-on experience before diving into architecture details.

Already tried Claude Code? Perfect! You’ve seen it work. Now let’s understand HOW it works under the hood so you can make informed decisions about which tools and models to use for different tasks.

Understanding Claude’s Ecosystem

You’ve run commands and seen results. Now let’s explore the architecture that makes it all work. Claude offers multiple tools and interfaces for different use cases - understanding the ecosystem helps you choose the right tool for each task.

graph TD
    User[Developer] --> CLI[Claude Code CLI]
    User --> VSCode[VS Code Extension]
    User --> API[Claude API]
    User --> SDK[Agent SDK]
    
    CLI --> Models[Claude Models]
    VSCode --> Models
    API --> Models
    SDK --> Models
    
    Models --> Sonnet45[Claude Sonnet 4.5<br/>Balanced]
    Models --> Haiku45[Claude Haiku 4.5<br/>Fast]
    Models --> Opus45[Claude Opus 4.5<br/>Most Capable]
    
    CLI --> MCP[MCP Servers]
    VSCode --> MCP
    SDK --> MCP
    
    CLI --> Skills[Skills System]
    
    MCP --> External[External Data<br/>Drive, Jira, Figma]

1.1 Model Family Overview

Active Models (December 2025)

Note: “Claude 4” is the family name, not a specific model. Always specify the complete model name: Claude Opus 4.5, Claude Sonnet 4.5, or Claude Haiku 4.5.

Model	Best For	Speed	Context Window
Claude Opus 4.5	Complex reasoning, architecture	Moderate	200K
Claude Sonnet 4.5	Daily coding (recommended)	Fast	200K
Claude Haiku 4.5	Speed + cost optimization	Fastest	200K

In Claude Code, the opus alias maps to Opus 4.5 (model configuration).

All models support:

✅ Extended thinking
✅ Vision (text + image input)
✅ Tool use / function calling
✅ Batch processing (2M tokens)
✅ Prompt caching
✅ Multilingual capabilities

API Model Identifiers: Use the opus alias in Claude Code, or see Model names (Claude API) / Model configuration for current Opus 4.5 model IDs.

Claude Sonnet 4.5 → claude-sonnet-4-5-20250929
Claude Haiku 4.5 → claude-haiku-4-5-20250403

Note: Model identifiers may include date suffixes. Use the latest available version or the opus / sonnet / haiku aliases in Claude Code for the current flagship models.

Model Selection Decision Tree

graph TD
    Start{What's your<br/>primary need?}
    
    Start -->|Speed + Cost| Haiku[Claude Haiku 4.5<br/>Cost-effective]
    Start -->|Balanced| Sonnet[Claude Sonnet 4.5<br/>Recommended]
    Start -->|Max Intelligence| Opus[Claude Opus 4.5<br/>Most Capable]
    
    Start -->|Complex Reasoning| Extended{Need extended<br/>thinking?}
    Extended -->|Yes| SonnetExt[Claude Sonnet 4.5<br/>with 5K-10K budget]
    Extended -->|Very Complex| OpusExt[Claude Opus 4.5<br/>with 10K-32K budget]
    
    Start -->|High Volume| Batch[Any Model<br/>with Batch API<br/>Cost optimized]

When to Use Each Model

Claude Haiku 4.5 - Speed & Cost Priority

Quick code completions
Simple refactoring
Documentation generation
High-volume API calls
Real-time chat interactions
Trade-off: Significantly more cost-effective than Opus

Claude Sonnet 4.5 - Daily Workhorse (Recommended)

Feature development
Code review
Bug fixes and debugging
Test generation
API design
Sweet spot: Faster than Opus, nearly as capable

Claude Opus 4.5 - Complex Tasks

System architecture design
Complex algorithm implementation
Security analysis
Multi-step reasoning
Research and analysis
Use strategically: Most capable for complex reasoning

1.2 Tool Selection Matrix

Claude Code CLI vs API vs Agent SDK vs VS Code

Feature	CLI	API	Agent SDK	VS Code Ext
Interactive use	✅ Excellent	❌ No	❌ No	✅ Excellent
Automation	✅ Scriptable	✅ Programmatic	✅ Full control	⚠️ Limited
CI/CD integration	✅ Perfect	✅ Perfect	✅ Perfect	❌ No
Visual feedback	⚠️ Terminal only	❌ No	❌ No	✅ Rich UI
Custom agents	❌ No	⚠️ Limited	✅ Full control	❌ No
Piping/composition	✅ Unix style	❌ No	⚠️ Programmatic	❌ No
Learning curve	Low	Medium	High	Lowest
Flexibility	Medium	High	Highest	Low

Decision Matrix: Which Tool to Use?

Use Claude Code CLI when:

✅ Working in terminal primarily
✅ Need to pipe command outputs
✅ Automating workflows (scripts, hooks)
✅ CI/CD integration
✅ Quick interactive questions
✅ Team uses terminal-based workflows

Use Claude API when:

✅ Building applications with AI features
✅ Need programmatic control
✅ Integrating into existing systems
✅ Custom UI requirements
✅ High-volume processing (batches)
✅ Need fine-grained control over requests

Use Agent SDK when:

✅ Building custom autonomous agents
✅ Need multi-agent orchestration
✅ Require custom tool integration
✅ Complex workflow automation
✅ Advanced permission control
✅ Subagent patterns

Use VS Code Extension when:

✅ Prefer GUI over terminal
✅ Visual coding feedback needed
✅ Not comfortable with CLI
✅ Occasional AI assistance
✅ Beginner-friendly interface

1.3 Extended Thinking Architecture

Extended thinking allows Claude to “think through” complex problems before responding, dramatically improving quality for reasoning-heavy tasks.

Status: Extended thinking is enabled by default with a budget of up to 31,999 tokens.

How Extended Thinking Works

Claude automatically uses thinking when beneficial (up to 31,999 tokens)
Claude reasons internally (transparent in response)
Thinking process visible in thinking blocks
Final answer generated based on reasoning

Controlling Extended Thinking

Toggle in CLI:

# Toggle with keyboard shortcut
Option+T (Mac) or Alt+T (Windows/Linux)

Configure globally:

# Within Claude session
/config
# Set extended thinking preferences

Environment variable:

# Set maximum thinking tokens
export MAX_THINKING_TOKENS=10000

In settings.json:

{
  "thinking": {
    "enabled": true,
    "maxTokens": 31999
  }
}

Budget Token Guide

Budget	Use Case	Example Tasks
1K-2K	Simple reasoning	Basic calculations, straightforward logic
2K-5K	Moderate complexity	Code review, refactoring planning
5K-10K	Complex analysis	Architecture design, complex debugging
10K-32K	Deep reasoning	System design, multi-step analysis
32K+	Research-level	Use Batch API (can timeout on standard)

Extended Thinking Example (API)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// Extended thinking is enabled by default
// Optionally customize with thinking parameter
const message = await client.messages.create({
  model: "claude-sonnet-4-5-20250929",
  max_tokens: 16000,
  thinking: {
    type: "enabled",      // Optional: already enabled by default
    budget_tokens: 10000, // Optional: customize budget (default: 31999)
  },
  messages: [
    {
      role: "user",
      content: "Design a scalable microservices architecture for an e-commerce platform",
    },
  ],
});

// Response includes thinking blocks + final answer
console.log(message.content);

Disable extended thinking (when not needed):

const message = await client.messages.create({
  model: "claude-sonnet-4-5-20250929",
  max_tokens: 1024,
  thinking: {
    type: "disabled"  // Disable for simple tasks
  },
  messages: [{ role: "user", content: "Format this code" }]
});

When Extended Thinking Helps Most

✅ Extended thinking excels at:

Architecture and system design decisions
Complex algorithmic problems
Multi-step reasoning tasks
Security analysis and threat modeling
Code refactoring strategies
Research and analysis

✅ Can disable for:

Simple code completions
Quick documentation
Straightforward refactors
High-volume batch operations (to reduce cost)
Real-time chat (to reduce latency)

Note: Extended thinking is automatic and intelligent. Claude uses thinking when beneficial and skips it for simple tasks.

Cost Impact

Extended thinking uses output tokens for the thinking process:

Thinking tokens billed as output tokens
Default budget: up to 31,999 tokens
Automatically scaled based on task complexity
Tip: Use MAX_THINKING_TOKENS env var to limit if needed

1.4 Vision Capabilities

All current Claude models support vision (text + image input), enabling powerful code generation from visual inputs.

Vision Use Cases for Developers

Screenshot to Code
- UI mockups → React/Vue components
- Wireframes → HTML/CSS
- Design files → implementation
Diagram Analysis
- Architecture diagrams → implementation guidance
- Flow charts → code logic
- Database schemas → migrations
Form Understanding
- PDF forms → data extraction
- Document layouts → parsing logic
- UI screenshots → test automation

Vision API Example

import anthropic
import base64

client = anthropic.Anthropic()

# Read image file
with open("mockup.png", "rb") as image_file:
    image_data = base64.standard_b64encode(image_file.read()).decode("utf-8")

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Convert this UI mockup to React components with Tailwind CSS"
                }
            ],
        }
    ],
)

print(message.content)

1.5 Model Context Protocol (MCP)

MCP enables Claude to connect to external datasources and tools, dramatically expanding its capabilities beyond just code analysis.

MCP Architecture

graph LR
    Claude[Claude Code] --> MCP[MCP Client]
    MCP --> GoogleDrive[Google Drive<br/>Design Docs]
    MCP --> Jira[Jira<br/>Tickets]
    MCP --> Figma[Figma<br/>Designs]
    MCP --> Slack[Slack<br/>Conversations]
    MCP --> Custom[Custom Tools<br/>Internal APIs]

Working with MCP

Primary Method: CLI Commands

# Add MCP servers by transport type
claude mcp add --transport http <name> <url>
claude mcp add --transport stdio <name> -- <command> [args]

# Manage servers
claude mcp list              # List all configured servers
claude mcp get <name>        # Show server configuration
claude mcp remove <name>     # Remove a server
claude mcp test <name>       # Test connection

Quick Example:

# Add GitHub MCP server
claude mcp add --transport http github https://api.githubcopilot.com/mcp/

# Add Jira via stdio
claude mcp add --transport stdio jira \
  --env JIRA_TOKEN="${JIRA_TOKEN}" \
  -- npx -y @modelcontextprotocol/server-jira

# Test connection
claude mcp test github

For detailed CLI usage, transport types, and scope management, see Section 2.4: MCP Server Configuration.

Common MCP Integrations

MCP Server	Use Case	Benefit
Google Drive	Access design docs, specs	Real-time context from docs
Jira	Create/update tickets	Automated issue tracking
Figma	Read design files	Design-to-code workflows
Slack	Search conversations	Team knowledge access
GitHub	Repo analysis	Deep codebase understanding
Custom	Internal tools	Extend capabilities

MCP Configuration Example

User MCP servers are configured in ~/.claude.json:

{
  "mcpServers": {
    "gdrive": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-gdrive"],
      "env": {
        "GOOGLE_OAUTH_TOKEN": "${GDRIVE_TOKEN}"
      }
    }
  }
}

Project MCP servers are configured in .mcp.json (project root):

{
  "mcpServers": {
    "jira": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-jira"],
      "env": {
        "JIRA_URL": "https://company.atlassian.net",
        "JIRA_API_TOKEN": "${JIRA_TOKEN}"
      }
    }
  }
}

Note: The JSON examples above show the configuration structure generated by CLI commands. For most use cases, use claude mcp add rather than editing JSON files directly. See Section 2.4 for complete CLI documentation.

1.6 Skills System

The Skills system provides just-in-time knowledge loading, enabling Claude to access specialized domain knowledge without bloating the context window.

Skills Architecture

Progressive Disclosure Model:

Metadata - Lightweight discoverability
SKILL.md - Core instructions loaded on-demand
Resources - Scripts, references, assets loaded if needed

Skills vs. Regular Prompts

Aspect	Regular Prompt	Skill
Reusability	One-time use	Reusable across projects
Context efficiency	Always in context	Loaded on-demand
Distribution	Copy-paste	Git repository / local files
Versioning	Manual	Git-based
Team sharing	Manual	Shared via `.claude/skills/`

Example: Security Review Skill

---
name: security-review
description: Comprehensive security vulnerability analysis
allowed-tools: Read, Grep
model: sonnet
---

# Security Review Skill

Perform comprehensive security review checking for:

**Common Vulnerabilities:**
- SQL injection risks
- Cross-site scripting (XSS)
- Authentication/authorization issues
- Insecure data handling
- Hardcoded secrets

**For each issue:**
1. File and line number
2. Severity (Critical/High/Medium/Low)
3. Description
4. Recommended fix

Prioritize by severity.

Usage:

claude "/security-review"

1.7 Context Windows & Token Management

Understanding context windows helps you optimize costs and performance.

Context Window Sizes

Mode	Size	Use Case
Standard	200K tokens	Most development tasks
Batch	2M tokens	Full codebase analysis

Token Usage Guide:

Average code file: 500-2000 tokens
Large file (1000 lines): ~3000-5000 tokens
CLAUDE.md (recommended): <5000 tokens

Prompt Caching (Cost Optimization)

Save 90% on repeated context:

Cache common context (CLAUDE.md, dependencies)
First call: Full price
Cached calls: 10% price

const message = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "Project context...", // This gets cached
      cache_control: { type: "ephemeral" }
    }
  ],
  messages: [{ role: "user", content: "Question..." }],
});

Having issues? Troubleshooting (MCP, extended thinking, context overflow).

1.8 Mental Model Summary

The Claude Ecosystem in One Picture

graph TB
    subgraph Users
        Dev[Developers]
    end
    
    subgraph Interfaces
        CLI[CLI<br/>Terminal First]
        VSCode[VS Code<br/>GUI]
        API[API<br/>Programmatic]
        SDK[Agent SDK<br/>Custom Agents]
    end
    
    subgraph Intelligence
        Haiku[Claude Haiku 4.5<br/>Fast/Cheap]
        Sonnet[Claude Sonnet 4.5<br/>Balanced]
        Opus[Claude Opus 4.5<br/>Max Intelligence]
        Extended[Extended Thinking<br/>Deep Reasoning]
    end
    
    subgraph Extensions
        MCP[MCP<br/>External Data]
        Skills[Skills<br/>Domain Knowledge]
        Vision[Vision<br/>Images→Code]
    end
    
    Dev --> CLI
    Dev --> VSCode
    Dev --> API
    Dev --> SDK
    
    CLI --> Sonnet
    VSCode --> Sonnet
    API --> Haiku
    API --> Sonnet
    API --> Opus
    SDK --> Sonnet
    SDK --> Extended
    
    Sonnet --> MCP
    Sonnet --> Skills
    Sonnet --> Vision

Key Principles

Choose the right tool: CLI for automation, API for integration, SDK for agents
Choose the right model: Claude Haiku 4.5 for speed, Claude Sonnet 4.5 for balance, Claude Opus 4.5 for complexity
Use extended thinking: For complex reasoning (5K-10K budget typical)
Leverage prompt caching: Save 90% on repeated context
Use MCP: Connect to external data and tools
Create skills: Reusable domain knowledge
Vision for UI: Screenshots → code

Quick Reference: Common Scenarios

Scenario	Tool	Model	Features
Daily feature development	CLI	Claude Sonnet 4.5	Standard context
Architecture design	CLI/API	Claude Opus 4.5	Extended thinking 10K
High-volume API calls	API	Claude Haiku 4.5	Batch processing
Screenshot to code	CLI/API	Claude Sonnet 4.5	Vision
Custom automation	Agent SDK	Claude Sonnet 4.5	Custom tools
CI/CD integration	CLI	Claude Haiku 4.5	Non-interactive
Complex debugging	CLI	Claude Sonnet 4.5	Extended thinking 5K
Full repo analysis	API	Claude Sonnet 4.5	Batch (2M context)

← Back to Part 1 Index

Next: Environment & Project Setup →