Skip to main content
Are you a large language model? This page is available as raw markdown at /warm-pool.md. The full docset is at /llms-full.md and the index is at /llms.md.

Startup Behavior

C3 may run jobs on capacity that is already available, or it may provision new capacity when existing machines are busy or a requested GPU type is not immediately available.

Cold Provisioning

Traditional cloud GPU workflows require spinning up a fresh VM for every job:

  1. Request VM from cloud provider
  2. Wait for allocation (1-10 min)
  3. Boot the VM (1-5 min)
  4. Initialize GPU drivers (1-10 min)
  5. Download your code (seconds to minutes)
  6. Finally run your job

This can add 5-45 minutes before your code starts, depending on provider availability and hardware type.

Available Capacity

C3 can assign jobs to already-available GPUs. These machines are:

  • Already booted and initialized
  • GPU drivers loaded and ready
  • C3 agent running and polling for work
  • Network configured with fast access to our control plane

When you submit a job, the scheduler first looks for compatible available capacity, then provisions new capacity if needed:

┌─────────────────────────────────────────────────────────────────────────────┐
│ C3 CAPACITY ARCHITECTURE │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────┐
│ Your Job │
│ c3 deploy │
└──────┬──────┘


┌──────────┐ ┌─────────────────┐
│ │ Job submitted │ │
│ You │ ──────────────────▶ │ C3 Control │
│ │ allocation < 1s │ Plane │
└──────────┘ └────────┬────────┘


┌──────────────────────┐
│ GPU PROVIDER(S) │
│ ┌────┐ ┌────┐ │
│ │GPU │ │GPU │ ... │
│ │ ✓ │ │ ✓ │ │
│ └────┘ └────┘ │
│ Available Capacity │
└──────────┬───────────┘


┌───────────────────┐
│ Job runs on │
│ first available │
│ GPU │
└───────────────────┘

Available vs New Capacity

MetricAvailable capacityNew capacity
Allocation timeseconds5-45 minutes
Total startupshort startup after assignment5-45 minutes
When usedCompatible machine already availableNo compatible machine is available

Assignment Flow

When compatible capacity is available, assignment is fast:

  1. You submit → Job hits the C3 control plane
  2. Assignment → Scheduler finds a compatible idle GPU and assigns your job
  3. Agent pickup (~5s) → Agent picks up the job on its next heartbeat
  4. Code download → Bundle is fetched and extracted
  5. Your code runs → Execution begins, logs stream back immediately

Capacity Scaling

C3 capacity scales with demand. The control plane tracks pending jobs and available provider inventory:

  • Low demand: Existing capacity is reused when available
  • High demand: Capacity scales up to meet job volume, subject to provider availability and spend controls
  • Burst load: Overflow goes to cold provisioning for currently available profiles (still works, just slower)

When New Provisioning Happens

Sometimes jobs need new capacity:

  • Current capacity busy: Compatible GPUs are already running other jobs
  • Rare GPU type: Specialized hardware is not immediately available
  • Unusual demand patterns: Spikes exceed current available capacity

For currently available GPU profiles, cold provisioning still works—your job will run, it just takes longer to start. C3 automatically falls back to cold provisioning when needed.

Best Practices

Keep Jobs Easy to Place

  1. Use common GPU profiles — Common profiles are easier to place quickly.
  2. Keep jobs small — Finish faster and free capacity for other work
  3. Use --follow — See real-time logs as your job runs

Understand the Timing

  • Allocation time: How long from submission until a GPU is assigned
  • Total startup: Time from submission until your code starts running
  • Runtime: Your actual code execution
  • Total time: Everything from submit to completion

The Development Experience

The normal workflow is:

┌─────────────────────────────────────────────────────────────────┐
│ ITERATIVE GPU DEVELOPMENT │
└─────────────────────────────────────────────────────────────────┘

┌──────────┐ ┌──────────┐ ┌──────────┐
│ Edit │ │ Submit │ │ See │
│ Code │ ───▶ │ Job │ ───▶ │ Results │
│ Locally │ │ c3 deploy│ │ quickly │
└──────────┘ └──────────┘ └──────────┘
▲ │
│ │
└───────────────────────────────────┘
Iterate

This is useful for:

  • ML experimentation — Try different hyperparameters quickly
  • Debugging — Add print statements, see output fast
  • Prototyping — Test ideas without provisioning overhead
  • Education — Learn GPU programming interactively