Warm Pool
C3's warm pool is the key to achieving a "mounted-like" GPU development experience. Instead of waiting minutes for a VM to provision, your code starts running in seconds.
The Problem with Cold Provisioning
Traditional cloud GPU workflows require spinning up a fresh VM for every job:
- Request VM from cloud provider
- Wait for allocation (30-60s)
- Boot the VM (30-60s)
- Initialize GPU drivers (30-60s)
- Download your code (10-30s)
- Finally run your job
This adds up to 2-5 minutes of waiting before your code even starts. For iterative development—tuning hyperparameters, debugging training loops, testing model changes—this latency kills productivity.
How the Warm Pool Works
C3 maintains a fleet of pre-provisioned GPUs. These VMs are:
- Already booted and initialized
- GPU drivers loaded and ready
- C3 agent running and polling for work
- Network configured with fast access to our control plane
When you submit a job, the scheduler finds an idle warm GPU and assigns it inline—allocation takes under a second, with total startup of ~5-15 seconds (agent heartbeat + code download):
┌─────────────────────────────────────────────────────────────────────────────┐
│ C3 WARM POOL ARCHITECTURE │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────┐
│ Your Job │
│ c3 deploy │
└──────┬──────┘
│
▼
┌──────────┐ ┌─────────────────┐
│ │ Job submitted │ │
│ You │ ──────────────────▶ │ C3 Control │
│ │ allocation < 1s │ Plane │
└──────────┘ └────────┬────────┘
│
▼
┌──────────────────────┐
│ GPU PROVIDER(S) │
│ ┌────┐ ┌────┐ │
│ │GPU │ │GPU │ ... │
│ │ ✓ │ │ ✓ │ │
│ └────┘ └────┘ │
│ Warm Pool │
└──────────┬───────────┘
│
▼
┌───────────────────┐
│ Job runs on │
│ first available │
│ warm GPU │
└───────────────────┘
Warm Path vs Cold Path
| Metric | Warm Pool | Cold Provision |
|---|---|---|
| Allocation time | < 1 second | ~2-5 minutes |
| Total startup | ~5-15 seconds | ~2-5 minutes |
| When used | Most jobs | Pool exhausted / rare GPU types |
Sub-Second Allocation
When a warm GPU is available, job allocation is nearly instantaneous:
- You submit → Job hits the C3 control plane
- Sub-second allocation → Scheduler finds an idle warm GPU and assigns your job
- Agent pickup (~5s) → Agent picks up the job on its next heartbeat
- Code download → Bundle is fetched and extracted
- Your code runs → Execution begins, logs stream back immediately
This transforms GPU development from a batch workflow into an interactive one. It feels like having a GPU mounted to your local machine.
Pool Scaling
The warm pool scales with demand. C3 uses a baseline target per GPU profile plus demand-based scaling:
- Low demand: Pool holds a baseline number of warm GPUs
- High demand: Pool scales up to meet job volume (with a configurable cap)
- Burst load: Overflow goes to cold provisioning (still works, just slower)
When Cold Provisioning Happens
Sometimes jobs use cold provisioning instead:
- Pool exhausted: All warm GPUs are busy during high demand
- Rare GPU type: Specialized hardware not kept warm
- Unusual demand patterns: Spikes beyond the baseline pool capacity
Cold provisioning still works—your job will run, it just takes longer to start. C3 automatically falls back to cold provisioning when needed.
Best Practices
Maximize Warm Pool Benefits
- Use L40 GPUs — Currently the available warm pool GPU
- Keep jobs small — Finish faster, return GPU to pool for others
- Use
--follow— See real-time logs as your job runs - Submit during active hours — Pool is warmest when others are using it too
Understand the Timing
- Allocation time: How long from submission until a GPU is assigned (sub-second for warm, minutes for cold)
- Total startup: Time from submission until your code starts running (~5-15 seconds for warm, minutes for cold)
- Runtime: Your actual code execution
- Total time: Everything from submit to completion
The Development Experience
With the warm pool, your workflow becomes:
┌─────────────────────────────────────────────────────────────────┐
│ ITERATIVE GPU DEVELOPMENT │
└─────────────────────────────────────────────────────────────────┘
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Edit │ │ Submit │ │ See │
│ Code │ ───▶ │ Job │ ───▶ │ Results │
│ Locally │ │ c3 deploy│ │ quickly │
└──────────┘ └──────────┘ └──────────┘
▲ │
│ │
└───────────────────────────────────┘
Iterate rapidly
This is especially powerful for:
- ML experimentation — Try different hyperparameters quickly
- Debugging — Add print statements, see output fast
- Prototyping — Test ideas without provisioning overhead
- Education — Learn GPU programming interactively
The warm pool makes cloud GPUs feel local.