# Startup Behavior

C3 may run jobs on capacity that is already available, or it may provision new capacity when existing machines are busy or a requested GPU type is not immediately available.

## Cold Provisioning[​](#cold-provisioning "Direct link to Cold Provisioning")

Traditional cloud GPU workflows require spinning up a fresh VM for every job:

1. **Request VM** from cloud provider
2. **Wait for allocation** (1-10 min)
3. **Boot the VM** (1-5 min)
4. **Initialize GPU drivers** (1-10 min)
5. **Download your code** (seconds to minutes)
6. **Finally run your job**

This can add **5-45 minutes** before your code starts, depending on provider availability and hardware type.

## Available Capacity[​](#available-capacity "Direct link to Available Capacity")

C3 can assign jobs to already-available GPUs. These machines are:

* Already booted and initialized
* GPU drivers loaded and ready
* C3 agent running and polling for work
* Network configured with fast access to our control plane

When you submit a job, the scheduler first looks for compatible available capacity, then provisions new capacity if needed:

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                           C3 CAPACITY ARCHITECTURE                          │
└─────────────────────────────────────────────────────────────────────────────┘

                                    ┌─────────────┐
                                    │   Your Job  │
                                    │  c3 deploy  │
                                    └──────┬──────┘
                                           │
                                           ▼
┌──────────┐                      ┌─────────────────┐
│          │    Job submitted     │                 │
│   You    │ ──────────────────▶  │  C3 Control     │
│          │   allocation < 1s    │  Plane          │
└──────────┘                      └────────┬────────┘
                                           │
                                           ▼
                              ┌──────────────────────┐
                              │   GPU PROVIDER(S)    │
                              │  ┌────┐ ┌────┐       │
                              │  │GPU │ │GPU │  ...  │
                              │  │ ✓  │ │ ✓  │       │
                              │  └────┘ └────┘       │
                              │  Available Capacity  │
                              └──────────┬───────────┘
                                         │
                                         ▼
                              ┌───────────────────┐
                              │  Job runs on      │
                              │  first available  │
                              │  GPU              │
                              └───────────────────┘
```

### Available vs New Capacity[​](#available-vs-new-capacity "Direct link to Available vs New Capacity")

| Metric              | Available capacity                   | New capacity                       |
| ------------------- | ------------------------------------ | ---------------------------------- |
| **Allocation time** | seconds                              | 5-45 minutes                       |
| **Total startup**   | short startup after assignment       | 5-45 minutes                       |
| **When used**       | Compatible machine already available | No compatible machine is available |

## Assignment Flow[​](#assignment-flow "Direct link to Assignment Flow")

When compatible capacity is available, assignment is fast:

1. **You submit** → Job hits the C3 control plane
2. **Assignment** → Scheduler finds a compatible idle GPU and assigns your job
3. **Agent pickup (\~5s)** → Agent picks up the job on its next heartbeat
4. **Code download** → Bundle is fetched and extracted
5. **Your code runs** → Execution begins, logs stream back immediately

## Capacity Scaling[​](#capacity-scaling "Direct link to Capacity Scaling")

C3 capacity scales with demand. The control plane tracks pending jobs and available provider inventory:

* **Low demand**: Existing capacity is reused when available
* **High demand**: Capacity scales up to meet job volume, subject to provider availability and spend controls
* **Burst load**: Overflow goes to cold provisioning for currently available profiles (still works, just slower)

## When New Provisioning Happens[​](#when-new-provisioning-happens "Direct link to When New Provisioning Happens")

Sometimes jobs need new capacity:

* **Current capacity busy**: Compatible GPUs are already running other jobs
* **Rare GPU type**: Specialized hardware is not immediately available
* **Unusual demand patterns**: Spikes exceed current available capacity

For currently available GPU profiles, cold provisioning still works—your job will run, it just takes longer to start. C3 automatically falls back to cold provisioning when needed.

## Best Practices[​](#best-practices "Direct link to Best Practices")

### Keep Jobs Easy to Place[​](#keep-jobs-easy-to-place "Direct link to Keep Jobs Easy to Place")

1. **Use common GPU profiles** — Common profiles are easier to place quickly.
2. **Keep jobs small** — Finish faster and free capacity for other work
3. **Use `--follow`** — See real-time logs as your job runs

### Understand the Timing[​](#understand-the-timing "Direct link to Understand the Timing")

* **Allocation time**: How long from submission until a GPU is assigned
* **Total startup**: Time from submission until your code starts running
* **Runtime**: Your actual code execution
* **Total time**: Everything from submit to completion

## The Development Experience[​](#the-development-experience "Direct link to The Development Experience")

The normal workflow is:

```
┌─────────────────────────────────────────────────────────────────┐
│                    ITERATIVE GPU DEVELOPMENT                    │
└─────────────────────────────────────────────────────────────────┘

    ┌──────────┐      ┌──────────┐      ┌──────────┐
    │  Edit    │      │  Submit  │      │   See    │
    │  Code    │ ───▶ │   Job    │ ───▶ │ Results  │
    │ Locally  │      │ c3 deploy│      │  quickly │
    └──────────┘      └──────────┘      └──────────┘
          ▲                                   │
          │                                   │
          └───────────────────────────────────┘
                    Iterate
```

This is useful for:

* **ML experimentation** — Try different hyperparameters quickly
* **Debugging** — Add print statements, see output fast
* **Prototyping** — Test ideas without provisioning overhead
* **Education** — Learn GPU programming interactively
