# C3 Docs > C3 is a GPU compute marketplace for academics. Users configure projects with a .c3 YAML file and run c3 deploy to provision cloud GPUs, run jobs, and return results. - [C3 Docs](https://docs.cthree.cloud/index.md) ## artifacts - [Artifact Output](https://docs.cthree.cloud/artifacts.md): The output of a job is its artifacts. These are things like plots produced by your job, trained neural network weights, or saved checkpoints. ## dashboard - [Web Dashboard](https://docs.cthree.cloud/dashboard.md): The C3 web dashboard at cthree.cloud/dashboard provides a visual interface for managing your GPU compute jobs, billing, and data. ## data-mounting - [Data Mounting](https://docs.cthree.cloud/data-mounting.md): How C3 stores data ## environment - [Environment](https://docs.cthree.cloud/environment.md): Every C3 job runs the script ## marketplace - [Marketplace](https://docs.cthree.cloud/marketplace.md): C3 aggregates GPU capacity from multiple data centers. When you submit a job, we find available compute at competitive rates—no need to manage cloud accounts or hunt for capacity yourself. ## submission - [Project Configuration](https://docs.cthree.cloud/submission.md): C3 projects are configured with a .c3 YAML file at the project root. Run c3 deploy from anywhere in the project to submit a job. ## warm-pool - [Startup Behavior](https://docs.cthree.cloud/warm-pool.md): C3 may run jobs on capacity that is already available, or it may provision new capacity when existing machines are busy or a requested GPU type is not immediately available. --- # Full Documentation Content # Artifact Output The output of a job is its artifacts. These are things like plots produced by your job, trained neural network weights, or saved checkpoints. Only files written to configured output directories are collected. By default this is `$C3_ARTIFACTS_DIR` (an `artifacts/` directory pre-created for every job). You can add more directories with the `output:` field in your `.c3` config. Files written elsewhere in the working directory are not collected. ## Configuring output[](#configuring-output "Direct link to Configuring output") Every job automatically collects files written to `$C3_ARTIFACTS_DIR`: ``` import os output_dir = os.environ["C3_ARTIFACTS_DIR"] # always set, pre-created with open(os.path.join(output_dir, "metrics.json"), "w") as f: json.dump(results, f) ``` You can collect files from additional directories too: ``` output: - results - checkpoints ``` Both `$C3_ARTIFACTS_DIR` contents and `output:` directories end up in the same artifact manifest, downloadable with `c3 pull`. ## Storage usage[](#storage-usage "Direct link to Storage usage") Artifact bytes count toward your account's tracked storage usage. When the agent registers an artifact after upload, the control plane records the byte delta against your storage balance. This means your displayed storage usage may increase after artifact uploads. Subscriptions do not include storage quotas. Uploads are not rejected because of subscription plan storage limits. You can check your current usage with `c3 balance`. ## Artifact lifecycle[](#artifact-lifecycle "Direct link to Artifact lifecycle") 1. Your script writes files to `$C3_ARTIFACTS_DIR` and/or directories listed in `output:` 2. After the script finishes, the agent collects all output files and uploads them as content-addressed blobs 3. An artifact manifest is created at `/jobs//` 4. Artifact bytes are recorded as storage usage 5. You can download, browse, or reuse artifacts in another job **Important:** If your script exits successfully (exit code 0) but artifact upload fails, the job is marked `FAILED` with reason `UPLOAD_ERROR`. This is non-retryable — the GPU work completed, but the output could not be saved. Check your storage configuration and job logs if you encounter this status. When the script itself fails (non-zero exit), is canceled, or times out, artifact upload is best-effort: upload errors are logged but the job status reflects the script failure, cancellation, or timeout. ## Browsing artifacts[](#browsing-artifacts "Direct link to Browsing artifacts") Artifacts are stored the same way as uploaded datasets, using the same content-addressed storage. You can browse them with the same `c3 data` commands, just at a different path. You can reach a job's artifacts in two ways: ``` c3 data ls /jobs/job_abc123/ # By job ID (resolves project automatically) c3 data ls /projects/my-project/jobs/job_abc123/ # By project + job ID ``` Both paths point to the same data. Having two paths does not mean the data is stored twice. To list all jobs with artifacts in a project: ``` c3 data ls /projects/my-project/jobs/ ``` ## Downloading artifacts[](#downloading-artifacts "Direct link to Downloading artifacts") Download artifacts to your local machine with `c3 pull`: ``` c3 pull job_abc123 # Download a specific job's artifacts c3 pull # Download all new completed jobs ``` Or use `c3 data cp` for more control: ``` c3 data cp /jobs/job_abc123/ ./local-output/ ``` ## Reusing artifacts in another job[](#reusing-artifacts-in-another-job "Direct link to Reusing artifacts in another job") Because artifacts live in C3's centralised storage, you can mount them directly into a new job without downloading to your local machine first. This avoids the slow local-machine bottleneck entirely: ``` datasets: - ref: /jobs/job_abc123 mount: /data/prev_weights ``` This mounts the output from `job_abc123` at `/data/prev_weights` in your new job. The data transfers server-to-server at full speed. ## Job chaining[](#job-chaining "Direct link to Job chaining") Job chaining is mounting a previous job's artifacts directly into a new job. This lets you build multi-stage pipelines where each stage feeds into the next, with all data staying in C3's storage network: ``` # Stage 1: Preprocess data c3 deploy -f # Job completes, producing job_preprocess # Stage 2: Train model (mount preprocessed data) # Update .c3: # datasets: # - ref: /jobs/job_preprocess # mount: /data/preprocessed c3 deploy -f # Job completes, producing job_train # Stage 3: Evaluate (mount trained model) # Update .c3: # datasets: # - ref: /jobs/job_train # mount: /data/model c3 deploy -f ``` Each stage mounts the previous stage's output directly. Data stays in C3's storage network and never needs to touch your local machine. --- # Web Dashboard The C3 web dashboard at [cthree.cloud/dashboard](https://cthree.cloud/dashboard) provides a visual interface for managing your GPU compute jobs, billing, and data. **Everything available in the dashboard is also available via the CLI.** The CLI is the primary interface for C3 — the dashboard is a convenience layer that mirrors CLI functionality. ## Dashboard Pages[](#dashboard-pages "Direct link to Dashboard Pages") | Dashboard | CLI Equivalent | Description | | ----------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------- | | [Jobs](https://cthree.cloud/dashboard/squeue) | `c3 squeue` | View all jobs with status, current activity, sanitized failure details, GPU, runtime. Click any job to see details. | | [Billing](https://cthree.cloud/dashboard/billing) | `c3 balance` / `c3 topup` | View credit balance, transaction history, and top up via Stripe. | | [Subscription](https://cthree.cloud/dashboard/subscription) | `c3 upgrade` | View current plan, upgrade/downgrade between tiers, cancel or resume your subscription. | | [Data](https://cthree.cloud/dashboard/data) | `c3 data ls` | Browse your datasets and storage usage. | | [Settings](https://cthree.cloud/dashboard/settings) | `c3 apikey` | Create and manage API keys for programmatic access. | ## Login[](#login "Direct link to Login") The dashboard uses the same Auth0 authentication as the CLI. Click "Login" in the header at cthree.cloud, or navigate directly to `/dashboard` to be prompted to sign in. Once logged in, the dashboard uses the same API and credentials as `c3 login` — you'll see the same jobs, balance, and data in both. New authenticated users must verify their email before dashboard-backed service APIs such as jobs, billing, data, and API keys work. Until then, service routes return `EMAIL_VERIFICATION_REQUIRED` and `whoami` reports `email_unverified`. The dashboard prompts you to resend the verification email when that denial occurs; the matching CLI action is `c3 verify-email`. If an account has been explicitly rejected or revoked, the dashboard and CLI show the same support message. ## Key Differences from CLI[](#key-differences-from-cli "Direct link to Key Differences from CLI") * **No job submission**: Jobs are submitted via `c3 deploy` from the CLI. The dashboard shows and manages existing jobs. * **No data upload/download**: Data transfers use `c3 data cp`. The dashboard browses data that's already uploaded. * **Stripe redirect**: Topping up with credits, or subscribing to Pro from Free, opens the same Stripe checkout page the CLI opens. Team is shown as `c3 upgrade team` but self-service checkout is coming soon. --- # Data Mounting ## How C3 stores data[](#how-c3-stores-data "Direct link to How C3 stores data") C3 keeps all data (datasets you upload and artifacts your jobs produce) on a centralised storage server with high-bandwidth connections to GPU nodes around the world. Think of it like a warehouse on a motorway network: the links between the warehouse and the GPUs are fast, high-bandwidth connections. Uploading from your local machine is the bottleneck, since home and office network connections to remote servers are much slower than server-to-server transfers. The good news is you only need to upload once. After that, C3 moves data between its storage and GPUs at full speed. ## Paths[](#paths "Direct link to Paths") Data in C3 can be organised by project, by job, or both: | Path | What it contains | | ----------------------------------- | --------------------------------- | | `/datasets/{name}/` | Uploaded datasets | | `/jobs/{jobId}/` | Job output artifacts | | `/projects/{project}/data/{name}/` | Datasets scoped to a project | | `/projects/{project}/jobs/{jobId}/` | Job artifacts scoped to a project | You can use whichever path style suits your workflow. `/jobs/{jobId}/` resolves the project automatically. ## Upload a dataset[](#upload-a-dataset "Direct link to Upload a dataset") ``` c3 data cp ./local-data/ /datasets/my-dataset/ ``` This uploads your data to C3's centralised storage. You only need to do this once. After the initial upload, every `c3 deploy` that references this dataset gets rapid access to it directly from the storage network, with no re-upload needed. C3 uses content-addressed deduplication: each file is hashed (SHA256) before upload, and if the content already exists, the upload is skipped. This means re-uploading a dataset with minor changes only transfers the files that actually changed, and overall storage usage can be lower than standard methods since identical files are never stored twice (see [How deduplication works](#how-deduplication-works) below). ## Browse data[](#browse-data "Direct link to Browse data") Use `c3 data ls` to browse datasets, versions, and files: ``` c3 data ls /datasets/ # List all datasets c3 data ls /datasets/my-dataset/ # List versions c3 data ls -l /datasets/my-dataset/@latest/ # List files in latest version ``` ## Mount a dataset in a job[](#mount-a-dataset-in-a-job "Direct link to Mount a dataset in a job") Reference the dataset in your `.c3` config: ``` datasets: - ref: /datasets/my-dataset mount: /data/my-dataset ``` Once referenced, C3 handles moving the data to whichever GPU your job lands on. From your script's perspective, the files are simply local at the mount path. You read them like any other files: ``` import numpy as np data = np.loadtxt("/data/my-dataset/measurements.csv", delimiter=",") ``` ### Mount path rules[](#mount-path-rules "Direct link to Mount path rules") * Mount paths must be **absolute** (start with `/`). Relative paths are rejected at submission time with a clear error. * If `mount` is omitted, it is auto-derived as `/data/` (e.g., `/datasets/cifar10` becomes `/data/cifar10`). * In `.c3` YAML, a relative mount like `mydata` is auto-prefixed to `/data/mydata`. ### Local directories[](#local-directories "Direct link to Local directories") You can reference a local directory as a dataset. C3 auto-uploads it before submitting the job: ``` datasets: - ref: ./local-data mount: /data/train ``` This is equivalent to running `c3 data cp ./local-data/ /datasets/...` yourself, but handled automatically. ## Versioning[](#versioning "Direct link to Versioning") Every upload creates a new version. Your jobs always get exactly the data they expect: ``` c3 data log /datasets/my-dataset/ ``` ``` VERSION CREATED FILES SIZE v3 2024-01-15 10:00:00 1000 2.5GB v2 2024-01-10 09:00:00 1000 2.4GB v1 2024-01-05 08:00:00 500 1.2GB ``` Jobs reference the latest version by default, or you can pin to a specific version for reproducibility. ## How deduplication works[](#how-deduplication-works "Direct link to How deduplication works") All data in C3 (datasets, workspaces, and job artifacts) uses the same content-addressed storage. Every file is stored as a **blob** keyed by its SHA256 hash, and a **manifest** lists which blobs make up each dataset, workspace, or set of job artifacts. This means: * **Cross-job dedup**: If two jobs produce identical output files, the data is stored once * **Workspace dedup**: Re-deploying the same code skips uploading unchanged files * **Cross-dataset dedup**: Identical files shared across datasets use the same storage * **Instant re-uploads**: `c3 data cp` only uploads files that have actually changed Deduplication is automatic and transparent. Artifacts still appear per-job (each job has its own listing), but identical files across jobs share storage behind the scenes. ## Data commands[](#data-commands "Direct link to Data commands") | Command | Description | | ---------------------- | --------------------------------------------- | | `c3 data ls /path/` | List files, datasets, or job artifacts | | `c3 data cp SRC DST` | Copy files (upload or download) | | `c3 data rm -r /path/` | Delete a dataset (requires `-r` for datasets) | | `c3 data du /path/` | Show disk usage | | `c3 data log /path/` | Show version history | --- # Environment Every C3 job runs the `script:` from your [`.c3` project configuration](https://docs.cthree.cloud/submission.md) as a Bash script on a GPU VM. The environment mode controls what C3 prepares before that script starts: * **Python**: C3 builds and caches a uv environment from `pyproject.toml` + `uv.lock`. * **Docker**: C3 pulls the Docker Hub image in `docker.image` and runs your script inside it. * **Bash**: C3 runs your script directly on the VM. This can launch anything, but any setup you do in the script runs from scratch on every job. Use Python or Docker when dependency setup is expensive or needs to be reproducible across jobs. Use Bash when your workload is already self-contained, has trivial setup, or you intentionally want to manage everything inside the script. ## Python[](#python "Direct link to Python") Use `python.project` for Python projects with a `pyproject.toml` and `uv.lock`: ``` # .c3 project: python-example script: run.sh time: "02:00:00" python: project: ./ output: - ./results ``` ``` # run.sh #!/bin/bash set -euo pipefail python3 train.py --output "$C3_ARTIFACTS_DIR" ``` ``` # pyproject.toml [project] name = "python-example" requires-python = ">=3.11" dependencies = [ "jax[cuda12]", "numpy", ] ``` C3 uses [uv](https://github.com/astral-sh/uv) to run `uv sync` before your script starts. The resulting environment is cached from the lock file, so repeat jobs with the same `uv.lock` avoid rebuilding dependencies. If your project is in a subdirectory, set `python.project` to that path. Generate or refresh the lock file locally with: ``` uv lock ``` Prefer `python.project` to `pip install` in `run.sh` If you install Python packages inside the Bash script, those installs happen again for every job. Declaring the project with `python.project` lets C3 cache the environment between jobs. ## Docker[](#docker "Direct link to Docker") Use `docker:` for non-Python workloads, mixed-language projects, custom CUDA/system libraries, or Python projects that need OS-level dependencies. `docker:` and `python:` are mutually exclusive. Reference a Docker Hub image from `.c3`: ``` # .c3 project: docker-example script: run.sh time: "02:00:00" docker: image: rust:1.95-slim-bookworm output: - ./results ``` ``` # run.sh #!/bin/bash set -euo pipefail cargo run --release -- --output "$C3_ARTIFACTS_DIR" ``` ``` # Cargo.toml [package] name = "docker-example" version = "0.1.0" edition = "2021" ``` ``` // src/main.rs use std::{env, fs, path::PathBuf}; fn main() -> std::io::Result<()> { let mut output = None; let mut args = env::args().skip(1); while let Some(arg) = args.next() { if arg == "--output" { output = args.next(); break; } } let output = output.unwrap_or_else(|| { env::var("C3_ARTIFACTS_DIR").unwrap_or_else(|_| ".".to_string()) }); let path = PathBuf::from(output).join("result.txt"); fs::create_dir_all(path.parent().unwrap())?; fs::write(path, "Rust job ran on C3\n") } ``` Docker image requirements Use a public Docker Hub image that C3 can pull without authentication; bare names (`ubuntu:22.04`), namespaced names (`user/image:tag`), and explicit `docker.io/...` hosts are accepted, but private repositories and external registries are not. Keep secrets out of image layers: do not publish `.c3.local`, `.env`, SSH keys, cloud credentials, service tokens, or build-argument secrets. C3 validates that `docker.image` points to Docker Hub, uploads your workspace, and pulls the image on the GPU VM before running `bash /workspace/