Skip to main content

Artifact Output

The output of a job is its artifacts. These are things like plots produced by your job, trained neural network weights, or saved checkpoints.

Only files written to configured output directories are collected. By default this is $C3_ARTIFACTS_DIR (an artifacts/ directory pre-created for every job). You can add more directories with the output: field in your .c3 config. Files written elsewhere in the working directory are not collected.

Configuring output

Every job automatically collects files written to $C3_ARTIFACTS_DIR:

import os

output_dir = os.environ["C3_ARTIFACTS_DIR"] # always set, pre-created
with open(os.path.join(output_dir, "metrics.json"), "w") as f:
json.dump(results, f)

You can collect files from additional directories too:

output:
- results
- checkpoints

Both $C3_ARTIFACTS_DIR contents and output: directories end up in the same artifact manifest, downloadable with c3 pull.

Artifact lifecycle

  1. Your script writes files to $C3_ARTIFACTS_DIR and/or directories listed in output:
  2. After the script finishes, the agent collects all output files and uploads them as content-addressed blobs
  3. An artifact manifest is created at /jobs/<jobId>/
  4. You can download, browse, or reuse artifacts in another job

Browsing artifacts

Artifacts are stored the same way as uploaded datasets, using the same content-addressed storage. You can browse them with the same c3 data commands, just at a different path.

You can reach a job's artifacts in two ways:

c3 data ls /jobs/job_abc123/                              # By job ID (resolves project automatically)
c3 data ls /projects/my-project/jobs/job_abc123/ # By project + job ID

Both paths point to the same data. Having two paths does not mean the data is stored twice.

To list all jobs with artifacts in a project:

c3 data ls /projects/my-project/jobs/

Downloading artifacts

Download artifacts to your local machine with c3 pull:

c3 pull job_abc123          # Download a specific job's artifacts
c3 pull # Download all new completed jobs

Or use c3 data cp for more control:

c3 data cp /jobs/job_abc123/ ./local-output/

Reusing artifacts in another job

Because artifacts live in C3's centralised storage, you can mount them directly into a new job without downloading to your local machine first. This avoids the slow local-machine bottleneck entirely:

datasets:
- ref: /jobs/job_abc123
mount: /data/prev_weights

This mounts the output from job_abc123 at /data/prev_weights in your new job. The data transfers server-to-server at full speed.

Job chaining

Job chaining is mounting a previous job's artifacts directly into a new job. This lets you build multi-stage pipelines where each stage feeds into the next, with all data staying in C3's storage network:

# Stage 1: Preprocess data
c3 deploy -f
# Job completes, producing job_preprocess

# Stage 2: Train model (mount preprocessed data)
# Update .c3:
# datasets:
# - ref: /jobs/job_preprocess
# mount: /data/preprocessed
c3 deploy -f
# Job completes, producing job_train

# Stage 3: Evaluate (mount trained model)
# Update .c3:
# datasets:
# - ref: /jobs/job_train
# mount: /data/model
c3 deploy -f

Each stage mounts the previous stage's output directly. Data stays in C3's storage network and never needs to touch your local machine.