Skip to content

Enable packing and pushing directly to remote without storing layers locally #1026

@amisevsk

Description

@amisevsk

Describe the problem you're trying to solve
While packing a ModelKit using kit pack, Kit will

  1. Determine which files need to be included in a layer
  2. Pack the files into a ModelKit layer
  3. Calculate the layer's digest
  4. Add it to the local cache

After a ModelKit is packed, a common next step is to push it to a remote, after which the local copy can be deleted to save space. The local cache is currently necessary as we need to compute the digests of all layers in order to compute the digest of the ModelKit manifest.

However, storing files locally introduces some inefficiency, especially in CI environments where disk storage is a limited resource. Each ModelKit effectively must be stored twice -- once in the local directory being packed, and once it the local Kit cache -- before being pushed to a remote. As Models are often very large files, this inefficiency can cause storage requirements to become very high.

Instead, we should support packing a ModelKit directly to a remote (e.g. using a flag on kit pack):

  • For each layer, initiate an upload with a remote registry and stream the layer directly into the remote registry while calculating the digest of the pushed data, skipping local storage
  • Once all layers have been pushed, generate a manifest from the calculated layer manifests and push that to the remote

This is possible because layer uploads in OCI only require the file digest upon completion, meaning we don't need to know the digest prior to push.

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions