-
Notifications
You must be signed in to change notification settings - Fork 158
Description
Describe the problem you're trying to solve
While packing a ModelKit using kit pack, Kit will
- Determine which files need to be included in a layer
- Pack the files into a ModelKit layer
- Calculate the layer's digest
- Add it to the local cache
After a ModelKit is packed, a common next step is to push it to a remote, after which the local copy can be deleted to save space. The local cache is currently necessary as we need to compute the digests of all layers in order to compute the digest of the ModelKit manifest.
However, storing files locally introduces some inefficiency, especially in CI environments where disk storage is a limited resource. Each ModelKit effectively must be stored twice -- once in the local directory being packed, and once it the local Kit cache -- before being pushed to a remote. As Models are often very large files, this inefficiency can cause storage requirements to become very high.
Instead, we should support packing a ModelKit directly to a remote (e.g. using a flag on kit pack):
- For each layer, initiate an upload with a remote registry and stream the layer directly into the remote registry while calculating the digest of the pushed data, skipping local storage
- Once all layers have been pushed, generate a manifest from the calculated layer manifests and push that to the remote
This is possible because layer uploads in OCI only require the file digest upon completion, meaning we don't need to know the digest prior to push.
Additional context
N/A