Skip to content

Conversation

@agners
Copy link
Member

@agners agners commented Dec 1, 2025

Proposed change

Refactor Docker image pull progress to use a simpler count-based approach where each layer contributes equally (100% / total_layers) regardless of size.

The core issue was that Docker rate-limits concurrent downloads (by default 3 at a time, see Docker DefaultMaxConcurrentDownloads constant) and reports layer sizes only when downloading starts. With the previous size-weighted progress calculation, large layers appearing late would cause progress to drop dramatically (e.g., 59% -> 29%) as the total size increased. We prevented the progress from going backwards, but in practice that meant the progress would stale for an extended amount of time.

The new approach:

  • Each layer contributes equally to overall progress
  • Per-layer progress: 70% download weight, 30% extraction weight
  • Progress only starts after first "Downloading" event (when full layer count is known)
  • Always caps at 99% - job completion handles final 100%

This moves progress tracking to a dedicated module (pull_progress.py) and removes the complex size-based scaling logic that tried to account for unknown layer sizes. With this, there is always some progress. Unfortunately, it also means that progress slows down towards the completion, since the larger layers are then still downloaded/extracted. But this is the best we can do currently.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:
  • Link to cli pull request:
  • Link to client library pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

@agners agners added the refactor A code change that neither fixes a bug nor adds a feature label Dec 1, 2025
@agners agners force-pushed the refactor-docker-pull-progress branch 2 times, most recently from c70d3b8 to ffc3783 Compare December 1, 2025 13:47
@agners agners requested a review from mdegat01 December 1, 2025 17:13
@agners agners force-pushed the refactor-docker-pull-progress branch from 3b608f3 to 8350a24 Compare December 1, 2025 17:19
agners and others added 3 commits December 1, 2025 21:19
Refactor Docker image pull progress to use a simpler count-based approach
where each layer contributes equally (100% / total_layers) regardless of
size. This replaces the previous size-weighted calculation that was
susceptible to progress regression.

The core issue was that Docker rate-limits concurrent downloads (~3 at a
time) and reports layer sizes only when downloading starts. With size-
weighted progress, large layers appearing late would cause progress to
drop dramatically (e.g., 59% -> 29%) as the total size increased.

The new approach:
- Each layer contributes equally to overall progress
- Per-layer progress: 70% download weight, 30% extraction weight
- Progress only starts after first "Downloading" event (when layer
  count is known)
- Always caps at 99% - job completion handles final 100%

This simplifies the code by moving progress tracking to a dedicated
module (pull_progress.py) and removing complex size-based scaling logic
that tried to account for unknown layer sizes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Layers that already exist locally should not count towards download
progress since there's nothing to download for them. Only layers that
need pulling are included in the progress calculation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@agners agners force-pushed the refactor-docker-pull-progress branch from bb9145c to 87e1e7a Compare December 1, 2025 20:19
@agners agners marked this pull request as draft December 2, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed refactor A code change that neither fixes a bug nor adds a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants