Skip to content

Conversation

@rvagg
Copy link
Member

@rvagg rvagg commented Jan 9, 2026

Implement POST /pdp/piece/fetch for fetching pieces from other SPs. Includes idempotent request handling, failure tracking with retry status, and CommP verification for external sources.

New files:

  • handlers_fetch.go: fetch endpoint and database store implementation See HandleFetch() Godoc for comprehensive details about new functionality.
  • fetch_types.go: request/response types with source URL validation
  • piece_cid.go: PieceCIDv2 parsing utilities (consolidated existing PieceCID code)
  • 20251212-pdp-fetch.sql: tracking tables for fetch requests

Changes:

  • task_park_piece.go: verify CommP for external sources, reduce MaxFailures to 5
  • handlers.go: refactor to use shared PieceCID utilities

Pulls in lib/ffi/piece_funcs.go from main branch as is.

@rvagg rvagg requested a review from a team as a code owner January 9, 2026 07:12
@rvagg rvagg self-assigned this Jan 9, 2026
@rvagg rvagg added this to FilOz Jan 9, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Jan 9, 2026
@rvagg rvagg removed this from FilOz Jan 9, 2026
@rvagg rvagg added this to FOC Jan 9, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Jan 9, 2026
@rvagg rvagg added the team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10 label Jan 9, 2026
@rvagg rvagg moved this from 📌 Triage to 🔎 Awaiting review in FOC Jan 9, 2026
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from 7f3a6f5 to 89c16b0 Compare January 9, 2026 07:14
@rvagg rvagg added this to the M4: Filecoin Service Liftoff milestone Jan 9, 2026
@rvagg rvagg changed the title feat(pdp): add SP-to-SP piece fetch endpoint with CommP verification feat(pdp): add SP-to-SP piece fetch endpoint Jan 9, 2026
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from 89c16b0 to 546549f Compare January 9, 2026 07:18
@rjan90 rjan90 requested a review from LexLuthr January 9, 2026 07:21
Copy link
Contributor

@LexLuthr LexLuthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just need to change the ParkPiece strategy.

@snadrus
Copy link
Contributor

snadrus commented Jan 9, 2026

This PR is a concern as we have SPs with huge databases, and Support is looking for ways to reduce Curio's db impact (ex: #854 ). Streaming behavior must be debugged using logs, not databases.

@snadrus
Copy link
Contributor

snadrus commented Jan 9, 2026

@rvagg Can we have the spec for this ticket?
I would like the Curio team to implement it in a supportable way.

magik6k
magik6k previously requested changes Jan 9, 2026
@github-project-automation github-project-automation bot moved this from 🔎 Awaiting review to ⌨️ In Progress in FOC Jan 9, 2026
@rvagg
Copy link
Member Author

rvagg commented Jan 12, 2026

@snadrus sorry I should have linked to #828 as the issue behind this ticket. There's also a longer document that the description there came out of but I think that'd be noise at this stage. The godoc I put in the handler can/should also be treated as a spec with the desired client flow.


I hear you re database size, but we have a clean-up task in here that keeps it minimal, that could be tightened much further than it is now - currently to 5 days, to retain data for debugging purposes, but it could even be as small as an hour if we want. The problems with not storing state for this is:

  • Removes options for doing idempotency on requests
  • Client polling for status to understand progress becomes more difficult and less informative, Curio is more of a cross-fingers black box
  • No clean failure tracking on record

We could just rely on parked_pieces and make this a per-piece interaction - fire once for the original request then query status for individual pieces, we don't quite get the level of granularity on status provided here. Also harmony_task_history might be able to give us some of the failure information, we'd have to join with it and return whatever it has to say. But it's a potentially big table and probably not intended for this purpose.

The trade-off would be: less granularity, separate status API(s), more complex queries against harmony_task_history. The choice here was two small, short-lived tables with aggressive cleanup over complex joins against large shared tables, for a more convenient, informative, and idempotent API.

@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from e2287ba to f0cf233 Compare January 12, 2026 07:57
@rvagg
Copy link
Member Author

rvagg commented Jan 12, 2026

@LexLuthr in a new fixup! commit I've reworked this based on your feedback. Restored task_park_piece.go and lib/ffi/piece_funcs.go to their original state, so no changes to ParkPiece or the shared piece funcs.

Created a separate PDPv0_FetchPiece in tasks/pdp/task_fetch_piece.go that handles the fetch flow independently: downloads from source URL, computes and verifies CommP while streaming, stores to custore, then creates a parked_pieces entry for StorePiece to pick up. Tracking happens via pdp_piece_fetch_items which ties into the fetch request lifecycle, cleanup still happens via the existing CASCADE delete when fetch records age out.

The task runs up to 4 concurrent downloads, each isolated so a slow/bad source doesn't block others. Has a 1 hour total fetch timeout (these are capped at 1016 MiB total) and exponential backoff on retries (5 max). Still room for verified streaming retrieval in future but I hope this should be solid enough for a starter?

@BigLep
Copy link
Member

BigLep commented Jan 12, 2026

Curio: what are the next steps for getting this approved/merged? I want to make sure we're clear about what things need to get done now or can be improved later, particularly with the merge back to main. My understanding is that @rvagg has biased towards making the future merge back to main easier. That seems right to me. @snadrus: if Curio wants to do a different implementation, can that be handled in the pdpv0-main branch after the fact?

Context: this is the last remaining feature needed for FOC GA (as part of FilOzone/filecoin-services#357 )). There are still plenty of operational/tooling items though for FOC GA, but getting the GA durability story landed is key for us to turn focus towards bug fixing / operational improvement mode because we know more APIs won't be added to Synapse.

How do we get decisions here as quickly as person? I suggest we have a verbal sync (I can work on scheduling). Even if we end up deciding that more immediate work is required, I want to take days getting to that decision with async back and forth.

@rjan90 rjan90 moved this from ⌨️ In Progress to ✔️ Approved by reviewer in FOC Jan 13, 2026
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from 5c3972e to 87dab18 Compare January 17, 2026 06:01
rvagg added a commit to FilOzone/synapse-sdk that referenced this pull request Jan 17, 2026
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch 2 times, most recently from 39a74a2 to aa2b089 Compare January 17, 2026 10:21
@rvagg rvagg changed the title feat(pdp): add SP-to-SP piece fetch endpoint feat(pdp): add SP-to-SP piece pull endpoint Jan 20, 2026
@rvagg
Copy link
Member Author

rvagg commented Jan 20, 2026

Renamed everything in here from "Fetch" to "Pull".

During SDK integration I found it too hard to distinguish it from fetch() and didn't want to cause confusion, so the functionality is now called "Pull", you ask an SP to pull from another SP.

rvagg added a commit to FilOzone/synapse-sdk that referenced this pull request Jan 20, 2026
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from b5da59c to f7e7250 Compare January 22, 2026 07:59
@BigLep BigLep linked an issue Jan 27, 2026 that may be closed by this pull request
@rvagg
Copy link
Member Author

rvagg commented Jan 27, 2026

@rjan90 we're going to get this landed before you cut a tag this week, keeping it open while I'm still testing on it in case something comes up, but it's ready as is, works, tested, just not heavily exercised.

@rjan90
Copy link
Collaborator

rjan90 commented Jan 30, 2026

@rjan90 we're going to get this landed before you cut a tag this week, keeping it open while I'm still testing on it in case something comes up, but it's ready as is, works, tested, just not heavily exercised.

Discussed in sync today to merge this, so that we can cut a tag and get it shipped out to PDP SPs.

Implement POST /pdp/piece/pull for fetching pieces from other SPs.
Background task handles download, CommP verification, and retry with
failure tracking. Consolidates PieceCID parsing into shared utilities.

Closes: #828
@rvagg rvagg force-pushed the rvagg/pdp-sp-fetch branch from 06a71ef to 262a95b Compare January 30, 2026 12:15
@rvagg rvagg merged commit 965a0b1 into pdpv0 Jan 30, 2026
15 checks passed
@rvagg rvagg deleted the rvagg/pdp-sp-fetch branch January 30, 2026 12:46
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FOC Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdpv0: Add SP-to-SP piece fetch endpoint

7 participants