Skip to content

Fix inconsistent BABE fork-choice by storing block weights in shared state#7550

Open
Nathy-bajo wants to merge 8 commits intoparitytech:masterfrom
Nathy-bajo:inconsistent-fork-choice
Open

Fix inconsistent BABE fork-choice by storing block weights in shared state#7550
Nathy-bajo wants to merge 8 commits intoparitytech:masterfrom
Nathy-bajo:inconsistent-fork-choice

Conversation

@Nathy-bajo
Copy link
Contributor

@Nathy-bajo Nathy-bajo commented Feb 12, 2025

Attempts to resolve #6013

Description

Introduced SharedBlockWeightStorage - a thread-safe, in-memory cache for block weights that provides immediate visibility across all import operations.

@Nathy-bajo
Copy link
Contributor Author

I'm not done with the implementation and I am yet to test it.

@Nathy-bajo
Copy link
Contributor Author

Please review @bkchr

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good thing to start would be to try to write some unit test for the issue.

@Nathy-bajo
Copy link
Contributor Author

A good thing to start would be to try to write some unit test for the issue.

Yeah, I think I'm getting to understand the issue better now.

@Nathy-bajo Nathy-bajo force-pushed the inconsistent-fork-choice branch from d716ee9 to e78de64 Compare January 5, 2026 16:27
@Nathy-bajo Nathy-bajo requested a review from bkchr January 5, 2026 16:28
@Nathy-bajo Nathy-bajo changed the title Prioritize primary blocks over secondary blocks #6013 Fix inconsistent BABE fork-choice by storing block weights in shared state Jan 5, 2026
Copy link
Contributor

@rockbmb rockbmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some test(s) that verify #6013 is now fixed?

I ran revert_prunes_epoch_changes_and_removes_weights against this PR's latest commit (e025b63) and it passes; however, running it against latest master also sees it pass, which means it wasn't catching #6013 regardless.

Ideally the test(s), when run against latest master, fails in a situation described in #6013, and passes against this PR's tip.

@Nathy-bajo
Copy link
Contributor Author

@rockbmb Done

Copy link
Contributor

@rockbmb rockbmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nathy-bajo thanks; this change breaks some things, though; e.g. this refactor

let aux_revert = Box::new(|client: Arc<FullClient>, backend, blocks| {
sc_consensus_babe::revert(client.clone(), backend, blocks)?;
sc_consensus_grandpa::revert(client, blocks)?;
Ok(())

"Shared storage and database should have same weight"
);

// Now test that prune_finalized also removes from shared storage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you testing this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nathy-bajo my comment here is still not done? Because you just say "test xy", but you don't test it? You just finalize, but have no idea if it gets pruned.

@Nathy-bajo Nathy-bajo requested a review from bkchr January 23, 2026 16:46
@Nathy-bajo
Copy link
Contributor Author

@bkchr All done. Please review

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't helping when you just ignore my comments. You still don't have a test for the original problematic behavior.

"Shared storage and database should have same weight"
);

// Now test that prune_finalized also removes from shared storage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nathy-bajo my comment here is still not done? Because you just say "test xy", but you don't test it? You just finalize, but have no idea if it gets pruned.

Comment on lines 1724 to 1725
// Wait a bit for the async finality notification to be processed
tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will for sure make this test flaky.

@Nathy-bajo
Copy link
Contributor Author

@bkchr I have removed the flaky tests and added tests for the original problematic behavior.

@Nathy-bajo Nathy-bajo requested a review from bkchr February 6, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent fork-choice: Validator sometimes chooses own secondary block over others primary

3 participants