-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix(trafficrouting): ReplicaSetReferenced does not check istio DestinationRules #4560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(trafficrouting): ReplicaSetReferenced does not check istio DestinationRules #4560
Conversation
a9e6e72 to
c31245a
Compare
Published E2E Test Results 4 files 4 suites 3h 24m 52s ⏱️ For more details on these failures, see this check. Results for commit 977a414. ♻️ This comment has been updated with latest results. |
Published Unit Test Results2 384 tests 2 384 ✅ 3m 8s ⏱️ Results for commit 977a414. ♻️ This comment has been updated with latest results. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4560 +/- ##
==========================================
+ Coverage 84.40% 84.42% +0.01%
==========================================
Files 164 164
Lines 18849 18884 +35
==========================================
+ Hits 15909 15942 +33
- Misses 2077 2079 +2
Partials 863 863 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
85d10d1 to
878bcfe
Compare
271a6b0 to
638e928
Compare
|
@zachaller I'm not sure the release/review process for this repo - any thoughts on how to get it closer to being merged? We've been using this on an internal build for the past ~month and it has resolved our issue, so I'm keen to get this into a release so we can start using upstream again instead of our fork. Thanks! |
d1ac39e to
181b02a
Compare
|
@zachaller friendly ping on this one too :-) |
b37aeca to
d8e4b15
Compare
The isReplicaSetReferenced function checks: - Static references in rollout status (StableRS, CurrentPodHash, Weights.Canary.PodTemplateHash, Weights.Stable.PodTemplateHash) - Service selectors for Canary/BlueGreen services However, when using Istio with subset-level traffic splitting (DestinationRules), the function doesn't check if the ReplicaSet's pod template hash is still referenced in the Istio DestinationRule subsets. This means a ReplicaSet can be scaled down even though Istio is still routing traffic to that subset. When using Istio with DestinationRule subsets (without explicit canary/stable services), the isReplicaSetReferenced function doesn't check the DestinationRule to see if the pod template hash is still referenced in a subset. The problem is: - The rollout status (e.g., Weights.Canary.PodTemplateHash) gets updated to the new hash - But the Istio DestinationRule subsets still have the old hash because UpdateHash hasn't been called yet (or returns early due to checkReplicasAvailable) - The ReplicaSet with the old hash gets scaled down because isReplicaSetReferenced returns false - Traffic is still being routed to that subset, causing 503s Signed-off-by: Andrew Brown <[email protected]>
Signed-off-by: Andrew Brown <[email protected]>
Signed-off-by: Andrew Brown <[email protected]>
Signed-off-by: Andrew Brown <[email protected]>
Signed-off-by: Andrew Brown <[email protected]>
d8e4b15 to
c04150b
Compare
Signed-off-by: Andrew Brown <[email protected]>
|



Fixes #4390 and #4534
The
isReplicaSetReferencedfunction checks:However, when using Istio with subset-level traffic splitting (DestinationRules), the function doesn't check if the ReplicaSet's pod template hash is still referenced in the Istio DestinationRule subsets.
This means a ReplicaSet can be scaled down even though Istio is still routing traffic to that subset.
We see these logs:
At which point Istio starts attempting to send to the scaled-down replica and starts returning HTTP 503 with a
UH(no healthy upstreams) error.The issue is that the metadata is being updated, but since we do not use services (only istio subsets), trafficrouting cannot detect that the "intermediate RS" is still in use.
When using Istio with DestinationRule subsets (without explicit canary/stable services), the isReplicaSetReferenced function doesn't check the DestinationRule to see if the pod template hash is still referenced in a subset.
The problem is:
Checklist:
"fix(controller): Updates such and such. Fixes #1234".