You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're running integration tests on a Service Fabric dev cluster provisioned on an Azure DevOps build pipeline.
We're using internal Windows Server 2022-based agent pool.
Everything worked until this Saturday 02.11.2024.
Before that we were getting this image: 20240922
Since Saturday we started getting this image: 20241021
Starting from Saturday the dev cluster fails to reach healthy state, due to the failing FaultAnalysisService (which is a non-configurable part of Service Fabric runtime). We don't have any visibility into why exactly it is failing.
That is not the case: the actual Service Fabric runtime, that now appears on our agents is this (two months old) one: 9.1.2718.9590. We established that by dumping FabricHost.exe from an agent.
We're not able to prove or disprove that SF runtime version is the actual culprit (because we cannot travel back in time to try the previous one - we're always getting the latest agent image, and cannot control its version), but it looks highly likely.
Question1: can there be any workaround for our failing SF cluster? E.g. maybe there's a way to override SF runtime version to be used? (Just remember that SF runtime installer requires root privileges, therefore just running it as part of the pipeline does not work).
Question2: why this repo's change history does not reflect the actual picture, and can this be fixed?
Question3: is there a chance to have SF runtime updated on the agent image? I cannot say which exact version it needs to be updated to (since we have no way to try them out), but maybe just to revert it to the previous, stable one?
Description
We're running integration tests on a Service Fabric dev cluster provisioned on an Azure DevOps build pipeline.
We're using internal Windows Server 2022-based agent pool.
Everything worked until this Saturday 02.11.2024.
Before that we were getting this image: 20240922
Since Saturday we started getting this image: 20241021
Starting from Saturday the dev cluster fails to reach healthy state, due to the failing FaultAnalysisService (which is a non-configurable part of Service Fabric runtime). We don't have any visibility into why exactly it is failing.
This repo says we should be having this (two years old) version of Service Fabric runtime: 9.1.1436.9590.
That is not the case: the actual Service Fabric runtime, that now appears on our agents is this (two months old) one: 9.1.2718.9590. We established that by dumping FabricHost.exe from an agent.
We're not able to prove or disprove that SF runtime version is the actual culprit (because we cannot travel back in time to try the previous one - we're always getting the latest agent image, and cannot control its version), but it looks highly likely.
Question1: can there be any workaround for our failing SF cluster? E.g. maybe there's a way to override SF runtime version to be used? (Just remember that SF runtime installer requires root privileges, therefore just running it as part of the pipeline does not work).
Question2: why this repo's change history does not reflect the actual picture, and can this be fixed?
Question3: is there a chance to have SF runtime updated on the agent image? I cannot say which exact version it needs to be updated to (since we have no way to try them out), but maybe just to revert it to the previous, stable one?
Platforms affected
Runner images affected
Image version and build link
20241021
Is it regression?
yes
Expected behavior
SF dev cluster starts and successfully goes into healthy state on a build agent
Actual behavior
SF never reaches healthy state (waited for up to 1 hour)
Repro steps
The text was updated successfully, but these errors were encountered: