TL;DR – How to Fix!

If your Health service is in a Failed state, you most likely cannot get your SDDC Management service to start up. You also probably cannot manage the cluster via Windows Admin Center (WAC).

It also means you cannot run the Stop-ClusterPerformanceHistory command, as that relies on these services to be running.

If you do not care about the historical data of your cluster’s stats, you can purge it all and recreate everything by running the following commands on a S2D cluster node directly:

Before running any code you find on a random website, please read through it and run it at your own risk.
This script is for the United States English (en-US) locale. If your cluster is configured with another locale (ex: de-DE), please update the script to use the translated names of these Cluster Resources and Groups.

An example of this would be “Clustergruppe” versus “Cluster Group” or “Integrität” versus “Health“.
# Run this directly on any of the cluster nodes
$ClusterName = (Get-Cluster).Name
# Get the OS build number, which determines which Cluster Group name this stuff is in
$osBuildNumber = (Get-CimInstance -ClassName Win32_OperatingSystem).BuildNumber
If ($osBuildNumber -le 17763) { $groupName = "Cluster Group" } Else { $groupName = "SDDC Group" }
# Confirm busted resource status
Get-ClusterGroup -Name $groupName | Get-ClusterResource
# Remove the busted service
Remove-ClusterResource -Name "Health" -Force
# Confirm the Health resource is gone by comparing output to above
Get-ClusterGroup -Name $groupName | Get-ClusterResource
# Enable it and add health providers
Get-CimInstance -Namespace root\mscluster -ComputerName $ClusterName -ClassName MScluster_ClusterService | Invoke-CimMethod -Name EnableHealth
Get-CimInstance -Namespace root\mscluster -ComputerName $ClusterName -ClassName MScluster_ClusterService | Invoke-CimMethod -Name AddHealthProviders -Arguments @{Providers="{7547c610-bd45-4e54-971b-8d319bf0a7a6}","{bb20ff2e-7127-4d86-9f52-906f891230be}","{e24b3e7e-a73a-4108-b79f-894cc1184295}","{89290bad-72c0-43c2-aa41-217818de7528}","{c06b98bf-0dbc-4259-ae0d-5801eda8e7e1}","{29d1f3ee-dbcf-44e9-b0cc-085bfa362499}"}
# Confirm providers are registered
(Get-ClusterResource -Name "Health" | Get-ClusterParameter).Value | FL *
# Move the entire group to another cluster node
Get-ClusterGroup -Name $groupName | Move-ClusterGroup
# Confirm all 3x resources are showing up and online
Get-ClusterGroup -Name $groupName | Get-ClusterResource
# Delete and recreate the ClusterPerformanceHistory CSV
# Note that it may take up to 5 minutes before the CSV reappears and logs data
Stop-ClusterPerformanceHistory -DeleteHistory
Start-ClusterPerformanceHistory

A Little more about SDDC Management

Since Windows Server 2019, when you deploy a Storage Spaces Direct (S2D) cluster, you get a few out-of-the-box resources for managing the S2D components via WAC:

This slightly changed in Windows Server 2022 (and Azure Stack HCI), where Microsoft moved these resources from the Core Cluster Group to its own SDDC Group:

Failover Cluster Manager sometimes does not show the SDDC Group in the GUI due to a display bug. Microsoft has made it clear that they have no interest in fixing this since you should be using PowerShell or Windows Admin Center… 🤷‍♂️

The SDDC Management resource is a grouping of “microservices” responsible for relaying information about the cluster, its member nodes, networking, and storage to whatever is querying the API. This is the main way that Windows Admin Center (WAC) and other tooling gets information about the cluster.

4 thoughts on “Repairing Cluster Health and SDDC Resources in an Azure Local, Azure Stack HCI, or Storage Spaces Direct (S2D) Cluster

  1. Is there any information out there with regards to what the Providers specifically do? We have a cluster where we cannot add all the health providers, and in fact, specifically provider 29d1f3ee-dbcf-44e9-b0cc-085bfa362499 causes the health service to fail to start. We can add all the other providers and it works OK, but we are unable to create the Cluster Performance History volume. Running the start command has no result.

    Surprisingly there seems to be very little to go off of in Event Viewer as well 🙁

    1. Sorry, I have no idea what they do. My assumption is there’s a relationship between those provider GUIDs and the various “things” the health service tracks (Storage Scale Units, etc). But I honestly do not know.

Leave a Reply to Nick Mayer Cancel reply

Your email address will not be published. Required fields are marked *