My shop uses a lot of monitoring. In fact, just about every monitoring feature built into SQL Server we have tried, are using, or have plans to use in the near future. Some of the monitoring overlaps in purpose, and some of it steps on each others’ toes. We get useful data, but we’ve also discovered a lot of unexpected errors. One such issue was uncovered this month when we were updating servers to a new cumulative update.
A little background first
We use Collection Sets. A more obscure feature of SQL Server that you may have never even heard of. I admit, I did not remember that it existed before one of my colleagues suggested trying it. Collection sets essentially provide data audits and have helped us track down some anomalies in the past. The collection sets run nearly constant queries across all of our servers, then report back to the CMS to catalog the data. We’ve altered the jobs a bit, most importantly offsetting the collection set’s upload schedules to avoid latency and blocking issues.
Well, after installing a Cumulative Update this month, all of the collection sets broke. Failures started flowing in stating that the data uploads were not processing and our inboxes exploded. We had to figure out what the problem was and then fix over a hundred servers.
Error Message
…Component name: DFT – Upload collection snapshot, Code: -1073450901, Subcomponent: SSIS.Pipeline, Description: “RFS – Read Current Upload Data” failed validation and returned validation status…
Solution
The first step was to determine how to get the collection set jobs working correctly again. We had to stop each collection set, then stop any collection upload jobs that were still running.
That alone wasn’t enough though, we had to then clear the cache files as well. These are stored in a folder on each server running the collection sets. You can find your cache location in the Object Explorer under Management -> Data Collection -> Right-click Properties.
Deleting the cache seemed unnecessary to me at first, but when I tried skipping it, the collection still failed after restart.
We had to fix a lot of servers, and it would have taken hours to do this manually. So I scripted all the changes. I needed it done quickly because I was planning to get Korean food with a friend on the day of the break.
Scripts
You can do this all through PowerShell using Invoke-SqlCmd
, but I find that process is very slow to make an initial connection, limits feedback, and I just needed this done fast.
-- SQL -- Stop Collection Sets EXEC msdb.dbo.sp_syscollector_stop_collection_set @name = 'Query Statistics' EXEC msdb.dbo.sp_syscollector_stop_collection_set @name = 'Server Activity' -- Stop Collection Upload jobs - they keep running after collection stops -- Change job names as necessary EXEC msdb.dbo.sp_stop_job @job_name = 'collection_set_3_upload' EXEC msdb.dbo.sp_stop_job @job_name = 'collection_set_2_upload'
## POWERSHELL # Create file with computer names to connect to, or supply names with commas $Computers = (Get-Content C:\Users\$Env:UserName\Desktop\Computers.txt) # Provide path where you save your cache files $Cache = "D$\Cache\CollectionSet" $StartTime = Get-Date ForEach ($Computer in $Computers) { $BeginTime = Get-Date (Get-ChildItem "\\$Computer\$Cache" -Recurse) | Remove-Item -Force "$Computer complete in $(((Get-Date) - $BeginTime).TotalSeconds) seconds." } "Total Runtime: $(((Get-Date) - $StartTime).TotalSeconds) seconds."
-- SQL -- Start Collection Sets EXEC msdb.dbo.sp_syscollector_start_collection_set @name = 'Query Statistics' EXEC msdb.dbo.sp_syscollector_start_collection_set @name = 'Server Activity'
The PowerShell code should run quickly as it will delete files from more than one server at a time. I included the computer name and run time as feedback so you don’t sit wondering how far along the script is. I had no feedback in my original script, so I sat fretting over how long it would take and how long I’d have to wait to go to lunch.
Filed under: Administration, PowerShell, SQL Server, Troubleshooting Tagged: Administration, CMS, Multi-Server, PowerShell, SQL Server
