This blog describes the fixing of the error below:
| Date and Time: | 6-10-2011 10:25:45 | | Log Name: | Operations Manager | | Source: | Health Service Modules | | Event Number: | 31552 | | Level: | 1 | | Logging Computer: | OPSRMS01 | | User: | N/A | | | Description: | | Failed to store data in the Data Warehouse. Exception 'SqlException': Sql execution failed. Error 777971002, Level 16, State 1, Procedure StandardDatasetAggregate, Line 424, Message: Sql execution failed. Error 777971002, Level 16, State 1, Procedure PerformanceAggregate, Line 149, Message: Sql execution failed. Error 8115, Level 16, State 2, Procedure -, Line 1, Message: Arithmetic overflow error converting expression to data type float. One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance Instance name: Performance data set Instance ID: {7547DA11-6328-54C6-00D6-C0729CD41CD8} Management group: SCOM01 | | | | |
Research:
It seems the aggregation of the hourly performance tables wend wrong. But what table are we talking about?
Okay looking at the error message the stored procedure what caused the error is PerformanceAggregate . Looking at this procedure you will see the SQL code that is giving the problem below.
SET @Statement =
'INSERT ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@InsertTableName) + ' ('
+ ' [DateTime]'
+ ' ,PerformanceRuleInstanceRowId'
+ ' ,ManagedEntityRowId'
+ ' ,SampleCount'
+ ' ,AverageValue'
+ ' ,MinValue'
+ ' ,MaxValue'
+ ' ,StandardDeviation'
+ ')'
+ ' SELECT'
+ ' CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalStartDateTime, 120) + ''', 120)'
+ ' ,PerformanceRuleInstanceRowId'
+ ' ,ManagedEntityRowId'
+ ' ,COUNT(*)'
+ ' ,AVG(SampleValue)'
+ ' ,MIN(SampleValue)'
+ ' ,MAX(SampleValue)'
+ ' ,ISNULL(STDEV(SampleValue), 0)'
+ ' FROM ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@CoverViewName)
+ ' WHERE ([DateTime] >= CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalStartDateTime, 120) + ''', 120))'
+ ' AND ([DateTime] < CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalEndDateTime, 120) + ''', 120))'
+ ' GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId'
Since we are investigating a performance issue the @SchemaName and @CoverViewName would be ‘Perf.vPerfRaw’. Now we have to determine the correct values for the @IntervalStartDateTime and @IntervalEndDateTime. This can be done by looking at the StandardDatasetAggregationHistory table, by running the query below. We know it’s a performance issue so we look at the performance aggregate dataset and then we look in the history table for the last good aggregation for this dataset.
declare @DataSetId as uniqueidentifier
select top 1 @DataSetId=SDS.DataSetId from dbo.StandardDatasetAggregation SDA
inner join StandardDataSet SDS on SDS.DataSetId=SDA.DataSetId
where SDA.BuildAggregationStoredprocedureName like '%PerformanceAggregate%'
select * from dbo.StandardDatasetAggregationHistory SDA
inner join dbo.StandardDataset SD on SD.DatasetId=SDA.DatasetId
where DirtyInd=1 and SDA.DataSetId=@DataSetId
order by AggregationDateTime ASC
And whala the fist record below gives me the data period caused my error:
So I change the @IntervalStartDateTime = 2011-09-28 22:00:00 and @IntervalEndDateTime = 2011-09-30 04:01:28. And the query to execute is born:
SELECT CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 22:00:00', 120), 120)
,PerformanceRuleInstanceRowId
,ManagedEntityRowId
,COUNT(*)
,AVG(SampleValue)
,MIN(SampleValue)
,MAX(SampleValue)
,ISNULL(STDEV(SampleValue), 0)
FROM Perf.vPerfRaw
WHERE ([DateTime] >= CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:00:00', 120), 120))
AND ([DateTime] < CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 04:01:00', 120), 120))
GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId
Hmm but this query's gives me:
Yes this is exactly what we want. Now we are going to to change the end date to a lower period so we can isolate the record giving the overflow. Doing this I am getting the error period is '2011-09-29 21:05:45'
So next is to hunt down this bad record:
SELECT PerformanceRuleInstanceRowId
,ManagedEntityRowId
,SampleValue
FROM Perf.vPerfRaw
WHERE ([DateTime] = CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120))
order by SampleValue
Wooaaw found it:
Hmm the STDEV doesn’t like showing large negative number.
Let’s look what this function does:
| Returns the statistical standard deviation of all values in the specified expression. May be followed by the OVER clause. |
Okay we could look and investigate what data value we must change it to but I am not willing to spend to much time. Since the value is soooo large I probably assume the measurement was false. So I will change it to 0.
I you wanted still to investigate you could use the query below and change the E+217 to a lower value till the query runs okay:
declare @float as Float
set @float = -1.1031304526204E+217
select @float
select STDEV(@float)
p.s E+154 is the maximum you can apply ;-))
As i said I am going to change this bad records to 0. Since we are looking at a view and this view isn’t updatable we have first to find out the root table containing this data. This isn’t so hard.
The query below gives you the performance RAW table containing the records:
The dadasetid is the same as you had got back in the first query as @DataSetId.
SELECT [StandardDatasetTableMapRowId]
,[DatasetId]
,[AggregationTypeId]
,[TableGuid]
,[TableNameSuffix]
,[InsertInd]
,[OptimizedInd]
,[StartDateTime]
,[EndDateTime]
FROM [OperationsManagerDW].[dbo].[StandardDatasetTableMap]
where datasetid = '1B1F0F44-A208-4145-8E59-9121357D78F2'
and [AggregationTypeId] = 0
and '2011-09-29 21:05:45' between [StartDateTime] and [EndDateTime]
Running this query will give you below the table we have to change:
Yes yes finally we are there. Now we are going to update the records. The table to use is : Perf.PerfRaw_E721608C35A44620AE3E0DE028C3C5A2
So the update query is:
update Perf.PerfRaw_E721608C35A44620AE3E0DE028C3C5A2
set SampleValue = 0
WHERE ([DateTime] = CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120))
and SampleValue = -1.1031304526204E+217
The result is , as expected:
Lets check if its now fixed:
SELECT CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 22:00:00', 120), 120)
,PerformanceRuleInstanceRowId
,ManagedEntityRowId
,COUNT(*)
,AVG(SampleValue)
,MIN(SampleValue)
,MAX(SampleValue)
,ISNULL(STDEV(SampleValue), 0)
FROM Perf.vPerfRaw
WHERE ([DateTime] >= CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120))
AND ([DateTime] < CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:07:50', 120), 120))
GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId
Gives me back:
SO IT’S FIXED !!!
But what targets workflows caused this bad data. Take the ManagedEntityRowId and PerformanceRuleInstanceRowId data from the bad records.
Below the query for the guilty targets:
select * from dbo.ManagedEntity
where ManagedEntityRowId in (103425,103424,103426)
And the below the query for the related workflows:
SELECT PerformanceRule.ObjectName, PerformanceRule.CounterName, PerformanceRuleInstance.InstanceName
FROM PerformanceRule INNER JOIN
PerformanceRuleInstance ON PerformanceRule.RuleRowId = PerformanceRuleInstance.RuleRowId
WHERE (PerformanceRuleInstance.PerformanceRuleInstanceRowId = 346638)
Happy SCOMMING!
Michel Kamp