StateView blog

[HOWTO] Failed to store data in the Data Warehouse : Arithmetic overflow error converting expression to data type float.

06 okt, 2011

 

This blog describes the fixing of the error below:

 

Date and Time:

6-10-2011 10:25:45

Log Name:

Operations Manager

Source:

Health Service Modules

Event Number:

31552

Level:

1

Logging Computer:

OPSRMS01

User:

N/A

Description:

Failed to store data in the Data Warehouse. Exception 'SqlException': Sql execution failed. Error 777971002, Level 16, State 1, Procedure StandardDatasetAggregate, Line 424, Message: Sql execution failed. Error 777971002, Level 16, State 1, Procedure PerformanceAggregate, Line 149, Message: Sql execution failed. Error 8115, Level 16, State 2, Procedure -, Line 1, Message: Arithmetic overflow error converting expression to data type float. One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance Instance name: Performance data set Instance ID: {7547DA11-6328-54C6-00D6-C0729CD41CD8} Management group: SCOM01

 

 

Research:

It seems the aggregation of the hourly performance tables wend wrong. But what table are we talking about?

Okay looking at the error message the stored procedure what caused the error is PerformanceAggregate . Looking at this procedure you will see the SQL code that is giving the problem below.

 

SET @Statement =
        'INSERT ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@InsertTableName) + ' ('
      + '  [DateTime]'
      + ' ,PerformanceRuleInstanceRowId'
      + ' ,ManagedEntityRowId'
      + ' ,SampleCount'
      + ' ,AverageValue'
      + ' ,MinValue'
      + ' ,MaxValue'
      + ' ,StandardDeviation'
      + ')'
      + ' SELECT'
      + '    CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalStartDateTime, 120) + ''', 120)'
      + '   ,PerformanceRuleInstanceRowId'
      + '   ,ManagedEntityRowId'
      + '   ,COUNT(*)'
      + '   ,AVG(SampleValue)'
      + '   ,MIN(SampleValue)'
      + '   ,MAX(SampleValue)'
      + '   ,ISNULL(STDEV(SampleValue), 0)'
      + ' FROM ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@CoverViewName)
      + ' WHERE ([DateTime] >= CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalStartDateTime, 120) + ''', 120))'
      + '   AND ([DateTime] < CONVERT(datetime, ''' + CONVERT(varchar(50), @IntervalEndDateTime, 120) + ''', 120))'
      + ' GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId'

 

Since we are investigating a performance issue the @SchemaName and @CoverViewName would be ‘Perf.vPerfRaw’. Now we have to determine the correct values for the @IntervalStartDateTime and @IntervalEndDateTime. This can be done by looking at the StandardDatasetAggregationHistory table, by running the query below. We know it’s a performance issue so we look at the performance aggregate dataset and then we look in the history table for the last good aggregation for this dataset.

 

declare @DataSetId as uniqueidentifier 
select top 1 @DataSetId=SDS.DataSetId from dbo.StandardDatasetAggregation SDA
inner join StandardDataSet SDS on SDS.DataSetId=SDA.DataSetId
where SDA.BuildAggregationStoredprocedureName like '%PerformanceAggregate%'
select * from dbo.StandardDatasetAggregationHistory SDA
inner join dbo.StandardDataset SD on SD.DatasetId=SDA.DatasetId
where DirtyInd=1 and SDA.DataSetId=@DataSetId
order by AggregationDateTime ASC 

 

And whala the fist record below gives me the data period caused my error:

 

clip_image001

 

So I change the @IntervalStartDateTime = 2011-09-28 22:00:00  and @IntervalEndDateTime = 2011-09-30 04:01:28. And the query to execute is born:

 

SELECT CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 22:00:00', 120), 120) 
,PerformanceRuleInstanceRowId 
,ManagedEntityRowId 
,COUNT(*) 
,AVG(SampleValue) 
,MIN(SampleValue) 
,MAX(SampleValue) 
,ISNULL(STDEV(SampleValue), 0) 
FROM Perf.vPerfRaw 
WHERE ([DateTime] >= CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:00:00', 120), 120)) 
AND ([DateTime] < CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 04:01:00', 120), 120)) 
GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId 

Hmm but this query's gives me:

 

clip_image002

 

Yes this is exactly what we want. Now we are going to to change the end date to a lower period so we can isolate the record giving the overflow. Doing this I am getting the error period is '2011-09-29 21:05:45'

So next is to hunt down this bad record:

 

SELECT PerformanceRuleInstanceRowId 
,ManagedEntityRowId 
,SampleValue 
FROM Perf.vPerfRaw 
WHERE ([DateTime] = CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120)) 
order by SampleValue 

 

Wooaaw found it:

 

clip_image003

 

Hmm the STDEV doesn’t like showing large negative number.

Let’s look what this function does:

Returns the statistical standard deviation of all values in the specified expression. May be followed by the OVER clause.

 

Okay we could look and investigate what data value we must change it to but I am not willing to spend to much time. Since the value is soooo large I probably assume the measurement was false. So I will change it to 0.

I you wanted still to investigate you could use the query below and change the E+217 to a lower value till the query runs okay:

 

declare @float as Float 
set @float = -1.1031304526204E+217 
select @float 
select STDEV(@float) 

p.s E+154 is the maximum you can apply ;-))

 

As i said I am going to change this bad records to 0. Since we are looking at a view and this view isn’t updatable we have first to find out the root table containing this data. This isn’t so hard.

The query below gives you the performance RAW table containing the records:

The dadasetid is the same as you had got back in the first query as @DataSetId.

 

SELECT [StandardDatasetTableMapRowId] 
,[DatasetId] 
,[AggregationTypeId] 
,[TableGuid] 
,[TableNameSuffix] 
,[InsertInd] 
,[OptimizedInd] 
,[StartDateTime] 
,[EndDateTime] 
FROM [OperationsManagerDW].[dbo].[StandardDatasetTableMap] 
where datasetid = '1B1F0F44-A208-4145-8E59-9121357D78F2' 
and [AggregationTypeId] = 0 
and '2011-09-29 21:05:45' between [StartDateTime] and [EndDateTime] 

 

Running this query will give you below the table we have to change:

 

clip_image004

 

Yes yes finally we are there. Now we are going to update the records. The table to use is : Perf.PerfRaw_E721608C35A44620AE3E0DE028C3C5A2

So the update query is:

 

update Perf.PerfRaw_E721608C35A44620AE3E0DE028C3C5A2 
set SampleValue = 0 
WHERE ([DateTime] = CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120)) 
and SampleValue = -1.1031304526204E+217 

 

The result is , as expected:

 

clip_image005

 

Lets check if its now fixed:

 

SELECT CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 22:00:00', 120), 120) 
,PerformanceRuleInstanceRowId 
,ManagedEntityRowId 
,COUNT(*) 
,AVG(SampleValue) 
,MIN(SampleValue) 
,MAX(SampleValue) 
,ISNULL(STDEV(SampleValue), 0) 
FROM Perf.vPerfRaw 
WHERE ([DateTime] >= CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:05:45', 120), 120)) 
AND ([DateTime] < CONVERT(datetime, + CONVERT(varchar(50), '2011-09-29 21:07:50', 120), 120)) 
GROUP BY PerformanceRuleInstanceRowId, ManagedEntityRowId 

 

Gives me back:

 

clip_image006

 

SO IT’S FIXED !!!

 

But what targets workflows caused this bad data. Take the ManagedEntityRowId and PerformanceRuleInstanceRowId data from the bad records.

Below the query for the guilty targets:

select * from dbo.ManagedEntity 
where ManagedEntityRowId in (103425,103424,103426) 

clip_image007

And the below the query for the related workflows:

SELECT PerformanceRule.ObjectName, PerformanceRule.CounterName, PerformanceRuleInstance.InstanceName 
FROM PerformanceRule INNER JOIN 
PerformanceRuleInstance ON PerformanceRule.RuleRowId = PerformanceRuleInstance.RuleRowId 
WHERE (PerformanceRuleInstance.PerformanceRuleInstanceRowId = 346638) 

clip_image008

 

Happy SCOMMING!

Michel Kamp

 

You are not allowed to post comments.

Monitoring van onder andere: