I want to do cumulative aggregation similar to "period to date", but not to a specific period in the past, but to the earliest record in the fact table. The problem is that I have 20 million accounts that I want to follow up on a daily basis, but they change rarely. After a year, perhaps half a million have changed. What I have done is to create a fact table that stores the balance when the account is opened, then only keeps track of the changes, so that when anyone deposits money a row is added with the increment. After a year that would give me about 20.5 million rows in the table. As an analyst I would like to enter a specific date, say 2005-08-20 and see the balance of all the accounts at that time. To do so, I simply need to calculate the sum of all increments of all accounts up to that time.
If I use a regular time dimension in my cube, I can "fake" this behaviour, but it is tedious, since I will also need to select all the dates before the 20th of august 2005. Lots of mouse-clicks and possible user confusion. I would prefer if I only had to select the specific date and the cube did the summing up behind the scenes. Hopefully there's a simple solution, since the current workaround would be to add the balance of all accounts every day (365 * 20 million rows a year). I tried the "Add Business Intelligence" but found nothing suitable.
I'm grateful for any help you can provide,
Lars (Analysis Services rookie)
Some of the ideas in this paper may help, specifically the section on "Summing Increments":
http://www.sqlserveranalysisservices.com/OLAPPapers/InventoryManagement%20in%20AS2005v2.htm
>>
Inventory Management Calculations in SQL Server Analysis Services 2005
...
Summing Increments
For any given warehouse and product, the running delta sum is the sum of Units Delta from the beginning of time.
>>
|||Brilliant!
Create Member Test.[Running Delta Sum] as
sum(
generate
(
ascendants([Time].[Calendar].currentmember),
iif(
[Time].[Calendar].currentmember IS
[Time].[Calendar].firstsibling,
{},
[Time].[Calendar].firstsibling:
[Time].[Calendar].prevmember
)
) + [Time].[Calendar].currentmember
,
[Measures].[Account Balance Change]
)
did the trick for me. Provided that the [Time] dimension has a hierarchy named [Calendar] the performance is great as well.
Many thanks,
Lars
With that in place I've stumbled upon another problem. The account balance is correctly calculated using the already precalculated aggregations regardless of the number of deposits or withdraws having been made. Now I want to introduce another measure, the number of unique accounts. I suspect that the same approach cannot be used then, since the count of distinct accounts must be calculated over the entire time span at its lowest level to ensure that there are no duplicates.
I suppose have to do something along the lines of:
Create Member Measures.[Unique Accounts] as
distinctcount (
[Time].[Calendar].[Date].members(0):
ClosingPeriod([Time].[Calendar].[Date]),
Measures.[AccountID]
)
as suggested as a less optimized approach in the referenced article. Unfortunately distinctcount() only takes a single argument, so I need to somehow bake the two arguments into a <<set>>. Forgive me if this is covered in beginners MDX.
Any help is appreciated,
Lars
Hi Lars,
Can't you simply create a measure on the AccountID source column, using the built-in "DistinctCount" aggregation function?
http://msdn2.microsoft.com/en-us/library/ms175623.aspx
>>
Configuring Measure Properties
AggregateFunction
Determines how measures are aggregated. For more information, see Aggregation Functions.
DistinctCount
Nonadditive
Retrieves the count of all unique child members.
>>
|||Hi Deepak,
if I do that I will only see the count of distinct accounts for the selected day, and that day alone. In other words, only for accounts whose balance changed that day, which doesn't correspond to the logic behind how the running sum is calculated. I need to caculate the number of distinct accounts from the earliest record up to the selected day. That way it will correspond with the sum that is shown and I could use the two to for example calculate an average balance per account.
Using the built-in distinct count on the AccountID column doesn't work then.
Thanks,
Lars
Lars,
With AS 2005 you can now apply Aggregate() to distinct count measures as well:
http://sqljunkies.com/WebLog/mosha/archive/2005/10/11/mdx_functions_as2005.aspx
>>
New AS2005 functions
Aggregate( [, ] )
Well, this isn't new function, but there were significant changes around it. First, Aggregate works with Distinct Count measure...
>>
So you could have something like:
Create Member Measures.[Unique Accounts] as
Aggregate (
[Time].[Calendar].[Date].members(0):
ClosingPeriod([Time].[Calendar].[Date]),
Measures.[AccountIDCount]
)
where Measures.[AccountIDCount]is a distinct count measure on the AccountID column.
Deepak,
I cannot even begin to express how grateful I am for your help. Now the distinct count is working as intended as per your example above. However, it seems that everytime I think I'm done, something else comes up. Dare I ask another question:
My accounts have an account status as well, in which an account can be "Open" or "Closed". When an account is "Closed" it is represented by a row containing a negative delta evening out the balance to zero. In other words, this keeps the running sum of the account balance in order, but it is messing up my distinct count. When I look at the number of accounts I only want to count the ones that haven't been closed, so I am guessing that I have to use Exists() or Filter() to get the set of non-closed accounts. Again though, the MDX is getting the better of me. I tried to do a measure counting the number of closed accounts to begin with:
Create Member Measures.[Unique Closed Accounts] as
Aggregate (
Exists (
[Time].[Calendar].[Date].members(0):
ClosingPeriod([Time].[Calendar].[Date]),
[AccountStatus].[Status].[Closed]
),
Measures.[AccountIDCount]
)
Although, this still counts all the accounts, whether they have been closed or not. How do I filter the set of accounts to only include the closed ones?
Thanks!
Lars
|||
Lars,
You could try including [AccountStatus].[Status].[Closed] in the set that is aggregated:
Create Member Measures.[Unique Closed Accounts] as
Aggregate (
{[Time].[Calendar].[Date].members(0):
ClosingPeriod([Time].[Calendar].[Date])}
* [AccountStatus].[Status].[Closed],
Measures.[AccountIDCount]
)
No comments:
Post a Comment