Saturday, February 25, 2012

Cube from denormalized table: Several dimensions vs. one dimension with several attributes

Hi,

I want to build a cube in AS2005 from a denormalized table.

The table consists of facts from the last three years, with all interesting attributes (about 12) in a seperate column of this table.

Actulally this attributes would form three dimensions (semantically).

I thought of two different approaches to structure the cube:

1. Building a separate dimension for each attribute

or

2. Building three dimensions (or to be more radical : one dimension) that contains all attributes

In my understanding, with the new concept of attribute based hierarchies, both variants should have the same effect on aggregations, performance and so on.

Only difference would the structure shown to the user .

Is this right?

Does autoexist have any influence on the both design variants?

It seems like modeling each attribute as a separate dimension would be a lot easier to use for analysis, right?|||

That is entirely not so.

When designing your model, you should group attributes into dimensions. For instance if you have geographical data , like Country, State, City, you should build a single dimension and create hierarchy.

Analysis Services although allows you to create model from almost any type of schema, you should try to follow the relationships between entities in relational data. That is you should design multideimensional model as if data as normalized as possible.

Having 3 dimensions with multiple attributes sounds like better idea.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

|||

Thank you Edward, for your reply.

That is what I supposed to do, but:

As I mentioned, the table contains facts from several years. Within this time, there are dimenions, in which some attributes changed. (Refential integrity in the denormalized table is destroyed)

So when I'm trying to process a dimension with several attributes directly on the table, I get the "key duplicate" error from the processing engine.

To build a model with 3 dimension contailing several attributes woud lead to a reverse design of a slow changing dimemsion, with surrogate key for each attribute combination and timestamp etc......

This is what I wanted to avoid if possible! So I thought of building a dimension form every single attribute in the table. Then historical changes of attributes stay related to the facts correctly.

As I found in another whitepaper, a dimension in ass2005 is nothing more than a logical container for attribute dimensions.--> Is this right ?

So what would be the advantages grouping attributes to dimensions.

As I am quite new to OLAP and MDX I haven't an overview on the technology by now. Are there any disadvantages when later querying the cube? (Restrictions, performance)

Thanks,

Keme

|||

The problem with creating dimension per attribute would come when your users will try to analyze the data , they run into problems trying to make sence out of the numbers.

You might be saving yourself some time by treating attributes as separate, but if they are logically related, your users will tell you that numbers are wrong. You'd have to come back and build hierarchies and work out through processing errors. Analysis Services are very good pointing out inconsistencies in data.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

No comments:

Post a Comment