Friday, March 23, 2012

Mining cubes vs relational

I've not found much guidance so far about the pros and cons of mining OLAP cubes vs their underlying fact and dimension tables. Can anyone offer advice in this area, or point me to more info about this?

Thanks,
Kevin

Generally I recommend mining OLAP cubes when you are mining results that OLAP is good at producing, otherwise you should mine relational tables. OLAP is not particularly good at returning large amounts of detail data. However, if you were mining stores, for example, and your inputs were total sales of products for particular months/categories, or even sales acceleration of such by store, those measures are very difficult and slow to obtain through relational means, and I would recommend OLAP.

Another issue with OLAP MM's is that there really isn't any way to split data into training and validation sets. If this is a requirement, and you need to use the aggregation/calculation facilities of OLAP, you can get the OLAP results into a table, split into training and validation sets, and then mine the resultant relational table.

sql

No comments:

Post a Comment