Friday, March 23, 2012

Mining Model predicts same set of products for all cases

Hi

I have developed a product

basket mining model as follows

DSV

SELECT C.CustomerId,C.CustomerIdName,P.ProductId,P.ProductIdName

FROM Customer INNER

JOIN CustomerProduct

ON C.OpportunityId

= P.OpportunityId

Mining Structure

CREATE MINING MODEL ProductBasket

(

CustomerId

TEXT KEY,

CustomerIdName

PREDICT,

ProductId

PREDICT,

ProductRecommend

TABLE PREDICT

(

ProductId TEXT KEY

ProductIdName PREDICTONLY

)

)

USING Microsoft_Association_Rules

Prediction Query

Since I want the output in the following format

Product

ID

Items (nested Table)

Product

A

product B

Product C

Product

B

Product A

Product C

I have written the prediction query as follows

SELECT

t.[ProductId],

PredictAssociation([Association].[ ProductId],3)

From

[Association]

PREDICTION JOIN

OPENQUERY([Adventure Works Cycle MSCRM],

'SELECT DISTINCT

[ProductId]

FROM

(SELECT ProductId FROM ProductBase)

as [Product]

') AS t

ON

[Association].[Product Id] = t.[ProductId]

The model is predicting the

same set of products for every case. Even changes in the algorithm parameter

value do not have any impact on the result.

What is the reason for this and

how can u rectify it?

If you're doing a simple recommendation model (i.e. a model that provides recommendations based on what's in the user's shopping basket, there is no need to have the product id at both case and nested level. Your modeling seems unneccessarily complex for this scenario - it also explains why you're seeing the same recommendations for all cases: based on the DSV you've listed, your training cases don't include the basket contents at all for each transaction.

If you use a model definition similar to the one I described in an earlier thread on this topic (http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=649759&SiteID=1), your query would look like this:


SELECT t.CustomerID, PredictAssociation(AssociationModel.Products, 3) FROM PM
PREDICTION JOIN
SHAPE {
OPENQUERY(ds, 'SELECT CustomerID FROM CustomerProducts ORDER BY CustomerID')
}
APPEND ( {
OPENQUERY(ds, 'SELECT CustomerID, ProductName FROM CustomerProducts ORDER BY CustomerID') }
RELATE CustomerID to CustomerID ) AS Products as t
ON AssociationModel.Products.ProductName = t.Products.ProductName

The core idea is that the input case (for prediction as well as training) is the set of products in the customer's basket and the prediction is the list of recommended products. Both the input and prediction map to the same nested table in the mining structure/model.

|||

The first query should be SELECT DISTINCT CustomerID ...

|||

Also, if you want simply a list of recommendations per product, and you first created a model based on customer baskets you could then create a query like this to get the complete list of product recommendations:

SELECT t.ProductName, PredictAssociation(Products, 3) FROM PM
PREDICTION JOIN
SHAPE {
OPENQUERY(ds, 'SELECT DISTINCT ProductName FROM CustomerProducts ORDER BY ProductName')
}
APPEND ( {
OPENQUERY(ds, 'SELECT DISTINCT ProductName FROM CustomerProducts ORDER BY ProductName') }
RELATE ProductName to ProductName) AS Products as t
ON PM.Products.ProductName = t.Products.ProductName

This works because you are specifying a basket containing a single item as input for the prediction on a model that was created on customer baskets. Using the productname on the top level of the input query for the prediction is just a shortcut since we know we want baskets of size 1 anyway. Doing an input query like this for the model training would cause no rules to be found since only baskets greater than size 1 imply any patterns.

No comments:

Post a Comment