Generating and Saving the Predictions for a Classification or Regression Predictive Model

Open the relevant predictive model.

Click Apply Predictive Model

.

The Apply Predictive Model window opens.

In the Apply To Population section, select the application you want to apply your predictive model on. Don't forget that this dataset must be prepared beforehand, it cannot be created at this step.

In the Generated Dataset section, you select the additional columns you want to have in your generated dataset:

Replicated Column: select which columns from the training data source that should replicated in the generated dataset.
Restriction

If your application dataset contains more columns than your training dataset, the additional columns will be ignrored by the application process.

Statistics & Predictions: This is information about your predictive model that you want to have in the generated dataset.

Information	Description	Comments
Apply Date	It's the start date of the predictive model application.	The type of the column is `TIMESTAMP`.
Train Date	It's the start date of the predictive model training.	The type of the column is `TIMESTAMP`.

Statistics: select the statistics regarding the influencers you want to save in your dataset:

Statistic	Description
Assigned Bin	When selected, individuals in the application population are assigned to referring quantiles defined on the validation population. Assigned bins explained: The validation population during training is spread out in quantiles (bins), each defined by a range of scores, to serve as references (assigned bins) to an application population. When a predictive model is applied, each individual in the application population is allocated to an assigned bin based on its predicted score. As each assigned bin represents 10% of the training population, if the population structure is unchanged, this % value should remain stable on the application population. If this is not the case, it doesn’t mean that the predictive model is no longer accurate, rather that the structure of the population has changed. For example there are more or less potential churners now, than in the past. The accuracy of the predictions should be monitored to back up the decisions. Note The number of bin is set to 10 and isn't customizable. See the section How does Smart Predict Create Assigned Bins? for information on using assigned bins.
Outlier Indicator	For each row in the application dataset, the Outlier Indicator is 1 if the row is an outlier with respect to the target, otherwise 0. An observation is considered an outlier when the prediction error is greater than 3 times the average prediction error found on similar observations.

Statistic

Description

Assigned Bin

When selected, individuals in the application population are assigned to referring quantiles defined on the validation population.

Assigned bins explained: The validation population during training is spread out in quantiles (bins), each defined by a range of scores, to serve as references (assigned bins) to an application population. When a predictive model is applied, each individual in the application population is allocated to an assigned bin based on its predicted score. As each assigned bin represents 10% of the training population, if the population structure is unchanged, this % value should remain stable on the application population. If this is not the case, it doesn’t mean that the predictive model is no longer accurate, rather that the structure of the population has changed. For example there are more or less potential churners now, than in the past. The accuracy of the predictions should be monitored to back up the decisions.

Note

The number of bin is set to 10 and isn't customizable.

See the section How does Smart Predict Create Assigned Bins? for information on using assigned bins.

Outlier Indicator

For each row in the application dataset, the Outlier Indicator is 1 if the row is an outlier with respect to the target, otherwise 0.

An observation is considered an outlier when the prediction error is greater than 3 times the average prediction error found on similar observations.

Predictions: select the predictions to include in the output table:

Prediction	Description
Predicted Category Classification predictive models (nominal target with 2 values only)	For each row in the application dataset, the Predicted Category is the target category determined by the predictive model. The percentage of predicted target categories found in the application dataset corresponds to the Contacted Population percentage that is set by default when entering the Confusion Matrix. Any change done by the user in the Confusion Matrix does not affect the Predicted Category in the generated dataset. An alternate way could be to generate the Prediction Probability (instead of the Predicted Category) and set a decision threshold (see How is a Decision Made For a Classification Result?) on the value of the probability based on the business requirements.
Prediction Probability Classification predictive models (nominal target with 2 values only)	For each row in the application dataset, the Prediction Probability is the probability that the Predicted Category is the target value.
Predicted Value Regression predictive models (continuous target)	For each row in the application dataset, the Predicted Value is the value predicted for the target.
Prediction Explanations Classification and regression predictive models	For each row of the application dataset, the Prediction Explanations is a set of explanations for the prediction.

Note

If you do not select any statistics or predictions, only the target and the key influencer(s) are included.

Output as: Give a name to your generated dataset.

Click Apply.

The status of your predictive model is updated to <Applied>. You can find your generated dataset with the forecasts by viewing the Recent Files (from the side navigation, choose Start of the navigation path

(Datasets) Next navigation step

Recent Files End of the navigation path

) or by going to the Files page (from the side navigation, choose

Files), where you can search for the file. You can then access to your results directly by opening the generated dataset or depending on your business needs, consume the output dataset in a BI story.

Generating and Saving the Predictions for a Classification or Regression Predictive Model

Context

Procedure