How does Smart Predict Select the Best Predictive Model?

As the values of the target variable are known in your training data source, the data can be used to evaluate the accuracy of the predictive model's results. Thanks to the partition strategy, Smart Predict first creates several predictive models versions and then cross-validates the results of these generated predictive models in order to keep the one with the best performance.

How is done the selection for a classification predictive model?

In the case of a classification model, Smart Predict bases the selection looking at the performance indicators Predictive Power and Prediction Confidence. It selects the model with the best compromise between perfect quality and perfect robustness.

How is done the selection for a regression predictive model?

In the case of a regression model, Smart Predict looks at the performance indicators Root Mean Squared Error (RMSE) and Prediction Confidence. It selects the model with the best compromise between these two performance indicators.

How is done the selection for a time series predictive model?

The selection of the best time series predictive model is based on the horizon-wide MAE: The time series predictive model is applied on the past observations found in the validation set. For each period, the predictive model calculates as many forecasted values as requested by the analyst. This is called the horizon of forecasts. Each of those forecasted values is compared to the corresponding actual one. Then, for each possible horizon, a per-horizon MAE can be calculated. The horizon-wide MAE is the mean of all per-horizon MAE values.