Understanding the Basic Concepts Used in Smart Predict

Here are some explanations of terms that you'll encounter when you create a predictive scenario:
  • Predictive Scenario: A workspace where you create and compare predictive models to find the one that provides the best insights to help solve a business question requiring predictions. Currently, you can choose between 3 different types: classification, regression, and time series forecasting.
  • Predictive Model: The result found by Smart Predict after exploring relationships in your data using SAP automated machine learning. Each predictive model produces visualizations and performance indicators based on certain requirements that you have set, so you can understand and evaluate the accuracy of the predictive results. You'll probably want to experiment a bit with different predictive models, varying the input data, or the training settings, until you are satisfied with the accuracy and relevance of the results.
  • Data source: The form and origin of the data that you'll use to create a predictive model. This could be a dataset in a database or a planning model in an SAP Analytics Cloud story.
  • Target: The variable that you want to explain or predict values for. Depending on your data source, this could be the column or dimension that you're interested to know.
    In the Smart Predict documentation the term variable is used to mean either column or dimension. However, in the user interface and messages, you'll see the specific term for the data source being used: columns in datasets and dimensions in planning model versions.
  • Entity: Only used in time series forecasting predictive scenarios. You can split up a population into distinct sections called entities. A predictive model is created for each entity allowing you to get more accurate forecasts aligned with the entity's particular characteristics.
  • Influencers: The variables that have an influence on the target. By default the predictive model considers all the columns or dimensions as influencers, and during training, will only retain the significant ones. You can chose to exclude influencers that you consider not worth including in the training. This is useful when dealing with large data sources.
  • Training: The process that uses SAP automated machine learning to explore relationships in your data and find the best combinations. The result is a formula, your predictive model, that can be applied to new data to obtain predictions.