Understanding Predictive Goal and Training Roles for Variables
A variable corresponds to a column in a dataset or a dimension in a planning model. The observations relating to each variable correspond to the rows. Variables that have been specified as a target, or an entity identifier, are not considered as influencers. Unless you exclude certain influencers, all other variables are treated as influencers. The training retains the most significant ones for the predictive model reports for debriefing.
| Role | Description | Example | 
|---|---|---|
| Target | The variable that you want to explain, or predict the values for. | Example 
 | 
| Date | The variable used for the date values. NoteThis variable is
									mandatory for a time series predictive scenario. | The date formats that should be used in your dataset are the
								following: 
 Here, YYYY stands for the year, MM for the month,DD for the day of the month, hh stands for the hour, mm stands for the minutes, and ss stands for the seconds. NoteLet's say you
									use the YYYY-MM-DD date format, you can create Time Series
									Predictive Scenarios where the date granularity can be: 
 | 
| Entity | Optionally used in a time series predictive scenario. It’s the
								identifier variable that you want to use to split up the predictive
								model into entites, with each one producing its own predictive
								model, so you get distinct predictions for each entity. The predictive model can then catch behaviors that are specific to a given entity, and so produce more accurate predictions. The entity can be a dimension in the data, for example Region, Store, or Product Family. | ExampleYou want to forecast the energy consumption by industry sector for the next 6
									months. Your target is <Energy consumption> and
									your entity is <Industry sector>. You will get
									predictions and performance indicators for each industry sector:
									commercial, industrial, residential,
									transportation. | 
| Influencer | The influencers are variables that describe your data and which serve to explain a target. Unless excluded, all variables that aren't already selected as a target, or an entity identifier, are considered as influencers, with only the most significant ones being retained after training for debriefing. During the predictive model creation, you can decide to exclude influencers from the training process, these are not taken into account to compute the predictive model, not included in the statistics for the predictive model, not retrieved from the data source, and not needed when you apply the predictive model to an application data source. Remember You should exclude influencers that are directly related to the target, especially
											variables that contain indirectly a target variable.
											Statisticians call these variables as "leakers" or "leak
											variables". This will produce a wrong predictive model
											with wrong performance indicator unable to produce
											prediction.  ExampleIf a predictive model has the
												target variable <has bought the product
												Yes/No>, you should exclude the
												influencer <Billing amount> if it
												contains the cost for the product. TipIf there is a variable that is influencing the
										prediction at very high level then there is a chance that it
										is a leak variable. Excluding influencers that have no influence on the targets (for example <account number>) can help speed up the training process. | Example Your company is marketing two products A and B. You have a database, which contains references to: 
 |