Import and Prepare Fact Data for a Classic Account Model

You are working with a classic account model and want to import fact data to your model.

In this section, we describe each step required to import data acquired from an external data source into the fact table of your classic account model. There are five main steps in the import process: creating an import job, preparing the data, combining acquired data if needed, mapping the data, and running the import. The process can be run once on demand, or scheduled to be executed on recurring basis. Use the links below to navigate straight to a specific part of the process. For a complete view of the workflow, check out the video below.

Video: How to Import Data into Models

Open this video in a new window

In this video, you will import data into an existing model, map your data to the model dimensions and measures, review the available mapping options and import methods, and validate your data.

Creating an Import Job

Context

Before being able to go to the data preparation step, you first have to import data using the Import Data from option. You can import data either from a file, or from a data source. If you’re importing data from a data source, make sure to select an existing connection, or create a new one. For more information on how to do so, make sure to check out Import Data to Your Model.
Restriction
The following data sources are not supported by the new model type:
  • SAP BPC
  • SAP Concur
  • SAP ERP
  • SAP Fieldglass
  • Workforce Analytics
  • Dow Jones

Also, for the Salesforce data source, importing data into an existing model isn't supported.

Procedure

  1. Select Start of the navigation path Import Data from Next navigation step FileEnd of the navigation path or Start of the navigation path Import Data from Next navigation step Data SourceEnd of the navigation path depending where your data is stored.
  2. Select your file if your data is stored in a local file or connect to your data source.
  3. Select the data you want to import.
    The import job is added to the Draft Sources or Import Jobs list, depending on whether you're working with a classic account model or a model with measures (i.e, the new model type). At this moment, you can save and exit the process and come back later by clicking . The uploaded draft data expires 7 days after the upload.
    Note
    The draft source is visible to anyone with imports rights to the model.
  4. Set up the import job:
    • If you're importing data to a model with measures, click Set up import.
    • If you're importing data to a classic account model, click the draft source you've just imported listed under Draft Sources.

Results

Now that you've imported data to your model, you're ready for the data preparation step. See Preparing the Data for more information.

Preparing the Data

The data preparation step is where you can you can resolve data quality issues before the mapping step, but also wrangle data and make edits such as renaming columns, create transformations, etc…

First, make sure to resolve data quality issues if there are any. After the initial data import, the Details panel gives you a summary of the characteristics of the model with general information about the imported data, including any data quality issues.

For tenants based on an SAP Data Center, issues can be listed and described under the Model Requirements section if they are related to the model itself, or under the Data Quality section when you click a particular dimension.

For tenants based on a non-SAP Data Center, the Details panel list all the dimensions and measures of the model. You can click the icon to access more information. If ever the application has detected issues with a dimension, the exact number of issues with that particular dimension is indicated right next to its name, and clicking that number takes you straight to the Validation tab, where you can see a description of the issue.

Select a message to see the options available to resolve any identified quality issue. Use the context-sensitive editing features and the transform bar to edit data in a cell, column, or card. When selecting a column, a cell, or content in a cell, a menu will appear with options for performing transforms. This menu is two-fold:

  • Choose the Quick Actions option to perform actions such as duplicate column, trim whitespace, hide columns, or delete columns or rows. The table below lists all the available actions.
  • Select the (Smart Transformations) icon to list suggested transformations to apply to the column, such as replacing the value in a cell with a suggested value. You can also select Create a Transform and choose from the options listed under the transformation bar displayed above the columns. The transformation bar is used to edit and create transforms.

As you hover over a suggested transformation, the anticipated results are previewed in the grid. To apply the transformation, simply select the transform. You can manually enter your own transformation in the transformation bar; as the transform is built, a preview is provided in the affected column, cell, or content within the cell. The table below summarizes the available transformations you can apply to selected columns, cells, or content within cells.

Transformation Description Transform Bar Format
Delete Rows Delete rows in the data, either by selecting individual members, or by specifying a range (not possible for text columns; only numerical and Date columns). Provided only as a quick action
Trim Whitespace Remove spaces, including non-printing characters, from the start and end of strings. Provided only as a quick action
Duplicate Column Create a copy of an existing column. Provided only as a quick action
Delete Column Delete a column. Use the Shift key to select and then delete multiple columns. Provided only as a quick action
Remove duplicate rows Remove duplicate rows when creating or adding data to a model. Provided only as a task bar icon
Concatenate Combine two or more columns into one. An optional value can be entered to separate the column values. Concatenate [<Column1>], [<Column2>]… using "value"
Split Split a text column on a chosen delimiter, starting from left to right. The number of splits can be chosen by the user. Split [<Column>] on "delimiter" repeat "#"
Extract Use this transform to extract a block of text specified as numbers, words, or targeted values within a column to a new column. Extract [<what to extract]>] [<where to extract>] [<which occurrence>] [""] from [<column name>] [<include value option> ].

Options for what to extract:

  • number Limits extracted text to one number from column cell.
  • word Limits extracted text to a word from column cell.
  • everything Includes all text.
Options for where to extract relative to the target value:
  • before
  • after
  • between
    Note
    You must specify two target values when using between.
  • containing: Extracts a word or number containing the target. For example, if your target is ship, both ship and shipping will be extracted.
  • equal to: Extracts the specific target value if it exists in the cell.
Options for specifying occurrence:
  • first
  • last
  • occurrence: Allows you to specify the position of the target from one to ten.
    Note
    Use occurrence when there are multiple instances of the target.

To extract the first number from all column cells, you would specify the following:

Extract before first "" from [<column name>].

To extract all text between parenthesis in column cells, you would specify:

Extract everything between "(" and ")" from [<column name>].
Change Change a column to uppercase, lowercase, or title case. Change [<Column>] to (<UPPERCASE>/<lowercase>/<TitleCase>)
Replace Replaces either an entire cell or content that could be found in multiple different cells.
Note
You can optionally select Start of the navigation path Next navigation step whereEnd of the navigation path to add a <where> clause to the transform bar. You need to specify an associated column and value when using a <where> clause to limit the replace action in a given column.
Replace (<cell/content>) in [<Column>] matching "value" with "value"

With a <where> clause: Replace (<cell/content>) in [<Column>] matching "value" with "value" where[<Column>]is "value"

A history of all transforms you implement is displayed in Transform Log. There are two levels of transformation logs recorded: transforms on the entire dataset, and transformations on the currently selected column. Hover over a specific transform in the Transform Log to highlight the impacted column. You can roll back the change by either deleting the entries in the history or using the (Undo/Redo) buttons on the toolbar. You can remove transforms in the Transform Log panel out of sequential order provided that there are no dependencies.

When selecting a dimension, measure, or attribute, the Data Distribution section gives contextual information about a dimension with numerical or textual histograms:
  • Numerical histograms are vertical, and represent the range of values along the x-axis. Hover over any bar to show the count, minimum, and maximum values for the data in the bar. The number of bars can also be adjusted by using the slider above the histogram. Checking the Show Outliers (SAP Data Center) or Include Outliers (Non-SAP Data Center) option includes or removes outliers from the histogram. Below the histogram, a box and whisker plot help you visualize the histogram's distribution of values.
  • Text histograms are horizontal, and the values are clustered by count or percent. The number of clusters can be adjusted by dragging the slider shown above the histogram. When a cluster contains more than one value, the displayed count is the average count for the cluster. The count is prefaced by a tilde symbol (~) if there are multiple different occurrences. Expand the cluster for a more detailed view of the values in the cluster along with individual counts. Use the search tool to look up specific column values, and press Enter to initiate the search. When you select a value in the histogram, the corresponding column is sorted and the value is highlighted in the grid.
    Note
    The displayed histogram is determined by the data type of the column. A column with numbers could still be considered as text if the data type is set to text.
Once you have checked and fixed all errors, if you're working with a dataset sample, click Validate Data to apply your transforms across the entire dataset and check for data quality errors.
Note
Validating data in the data preparation step is only possible for tenants based on a non-SAP Data Center. For tenants based on an SAP Data Center, you can only validate the full dataset when reviewing the import. See Reviewing and Running the Import for more information.

Selecting Dimension Types

When you create an import job, the application automatically qualifies the data. Typically, columns containing text are identified as dimensions, and columns containing numeric data are identified as measures. You can still change the data qualification and change it to another type if needed. For example, you can change a Date dimension to an Organization dimension.

If you’re unsure about dimension types, make sure to check out Learn About Dimensions and Measures.

After you've selected your dimensions make sure to follow the best practices described in the sections below.

Maximum Number of Dimension Members

To maintain optimal performance, the application sets a limit to the number of unique members per dimension when importing data to a new model. For more information, check out System Requirements and Technical Prerequisites.

For non-planning-enabled models only, you can import dimensions with more than the maximum number of members. However, the following restrictions apply:
  • In the data integration view, these dimensions cannot have any dimension attributes added to them, such as description or property.
  • Once the dimensions are imported into the Modeler, they have only one ID column, and are read-only.
  • The dimensions can't be used as exception aggregation dimensions or required dimensions.
  • The dimensions can't be referenced in formulas.

Calculated Columns

While you're preparing data in the data integration view, you can create a calculated column based on input from another column and the application of a formula expression.

Click from the menu toolbar, and use the Create Calculated Column interactive dialog to build up your calculated column. Add a name, and build the formula for your column in the Edit Formula space. You can either select an entry from Formula Functions as a starting point or type “[" to view all the available columns. Press Ctrl + Space or Cmd + Space to view all the available functions and columns.

Click Preview to view a 10 line sample of the results of the formula. Click OK to add the calculated column to the model. If necessary, you can go back and edit the calculated column’s formula by clicking Edit Formula in the Designer panel.

For a listing of supported functions, see Supported Functions for Calculated Columns.

Dimension Attributes

Dimension attributes are information that is not suitable to be standalone dimensions. You can use them to create charts, filters, calculations, input controls, linked analyses, and tables (with “import data” models only). For example, if you have a Customers column, and a Phone Numbers column, you could set Phone Numbers to be an attribute of the Customers dimension.

There are multiple types of attributes available:
  • Description: The column can be used for descriptive labels of the dimension members when the member IDs (the unique identifiers for the dimension members) are technical and not easily understandable.

    For example, if your imported data contains a pair of related columns Product_Description and Product_ID with data descriptions and data identifiers, you can set Product_Description to be the Description attribute for the Product_ID dimension. Note that the Product_ID column would then need to contain unique identifiers for the dimension members.

  • Property: The column represents information that is related to the dimension; for example, phone numbers.
  • Parent-Child Hierarchy (Parent): The column is the parent of the parent/child hierarchy pair.

    For example, if your imported data contains the two columns Country and City, you can set the Country column to be the parent of the Country-City hierarchy.

    The hierarchy column is a free-format text attribute where you can enter the ID value of the parent member. By maintaining parent-child relationships in this way, you can build up a data hierarchy that is used when viewing the data to accumulate high-level values that can be analyzed at lower levels of detail.

    You can also create level-based hierarchies when your data is organized into levels, such as Product Category, Product Group, and Product. When the data is displayed in a story, hierarchies can be expanded or collapsed. In the toolbar, select (Level Based Hierarchy). For more information about hierarchies, see Learn About Hierarchies.

  • Currency: If you set a column to be an Organization dimension, the Currency attribute is available. The Organization dimension offers an organizational analysis of the account data, based, for example, on geographic entities. You can add the Currency attribute to provide currency information for the geographic entities.

Data Quality Considerations

During data import, anomalies in your data can prevent the data from being imported properly, or prevent the model from being created. If issues are found in your data, the impacted data cells are highlighted, and messages in the Details panel explain the issues. You'll need to resolve the following issues before the data can be fully imported or the model can be created:
  • In account dimensions, dimension member IDs cannot contain the following characters: , ; : ' [ ] =.
  • Numeric data cells in measures cannot contain non-numeric characters, and scientific notation is not supported.
  • When importing data to an existing model, cells in a column that is mapped to an existing dimension must match the existing dimension members. Unmatched values will result in those rows being omitted from the model.
  • For stories, if any member IDs are empty, you can type values in those cells, or select Delete empty rows in the Details panel to remove those rows.
  • When creating a new model, if member IDs are empty, they are automatically filled with the “#” value if you select the Fill applicable empty ID cells with the "#" value option. Otherwise, those rows are omitted from the model.
    Note
    This option is only available on for tenants based on an SAP Data Center (Neo).
  • In dimensions and properties, a single member ID cannot correspond to different Descriptions in multiple rows (but a single Description can correspond to multiple member IDs).

    For example, if member IDs are employee numbers, and Descriptions are employee names, you can have more than one employee with the same name, but cannot have more than one employee with the same member ID.

  • In a Date dimension column, cell values must match the format specified in the Details panel.

    The following date formats are supported: dd-mm-yy, dd.mm.yy, dd/mm/yy, dd-mm-yyyy, dd.mm.yyyy, dd/mm/yyyy, dd-mmm-yyyy, dd-mmmm-yyyy, mm-dd-yy, mm.dd.yy, mm/dd/yy, mm-dd-yyyy, mm.dd.yyyy, mm/dd/yyyy, mm.yyyy, mmm yyyy, yy-mm-dd, yy.mm.dd, yy/mm/dd, yyyq, yyyy, yyyymm, yyyy-mm, yyyy.mm, yyyy/mm, yyyymmdd, yyyy-mm-dd, yyyy-mm/dd, yyyy.mm.dd, yyyy/mm-dd, yyyy/mm/dd, yyyy.mmm, yyyyqq.

    Examples:
    • mmm: JAN/Jan/jan
    • mmmm: JANUARY/January/january
    • q: 1/2/3/4
    • qq: 01/02/03/04
  • Latitude and longitude columns, from which location dimensions are created, must contain values within the valid latitude and longitude ranges.
  • For planning-enabled models, in a hierarchy measure, non-leaf-node members are not allowed.
    Note
    For analytic (non-planning-enabled) models only, non-leaf node members are allowed, but be aware of the effects of this behavior. For example: in an organizational chart that includes employee salaries, the manager has her individual salary, and her staff members have their own salaries as well. In a visualization, do you expect the manager’s data point to reflect her individual salary, or the sum of her staff members' salaries?

Combine Data with Your Acquired Data

After importing raw data into a new or existing model, or into a story, you may need to perform some data preparation.

You can combine data from another source with your acquired data by using up to three matching columns to join the two datasets.

Note
This workflow does not apply to a model embedded in a story.
  1. In Actions section of the toolbar, under Actions, select the (Combine Data) icon.

    The Let's add some data! dialog is displayed.

  2. Select whether you want to add data using a file, or using a data source.

    Once the new data is loaded, the Combine Data dialog is displayed. Under Combine Settings, a table representing all columns in the original data is displayed on the left, and a table listing columns from the new dataset is displayed on the right.

  3. Choose the most appropriate columns in each table to combine data. Click to select, or drag columns to the field provided under Combine Column for each table.
    Note
    The columns should be the best match and should not contain duplicates. You can use up to three columns to join the two datasets.
  4. Select how you want to combine the columns:
    • All primary data: maintains the non-intersecting data from the original dataset.
    • Intersecting data only: omits any data from the original dataset that does not match in the new data.

    The results for the column combination in terms of Accepted, Duplicated, Nulls, and Omitted values are provided under Combine Preview. A sample of combined data rows is displayed under Data Sample.

  5. Select Combine when you are ready to combine the two tables.
    Note
    Currently, when you combine data:
    • The number of combined cells cannot exceed 30,000,000.
    • The number of combined rows cannot exceed 1,000,000.
    • The number of combined columns cannot exceed 100.
    • You cannot run more than ten combine data processes on a given dataset.
    • You cannot combine two datasets using a calculated column.
    • You cannot combine two datasets in a story.

Import Cross Table Data

When you upload acquired data containing cross tables to the data integration view, the Transpose Columns into Rows action enables you to change selected columns into rows. The new table format can then be used to either import data or create a new model.

Note
You can only perform one transpose transformation per session on a dataset. The table resulting from the transformation cannot contain more than 50 million cells.
  1. Once the acquired data (or sample) is loaded in the data integration view, select the icon from the Actions menu.

    The Transpose Columns into Rows panel appears in Details.

  2. Under Transpose Columns, select columns you want to include in the new table format. Choose Select All to include all the listed columns.
  3. Set Preview on grid on to preview the new format.
  4. Select OK to implement the transform and to actualize the new format.
    Note
    To undo the transformation, open the Transformation Log and delete the associated entry.

Mapping the Data (SAP Data Center Tenants)

Now that the data is prepared, you can start the mapping process. The application automatically pre-maps some parts of the data, and you can map the remaining data manually.

Context

Restriction
Restrictions apply when mapping dimensions and attributes. These issues are listed in the Review Import step just before running the import. To avoid having issues during the import, make sure to follow these best practices:
  • All dimensions are mapped.
  • If the Version is mapped to a column, then the Category is also mapped to a column.
  • There is only one Actuals version in the model.
  • The Actuals category is mapped to public.Actuals for planning models.
  • All measures are mapped.
  • If there are multiple hierarchies in the dimension, and some dimension members don’t belong to all the hierarchies, we recommend carrying out multiple data imports, one for each hierarchy.

    When the source column is mapped to a hierarchy of the target dimension, an empty cell in the source column means that after the import, the member will become the top node of the hierarchy, displayed as <root> in the dimension.

    That empty cell during the data import does not mean that the member is excluded from the hierarchy. To make sure that this dimension member does not belong to a hierarchy after the import, you can either filter out the value using the delete row function during the data preparation step in the import mapping screen first, or do not map a column to the target hierarchy.

  • We do not recommend adding the <root> or Not In Hierarchy members in the data source itself when importing data to a hierarchy, as these are interpreted as regular strings by the application and might cause unexpected issues in the dimension.

Procedure

  1. Switch to the Card View and determine what remains to be mapped. Cards display as either mapped or unmapped entities. A card represents mapped data if it is shaded solid and has defined borders. Cards that must be mapped appear transparent and borderless.

    Dimensions with attributes appear stacked and can be expanded. Mapping attributes is optional.

  2. Drag and drop an unmapped imported dimension, measure or attributes on the associated card.
    The date dimension can either be matched, or set with a default value. To set a default value, select the date dimension and click Set a default value in the Details panel. After you've set a default value, you can change it by clicking Change Default Value.
    Note
    Check Apply fiscal settings to column values in Details to map imported fiscal period data into a date dimension in a model enabled to support fiscal year. Review the format listed under Format and change it if required.
  3. Repeat step 2 until you have mapped all dimensions, at least one measure, and the attributes.
    You can check the progress of the mapping thanks to the dedicated Dimensions, Attributes, and Measures headings. When each bar is fully colored in grey, the mapping is complete.
  4. In the Details panel, specify the version for which you're importing data by checking either Existing Version or New Version. If you have multiple versions available, select the source version dimension card and click Map Versions in the Details panel to map each version to the desired version and category.
  5. Once you have mapped all cards, check for mapping issues. If there are any, fix them before moving to the next step.
    A red dot in the top right corner of the card indicates there are mapping errors that need to be addressed. A blue dot indicates that new values have been added from the import to the existing data.

Results

Once you've fixed remaining issues, you're ready to validate the data, review and run the import. For more information, check out Reviewing and Running the Import.

Reviewing and Running the Import

Once the mapping is complete, you're ready to review the import options and method before running the import job.

Context

Procedure

  1. Check whether there are pending data issues after the mapping.
    All issues must be solved to make sure all data is imported into your model. If you decide to create the model ignoring some of the detected errors, either entire rows or invalid cells will be omitted from the new model.
  2. Review the import option. For tenants based on an SAP Data Center, click View all options under Mapping Options in the Details panel. For tenants based on a non-SAP Data Center, click to access the preferences.
    • Update dimensions with new values (SAP Data Center only): Select this option if you're importing data to a model that already contains data and want to update the data with new dimensions members, attributes, and hierarchy changes.
      Note
      This option doesn't apply to public dimensions.
    • Convert value symbol by account type (SAP Data Center only): Select this option to match the value symbol, positive or negative, to each account type in the model, for when values are stored as positive regardless of whether they represent income or expense accounts.
    • Fill applicable empty ID cells with the "#" value (SAP Data Center only): Select this option to fill empty ID cells with a "#" value to preserve rows without a value in the Dimension ID column. Otherwise, empty ID cells are deleted.
    • Reverse Sign by Account Type (Non-SAP Data Center only): Select this option to import your INC and LEQ account values with reversed signs (+/-). This option is only available if your model has an account dimension. If you want your INC and LEQ values to show up as positive, import them as negative values, and vice-versa. Before importing data, make sure to check the signs of the original source values first, and then set the option accordingly.
    • Update Local Dimensions with New Members (Non-SAP Data Center only): Select this option to map your source data to dimension properties to update their members during the import.
    • Conditional Validation: Select which hierarchies to validate against. Validating against selected hierarchies will prevent data from being imported to non-leaf members. This option is only available if you have at least one parent-child hierarchy in your model. For more details about storing data in non-leaf members, see Entering Values with Multiple Hierarchies.
  3. Select an import method.
    Note
    For tenants based on an SAP Data Center, import methods are listed under Import Method in the Details panel. For tenants based on a non-SAP Data Center, import methods can be found in the preferences .
    • Update: The target model’s measure values for the dimension member combinations specified by the source data are updated by the corresponding measure values in the source data. If the particular dimension member combination has no measure values in the model prior the import, new value is inserted.
    • Append: The target model’s measure values for the dimension member combinations specified by the source data are added by the corresponding measure values in the source data (summed together). If the particular dimension member combination has no measure values in the model prior the import, new value is inserted. For a more refined scope, use either the Clean and replace selected version data or Clean and replace subset of data update options.
    • Clean and replace selected version data: Deletes the existing data and adds new entries to the target model, only for the versions that you specify in the import. You can choose to use either the existing version or specify a new version under Version. If you specify to import data for the "actual" version, only the data in the "actual" version is cleaned and replaced. Other versions, for example "planning", are not affected.
    • Clean and replace subset of data: Replaces existing data and adds new entries to the target model for a defined subset of the data based on a scope of selected versions using either the Existing Version or New Version buttons. You can also limit the scope to specific dimensions. To define a scope based on a combination of dimensions select + Add Scope and use the Select a dimension field to specify a dimension.

      When a Date dimension is defined in the scope, the time range in the source data (determined by the minimum time value and maximum time value of the source data, as well as the granularity of the date dimension) combined with other dimensions in the scope, will determine what existing data is cleaned and replaced.

      If for example, Date and Region dimensions are defined as part of a scope, only entries that fall within the time range and match Region from the source data will be replaced in the target model. Existing data that does not match the scope will be kept as is. Other dimensions that are not part of the scope will be cleaned and replaced regardless of whether the dimension members are in the source data or not.

    Note
    These options affect measures and dimensions. To include both measures and dimensions, see Update and Schedule Models.
  4. For tenants based on an SAP Data Center, once you have checked and fixed all errors, if you're working with a dataset sample, click Validate Data to apply your transforms across the entire dataset and check for data quality errors.
    Note
    Validating data in the right before running the import is only possible for tenants based on an SAP Data Center. For tenants based on a non-SAP Data Center, you can only validate the full dataset during the data preparation step. See Preparing the Data for more information.
  5. Click Run Import (non-SAP Data Center tenant) or Finish Mapping (SAP Data Center tenant).

Scheduling an Import Job

Schedule a data import job if you want to refresh data against the original data source. You can import data from multiple queries and data sources into a model, and each of these imports can be separately scheduled.

Context

Procedure

  1. Select one or multiple import jobs you want to schedule.
    If you select multiple jobs, you can order them, set a group name for the job, and select one of the group processing option:
    • Stop if any query fails: If any of the import jobs fails, the group processing stops. You can then cancel the remaining jobs, or try to fix the cause of the failure, and later resume execution of the grouping from the same point where execution stopped.
    • Skip any failed query: If any of the import jobs fails, the remaining jobs are still processed.
    Note
    A grouping can include jobs from public dimensions as well as the model. Running the grouping refreshes the public dimensions and model together. You can ungroup your import at anytime clicking .
  2. Define the frequency for the scheduling:
    • None: Select this option when you want to update the data manually.
    • Once: The import is performed only once, at a preselected time.
    • Recurring: The import is executed according to a recurrence pattern.
    You can update the schedule at anytime clicking .
  3. Define or update your import settings.