Data Standardization Best Practices

Follow these best practices to standardize attribute values across tenants.

Overview

Concept configuration in Visier depends on the requirement of having standardized attribute values across your analytic tenants. For more information, see Concepts. There are several key attributes that necessitate standardization to fully unlock the potential of Visier, and we expect your collaboration in providing both the raw data values from your source systems and the predefined set of values that Visier specifies.

The following attributes commonly require standardization:

  • Compensation Pay Types (Base Pay, Variable Pay, Supplemental Pay, etc.)

  • Exit Reasons (Retirement, Involuntary / Voluntary, etc.)

  • Start Reasons (Rehire, Growth, etc.)

  • Performance Rating (High, Mid, Low)

  • Talent Acquisition processes, stages and outcomes, and how these map to Visier’s model. 

  • Locations 

  • Gender

  • Ethnicity

  • Manager Status

  • Position Status (for Position Management data)

  • Payout Types (for Payroll data)

  • Absence Reasons (for Absence data)

  • Succession Readiness Levels (for Succession data)

  • Working Hours Types (Overtime, Late, Absence, Vacation, etc.)

  • Learning Activity Stages (for Learning data)

  • Incident Category (for Safety data)

Data standardization is important because:

  1. Unlock the full potential of Visier: Standardized data enables the full utilization of Visier's out-of-the-box content. Many of our analyses and content features rely on the configuration of underlying data, which is only scalable with a standardized set of values that are applicable to all customers. An example of this is the standardization of performance ratings, which allows us to configure High, Mid, and Low Performer concepts. 

  2. Client comparison and benchmarking: Data standardization enables meaningful client-to-client comparisons and benchmarking. It's essential for clients to gauge their performance against industry benchmarks and peers.

  3. Long-term maintainability and scale: As the number of clients using analytics increases, maintaining a growing list of raw data values becomes essential. Standardizing data provides a structured approach to map these values to Visier's finite categories.

Failure to standardize your data has cascading consequences:

  • Concepts will not be correctly configured.

  • Metrics will not be accurately generated.

  • Analyses will be incomplete or unavailable for customers.

There are three approaches for standardizing data values:

  1. Within file mapping: Our recommended approach is to incorporate the mapping within the data file itself. This approach streamlines the data standardization process and simplifies data management.

  2. Separate mapping file: You can create a separate mapping file that defines the relationship between your source data and Visier's standardized values. This approach offers flexibility and allows for easy updates.

  3. Concept configuration per tenant: This approach involves exposing Studio and having your implementation teams perform concept configuration on a per tenant basis. This approach is suited for hands-on implementations and at a limited scale.

Within file mapping

Note: We recommend this approach to data standardization.

Within file mapping does not rely on a separate mapping file, instead, both your customer values and the Visier values are provided in the same file.

For example, Contract_Type requires standardization as it has dependencies on several concepts.

The CustomerContractType is the customer value, alongside the standardized VisierContractType.

Advantages

  • Resource efficiency: Eliminates the need for extra files, conserving processing and storage resources, ultimately reducing load times.

  • Streamlined process: Removes the requirement for creating dummy employee records, simplifying the overall data handling process.

  • Simplified effective dates: Bypasses the need to generate artificial, incremental effective dates, streamlining data management.

  • Efficient troubleshooting: Facilitates easier identification and troubleshooting of data issues.

  • Onboarding flexibility: The superset of values isn’t mandated at the administrative tenant level for concept configuration, providing flexibility in the setup.

Disadvantages

  • Additional records requirement: Extra records are needed within the file to accommodate raw values, potentially increasing the number of transformations required in your data pipeline.

Separate mapping file

This approach relies on a separate file containing a superset of data values for specified attributes. 

The file should include columns containing the:

  • Standardized value (the Visier value)

  • Non-standardized value (the Partner value, or raw value coming from your source system)

In this example, the column VisierExitReason contains the value we expect and the adjacent column is the partner value. This then feeds into the Employee Exit Model concept, that is configured on the Visier value. Any new customer values will automatically map to the correct bucket.

Note: We require a dummy EmployeeID and an incremental EffectiveDate as it is a requirement for our data loading routine. In this example, EmployeeID ‘999999999-1’ is used and the EffectiveDate has a one day increment.

Advantages

  • Clear superset visualization: Facilitates easy visualization of a superset of values, making it easier to see all of the possible permutations in one place.

  • Simplified data extraction: Potentially offers a straightforward process to extract all values into a single file, reducing the number of transformations that need to happen.

Disadvantages

  • Additional business rules: A business rule is required to lookup the VisierExitReason from the mapping file, this can implementation complexity.

  • Troubleshooting: It can be more challenging to identify where certain values are coming from because the standardized value is not present in the subject source, troubleshooting involves investigating multiple files.

  • Increased data load overhead: Partner is required to send additional files with each load, potentially impacting efficiency.

  • Dummy employee record requirement: Necessitates the generation of a dummy employee record for the assignment of values, introducing an extra step in the mapping process.

  • Incremental effective dates challenge: Requires the generation of incremental effective dates to align with Visier’s loading requirements, adding complexity to the implementation process.

  • Risk of data load failures: The absence of mapping files may lead to data load failures if fileset validation is enabled.

Concept configuration per tenant

This strategy involves providing access to Studio for your implementation teams, allowing them to conduct concept configuration tailored to each tenant. It is particularly well-suited for hands-on implementations and operates optimally at a limited scale. This may be the only feasible approach if achieving standardization is unfeasible, and the manual effort is manageable given your scale.

The process involves the identification of key concepts to be configured, with implementation teams collaborating closely with customers to comprehend their specific requirements for mapping out each concept.

A common example is performance ratings, where each customer has a different performance scale. For instance, Customer A may use a scale of one to five, and Customer B may use a scale of one to ten. For Customer A, a 4 may be considered a “high performer”, while customer B considers a 4 to be a “low performer”.

Advantages

  • Flexible onboarding: Allows for a seamless onboarding process even when standardizing data proves challenging.

  • Reduced dependency on superset values: Eliminates the necessity for a superset of values at the administrative tenant level during concept configuration.

Disadvantages

  • Scale challenges: Delivery and maintenance can pose challenges as the scale of operations increases.

  • Training requirements: Necessitates additional training for implementation teams to effectively utilize the Visier platform.

  • Ongoing commitment: Sustains an ongoing effort for each new customer onboarding process.

Recommendations

You may have certain limitations and challenges that dictate which standardization approach to adopt. Within file mapping is the recommended approach due to the reduction in implementation complexity and ability to manage standardization at scale.

Some questions that may help with your own assessment:

  1. Can your customers enter custom values in data fields or do they pick from a predefined set of values that you control?

  2. How easy/difficult is it to get a superset of values across all your customers?

  3. How would you extract the superset of values from your customer base? Can you map to Visier’s concept values?

  4. What dataset will you use to set up the administrating tenant? Does it already have a superset of data values? If not, how will you include them as part of the dataset?