Data Validation Best Practices

Follow these best practices to validate data when implementing data sources and processing jobs.

Data validation is the process of verifying that all necessary source data has been received, transformations are applied correctly, and the analytics content is functioning as expected. Failing to thoroughly validate data can result in incorrect or missing data, leading to a loss of trust in the solution, reduced user adoption, and diminished perceived value.

Validate analytics content

Verify that all required attributes are present for analytics and user security. This ensures that analytical content is comprehensive and user access controls work correctly. Next, verify that key metrics and dimensions display correctly to ensure the accuracy and usability of the analytics provided to your customers.

Investigate the root cause if you observe any of the following data issues:

  • Incorrect metric values
  • Missing metrics, group bys, or filters
  • Unexpected value in the Unknown category
  • Incorrect system starts or system terminations
  • Unexpected attribute values
  • Unknown values in parent-child dimension

For more information about troubleshooting, see Data Validation.

Build your data transformation logic using a wide range of scenarios. For more information, see Demo Data Best Practices. This provides robust validation coverage and reduces the risk of data inconsistencies.

For data integrated into Visier’s existing data model, use the prebuilt data validation analyses. These tools guide you through step-by-step checks to ensure proper data loading and transformation, saving time and improving accuracy.

Once the detailed validation of sample data is complete, take time to perform validation on analytic tenants using customer data. This will help identify additional edge cases that may not be present in the sample data set used for configuration purposes.

Validate custom data in the Explore room

For custom data specific to your solution, create month-over-month tables in the Explore Room. Break down metrics by dimensional groups and compare results against expected counts from the source data. This helps ensure custom metrics and dimensions are accurate and align with expectations.

Focus on key areas during validation

Spend time validating data in these areas:

  • Aggregate trend over time: Ensure data spans the expected time periods, from the earliest records to the most recent.
  • Granular data distribution: Perform detailed comparisons between Visier’s output and your database expectations.
  • Complex transformations: Validate any metrics or attributes involving significant calculations or derivations to confirm accuracy.
  • Deltas and changes over time: Test scenarios for updates, additions, and deletions in the data to ensure historical data integrity and correct incorporation of new changes.

Investigate data and permission issues promptly

As you prepare for onboarding customer users, confirm that the data attributes required for specifying the analysis population are available and accurate. To ensure security permissions are configured correctly and customer users maintain consistent access levels, review the data attributes used as row filters in your current user security implementation. Ensure that there are no gaps in the data being loaded.

Validate user permissions by previewing the solution as a user to ensure all permissions display the correct combination of data, content, and capabilities. For instructions, see Preview the Solution as a User.

Plan for post-release data issues

It is common for minor data issues or edge cases to be reported by customers after an analytic solution is released. It is important to develop a plan that covers how you will triage, investigate, and address data issues in a timely manner when they are reported. Having a formal, documented process and identifying the team members responsible for reaching out to Visier Technical Support can accelerate the path to resolution for your customers.