Data Version Export API Overview
Learn more about the Data Version Export API.
Note: To use the Data Version Export API through Visier's Python connector, see Data Version Export API on Github and DV Export Script on Github.
With the Data Version (DV) Export API, you can schedule a full export for a data version or a delta export to get the differences between two data versions.
- Full export: All the tables, columns, and data version files associated with one data version. To get a full export, when scheduling a data version export job, define dataVersionNumber as the data version to export and leave baseDataVersionNumber empty.
- Delta export: The differences between two data versions, such as new tables, updated table row values, or deleted tables columns. To get a delta export, when scheduling a data version export job, define the dataVersionNumber and baseDataVersionNumber to compare.
We recommend that you schedule a DV export job and then poll the export job's status until it completes. After it completes, use the returned export ID to retrieve the exported DV details. Then, download the processed data files generated by Visier, such as data tables like Employee or Applicant.
- GET /v1alpha/data/data-version-exports/data-versions Retrieve a list of all data versions: Get all the data versions that are available to export.
- POST /v1alpha/data/data-version-exports/jobs Schedule a data version export job: Define the data version to export or two data versions to export a delta between them. The job schedules immediately and will begin when resources are available. Returns a jobUuid.
- GET /v1alpha/data/data-version-exports/jobs/{jobUuid} Retrieve a data version export job's status: Check whether the export job has completed successfully. When completed, returns an exportUuid.
- GET /v1alpha/data/data-version-exports/exports/{exportUuid} Retrieve the details of a data version export: After the export job completes, use the returned export UUID to retrieve the details of the data version export, such as the DV tables, columns, and files.
- GET /v1alpha/data/data-version-exports/exports/{exportUuid}/file/{fileId} Download a file from a data version export: Use the fileId returned by the previous endpoint to download any data tables in the data version, such as the Employee data table.
File format
Data version export files are in CSV (RFC 4180) format and compressed with gzip. The data version export generates files based on the tables in the data version, such as a file for Employee data and a different file for Applicant data.
Note: The API may export more than one file per table if:
- The table exceeds the 2GB limit.
- The export is a delta between two data versions. For delta exports, Visier separates table files into a file for new columns in dataVersionNumber and another file for columns that exist in both data versions.
Export files may have the following filenames:
- <table_name>-newColumns_part0.csv.gz: If a full export, includes all columns in the table. If a delta export, only includes new table columns in dataVersionNumber. If the table exceeds the 2GB limit, the filename for the next file is <table_name>-newColumns_part1.csv.gz, and so on.
- <table_name>_part0.csv.gz: If a delta export table has columns that exist in dataVersionNumber and baseDataVersionNumber, this file includes the common columns. Not generated for full exports. If the table exceeds the 2GB limit, the filename for the next file is <table_name>_part1.csv.gz, and so on.
Example: Let's say you exported the delta between two data versions and then downloaded the Employee table. In this example, the Employee table has new columns and common columns between data versions. The new columns data is less than 2GB, but the common columns data exceeds 5GB. The API returns the following CSVs:
- Employee-newColumns_part0.csv.gz
- Employee_part0.csv.gz
- Employee_part1.csv.gz
- Employee_part2.csv.gz
In addition to the data columns, the API generates one or both columns:
- EventIndexOnDate: If the file is for event analytic object table data, this column is part of the primary key for each event instance. This creates a unique primary key if there are multiple events on the same date for the same record, such as two events on the same day for Employee-1.
- _DiffAction_: Identifies the type of change that occurred. A change is one of:
- +: The row was inserted in the data version. In a full export, all rows are marked as +.
- -: The row was deleted since the base data version.
u: The row value was updated.
Note: In a delta export, if a column data type changed, the table indicates the original column with the previous data type was deleted (-) and a new column with the updated data type was inserted (+).
The API exports dates as millisecond timestamps; for example, the date Jan 1, 2015 UTC is exported as 1420070400000 timestamp. In the data version export details, date columns are marked as the Date data type.