Use Visier Data in Amazon SageMaker
Integrate Visier HR analytics data into SageMaker for advanced machine learning modeling.
Who can use this feature?
-
Enterprise API User
Not sure if you have this feature or capability? Reach out to your administrator.
Overview
Use the Visier Python Connector to extract HR analytics data from Visier and use it as an outbound data source for your machine learning models in SageMaker. This powerful combination allows HRIS/HRIT leaders, CDAOs, and technical teams to leverage predictive analytics and make data-driven decisions in workforce management.
Prerequisites
- Have an Amazon Web Services (AWS) account with access to Amazon SageMaker and S3 services.
- Have a Viser user account.
- Have access to data in Visier. You can only bring Visier data into SageMaker if your Visier user account is assigned data security access to that data. For more information about data security in Visier, see Data Security for a Permission.
- Retrieve a Visier API key. For more information, see Generate an API Key.
- Have a Python environment installed on your machine or AWS Cloud9 environment.
- Install the Visier Python Connector package. For more information, see Visier Python Connector.
Set up your environment
Step One: Install the Visier Python Connector
Use the following command to install the connector:
pip install -U visier-connector
Step Two: Authentication
For OAuth 2.0 or basic authentication, set up your .env file with the necessary credentials from Visier, such as VISIER_HOST, VISIER_APIKEY, and authentication specifics. For more information, see OAuth 2.0 or Basic Authentication in the README.
Fetch data from Visier
The following section will walk you through using the Data Query API. There are additional queries that are included in the Python Connector including List, Aggregate, and Snapshot API calls. For more information, see the Visier Python Connector README.
Step One: Establish a connection
Use the VisierSession and make_auth utility to authenticate and establish a session with Visier.
from dotenv import dotenv_values
from visier.connector import VisierSession, make_auth
from visier.api import QueryApiClient
env_creds = dotenv_values(".env") # Ensure your .env file path is correct
auth = make_auth(env_values=env_creds)
Step Two: Execute a data query
Define and execute your SQL-like query to fetch the required data.
with VisierSession(auth) as session:
query_client = QueryApiClient(session)
# Replace 'your-query.sql' with your actual query file or string
query_result = query_client.sqllike("SELECT * FROM Employee WHERE ...")
data = query_result.json()
Use data in SageMaker
Step One: Create a SageMaker notebook instance
Navigate to the SageMaker dashboard in your AWS Console, and create a new notebook instance.
Step Two: Access your data from S3
In your SageMaker notebook, use the AWS SDK for Python (Boto3) to access your uploaded data from S3.
import s3fs
fs = s3fs.S3FileSystem(anon=False)
# Ensure the path matches your S3 bucket and file name
s3_path = 's3://{}/visier_data.csv'.format(bucket_name)
df = pd.read_csv(fs.open(s3_path))
Step Three: Model training and evaluation
Proceed with your model training, validation, and evaluation using the data from Visier within SageMaker as per your project requirements.
Write data back to Visier
Step One: Upload the dataset with the Direct Data Intake API
To upload the data, we use the Python connector, which is instantiated in the same way that was described in Fetch data from Visier. However, the initialization differs when instantiating a different API client, that is, the DirectIntakeApiClient.
In this sample, we assume that most Visier customers use other means of sending data to Visier, such as SFTP or data connectors, which are processed in Visier's data provisioning engine. Because of this assumption, this sample uses the Direct Data Intake API (DDI) to supplement existing data in Visier and extend the Employee object. Because the Supplemental data intake mode is not the default mode, we must configure the data intake mode for this data upload.
Step Two: Upload transaction
The DDI API uses a transactional process that allows callers to upload many data files before sending the files to Visier. In this sample, we are sending one data file to Visier. To send our data file to Visier, the DDI API requires three calls:
- start_transaction to begin a transaction. The response contains the transaction ID, which we must retain for the next two calls.
- upload_file to specify a target object and provide a data file with columns that match the properties of the target object that we want to load.
- commit_transaction to close the transaction and process the data files in Visier.
from visier.api import DirectIntakeApiClient
from visier.api.direct_intake import Configuration
with VisierSession(auth) as s:
intake_client = DirectIntakeApiClient(s, raise_on_error=True)
# Configure the Direct Intake to supplement data in the tenant
# Enable loading into Employee using extension tables
config = Configuration(is_supplemental=True,
extend_objects=['Employee'])
returned_config = intake_client.set_configuration(config)
# Upload the file within the context of a transaction
try:
tx_response = intake_client.start_transaction().json()
print(tx_response)
transaction_id = tx_response['transactionId']
intake_client.upload_file(transaction_id, 'Employee', result_filename)
intake_client.commit_transaction(transaction_id)
print(f'Committed {transaction_id}')
except Exception as ex:
print(f'Rolling back {transaction_id}', ex)
intake_client.rollback_transaction(transaction_id)
You've now successfully integrated Visier HR analytics data into SageMaker for advanced machine learning modeling.
Best practices
- Remember to manage your API keys and sensitive information securely.
- Tailor the data queries to match the specific analytics needs of your project.
- Explore further model optimization and deployment within SageMaker using Visier data to enhance your HR analytics capabilities.