How it Works
Using the Batch API is a four-step process. You must follow these steps in order to successfully process your data.- Create a Batch Job: First, you make an API request to create a new job. This initial call will return a
job_idand a secure, pre-signed URL that you will use to upload your data. - Upload Your Input File: Next, you will upload your data file to the pre-signed URL provided in Step 1. The file must be in the specified CSV format. See below for details.
- Start the Batch Job: Once your file is successfully uploaded, you must make a separate API call to start the job, referencing its
job_id. The job will not begin processing until this call is made. - Check Job Status: To monitor the progress of your job, you will need to periodically poll the ‘Get Batch Job’ endpoint using your
job_id.
Create a Batch Job
Creating a Batch Job is very simple. Sending in a request will grant you a URL to upload your content to. The URL is only good for 4-hours so be sure to upload your content before that expires. If you miss the window then you will need to create another job.The current upload limit for each Batch Job is 25,000 items. If you have more than that, you will need to create more than one job.
Upload Your Input File
Currently Clavata supports uploading your content via a CSV.Text
Text processing requires the below:- ref_id - this could be a numbered list, a content guid, or any other identifier so long as it is unique in the row.
- type - this should be set to ‘text’
- content - should contain the content you wish to have evaluated
Image
Image processing requires the below:- ref_id - this could be a numbered list, a content guid, or any other identifier so long as it is unique in the row.
- type - this should be set to ‘image_url’
- content - should contain the content you wish to have evaluated
Ensure your image URLs are publicly accessible!
Start the Batch Job
Once your content is uploaded, you will need to make another call to start the job. You cannot upload content once a job has started.Performance Expectations
Batch Job processing time is not guaranteed and can vary significantly based on several factors:- Job Size: Larger jobs will naturally take longer to complete.
- Content Type: Image processing is more resource-intensive and will take longer than text processing.
- Policy Complexity: Jobs run against policies with a higher number of assertions or more complex logic will require more processing time.
- System Queue: There may be a delay before your job begins processing as it waits in our internal queue.
Check Job Status
You will need to check on your job status periodically. When the job status returnsBATCH_JOB_STATE_COMPLETED, the response from the API call will also contain a output_url to download your results. This URL is valid for 4-hours. Each pull of Batch Job Status will return a new URL for completed jobs. Each of these are good for 4-hours. Your results are stored for much longer and can be retrieved at your convenience.
Stages
Your job will pass through several stages. You can monitor these by polling the ‘Get Batch Job’ endpoint.BATCH_JOB_STATE_QUEUED- Your job is in the queue waiting for processing.BATCH_JOB_STATE_PREPROCESSING- Your job is being prepared and validated.BATCH_JOB_STATE_SUBMITTED- Your job has been submitted to the AI for evaluation.BATCH_JOB_STATE_POSTPROCESSING- The results are being compiled.BATCH_JOB_STATE_COMPLETED- Your job is complete and the results are ready to be downloaded.BATCH_JOB_STATE_ERROR- Your job has encountered an error.