Skip to main content
For technical docs visit the Clavata api docs.

How it Works

Using the Batch API is a four-step process. You must follow these steps in order to successfully process your data.
  1. Create a Batch Job: First, you make an API request to create a new job. This initial call will return a job_id and a secure, pre-signed URL that you will use to upload your data.
  2. Upload Your Input File: Next, you will upload your data file to the pre-signed URL provided in Step 1. The file must be in the specified CSV format. See below for details.
  3. Start the Batch Job: Once your file is successfully uploaded, you must make a separate API call to start the job, referencing its job_id. The job will not begin processing until this call is made.
  4. Check Job Status: To monitor the progress of your job, you will need to periodically poll the ‘Get Batch Job’ endpoint using your job_id.

Create a Batch Job

Creating a Batch Job is very simple. Sending in a request will grant you a URL to upload your content to. The URL is only good for 4-hours so be sure to upload your content before that expires. If you miss the window then you will need to create another job.
The current upload limit for each Batch Job is 25,000 items. If you have more than that, you will need to create more than one job.

Upload Your Input File

Currently Clavata supports uploading your content via a CSV.
Each Batch Job can only take one file at a time. Uploading more than one CSV will overwrite the oldest.

Text

Text processing requires the below:
  • ref_id - this could be a numbered list, a content guid, or any other identifier so long as it is unique in the row.
  • type - this should be set to ‘text’
  • content - should contain the content you wish to have evaluated
ref_id, type, content
1, text, hello
2, text, nice to meet you
3, text, have a great day!
Download text example

Image

Image processing requires the below:
  • ref_id - this could be a numbered list, a content guid, or any other identifier so long as it is unique in the row.
  • type - this should be set to ‘image_url’
  • content - should contain the content you wish to have evaluated
ref_id, type, content
568695, image_url, https://image.com/1234_fuzzy_bunny.png
568696, image_url, https://image.com/5678_fluffy_cat.png
568727, image_url, https://image.com/0912_funny_dog.png
Download image example
Ensure your image URLs are publicly accessible!

Start the Batch Job

Once your content is uploaded, you will need to make another call to start the job. You cannot upload content once a job has started.

Performance Expectations

Batch Job processing time is not guaranteed and can vary significantly based on several factors:
  • Job Size: Larger jobs will naturally take longer to complete.
  • Content Type: Image processing is more resource-intensive and will take longer than text processing.
  • Policy Complexity: Jobs run against policies with a higher number of assertions or more complex logic will require more processing time.
  • System Queue: There may be a delay before your job begins processing as it waits in our internal queue.

Check Job Status

You will need to check on your job status periodically. When the job status returns BATCH_JOB_STATE_COMPLETED, the response from the API call will also contain a output_url to download your results. This URL is valid for 4-hours. Each pull of Batch Job Status will return a new URL for completed jobs. Each of these are good for 4-hours. Your results are stored for much longer and can be retrieved at your convenience.

Stages

Your job will pass through several stages. You can monitor these by polling the ‘Get Batch Job’ endpoint.
  • BATCH_JOB_STATE_QUEUED - Your job is in the queue waiting for processing.
  • BATCH_JOB_STATE_PREPROCESSING - Your job is being prepared and validated.
  • BATCH_JOB_STATE_SUBMITTED - Your job has been submitted to the AI for evaluation.
  • BATCH_JOB_STATE_POSTPROCESSING - The results are being compiled.
  • BATCH_JOB_STATE_COMPLETED - Your job is complete and the results are ready to be downloaded.
  • BATCH_JOB_STATE_ERROR - Your job has encountered an error.