API Run Page

The API Run page displays the progress of an API ingestion. You can access this page via two ways:

The Data Source page: Click the hamburger menu to open the API Runs page.
The Data Load page: Select any data load widget in the running state by clicking the icon to load an API Run page.

You may monitor the progress of data ingestions by looking at the status of a data source.

For example: A data source will have a 'waiting to start' status. This status will transition to running while the data sources are actively processing data.

A data load will enter a 'completed' status when the API run has completed running.

An API run may enter a 'failed' status when an unexpected use case happens during the run. You should submit a support ticket if this happens.

You can also cancel an API run. The API run will enter a ‘Canceling …’ state until a data source has completely stopped the current processing. This state is designed to prevent multiple users from submitting multiple cancellation requests.

There can only be 1 active data source at a time. The data source will enter a ‘Rejected - previous load still running’ status until the existing data processing is completed, then you may retry an API run.

Understanding an API run

When a connector extracts data via a single API, it populates one or more Redshift tables. This process includes Redshift Ingests, Metadata Consolidations, and Table Consolidations. This page explains how these tasks work and confirms that your data is actively stored. There will be a task named complete redshift ingestion that tells you that the API extraction has been completed.

This is the high level description of an API run.

The process typically involves several stages orchestrated by various services:

Extraction: This stage is all about pulling data from a specific API. The API Processor is the central orchestrator, initiating the data pull from various API endpoints. Think of a task named “Surveys” - this means data is being extracted from the Surveys endpoint. Once extracted, the data is temporarily stored, or “staged,” in Amazon S3. The API Processor is responsible for generating and saving delimited file inputs and creating manifest files that specify which S3 objects should be loaded into a table.

Consolidation: for large data volumes, a table’s data may be extracted by multiple tasks, with each task handling a portion of the total data. Metadata Consolidation combines the table structures - including schema and column definitions - from each of these tasks or API responses to create a final, comprehensive table structure that acts as a blueprint for all the incoming data. This blueprint then guides subsequent consolidation tasks, which align and merge the actual data from the individual tasks into this unified structure.

Ingestion: The final stage is Redshift Ingestion, where the staged data is loaded from S3 into Redshift tables. The data is first loaded into temporary tables before being made live.

With the new system, you’ll see the “Redshift Ingestion” tasks directly on the API Run’s page. This is a change from the old architecture, where you had to check the Data Loads page. A corresponding data load will now only be created if the API run is successful, which streamlines the process and gives you a clearer view of the run’s status.

The entries in these examples will be available when the team releases an upcoming feature called Batched Loading.

Example of a Redshift Ingestion

"Redshift ingestion" indicates the loading of a table into Redshift.

Example of Metadata Consolidation

Metadata consolidation is responsible for combining the table structures from different batches to accommodate all data and for spinning up consolidation tasks.

Example of a table consolidation

The API Runs page details active API endpoints from which data is being extracted. The entries provide insight into the types of data being extracted from a connector.

Examples of API end points that a data source is processing:

The final processing stage involves Complete Redshift ingestion, where the data source prepares the connector's data for live consumption and subsequent processing.

Use Cases

These are useful use cases that could be fulfilled using the API runs page.

Knowing how long each API runs takes

Understanding how long incremental or destructive data runs typically take is crucial. This helps identify if any process is running unusually long. The completion time of the last API run is also key when deciding whether to start another. For instance, you might choose not to run a destructive process if a similar API run historically took seven days, depending on your needs. You can use the Type filter to find this information and compare how long destructive versus incremental API runs usually last.

Retrying an API run

You may retry an API run when the API failed or cancelled. You could only retry the most recent API run. We recommend that you submit a support ticket before attempting to retry an API run.

Please include the API run id when you contact us.

The API run ID can be found in the address bar of your browser. In this example, the API run id is 31548.

Finding Failed API runs

The "Status" column on this page displays the status of an API run. You may click on the status column header to sort the API runs based on the status of the run.

The following is an example of a failed run.

Clicking on the number of the message column gives some indication why the failure happened. Please submit a support ticket when this happens.

Finding out table names for an API run

You may click on the number under the tables column to retrieve a description of the table. You can also see the record counts per table in that run.

You could also use the search feature that lets you find an entry based on:

acronym match on task name (where any underscore or space denotes a word boundary)
task name contains search string
status starts with search string (allows typing just Waiting, instead of ‘Waiting to start’ to see matches)
a table name contains the search string
a message contains the search string

API Run Page

Understanding an API run

Use Cases

Knowing how long each API runs takes

Retrying an API run

Finding Failed API runs

Finding out table names for an API run

Was this article helpful?

Comments

<%= previousTitle %>

<%= nextTitle %>

In this article:

<%= heading %>

<% if (block.html_url) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

<%= heading %>

<% if (block.html_url) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

Learn Apply Lead

Categories

Toggle navigation menu

<%= category.name %>