Universal Connector Configuration Guide

  • Updated

This article describes how to configure the Universal Connector in One Model to ingest data from an API. It is intended for Data Admins setting up a new API-based data source, or configuring new endpoints on an existing API data source. By the end you will have created a data source and configured API endpoints ready for data loading.


 

Before you start setting up a connector, ensure:

  • You have Data Admin permissions in One Model.
  • You can access Data > Sources from the One Model menu.
  • You have the API reference information of the data source you wish to connect to. Note: APIs that return data in CSV format should use the CSV Connector.

Getting Started

This is a general guide to configuring a Universal Connector for a custom data source. Universal Connector configuration guides for specific data sources may also be useful references.

The prerequisites for building a One Model custom integration with any application that has an API are:

  • The application must have a public API.
  • You know what type of authentication the API requires - Basic, Bearer Token, OAuth, or Header Token - and have the required usernames and secret keys or passwords for that authentication type.
  • You have the various URLs (base and endpoint), token URLs, and scopes (if applicable) as required for the API. This is typically found in the API documentation published by the application's owner.
  • You know what endpoints are available, how they are organised, and what data is available from them.
  • The endpoints you need are available via GET requests. GET is the standard method for retrieving data from an API without modifying it, and is the most common method for read-only data extraction.
  • You are aware of any requirements or expectations of the data source when requesting data from the API.
  • The endpoint returns data in JSON format. (Note: see Configuring the CSV Connector for APIs that return data in CSV format.)

Most data sources will have documentation describing the type of authentication used by the API, how to set up access credentials, where to access the API and its endpoints, any information that must be passed when retrieving data, and how the queried data will be returned. Having this documentation available when configuring a connector is strongly recommended.

Be selective about which endpoints are needed from the API - data retrieved from one endpoint may make another unnecessary. For example, if an endpoint provides a list of all departments and their details (/departments), it may be redundant to also include an endpoint that queries each department individually (/departments/{departmentid}). The API documentation can help with comparing what is retrieved from each endpoint.

For security reasons, any Password, Client Secret, and/or Token (depending on the method of authentication) will need to be re-entered each time the connector is saved.

 

Additional Information

Destructive Processing

Destructive processing treats the API's current response as the single source of truth, replacing everything that was there before. When running destructive processing, a full historical retrieval of all endpoint data is completed and then loaded into the target tables after all existing records are deleted. Typically an initial run of a newly built connector would be a destructive run. For subsequent runs, the size of the dataset returned in the API's response should be a consideration - the larger the response, the longer a destructive run will take.

Incremental Processing

When running incremental processing, the connector retrieves only the data that has been modified or added since the last successful run. Incremental processing is intended to have a smaller API response footprint than a destructive run, and to avoid requesting data that has already been retrieved. To successfully isolate changes, the connector requires keys and parameters to define the exact timeframe and record set to be retrieved. A key is needed to identify changed records, and a parameter (such as a date filter) is needed to limit the data returned. If either is missing, the endpoint will run in destructive mode even if the overall extraction is set to incremental.

 


Configuring Connector Details

Field Description
Name The label that will be used to identify this data source. Choose one that is meaningful and helps with easy identification of the data source.
Data Loads Should Process Data Enable this option to trigger processing (e.g. custom SQL, processing script, and cache warming) after data has been loaded into the database.

This defaults to On, however it may be preferable to leave this Off until the data source configuration is finished and tested.
Enable Debug Mode Allows logging of the API request and response.

In the event that debug mode is required, One Model will organise its activation and the retrieval of the logs.
Restricted Data Restricts the downloading of data files produced by this data source. Typically used with sensitive data (e.g. survey data) to ensure it cannot be accessed prior to processing and aggregating.

In the event that restricting data is required, One Model will organise its activation.
Message Queue Capacity Sets the maximum number of Simple Queue Service (SQS) messages that can be processed at a time by the SQS queue of the API service.

No change to the default configuration is required. In the event that a change is needed, One Model will manage the appropriate capacity.
Schema The label for the schema that will hold the data retrieved for this data source. Choose one that is meaningful and helps with easy identification of the data source.
Processed To No configuration required. This will auto-populate with the most recent date the data source was run. The value shown here will be used in the Variables keyword ProcessedTo.
Endpoint Base Uri The URL for the API's hostname for your data source. Refer to your API's documentation to find the URL - it is typically found as the "Base URL", "Root Endpoint", or "Host". Note that a production environment API hostname may have a similar URL to a test environment.
Max Concurrent Requests Sets the maximum number of requests to the API that can be processed at one time. This setting can be left at its default of 10, unless the data source's API documentation notes that a lower number of concurrent requests should be used.
Date Time Format The format of datetime values (e.g. yyyy-MM-ddTHH:mm:ssZ) provided by the API. May be left empty if the format is unknown.

 


Configuring Authentication Types

Basic

Key or user+password credentials

Field Description
Username The user identifier that has access to the API. If the data source has provided a key for authentication, that key can be entered here.
Password The password or secret key for the user identifier. If the data source has provided a key for authentication, the key should be entered into the Username field and Password can be any string (e.g. xx).

Bearer Token

Digital key credentials

Field Description
Bearer Token An alphanumeric string as provided by the data source. Bearer tokens can expire, requiring a new token to be created.

OAuth Client Credential

Client ID/Secret are traded for a temporary token

Field Description
Access Token URL The URL where the Client ID and Client Secret are sent to obtain an access token. May also be referred to as an OAuth Endpoint, Token Request Endpoint, or Identity Provider (IdP), or found as a POST /token. The Access Token URL is almost always different to the Endpoint Base Uri.
Client ID The user identifier that has access to the API.
Client Secret The secret key for the user identifier.
Scopes A list of the permissions granted to the Client that will be used by the connector. If a permission is listed here that has not been granted to the Client, the connector will be rejected by the API. Refer to your API documentation for the appropriate way to format and separate the scopes.

Scopes may be left empty, however some APIs require that scopes are listed and will reject the connector if they are not.

OAuth Static Refresh Token

A 'master key' used to generate access tokens

Field Description
Access Token URL The URL where the Client ID, Client Secret, and Refresh Token are sent to obtain an access token. May also be referred to as an OAuth Endpoint, Token Request Endpoint, or Identity Provider (IdP), or found as a POST /token. The Access Token URL is almost always different to the Endpoint Base Uri.
Client ID The user identifier that has access to the API.
Client Secret The secret key for the user identifier.
Scopes A list of the permissions granted to the Client that will be used by the connector. If a permission is listed here that has not been granted to the Client, the connector will be rejected by the API. Refer to your API documentation for the appropriate way to format and separate the scopes.

Scopes may be left empty, however some APIs require that scopes are listed and will reject the connector if they are not.
Refresh Token A secret key that rarely, if ever, changes. This key is used in conjunction with the Client ID and Client Secret for access to the API.

Header Token

A key used in the API request header

Field Description
Header Name The name for the header token, as defined by the application that provides the API. Typically found in the application's API documentation.
Header Token The secret key.

 


Configuring Endpoints

Endpoints represent specific resources within the API.

Field Description
All Endpoints toggle Toggle individual endpoints on or off to include or exclude them from an API run. A master toggle at the Endpoints header switches all endpoints on or off simultaneously.
Depends On Choose the required endpoint from the drop-down list if this endpoint requires data retrieved from that endpoint (e.g. using a list of employee IDs retrieved in another endpoint).

Note: the nominated Depends On endpoint should have a Transformer configured to extract and transform the required data, and must be toggled on to pass data to the dependent endpoint.
Batch Output of this Task Activate batching if this endpoint is a "parent" - that is, another endpoint has this endpoint in its Depends On field. Batch Output of this Task splits results into groups before passing them to the dependent endpoint. Each group becomes a separate task that can be processed in parallel, which can significantly reduce overall extraction time.

Consider using batching if an API run is taking more than 12 hours. See Batch Into Groups Of for guidance on evaluating how many groups may be needed.
Batch Into Groups Of Controls how many results are included in each batch group. For example, if an endpoint returns 1,000 employee records and this is set to 100, the dependent endpoint will run as 10 parallel tasks of 100 employees each, rather than processing all 1,000 in a single task.

When choosing a value, consider:
  • Too large (e.g. all results in one batch): loses the benefit of parallel processing and can result in very long-running tasks that are slow to retry if they fail.
  • Too small (e.g. 1): generates a large number of individual tasks, which has its own overhead and can trigger API rate limits if many requests are made simultaneously.
As a general guide, aim for a total of 100–200 batches per endpoint and monitor API processing time. If any individual task is consistently taking over two hours to complete, reducing the batch size is a good first step. The API's rate limit documentation may also help to determine an appropriate value.
Name A label for the endpoint. Choose one that is meaningful and helps with easy identification of the endpoint.
Relative URI The path to the data that will be retrieved from this endpoint. Refer to your API's documentation - it may be labelled as a "resource path" or part of a path and method combination (e.g. GET /surveys). The Relative URI is appended to the Endpoint Base Uri, so only the trailing path is needed (e.g. /surveys rather than a full URL).

If part of the Relative URI is in curly braces (e.g. /candidates/{id}), the part within the braces is a placeholder that can be filled using data from another endpoint. Nominate the appropriate endpoint in the Depends On field.

Note: the nominated Depends On endpoint should have a Transformer configured to extract and transform the required data, and must be toggled on.
Root Data Element For APIs that include a metadata layer in their response, this identifies where in the response to start looking for data. Use the case-sensitive name of the top-level property that contains the data list, or leave blank if the API returns a direct list.

For example, the Root Data Element in the following response is data:
"data": [
  {
    "employeeId": "12345",
    "firstName": "Jane",
    "lastName": "Doe",
    "emplStatus": "Active"
  }
]
Leave blank if the API returns data at the top level or you wish to extract multiple top-level objects.
Table Name A label for the table that will hold the data retrieved from the endpoint. Choose one that is meaningful and helps with easy identification of the data when querying via SQL Explorer or in a Processing Script.
Paging Type Look for "page" or "pagination" in the API documentation to determine the appropriate type.

Link - The next page URL is included directly in the API's response body. Look for properties like: next, next_page, links.next, pagination_url.

Query Parameter - The more common type. Paging metadata is included in the API response body or headers, and can then be used as a query parameter in the URL to request the next page. The API documentation may refer to pagination parameters of: page, cursor, offset, next_cursor, startAt, maxResults.

RFC Link Header - The next page URL is included in the response header using the RFC 8288 standard format. The header will be named Link and will look like: <https://api.example.com/page=2>; rel="next". The API documentation will typically reference the "Link header", "RFC 8288", or keywords like rel="next" and rel="prev". This format is defined here: https://datatracker.ietf.org/doc/html/rfc8288

 


Configuring Endpoint > Pivot Paths

Pivot Paths takes the JSON API response and transforms it into tables by flattening objects into rows and splitting arrays into tables.

Pivot Paths are not mandatory and may not be needed in your configuration. If unsure, leave empty.

Pivot Paths are useful where the JSON contains a list of records as an object, using a record identifier (such as an employee ID) as the property name rather than as a property value. A pivot path on the endpoint will treat the nominated object as if it were an array, converting each key into a row instead. The original key is preserved in a column called om_pivot_key, so the identifier is not lost.

For example, the following JSON response for TableA shows employees 4 and 5 using their identifiers as property names. Without a Pivot Path, all objects and properties are stored with property names prepended with their parent names, resulting in employees_4_lastChanged as a column name within TableA:

{
    "table": "tableA",
    "employees": {
        "4": {"lastChanged": "..."},
        "5": {"lastChanged": "..."}
    }
}

Using Pivot Paths reconfigures the JSON into a format more easily read into tables and columns, with om_pivot_key preserving the identifier:

{
    "table": "tableA",
    "employees": [
        {
            "om_pivot_key": "4",
            "lastChanged": "..."
        },
        {
            "om_pivot_key": "5",
            "lastChanged": "..."
        }
    ]
}
Field Description
Path The path to the object that will be pivoted. In the example above, the Path would be employees. Include the full path where nested - e.g. data.employees. Wildcards are accepted - e.g. values[*].links.
Unpivot Child Arrays Activate this option to unpivot any arrays found within the nominated Path (child arrays).
Unpivot to Parent Table Rows from the unpivoted object and rows from any arrays within it are typically stored in separate tables with a linking key. Activate this option to collect all rows into a single table.

Configuring Endpoint > Headers

Headers let the API know who you are, what format you're requesting, and how to handle the delivery of data. Authorization tokens or credentials are automatically included and do not need to be configured here.

Review the API documentation for:

  • Any required custom headers - these may be in a reference table or a "headers" section, and typically accompany keys that start with X-.
  • Example curl commands that start with -H, which indicates a header requirement. Copy the key (the part before the colon) and the value (the part after) into the Header section of the connector.

Note: headers can be case sensitive - ensure any key and value are copied exactly as written in the API documentation. A common header to include where required by the API is accept: application/json.

Field Description
Header The header name or key. E.g. content-type, accept, user-agent.
Value The value that will accompany this header. E.g. application/json or text/csv.

 


Configuring Endpoint > Parameters

Parameters pass filters or modifiers with the API request to ensure the returned dataset contains only what is needed. 

Review the API documentation for "filters" or "query parameters" to determine each endpoint's parameter names, types (e.g. string, date), whether they are required, and what each parameter does. If a required parameter is omitted, the API request will return an error. It is also worth reviewing optional parameters to determine whether the default response (used when the parameter is omitted) is acceptable.

Field Description
Name The parameter name. Must be one that the API recognises.
Value The parameter value, either static or variable.

One Model Variables allow for values to be dynamically substituted into parameters when the connector runs:
  • Range Variables: use {VariableName_From} and {VariableName_To} to reference the From and To values configured in range variables.
  • Non-Range Variables: use {VariableName}.
Type All - Use the parameter in both Destructive and Incremental API runs.

Incremental Only - Only use the parameter in incremental runs. If enabled, a Key must also be configured on this endpoint.

Destructive Only - Only use the parameter in destructive runs.

Configuring Endpoint > Keys

Keys nominate a unique identifier for an object in an API response. 

Keys are required when configuring Parameters with incremental filtering, to ensure that data is not duplicated when retrieving a filtered API response. The key should be the unique identifier on which the incremental date filtering will be applied. For example, when setting an incremental parameter that retrieves all added or changed employee records, the key will be the employee identifier and the parameter will be the modified date.

Required on any endpoint that will run in incremental mode. Without a key set, the endpoint will fall back to destructive mode even during an incremental run.

Field Description
Path No configuration required.
Values > Key The value that will act as the key (e.g. id, employeeid). Must be available within the endpoint.

 


Configuring Endpoint > Transformers

Field Description
Required Enable this to fail processing if the nominated transformer cannot be found in every API response.

Transformer Type: Translating

A Translating Transformer takes a value from a response (the Selector) and inserts it into the next request.

Field Description
Selector Path The name of the header or property that contains the value needed for output. Include the Root Data Element if applicable.
Selector Location Where the nominated Selector Path may be found.

Header - The nominated Selector Path is found in the response header.
Response Body - The nominated Selector Path is found in the response body.
Inserter Location Where the value found at the Selector Path is to be output.

Header - The value is inserted into the request header.
Query String - The value is inserted into the request query.
Replace Uri - The value will be used as the absolute URI.

Transformer Type: Output

An Output Transformer takes a value from a response (the Selector) and inserts it as a parameter for dependent endpoints.

An output transformer must be created if the endpoint will be nominated as a Depends On endpoint. The transformer should nominate the appropriate identifier in Selector Path, with Response Body as Selector Location and Query String as Inserter Location.

Field Description
Selector Path The name of the header or property that contains the value needed for output. Include the Root Data Element if applicable.
Selector Location Where the nominated Selector Path may be found.

Header - The Selector Path is found in the response header. E.g. if the next page URL is returned in the header, the Selector Path would be Link.
Response Body - The Selector Path is found in the response body. E.g. if the employee identifier needs to be output and it is in the root data element of data, the Selector Path would be data.employeeId.
Inserter Location Where the value found at the Selector Path is to be output.

Header - The value is inserted into the request header.
Query String - The value is inserted into the request query.
Replace Uri - The value will be used as the absolute URI.

Transformer Type: Incrementing

An Incrementing Transformer takes a value from a response (the Selector), increments it, and inserts it into the next request.

Field Description
Selector Path The name of the header or property that contains the value needed for output. Include the Root Data Element if applicable.
Selector Location Where the nominated Selector Path may be found.

Header - The Selector Path is found in the response header. E.g. if the API requires page numbers to be tracked, the Selector Path would be X-Page-Number.
Response Body - The Selector Path is found in the response body. E.g. if the employee identifier needs to be output and it is in the root data element of data, the Selector Path would be data.employeeId.
Query String - The Selector Path is found in the query string. E.g. if using offset pagination, the Selector Path would be offset.
Increment By The number to increment the value at the Selector Path by.
Maximum The highest number the Selector Path value can be incremented to. Can be left at the default of 0 - processing will automatically terminate when no more results are received, the same URL is queried twice in a row, or the Terminator is triggered.

 


Configuring Endpoint > Terminator

A Terminator stops requesting data from the API when a specified condition is met. 

The connector has two built-in termination behaviours that are always active and do not require a Terminator to be configured:

  • If the API response returns no results, processing stops automatically.
  • If the same URL is queried twice in a row, processing stops automatically.

Use a Terminator to set an additional limit on requests and prevent the connector from making unnecessary API calls. When the Terminator logic evaluates to true, the connector will finish its current task (saving the data it has received) and close the connection.

Configure a Terminator when the API signals the end of available data in a way other than returning an empty result - look for a has_more: false property, a null next_cursor, or a total page count in the API documentation. When the Termination Value matches the nominated Selector value, the connector will complete its current task and close the connection.

Field Description
Required When enabled, this will fail the connector processing if the response does not include the nominated Selector value.
Selector Path The name of the header or property that contains the data needed for terminating the connector processing. Include the Root Data Element if applicable.
Selector Location Where the nominated Selector Path may be found.

Header - The Selector Path is found in the response header. E.g. if the API requires page numbers to be tracked, the Selector Path would be X-Page-Number.

Response Body - The Selector Path is found in the response body. E.g. if the API response includes a has_more property.

Query String - The Selector Path is found in the query string. E.g. if using offset pagination, the Selector Path would be offset.
Termination Value The case-sensitive value that, when equal to the nominated Selector value, will stop further processing. E.g. false or no if using has_more as the Selector value.

 


Configuring Variables

Variables are a flexible way for your One Model connector to insert dynamic values into API requests without the maintenance of continually updating the connector. Variables are used in Parameter values and allow the API to filter its response so that only records meeting the variable condition are included.

Variables allow for the following keywords:

  • CurrentTime - uses the start date of the API run.
  • ProcessedTo - uses the Processed To date of the data source. Processed To is the date data has been extracted to for the data source; the current value can be seen in the main connector details for the data source.

A common use for variables is with incremental API runs. By creating a variable with ProcessedTo and using it in a Parameter on an updated_at API parameter, the API request will be scoped to return only records modified since the last successful run.

Note: ensure any parameters used are recognised by the API data source (as defined in the API documentation), and that any variables used in those parameters are the same data type as the parameter. How the API interprets and applies date values will depend on the API's own parameter definitions - refer to the API documentation to confirm the expected format and behaviour of each parameter.

Variable Type: Date Range or Number Range

Field Description
Variable Name A label for the variable. Choose one that is meaningful and helps with easy identification. Ensure the same label is used when referencing the variable in a parameter.
Type Date Range - Uses a date range for the variable. Allows ProcessedTo and CurrentTime keywords.
Number Range - Uses a number range for the variable.
From The date or number representing the start of the range. If Date Range type is selected, the ProcessedTo and CurrentTime keywords are also allowed.
To The date or number representing the end of the range. If Date Range type is selected, the ProcessedTo and CurrentTime keywords are also allowed.
Period How the date or number range will be split. E.g. year will split the date range into yearly intervals, enabling one task per interval.

Variable Type: Date or Text

Field Description
Variable Name A label for the variable. Choose one that is meaningful and helps with easy identification. Ensure the same label is used when referencing the variable in a parameter.
Type Date - Uses a date for the variable. Allows ProcessedTo and CurrentTime keywords.
Text - Uses a text value for the variable.
Value The date or text value for the variable. If Date type is selected, the ProcessedTo and CurrentTime keywords are also allowed.

 


You have configured your Universal Connector data source. You can now:

  • Perform a destructive API run from the Data Source page.
  • Monitor your API run from Data > Sources > View API Runs. For additional information on API runs, see the guide to Understanding the API Run Page.
  • Monitor successful API runs as they transition into Data Loads. For additional information on data loads, see the guide to Data Loads.
  • For questions, contact your One Model Customer Success team.

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.