This article is intended for Data Admins who are configuring a Google Cloud Storage data source in One Model. It explains how to use Google Console to enable the required APIs, create a Workload Identity Pool, create a Service Account (if applicable) and grant One Model the permissions it needs to retrieve data from Google Cloud Storage. By the end you'll have a configured Workload Identity Pool and the necessary bucket permissions in place to set up your Google Cloud Storage connector.
Note: Data Admins may need to have these steps completed by their Google Workspace administrator.
Google Workload Identity Federation (WIF) is a security framework that allows One Model's AWS-hosted services to access your Google Cloud resources without requiring long-lived service account keys. Instead of managing static credentials, WIF establishes a trust relationship between One Model's AWS role and your Google Cloud environment, so access is granted securely and automatically.
Before you begin configuring Google WIF you will need to:
- Identify your bucket: confirm you have a Google Cloud Storage bucket/s containing the data you want One Model to access, and that you know its name.
- Identify your BigQuery schema: confirm the names of any BigQuery schemas (if you would like to retrieve data from a BigQuery schema)
-
Obtain the One Model AWS Role ARN*: see your One Model Customer Success team to collect the AWS Account ID and AWS Role Name. The AWS ARN follows the format:
arn:aws:iam::AWS_ACCOUNT_ID:role/AWS_ROLE_NAMEand consists of:
AWS_ACCOUNT_ID: One Models 12 digit AWS Account ID
AWS_ROLE_NAME: the ARN of the specific IAM Role our services will be using. This isom-COMPANY_ID-ext-data-intgwhere COMPANY_ID is the ID of the company. - Determine your access approach: a step in this configuration requires a choice between two approaches: Direct Resource Access and Service Account Impersonation. Direct Resource Access grants One Model's AWS role permission to read from your bucket directly through the Workload Identity Pool; it is the simpler of the two options with fewer components to configure. Service Account Impersonation adds an intermediary Google Service Account through which access is managed, which may be preferable for organizations that already govern cloud resource permissions through Service Accounts. Review these options with your Google Workspace or GCP administrator to determine which option best fits.
* An ARN is an Amazon Resource Name, a unique identifier used to specify AWS resources across an environment.
As you work through these steps, please retain the following information. These will need to be provided to Customer Success for the Google Cloud Storage data source configuration in One Model:
- Project Number
- Pool ID
- Provider ID
- Service Account Email (for Service Account Impersonation configurations)
Create a Project in Google Console
Google Console organises cloud resources and access controls around 'projects', that hold your Workload Identity Pool, API settings, and permissions. You may use an existing project or creating a dedicated project for this integration keeps your One Model access configuration separate and easy to manage. Follow the steps below for creating a new project or skip to the next section if using an existing project.
- Go to the Google Console at https://console.cloud.google.com/apis/dashboard and select Create Project:
- Choose a Project Name,
- confirm or change the Organization,
- confirm or change the Parent Resource, and click Create:
Enable IAM and Security Token APIs
By default, Google projects can't access any APIs, they need to be explicitly enabled. The WIF integration requires three APIs to be switched on: these allow Google Cloud to verify One Model's AWS identity, issue short-lived access tokens, and manage the permissions that control what One Model can access. Without these enabled, any request One Model makes to your Google Cloud resources will be rejected.
The following APIs will need to be enabled in the GCP project:
- IAM Service Account Credentials API (iamcredentials.googleapis.com) for generating short-lived credentials on behalf of a Google Service Account
- IAM API (iam.googleapis.com) enables Google Cloud's identity and access management system, which controls who and what is permitted to access your cloud resources
- Security Token Service API (sts.googleapis.com) allows One Model's AWS identity to be exchanged for a Google Cloud token, which is the core mechanism that makes the WIF trust relationship work.
-
From creating a project, you will be redirected to the APIs & services menu and screen, select Library from the menu:
-
In the Library screen, search for ‘IAM Service Account Credentials’ and select it from the list of results:
-
The details of the IAM Service Account Credentials API are displayed, click Enable:
- Return to the Library screen, search for ‘Identity and access management’ and select it from the list of results:
-
The details of the Identity and Access Management API are displayed, click Enable:
-
Repeat this process for the Security Token Service API and enable:
- You should now see these APIs in the APIs and Services list:
Create Workload Identity Pool
A Workload Identity Pool is the mechanism that establishes trust between One Model's AWS environment and your Google Cloud project. It works by defining a relationship between an external identity provider (in this case, AWS) and your Google Cloud resources, so that when One Model's AWS services make a request, Google Cloud can verify who is asking and decide whether to grant access. Creating a pool and adding AWS as a provider is what allows One Model's AWS role to be recognised as a trusted identity in your Google environment, without needing a static password or key.
Here we will create a new Workload Identity Pool or, if your project already has an existing Workload Identity Pool, you can add a new pool alongside it.
-
From the top left Navigation menu:
-
Find IAM & Admin and select Workload Identity Federation (WIF):
-
(if this is the first Workload Identity Pool for the WIF) You will see a Get Started option. Click this:
-
(if this is not the first Workload Identity Pool for the WIF) Click Create Pool:
-
Give the Pool a descriptive name and a description. The Pool ID will be automatically generated from the given name. Leave the default Enabled setting and click Continue:
-
Add a Provider to the Pool. (a) Choose AWS from the Add a Provider dropdown (b) Enter a name for One Model as the Provider. (c) Once a name is selected, make a note of the automatically generated Provider ID as this will be required in a later step. (d) Enter the One Model AWS Account ID and click Continue:
-
Configure Provider Attributes. Click the Edit Mapping dropdown to review the attribute mappings, we can keep the defaults: (a)
google.subject = assertion.arn(b)attribute.aws_role = assertion.arn.contains('assumed-role') ? assertion.arn.extract('{account_arn}assumed-role/') + 'assumed-role/' + assertion.arn.extract('assumed-role/{role_name}/') : assertion.arn -
Click Save to continue:
Gather the Identity Pool Project Number
Once your Workload Identity Pool is created, you will need to note two identifiers from its configuration: the Project Number and the Pool ID. These are used in the next step to construct the Principal Identifier, which is the string that tells Google Cloud exactly which One Model AWS role should be granted access to your bucket. The easiest way to retrieve these values is by downloading the configuration file that Google generates for your pool, and locating them within it.
-
From Workload Identity Federation, click on the Pool name to bring up its configuration details:
-
In the right panel, click Connected Service Accounts, then Download Config:
-
Select the Provider Name you chose in Create Workload Identity Pool (step 6b) from the dropdown and click Download Config:
- You can review the downloaded config in a text editor. Look for the audience value, it should look similar to:
//iam.googleapis.com/projects/<project_number>/locations/global/workloadIdentityPools/<pool_id>/providers/<provider_id>Note the Project Number which immediately followsprojectsin the string, the Pool ID (afterworkloadIdentityPools), and save them for future steps.
Construct a Principal Identifier
The Principal Identifier is a string that uniquely identifies One Model's AWS role to your Google Cloud environment. It combines the Project Number and Pool ID you gathered in the previous step with the AWS Account ID and AWS Role Name provided by the One Model Customer Success team. Once constructed, this string is used in the next step to grant One Model's AWS role the specific permission it needs to read files from your bucket. Take care when assembling the string as even a small error will prevent access from being granted correctly.
Before beginning, ensure you have:
- Project Number (from Gather the Identity Pool Project Number)
- Pool ID (from Gather the Identity Pool Project Number)
- AWS Account ID (provided by Customer Success)
- AWS Role Name (provided by Customer Success)
- Construct the principal identifier for the One Model AWS role. It follows this format:
principalSet://iam.googleapis.com/projects/<project_number>/locations/global/workloadIdentityPools/<pool_id>/attribute.aws_role/arn:aws:sts::AWS_ACCOUNT_ID:assumed-role/AWS_ROLE_NAME - Replace
<project_number>,<pool_id>,AWS_ACCOUNT_IDandAWS_ROLE_NAME(both provided by Customer Success) in the above string and save the whole string for use in a future step.
Grant Bucket Permissions for Principal Identifier
This step grants One Model's AWS role permission to read files from your Google Cloud Storage bucket. You will apply the Principal Identifier constructed in the previous step to your chosen bucket, assigning it the Storage Object Viewer role. This role allows One Model to read the files in the bucket without being able to modify or delete them. Once saved, One Model's AWS services will be able to retrieve data from that bucket whenever an API run for the One Model connector is triggered.
-
From the top left navigation menu, find Cloud Storage and click Buckets:
-
Select the bucket you would like to use, scroll to the far right, and from the menu choose Edit access:
-
In the Grant Access pop up panel, paste your entire Principal Identifier (generated in Construct a Principal Identifier) string into the New Principals box, select Storage Object Viewer from the Role dropdown and click Save:
- Repeat these steps for any additional buckets you need accessible to the One Model connector.
Service Account Impersonation
* This step may be skipped if you are using Direct Resource Access.
Here we will create a Service Account and configure Service Account Impersonation which adds an intermediary Google Service Account through which access is managed, which may be preferable for organizations that already govern cloud resource permissions through Service Accounts. We recommend discussing both options with your Google Workspace or GCP administrator to determine which best fits your organization's access management practices.
-
From the top left navigation menu, find IAM & Admin and click Service Accounts:
-
Select Create Service Account:
-
Create the service account with a meaningful name and description (note the service account id will automatically generate), then click Create and continue:
-
Search for ‘Workload Identity User’ in the Role dropdown and select it. No further configuration is needed, click Done:
- You will now see your Service Account, its email will follow the format:
<service_account_name>@<project_id>.iam.gserviceaccount.comMake a note of this Service Account email as you will need it in future steps:
Grant Workload Identity Pool Permissions for Service Account
* This step may be skipped if you are using Direct Resource Access.
This step grants One Model's AWS role permission to impersonate the Service Account you identified or created in the previous step. You will apply the Principal Identifier constructed earlier to the Service Account, assigning it applicable role/s. This tells Google Cloud that One Model's AWS role is permitted to act as that Service Account when accessing your resources. Ensure your Service Account sits within the same GCP project as your Workload Identity Pool.
-
Staying within IAM & Admin, go to the Workload Identity Federation and click on the Pool we created earlier:
-
Click Grant Access:
- Grant access to the service account. (a) Choose Grant Access using Service Account Impersonation (b) Select the Service Account that we created in the previous step (c) Choose aws_role for the Attribute Name (d) Enter the One Model AWS Role ARN into Attribute Value (e) Click Save:
Grant Bucket Permissions for the Service Account
* This step may be skipped if you are using Direct Resource Access.
This step grants the Service Account permission to read files from your Google Cloud Storage bucket. It follows the same process as the earlier Grant Bucket Permissions step, but this time you will apply the Service Account's identity, rather than the Principal Identifier, to the bucket, assigning it the Storage Object Viewer role. Once saved, One Model will be able to access your bucket by impersonating this Service Account.
-
Return to the Buckets menu. From the top left navigation menu, find Cloud Storage and click Buckets:
-
Select the bucket you would like to use, scroll to the far right, and from the menu choose Edit access:
-
In the Grant Access pop up panel, paste your entire Service Account email (generated in Service Account Impersonation) into the New Principals box, select Storage Object Viewer from the Role dropdown and click Save:
Additional Permissions for BigQuery
* This step is only required if you are setting up a BigQuery data source and can be skipped if you are using Google Cloud Storage only.
BigQuery requires two additional roles to be granted in order for One Model to access your data: BigQuery Data Viewer, which allows One Model to read data from your BigQuery datasets, and BigQuery User, which allows One Model to run queries against them. These roles are granted whether you are using Direct Resource Access or Service Account Impersonation, the difference is whether they are applied to your Principal Identifier or your Service Account respectively.
-
From the top left navigation menu, find BigQuery and click Studio:
-
Select the BigQuery Schema, click Share then Manage Permissions:
-
(if configuring for Direct Access) Enter the Principal String, select BigQuery Data Viewer and BigQuery User roles, then Save:
-
(if configuring for Service Accounts) Enter the Service Account email, select BigQuery Data Viewer and BigQuery User roles, then Save:
Next Steps
You have now configured Google Workload Identity Federation for your Google Cloud Storage data source. Your Workload Identity Pool is set up, the required APIs are enabled, and One Model's AWS role has the permissions it needs to access your bucket.
To complete the connector setup, provide the following to your One Model Customer Success team:
- Project Number (retrieved in Gather the Identity Pool Project Number)
- Pool ID (retrieved in Gather the Identity Pool Project Number)
- Provider ID (retrieved in Create Workload Identity Pool)
- Service Account Email (for Service Account Impersonation configurations, retrieved in Service Account Impersonation)
Once your Customer Success team has confirmed the configuration, you can set up your Google Cloud Storage data source in One Model. If you encounter any issues, contact your One Model Customer Success team for assistance.
Comments
0 comments
Article is closed for comments.