Google BigQuery Integration
Google BigQuery can be used as a PIP Data Source when you need data Attributes served from BigQuery to be used in Policy logic.
This can be achieved by using a JDBC Adapter in PIP, using the Google BigQuery JDBC driver.
Initial Setup
To complete the initial setup in GBQ, the following initial steps are required:
- In GBQ, create a Service Account with the following roles:
- BigQuery Connection User
- BigQuery Data Viewer
- BigQuery User
- Download the service account key to a safe location for future use (as detailed below).
Authorization Platform Setup
A PAA instance should include the GBQ JDBC driver. You can get the GBQ JDBC driver using the following link: SimbaJDBCDriverforGoogleBigQuery42_1.3.0.1001.zip
Containerized PAA
-
Place the driver in the relevant location. You need to include the driver in the
custom-values.yaml
as described in the PlainID Developer Portal here. -
Add the following lines to the
custom-values.yaml
file:
You will reference this Secret in the Data Source connection configuration as described below.
Standalone/VM PAA
- Place the driver files in the PIP adapters drivers location:
$PLAINID_HOME/pip-operator/lib/
- Unzip the
SimbaJDBCDriverforGoogleBigQuery42_1.3.0.1001.zip
file you downloaded into$PLAINID_HOME/pip-operator/lib
. - After unzipping the file, you should have the driver jar file
GoogleBigQueryJDBC42.jar
and a libs folder with many other required files. - From the lib folder, you need to delete
slf4j-api-*.jar
. - From the lib folder, you can delete the original zip file.
- Unzip the
- Place the GBQ service account key (json file) at
$PLAINID_HOME/pip-operator/conf
PIP Settings
After the initial setup, you can create a new Data Source with JDBC adapter in the PIP UI and use it as a View. Follow the instructions of Data Sources, Models, and Views creation under the JDBC adapter documentation here.
Integration Steps
To integrate Google BigQuery:
- In the Connection Adapter drop-down list, select the Simple JDBC adapter.
- For the Driver Class Name, you should refer to the newly added GBQ driver with this value:
com.simba.googlebigquery.jdbc.Driver
In the Policy Authorization Agent section, in the URL field use this value with your specific details:
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<BigQuery_Project_ID>;OAuthType=0;OAuthServiceAcctEmail=<Service_Account_Name>;OAuthPvtKeyPath=<GBQ_KEY_PATH>;DefaultDataset=<Default_Dataset>;QueryDialect=SQL/app/conf/plainid-presales-f3509fa53b6e.json
Make the following replacements above:
- BigQuery_Project_ID - Replace this param with your projectID
- Service_Account_Name - Replace this parameter with your Service Account details, ex. bq@example.iam.gserviceaccount.com
- GBQ_KEY_PATH - Replace this param with the Secret Key location as you configured above:
- For containerized -
/app/conf/plainid-presales-f3509fa53b6e.json
- For VM PAA -
/opt/plainid/pip-operator/conf/plainid-presales-f3509fa53b6e.json
- For containerized -
- In the Models section, make sure you have these keys:
- Key - importer.tableTypes, Value - TABLE,VIEW
- Key - importer.schemaPattern, Value - demo
- Click on the Generate DDL button or create foreign tables manually.
NOTE: If you generate DDL, all your GBQ schema will be created in the model.
NOTE: If you generate DDL, you will need to delete the option 'name in source' from each table in the DDL.
- Save the new Data Source and continue your PIP configuration as described in the PIP section of the documentation considering the GBQ source as a SQL source.