Google BigQuery Data Source
    • 29 Sep 2024
    • 2 Minutes to read
    • Dark
      Light
    • PDF

    Google BigQuery Data Source

    • Dark
      Light
    • PDF

    Article summary

    Google BigQuery Integration

    Google BigQuery can be used as a PIP Data Source when you need data Attributes served from BigQuery to be used in Policy logic.

    This can be achieved by using a JDBC Adapter in PIP, using the Google BigQuery JDBC driver.

    Initial Setup

    To complete the initial setup in GBQ, the following initial steps are required:

    1. In GBQ, create a Service Account with the following roles:
      • BigQuery Connection User
      • BigQuery Data Viewer
      • BigQuery User
    2. Download the service account key to a safe location for future use (as detailed below).

    Authorization Platform Setup

    A PAA instance should include the GBQ JDBC driver. You can get the GBQ JDBC driver using the following link: SimbaJDBCDriverforGoogleBigQuery42_1.3.0.1001.zip

    Containerized PAA

    1. Place the driver in the relevant location. You need to include the driver in the custom-values.yaml as described in the PlainID Developer Portal here.

    2. Add the following lines to the custom-values.yaml file:

    image

    You will reference this Secret in the Data Source connection configuration as described below.

    Standalone/VM PAA

    • Place the driver files in the PIP adapters drivers location: $PLAINID_HOME/pip-operator/lib/
      • Unzip the SimbaJDBCDriverforGoogleBigQuery42_1.3.0.1001.zip file you downloaded into $PLAINID_HOME/pip-operator/lib.
      • After unzipping the file, you should have the driver jar file GoogleBigQueryJDBC42.jar and a libs folder with many other required files.
      • From the lib folder, you need to delete slf4j-api-*.jar.
      • From the lib folder, you can delete the original zip file.
    • Place the GBQ service account key (json file) at $PLAINID_HOME/pip-operator/conf

    PIP Settings

    After the initial setup, you can create a new Data Source with JDBC adapter in the PIP UI and use it as a View. Follow the instructions of Data Sources, Models, and Views creation under the JDBC adapter documentation here.

    Integration Steps

    To integrate Google BigQuery:

    1. In the Connection Adapter drop-down list, select the Simple JDBC adapter.
    2. For the Driver Class Name, you should refer to the newly added GBQ driver with this value: com.simba.googlebigquery.jdbc.Driver

    image

    In the Policy Authorization Agent section, in the URL field use this value with your specific details:

    jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<BigQuery_Project_ID>;OAuthType=0;OAuthServiceAcctEmail=<Service_Account_Name>;OAuthPvtKeyPath=<GBQ_KEY_PATH>;DefaultDataset=<Default_Dataset>;QueryDialect=SQL/app/conf/plainid-presales-f3509fa53b6e.json

    Make the following replacements above:

    • BigQuery_Project_ID - Replace this param with your projectID
    • Service_Account_Name - Replace this parameter with your Service Account details, ex. bq@example.iam.gserviceaccount.com
    • GBQ_KEY_PATH - Replace this param with the Secret Key location as you configured above:
      • For containerized - /app/conf/plainid-presales-f3509fa53b6e.json
      • For VM PAA - /opt/plainid/pip-operator/conf/plainid-presales-f3509fa53b6e.json
    1. In the Models section, make sure you have these keys:
    • Key - importer.tableTypes, Value - TABLE,VIEW
    • Key - importer.schemaPattern, Value - demo
    1. Click on the Generate DDL button or create foreign tables manually.

    NOTE: If you generate DDL, all your GBQ schema will be created in the model.

    NOTE: If you generate DDL, you will need to delete the option 'name in source' from each table in the DDL.

    K8s PAA

    VM PAA

    1. Save the new Data Source and continue your PIP configuration as described in the PIP section of the documentation considering the GBQ source as a SQL source.

    Was this article helpful?