Google BigQuery
    • 08 Jun 2023
    • 5 Minutes to read
    • Dark
      Light
    • PDF

    Google BigQuery

    • Dark
      Light
    • PDF

    Article summary

    Prerequisites

    To build Policies for data, both row and column levels should be considered. When setting up the building blocks for fine-grained data access in the Authorization Platform, the following structure must be followed:

    • Asset Type for row level access control: This Asset Type represents the local representation of the table within the Authorization Platform.
    • Asset Type for column level access control: This Asset Type defines the actual columns that access should (or should not) be provided access to.

    Asset Type for Row Level Access Control

    For each table in which a row level access decision needs to be provided, an Asset Type needs to be created.

    The Asset Type should contain the Attributes needed to enforce access for the specific table and the Asset Type is used for data filtering field should be set to Yes. Assets are managed in the Authorization Request.

    image.png

    The following naming conventions need to be followed:

    • The Asset Type name must match the format “project_ID.dataset.table_name”, based on the names set in Google BigQuery.

    Note: To support global Policies, it is possible to use an asterisk instead of specific data set or project id.

    • The Asset Attribute names in the Asset Type should match the column names of the table to be used for filtering – please note these names are case sensitive.
    • The supported types of Asset Attributes for both row and column level attributes are String and Numeric. For example, in the following table, the relevant Attributes to be used to enforce access on the data are branch (String type) and balance (Numeric type).

    image.png

    Therefore, the Asset Type for this table will contain these two Asset Attributes, as shown below:

    image.png

    The Policy decision from Asset Types that are configured to be used for data filtering (in the Asset Types is used for data filtering field) are used to build the “where” clause of the query.

    The Row level Asset Type can be created through the Authorization Platform UI.

    Asset Type for Column Level Access Control

    The Asset Type to support column level must come from either an External or PlainID Asset Source. The Asset Type represents which columns the user can access in the database. The Access Request will ask which column the user can access and the Access Response will return the names of the columns the user can access.

    Columns in the Policy are managed as actual Assets. The deciding factor on which Asset Source you select is based on where the PDP is located.

    • If the organization is using a Cloud-based PDP, then the Asset Type to support column level assets should be configured as PlainID.
    • If the organization is using a custom-hosted deployment where the PDP is hosted locally, then the Asset Type to support column level assets should be configured as External. This means that the Assets will be managed externally and you will create a query to create the table where the Assets are managed.

    The following query can be used to create the Asset table source.

    CREATE TABLE public.column_data (
    dataset_name varchar NULL,
    table_name varchar NULL,
    column_name varchar NULL,
    classification varchar NULL
    );
    CREATE INDEX newtable_dataset_name_idx ON public.column_data USING btree (dataset_name, table_name, column_name, classification);
    

    One column level Asset Type should be created for each Environment. This Asset Type supports column level access to multiple tables in Google BigQuery. There are no constraints on the name of this Asset Type, but the Asset Types is used for data filtering must be set to No, as shown below.

    image.png

    After creating the Asset Type, you need to create Asset Attributes the column level access. The following Asset Attributes are mandatory:

    • column – the name of the column where access should be enforced in Google BigQuery.
    • projectid – the unique ID of the project where the table is located in Google BigQuery.
    • dataset – the name of the dataset where the table is located in Google BigQuery).
    • table – the name of the table in Google BigQuery.

    Note: The names to be used in the above Asset Attributes are the default setting in the configuration. It is possible to change this default, however this requires a configuration change.

    In addition to these Asset Attributes, this table should include:

    • Any Asset Attribute based on which you wish the Access Decision to be calculated. These additional Asset Attributes are dictated by the customer’s use case. For example, if you wish to include “classification” (private, confidential, etc.) to filter the columns, you will need to create a classification Asset Attribute.
    • Asset ID – a unique identifier for the column, as columns across different table can share a common name.

    The Policy decision from the Asset Type that is not used for data filtering, is used to build the selectable columns of the query.

    Since the tables in Google BigQuery are in a specific dataset, and each dataset is under a specific project, each Asset Attribute in this Asset Type represents a column in a table.

    In the following example, we see the column balance in the table account, in the dataset titled ds1, in the project titled project1.

    image.png

    Policy Structure

    Policies set up to support fine-grained access to data are built as any other Policy in the Authorization Platform, with one exception. The Asset side of the Policy consists of two Asset Types, one that determines access at the row-level and one for the column-level.

    Sample Policy

    This Policy allows Account Managers to access account information that is classified as Public, for accounts that are managed in the New York branch.

    The Asset side of the Policy is comprised of two Asset Types:

    • Row level access: project1.ds.account (specified for data filtering)
    • Column level access: tables_columns (not specified for data filtering).

    Following is the Policy Map for this sample Policy.

    Policy Map

    The Rulesets to create this Policy are created in each of the two Asset Types, as detailed in the following sections.

    Row level:

    Row level

    Column level:

    Column level

    Note: The request.asset rules ("request.table", "request.dataset", and "request.projectid") must be included in Asset Types dedicated for column level access. These Rules are used for additional filtering of the response based on the table being accessed at the time of request. As there may be columns with the same name across multiple different tables, datasets, and project IDs.

    Sample Policy Resolution Response

    The resulting access for an Account Manager for the query to create the Asset table source will be the actual columns in the table. These are returned in the access section with the data filtered according to the filter returning in the privileges section, as shown in the following code samples.

    Sample Column Level Access Response

    Column Level Access Response

    Sample Row Level Access Response

    Row Level Access Response


    Was this article helpful?