Database

BigQuery



source: documentation: author: supabase tags:

  • native
  • official

BigQuery

BigQuery is a completely serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data, with BI, machine learning and AI built in.

The BigQuery Wrapper allows you to read and write data from BigQuery within your Postgres database.

Supported Data Types

Postgres TypeBigQuery Type
booleanBOOL
bigintINT64
double precisionFLOAT64
numericNUMERIC
textSTRING
varcharSTRING
dateDATE
timestampDATETIME
timestampTIMESTAMP
timestamptzTIMESTAMP

Preparation

Before you get started, make sure the wrappers extension is installed on your database:


_10
create extension if not exists wrappers with schema extensions;

and then create the foreign data wrapper:


_10
create foreign data wrapper bigquery_wrapper
_10
handler big_query_fdw_handler
_10
validator big_query_fdw_validator;

Secure your credentials (optional)

By default, Postgres stores FDW credentials inide pg_catalog.pg_foreign_server in plain text. Anyone with access to this table will be able to view these credentials. Wrappers is designed to work with Vault, which provides an additional level of security for storing credentials. We recommend using Vault to store your credentials.


_15
-- Save your BigQuery service account json in Vault and retrieve the `key_id`
_15
insert into vault.secrets (name, secret)
_15
values (
_15
'bigquery',
_15
'
_15
{
_15
"type": "service_account",
_15
"project_id": "your_gcp_project_id",
_15
"private_key_id": "your_private_key_id",
_15
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
_15
...
_15
}
_15
'
_15
)
_15
returning key_id;

Connecting to BigQuery

We need to provide Postgres with the credentials to connect to BigQuery, and any additional options. We can do this using the create server command:


_10
create server bigquery_server
_10
foreign data wrapper bigquery_wrapper
_10
options (
_10
sa_key_id '<key_ID>', -- The Key ID from above.
_10
project_id 'your_gcp_project_id',
_10
dataset_id 'your_gcp_dataset_id'
_10
);

Creating Foreign Tables

The BigQuery Wrapper supports data reads and writes from BigQuery.

IntegrationSelectInsertUpdateDeleteTruncate
BigQuery

For example:


_10
create foreign table my_bigquery_table (
_10
id bigint,
_10
name text,
_10
ts timestamp
_10
)
_10
server bigquery_server
_10
options (
_10
table 'people',
_10
location 'EU'
_10
);

Foreign table options

The full list of foreign table options are below:

  • table - Source table or view name in BigQuery, required.

    This can also be a subquery enclosed in parentheses, for example,


    _10
    table '(select * except(props), to_json_string(props) as props from `my_project.my_dataset.my_table`)'

    Note: When using subquery in this option, full qualitified table name must be used.

  • location - Source table location, optional. Default is 'US'.

  • timeout - Query request timeout in milliseconds, optional. Default is '30000' (30 seconds).

  • rowid_column - Primary key column name, optional for data scan, required for data modify

Query Pushdown Support

This FDW supports where, order by and limit clause pushdown.

Inserting Rows & the Streaming Buffer

This foreign data wrapper uses BigQuery’s insertAll API method to create a streamingBuffer with an associated partition time. Within that partition time, the data cannot be updated, deleted, or fully exported. Only after the time has elapsed (up to 90 minutes according to BigQuery’s documentation); can you perform operations.

If you attempt an UPDATE or DELETE statement on rows while in the streamingBuffer, you will get an error of UPDATE or DELETE statement over table datasetName - note that tableName would affect rows in the streaming buffer, which is not supported.

Examples

Some examples on how to use BigQuery foreign tables.

Let's prepare the source table in BigQuery first:


_12
-- Run below SQLs on BigQuery to create source table
_12
create table your_project_id.your_dataset_id.people (
_12
id int64,
_12
name string,
_12
ts timestamp
_12
);
_12
_12
-- Add some test data
_12
insert into your_project_id.your_dataset_id.people values
_12
(1, 'Luke Skywalker', current_timestamp()),
_12
(2, 'Leia Organa', current_timestamp()),
_12
(3, 'Han Solo', current_timestamp());

Basic example

This example will create a "foreign table" inside your Postgres database called people and query its data:


_12
create foreign table people (
_12
id bigint,
_12
name text,
_12
ts timestamp
_12
)
_12
server bigquery_server
_12
options (
_12
table 'people',
_12
location 'EU'
_12
);
_12
_12
select * from people;

Data modify example

This example will modify data in a "foreign table" inside your Postgres database called people, note that rowid_column option is mandatory:


_24
create foreign table people (
_24
id bigint,
_24
name text,
_24
ts timestamp
_24
)
_24
server bigquery_server
_24
options (
_24
table 'people',
_24
location 'EU',
_24
rowid_column 'id'
_24
);
_24
_24
-- insert new data
_24
insert into people(id, name, ts)
_24
values (4, 'Yoda', '2023-01-01 12:34:56');
_24
_24
-- update existing data
_24
update people
_24
set name = 'Anakin Skywalker'
_24
where id = 1;
_24
_24
-- delete data
_24
delete from people
_24
where id = 2;