This is Embulk input plugin from Bigquery.
install it yourself as:
$ embulk gem install embulk-input-bigquery
in:
type: bigquery
project: 'project-name'
keyfile: '/home/hogehoge/bigquery-keyfile.json'
sql: 'SELECT price,category_id FROM [ecsite.products] GROUP BY category_id'
columns:
- {name: price, type: long}
- {name: category_id, type: string}
max: 2000
out:
type: stdout
If, table name is changeable, then
in:
type: bigquery
project: 'project-name'
keyfile: '/home/hogehoge/bigquery-keyfile.json'
sql_erb: 'SELECT price,category_id FROM [ecsite.products_<%= params["date"].strftime("%Y%m") %>] GROUP BY category_id'
erb_params:
date: "require 'date'; (Date.today - 1)"
columns:
- {name: price, type: long}
- {name: category_id, type: long}
- {name: month, type: timestamp, format: '%Y-%m', eval: 'require "time"; Time.parse(params["date"]).to_i'}
in:
type: bigquery
project: 'project-name'
keyfile: '/home/hogehoge/bigquery-keyfile.json'
sql: 'SELECT price,category_id FROM [ecsite.products] GROUP BY category_id'
out:
type: stdout
This plugin uses the gem google-cloud(Google Cloud Client Library for Ruby)
and queries data using the synchronous method.
Therefore some optional configuration items comply with the Google Cloud Client Library.
- max :
- default value : null and null value is interpreted as no maximum row count in the Google Cloud Client Library.
- cache :
- default value : null and null value is interpreted as true in the Google Cloud Client Library.
- standard_sql:
- default value : null and null value is interpreted as true in the Google Cloud Client Library.
- legacy_sql:
- default value : null and null value is interpreted as false in the Google Cloud Client Library.