How to get BigQuery storage size for a single table

I want to calculate the reasonable cost of a table for a large Google query store, but I don’t know how to individually view the storage size for each table.

+7
google-bigquery
source share
3 answers

Or from the GUI, you can use the __TABLES__ internal metadata table, for example, this will give you the size in GB:

select sum(size_bytes)/pow(10,9) as size from <your_dataset>.__TABLES__ where table_id = '<your_table>' 
+15
source share

There are several ways to do this, but be aware that the size of the table property in bytes is not available for tables that are actively receiving stream inserts.

but. Using the BQ and JQ command line tool, the linux library to parse JSON.

 bq --format=json show publicdata:samples.gsod | jq '.numBytes | tonumber' 

This output:

 17290009238 

B. Using the REST api, do the Tables: get the call

 GET https://www.googleapis.com/bigquery/v2/projects/projectId/datasets/datasetId/tables/tableId 

this returns full JSON, which you can numBytes and get numBytes .

 { "kind": "bigquery#table", "description": "This dataset contains weather information collected by NOAA, such a…", "creationTime": "1335916040125", "tableReference": { "projectId": "publicdata", "tableId": "gsod", "datasetId": "samples" }, "numRows": "114420316", "numBytes": "17290009238", "etag": "\"Gn3Hpo5WaKnpFuT457VBDNMgZBw/MTQxMzkzNzk4Nzg0Ng\"", "location": "US", "lastModifiedTime": "1413937987846", "type": "TABLE", "id": "publicdata:samples.gsod", "selfLink": "https://www.googleapis.com/bigquery/v2/projects/publicdata/datasets…", "schema": { "fields": [ { "description": "The World Meteorological Organization (WMO) / DATSAV3 station numbe…", "type": "INTEGER", "name": "station_number", "mode": "REQUIRED" }, { "description": "The Weather-Bureau-Army-Navy (WBAN) station number where the data w…", "type": "INTEGER", "name": "wban_number", "mode": "NULLABLE" }, { "description": "The year the data was collected in", "type": "INTEGER", "name": "year", "mode": "REQUIRED" }, { "description": "The month the data was collected in", "type": "INTEGER", "name": "month", "mode": "REQUIRED" }, { "description": "The day the data was collected in.", "type": "INTEGER", "name": "day", "mode": "REQUIRED" }, { "description": "The mean temperature of the day in degrees Fahrenheit, accurate to …", "type": "FLOAT", "name": "mean_temp", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_temp.", "type": "INTEGER", "name": "num_mean_temp_samples", "mode": "NULLABLE" }, { "description": "The mean dew point of the day in degrees Fahrenheit, accurate to on…", "type": "FLOAT", "name": "mean_dew_point", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_dew_point.", "type": "INTEGER", "name": "num_mean_dew_point_samples", "mode": "NULLABLE" }, { "description": "The mean sea level pressure of the day in millibars, accurate to on…", "type": "FLOAT", "name": "mean_sealevel_pressure", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_sealevel_pressure…", "type": "INTEGER", "name": "num_mean_sealevel_pressure_samples", "mode": "NULLABLE" }, { "description": "The mean station pressure of the day in millibars, accurate to one …", "type": "FLOAT", "name": "mean_station_pressure", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_station_pressure.…", "type": "INTEGER", "name": "num_mean_station_pressure_samples", "mode": "NULLABLE" }, { "description": "The mean visibility of the day in miles, accurate to one tenth of a…", "type": "FLOAT", "name": "mean_visibility", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_visibility.", "type": "INTEGER", "name": "num_mean_visibility_samples", "mode": "NULLABLE" }, { "description": "The mean wind speed of the day in knots, accurate to one tenth of a…", "type": "FLOAT", "name": "mean_wind_speed", "mode": "NULLABLE" }, { "description": "The number of observations used to calculate mean_wind_speed.", "type": "INTEGER", "name": "num_mean_wind_speed_samples", "mode": "NULLABLE" }, { "description": "The maximum sustained wind speed reported on the day in knots, accu…", "type": "FLOAT", "name": "max_sustained_wind_speed", "mode": "NULLABLE" }, { "description": "The maximum wind gust speed reported on the day in knots, accurate …", "type": "FLOAT", "name": "max_gust_wind_speed", "mode": "NULLABLE" }, { "description": "The maximum temperature of the day in degrees Fahrenheit, accurate …", "type": "FLOAT", "name": "max_temperature", "mode": "NULLABLE" }, { "description": "Indicates the source of max_temperature.", "type": "BOOLEAN", "name": "max_temperature_explicit", "mode": "NULLABLE" }, { "description": "The minimum temperature of the day in degrees Fahrenheit, accurate …", "type": "FLOAT", "name": "min_temperature", "mode": "NULLABLE" }, { "description": "Indicates the source of min_temperature.", "type": "BOOLEAN", "name": "min_temperature_explicit", "mode": "NULLABLE" }, { "description": "The total precipitation of the day in inches, accurate to one hundr…", "type": "FLOAT", "name": "total_precipitation", "mode": "NULLABLE" }, { "description": "The snow depth of the day in inches, accurate to one tenth of an in…", "type": "FLOAT", "name": "snow_depth", "mode": "NULLABLE" }, { "description": "Indicates if fog was reported on this day.", "type": "BOOLEAN", "name": "fog", "mode": "NULLABLE" }, { "description": "Indicates if rain was reported on this day.", "type": "BOOLEAN", "name": "rain", "mode": "NULLABLE" }, { "description": "Indicates if snow was reported on this day.", "type": "BOOLEAN", "name": "snow", "mode": "NULLABLE" }, { "description": "Indicates if hail was reported on this day.", "type": "BOOLEAN", "name": "hail", "mode": "NULLABLE" }, { "description": "Indicates if thunder was reported on this day.", "type": "BOOLEAN", "name": "thunder", "mode": "NULLABLE" }, { "description": "Indicates if a tornado was reported on this day.", "type": "BOOLEAN", "name": "tornado", "mode": "NULLABLE" } ] } } 

C. There are meta tags called __TABLES__ and __TABLES_SUMMARY__

You can run a query, for example:

 SELECT size_bytes FROM <dataset>.__TABLES__ WHERE table_id='mytablename' 

The __TABLES__ part of this query may seem unfamiliar. __TABLES_SUMMARY__ is a meta table containing information about the tables in the dataset. You can use this meta table yourself. For example, a SELECT * FROM publicdata:samples.__TABLES_SUMMARY__ would return metadata about the tables in the publicdata:samples . You can also do SELECT * FROM publicdata:samples.__TABLES__

Available fields:

The fields of the __TABLES_SUMMARY__ meta tags (all available in the TABLE_QUERY query) include:

  • table_id : table name.
  • creation_time : time, in milliseconds since 01/01/1970 UTC, that the table was created. This is the same as the creation_time field in the table.
  • type : whether it is a view (2) or a regular table (1).

The following fields are not available in TABLE_QUERY() because they are members of __TABLES__ but not __TABLES_SUMMARY__ . They are stored here for historical interest and partially document the __TABLES__ meta tags:

  • last_modified_time : time, in milliseconds since 01/01/1970 UTC, the table has been updated (either metadata or the contents of the table). Please note: if you use tabledata.insertAll() records to stream to your table, this may be outdated for a few minutes.
  • row_count : the number of rows in the table.
  • size_bytes : total size in bytes of the table.
+6
source share

You can do this using the command line tool

bq show ds_name.table_name

It will display some information outside the table, including "Total Bytes." Link here https://cloud.google.com/bigquery/bq-command-line-tool

+4
source share

All Articles