Decoding url bigquery

Is there an easy way to decode URLs in the BigQuery query language? I am working with a table in which there is a column containing rows encoded with URLs in some values. For instance:

http://xyz.com/example.php?url=http%3A%2F%2Fwww.example.com%2Fhello%3Fv%3D12345&foo=bar&abc=xyz 

I retrieve the url parameter as follows:

 SELECT REGEXP_EXTRACT(column_name, "url=([^&]+)") as url from [mydataset.mytable] 

which gives me:

 http%3A%2F%2Fwww.example.com%2Fhello%3Fv%3D12345 

What I would like to do is something like:

 SELECT URL_DECODE(REGEXP_EXTRACT(column_name, "url=([^&]+)")) as url from [mydataset.mytable] 

thereby returning:

 http://www.example.com/hello?v=12345 

I would like to avoid using multiple REGEXP_REPLACE () statements (replacing% 20,% 3A, etc.) if possible.

Ideas?

+4
source share
2 answers

This is a good feature request, but there is currently no BigQuery built-in function that provides URL decoding.

+1
source

Another workaround is to use a custom function.

 #standardSQL CREATE TEMPORARY FUNCTION URL_DECODE(enc STRING) RETURNS STRING LANGUAGE js AS """ try { return decodeURI(enc);; } catch (e) { return null } return null; """; SELECT ven_session, URL_DECODE(REGEXP_EXTRACT(para,r'&kw=(\w|[^&]*)')) AS q FROM raas_system.weblog_20170327 WHERE para like '%&kw=%' LIMIT 10 
+1
source

All Articles