How to set variables in HIVE scripts

I am looking for the SQL equivalent of SET varname = value in Hive QL

I know I can do something like this:

 SET CURRENT_DATE = '2012-09-16'; SELECT * FROM foo WHERE day >= @CURRENT_DATE 

But then I get this error:

the '@' character is not supported here

+93
variables set constants hadoop hive hiveql
Sep 17
source share
8 answers

To replace the variables you need to use a special hiveconf . eg.

 hive> set CURRENT_DATE='2012-09-16'; hive> select * from foo where day >= '${hiveconf:CURRENT_DATE}' 

Similarly, you can pass the command line:

 % hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql 

Note that there are env and system variables, so you can reference ${env:USER} for example.

To view all available variables, from the command line, run

 % hive -e 'set;' 

or from the hive prompt, run

 hive> set; 

Update: I also started using hivevar variables, putting them in hql fragments, which I can enable from the CLI in hive using the source command (or pass as -i from the command line). The advantage here is that the variable can then be used with or without the hivevar prefix and allows something akin to global or local use.

So, suppose I have setup.hql that sets the tablename variable:

 set hivevar:tablename=mytable; 

then I can bring to the hive:

 hive> source /path/to/setup.hql; 

and use in the request:

 hive> select * from ${tablename} 

or

 hive> select * from ${hivevar:tablename} 

I could also set a "local" scoreboard name, which will affect the use of $ {tablename} but not $ {hivevar: tablename}

 hive> set tablename=newtable; hive> select * from ${tablename} -- uses 'newtable' 

against

 hive> select * from ${hivevar:tablename} -- still uses the original 'mytable' 

This probably doesn't mean too much of the CLI, but it can have hql in the file that the source uses, but set some of the variables β€œlocally” for use in the rest of the script.

+188
Sep 18
source share

Most answers here suggest using the hiveconf or hivevar to store the variable. And all these answers are correct. However, there is another namespace.

In total, three namespaces for storing variables.

  1. hiveconf - the bush started from this, the entire configuration of the hive is saved as part of this conf. Initially, variable substitution was not part of the hive, and when it appeared, all user-defined variables were also saved as part of this. Which is definitely not a good idea. Thus, two more namespaces were created.
  2. hivevar : for storing user variables
  3. system : to store system variables.

And therefore, if you store a variable as part of a request (e.g. date or product_number), you should use the hivevar namespace and not the hiveconf namespace.

And this is how it works.

hiveconf is still the default namespace , so if you don't provide any namespace, it will save your variable in the hiveconf namespace.

However, when it comes to referencing a variable, it is not. By default, this refers to the hivevar namespace. Confused, right? This can become clearer with the following example.

If you do not provide a namespace as follows, the var variable will be stored in the hiveconf namespace.

 set var="default_namespace"; 

So, to access this you need to specify the hiveconf namespace

 select ${hiveconf:var}; 

And if you do not provide a namespace, it will give you an error, as indicated below, the reason is that by default, if you try to access a variable, it checks the hivevar namespace hivevar . And in hivevar there is no variable named var

 select ${var}; 

We explicitly provided the hivevar namespace

 set hivevar:var="hivevar_namespace"; 

since we provide a namespace, this will work.

 select ${hivevar:var}; 

And by default, the workspace used when accessing the variable is hivevar , the following will work as well.

 select ${var}; 
+14
Jan 04 '19 at 2:26
source share

Have you tried using the dollar sign and parentheses as follows:

 SELECT * FROM foo WHERE day >= '${CURRENT_DATE}'; 
+7
Sep 17 '12 at 18:41
source share

Two easy ways:

Using hive conf

 hive> set USER_NAME='FOO'; hive> select * from foobar where NAME = '${hiveconf:USER_NAME}'; 

Hive use

Install Vars on your CLI and then use them in the hive

 set hivevar:USER_NAME='FOO'; hive> select * from foobar where NAME = '${USER_NAME}'; hive> select * from foobar where NAME = '${hivevar:USER_NAME}'; 

Documentation: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution

+3
Oct 23 '18 at 18:35
source share

Remember that you need to set strings and then access them. You need to make sure the quotes do not come across.

  set start_date = '2019-01-21'; select ${hiveconf:start_date}; 

When setting dates, refer to them in the code, as strings may conflict. This will not work with the above start_date value.

  '${hiveconf:start_date}' 

We must remember that you cannot specify double single or double quotes for strings when accessing them in a query.

+2
Jan 24 '19 at 18:22
source share

Try this method:

 set t=20; select * from myTable where age > '${hiveconf:t}'; 

this works well on my platform.

0
Aug 08 '18 at 6:36
source share

You can export the variable in the shell script export CURRENT_DATE = "2012-09-16"

Then in hiveql you like SELECT * FROM foo WHERE day> = '$ {env: CURRENT_DATE}'

0
Apr 10 '19 at 10:47 on
source share

You can save the output of another request in a variable, and then you can use it in your code:

 set var=select count(*) from My_table; ${hiveconf:var}; 
-7
May 7 '16 at a.m.
source share



All Articles