Creating and Using Temporary / Volatile Database Tables In Stata

Appendix: Starting with Stata 14, flying tables work without hacks.

Is there a way to configure Stata to work with temporary pivot tables? These tables and data are deleted after leaving the user session.

Here is an example of a simple toy SQL query that I use in Stata and Teradata:

odbc load, exec(" BEGIN TRANSACTION; CREATE VOLATILE MULTISET TABLE vol_tab AS ( SELECT TOP 10 user_id FROM dw_users ) WITH DATA PRIMARY INDEX(user_id) ON COMMIT PRESERVE ROWS; SELECT * FROM vol_tab; END TRANSACTION; ") dsn("mozart"); 

This is the error message I get:

 The ODBC driver reported the following diagnostics [Teradata][ODBC Teradata Driver][Teradata Database] Only an ET or null statement is legal after a DDL Statement. SQLSTATE=25000 r(682); 

The Stata error code means:

error. ,,,,,,,,,,,, Return code 682 could not connect to odbc dsn; This usually happens due to incorrect permissions, such as an incorrect username or password. Use set debug to display the actual error message generated by the ODBC driver.

As far as I can tell, everything is fine, as I can pull out the data if I just run the query "SELECT TOP 10 ...". I installed debugging, but it did not give any additional information.

Session Mode - Teradata. ODBC Manager is installed in unixODBC. I am using Stata 13.1 on an Ubuntu server.

I believe that the main problem may be that separate connections are established for each SQL statement , so the flying table evaporates by the time it is released. I am waiting for technical support to check this out.

I tried using the odbc sqlfile , but this approach does not work unless I create a persistent table at the end of it. There is no boot option with odbc sqlfile .

Flying tables seem very good in SAS and R. For example, this works fine:

 library("RODBC") db <- odbcConnect("mozart") sqlQuery(db,"CREATE VOLATILE MULTISET TABLE vol_tab AS ( SELECT TOP 10 user_id FROM dw_users ) WITH DATA PRIMARY INDEX(user_id) ON COMMIT PRESERVE ROWS; ") data<- sqlQuery(db,"select * from vol_tab;",rows_at_time=1) 

Perhaps this is due to the fact that the connection to the database remains open until close(db) .

+4
source share
4 answers

This answer is no longer correct. Now Stata allows you to use several SQL statements if the multistatement option is added to the odbc command.


The Stata odbc command does not allow you to combine multiple SQL statements into one odbc command and change the TD mode. It also creates a separate connection for each odbc command issued, so the volatility table goes into the puff by the time you want to use it for something. This makes it impossible to directly use volatile tables.

However, it is possible to use R through Stata to create a Stata data file. You need to install rsource from SSC and foreign and RODBC in R. 2 globals Rterm_path and Rterm_options for rsource can be defined in sysprofile.ado or in your own profile.ado. As far as I can tell, R does not allow exporting timestamps, so I had to do some date and timestamp conversion manually. These conversions are somewhat at odds with the suggestions in the Stata and Stata blog guides .

 rsource, terminator(END_OF_R) library("RODBC") library("foreign") db <- odbcConnect("mydsn") sqlQuery(db,"CREATE VOLATILE MULTISET TABLE vol_tab AS (SELECT ...) WITH DATA PRIMARY INDEX(...) ON COMMIT PRESERVE ROWS;") data<- sqlQuery(db,"SELECT * FROM vol_tab;",rows_at_time=1) write.dta(data,"mydata.dta",convert.dates = FALSE) close(db) END_OF_R use "mydata.dta", replace /* convert dates and timestamps to Stata format */ gen stata_date = rdate + td(01jan1970) format stata_date %td gen double stata_timestamp = (rtimestamp + 315594000)*1000 format stata_timestamp %tc 
0
source

I am not familiar with Stata, but I assume that your ODBC connects in "ANSI" mode. Try adding this between create volatile table and select statements:

 commit work; 

If this does not work, you may have to make two separate calls.

UPDATE: Thinking about this a bit more, maybe you can try the following:

 odbc load, exec("select distinct user_id from dw_users where cast(date_confirm as date) > '2011-09-15'") clear dsn("mozart") lowercase; 

In other words, just complete the request in one step; Do not try to create a mutable table.

+3
source

What if you try to use the following connection mode as TERADATA (which is most often not the default):

 odbc load, exec("BT; create volatile table new_usr as (select top 10 user_id from dw_users) with data primary index(user_id) on commit preserve rows; ET; select * from new_usr;") clear dsn("mozart") lowercase; 

BT; operators BT; and ET; carry SQL contained in an explicit transaction. This SQL has been tested in SQL Assistant since I do not have access to the tool you are using. Typically, BT and ET are used to enforce logical transactions (or units of work) that must be completed successfully or all rollbacks. This may allow you to work around the problem in your tool.

EDIT

If you have no way to wrap the creation of the Volatile Table in BT and ET, do you have the opportunity to create a stored procedure or macro that can insert all the logic needed to complete the task, and then call the stored procedure or macro from Stata?

+2
source

Placed

BT --UR LOGIC-- ET;

IF any thing fails in between. it rolls back

sourced from here

+2
source

All Articles