"My alias" is an embedded table and therefore cannot be in the outer part of the outer join

Question

"My alias" is an embedded table and therefore cannot be in the outer part of the outer join

I am trying to run a query similar to:

Select Table1.a,Table1.b,Table2.c From (Select a, max(x) as b from Tbl1 group by a ) as Table1 LEFT JOIN EACH Table2 ON Join Condition

and I get: "Table 1 is an embedded table and therefore cannot be in the outer part of the outer join."

when changing the connection from Outer to Inner, it works (hence, a problem with the syntax ...).

Can someone explain this post, and what should I do differently to avoid it?

thanks

+6

google-bigquery

Nn Jan 19 '14 at 12:27

source share

1 answer

Jordan Tigani · Accepted Answer · 2014-01-19T18:17:40+0000

TL DR: use GROUP EACH BY in your inner query.

Internal request

 Select a, max(x) as b from Tbl1 group by a

not paralyzed. You can calculate partial results in parallel, but in order to know the global max(x) for each value of a ', you only need to run this run in one place. When you use the JOIN EACH in an external query, you instruct the query mechanism that you need to parallelize when executing the JOIN . However, you do not have a parallelizable source, so the request fails.

There are two ways to fix this: first, use an explicitly parallel version of the internal query - just use GROUP EACH BY instead of GROUP BY . This can be done in parallel, because the first step is to sort the base table in field a so that you can calculate the global maximum in field a in a parallel workstation. If the inner query is paralellizable, then the outer query can be executed in parallel. It will look like this:

 Select Table1.a,Table1.b,Table2.c From (Select a, max(x) as b from Tbl1 group EACH by a -- note the EACH keyword here ) as Table1 LEFT JOIN EACH Table2 ON Join Condition

The second option is to use JOIN instead of JOIN EACH for an external request. This can lead to size restrictions depending on the size of table2 (since to perform not every join, the table on the right should be "small".).

This is admittedly suboptimal; You must understand how the request mechanism works in order for your requests to be executed. However, we (on BigQuery and DREMEL teams) are working hard to create “just work” queries, so you don’t have to deal with this kind of problem. We have made some progress in this, over the past few months some of these types of errors have been removed (for example, the inner join also failed until recently), but we still have ways.

Thank you for bringing this to our attention. I filed an internal error so that we could improve this work in the future.

"My alias" is an embedded table and therefore cannot be in the outer part of the outer join

More articles: