Let's say I have the following hypothetical data structure:
create table "country" ( country_id integer, country_name varchar(50), continent varchar(50), constraint country_pkey primary key (country_id) ); create table "person" ( person_id integer, person_name varchar(100), country_id integer, constraint person_pkey primary key (person_id) ); create table "event" ( event_id integer, event_desc varchar(100), country_id integer, constraint event_pkey primary key (event_id) );
I want to request the number of rows of people and events in each country. I decided to use a subquery.
select c.country_name, sum(sub1.person_count) as person_count, sum(sub2.event_count) as event_count from "country" c left join (select country_id, count(*) as person_count from "person" group by country_id) sub1 on (c.country_id=sub1.country_id) left join (select country_id, count(*) as event_count from "event" group by country_id) sub2 on (c.country_id=sub2.country_id) group by c.country_name
I know that you can do this using select commands in the list of fields, but the advantage of using subqueries is that I am more flexible in modifying SQL to compile it and use another field. Say, if I changed the query to display it by continent, it would be as simple as replacing the c.country_name field with c.continent.
My problem is with filtering. If we add a where clause like this:
select c.country_name, sum(sub1.person_count) as person_count, sum(sub2.event_count) as event_count from "country" c left join (select country_id, count(*) as person_count from "person" group by country_id) sub1 on (c.country_id=sub1.country_id) left join (select country_id, count(*) as event_count from "event" group by country_id) sub2 on (c.country_id=sub2.country_id) where c.country_name='UNITED STATES' group by c.country_name
The subqueries seem to still do the counting for all countries. Suppose people and events tables are huge, and I already have country_id indexes for all tables. It is very slow. Should the database only fulfill subqueries for the country that has been filtered? Should I recreate the country filter for each subquery (this is very tedious, and the code is not easily modified)? I use both PostgreSQL 8.3 and 9.0, but I think this happens in other databases.