> It's got a query parser that's dumb as a brick.
While we're on this topic... I have a question on these series
of queries:
-- Query A
select count(*) from customers c
where not exists ( select 1 from orders o
where o.customer_id = c.customer_id )
-- Query B
select count(*) from customers c
where customer_id not in ( select customer_id from orders)
-- Query C
select count(*) from customers c
left join orders o on c.customer_id = o.customer_id
where o.order_id is null
I believe they all achieve the same thing. i.e. How many
customers have never placed an order? I ran these 3 on
MySQL & PG with the following results:
Query A: MySQL=4.74s PostgreSQL=4.23s
Query B: MySQL=4.64s PostgreSQL=?????
Query C: MySQL=5.07s PostgreSQL=3.39s
MySQL's time is pretty consistent for all 3. As you said,
the output from explain is pretty useless so there's not
much else to look at.
PostgreSQL runs A&C slightly faster, which I expected.
However, waiting for query B exceeded my patience and
I had to abort it. The explain output is below, is this
result due to some incorrect setting?
benchdb=# explain select count(*) from customers c
benchdb-# where customer_id not in ( select customer_id from orders);
QUERY PLAN
------------------------------------------------------------------------
------------
Aggregate (cost=16406564027.00..16406564027.01 rows=1 width=0)
-> Seq Scan on customers c (cost=41578.00..16406562777.00
rows=500000 width=0)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=41578.00..69391.00 rows=2000000
width=4)
-> Seq Scan on orders (cost=0.00..31765.00
rows=2000000 width=4)
(6 rows)