Thread: Merge join vs merge semi join against primary key

Merge join vs merge semi join against primary key

From
Sean Rhea
Date:
Hello,

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked
the archives, but I still can't explain it. Apologies if I missed something.

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?

2. Even though the join methods are different, I would expect about the same
performance in either case, but one query takes only a few hundred
milliseconds while the other takes hundreds of seconds. Ouch!

Can anyone help me explain this behavior?

Some details are below. Let me know if it would be helpful to gather others.

Sean


production=> select version();
                                        version
----------------------------------------------------------------------------------------
 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 32-bit
(1 row)

production=> \d customers
                        Table "public.customers"
     Column    |   Type   |                         Modifiers
---------------+----------+-------------------------------------------------------
 id            | bigint   | not null default nextval('customers_id_seq'::regclass)
 group_id      | bigint   |
...
Indexes:
    "customers_pkey" PRIMARY KEY, btree (id)
...

production=> select count(*) from customers;
 count
--------
 473733
(1 row)

production=> \d balances
                       Table "public.balances"
        Column     |   Type   |                         Modifiers
-------------------+----------+------------------------------------------------------
 id                | bigint   | not null default nextval('balances_id_seq'::regclass)
 balance           | integer  | not null default 0
 tracking_number   | integer  | not null
 customer_id       | bigint   | not null
...
Indexes:
    "balances_pkey" PRIMARY KEY, btree (id)
    "balances_customer_tracking_number_index" UNIQUE, btree (customer_id, tracking_number)
...

production=> select count(*) from balances;
 count
-------
 16876
(1 row)

production=> analyze verbose customers;
INFO:  analyzing "public.customers"
INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows and 1949 dead rows; 300000 rows in sample, 475288 estimated total rows
ANALYZE

production=> analyze verbose balances;
INFO:  analyzing "public.balances"
INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 dead rows; 16876 rows in sample, 16876 estimated total rows
ANALYZE

production=> explain analyze SELECT * FROM balances where customer_id IN (SELECT id from customers WHERE group_id = 45);
                                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual time=119.905..145.126 rows=7318 loops=1)
   Merge Cond: (balances.customer_id = customers.id)
   ->  Index Scan using balances_customer_id_index on balances  (cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 loops=1)
   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 141684
 Total runtime: 146.659 ms
(7 rows)

production=> explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;
                                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=2214.50..20216.86 rows=30 width=80) (actual time=185.615..201991.752 rows=7318 loops=1)
   Merge Cond: (ac.customer_id = o.id)
   ->  Index Scan using balances_customer_tracking_number_index on balances ac  (cost=0.00..1007.49 rows=16876 width=80) (actual time=0.068..25.036 rows=16876 loops=1)
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113
 Total runtime: 201995.044 ms
(7 rows)

Re: Merge join vs merge semi join against primary key

From
Sean Rhea
Date:
It does the merge (not-semi) join:

production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND o.group_id = 45);
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=2172.47..19959.82 rows=6 width=80) (actual time=114.578..243898.199 rows=7318 loops=1)
   Merge Cond: (ac.customer_id = o.id)
   ->  Index Scan using balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876 width=80) (actual time=0.025..20.972 rows=16876 loops=1)
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..64811.57 rows=179 width=8) (actual time=92.174..243813.231 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113
 Total runtime: 243901.595 ms
(7 rows)

Sean



On Fri, Oct 9, 2015 at 1:09 PM, Igor Neyman <ineyman@perceptron.com> wrote:

Hello,

 

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked

the archives, but I still can't explain it. Apologies if I missed something.

 

1. When I join two tables with "WHERE id IN (...)" versus with an explicit

join, and the join column for the inner table is a primary key, I would expect

the same behavior in both cases, but the optimizer is choosing a merge join in

one case and a merge semi join in the other. There's at most one customer

with a given id. Why not do a semi join?

 

2. Even though the join methods are different, I would expect about the same

performance in either case, but one query takes only a few hundred

milliseconds while the other takes hundreds of seconds. Ouch!

 

Can anyone help me explain this behavior?

 

Some details are below. Let me know if it would be helpful to gather others.

 

Sean

 

 

production=> select version();

                                        version

----------------------------------------------------------------------------------------

 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 32-bit

(1 row)

 

production=> \d customers

                        Table "public.customers"

     Column    |   Type   |                         Modifiers

---------------+----------+-------------------------------------------------------

 id            | bigint   | not null default nextval('customers_id_seq'::regclass)

 group_id      | bigint   |

...

Indexes:

    "customers_pkey" PRIMARY KEY, btree (id)

...

 

production=> select count(*) from customers;

 count

--------

 473733

(1 row)

 

production=> \d balances

                       Table "public.balances"

        Column     |   Type   |                         Modifiers

-------------------+----------+------------------------------------------------------

 id                | bigint   | not null default nextval('balances_id_seq'::regclass)

 balance           | integer  | not null default 0

 tracking_number   | integer  | not null

 customer_id       | bigint   | not null

...

Indexes:

    "balances_pkey" PRIMARY KEY, btree (id)

    "balances_customer_tracking_number_index" UNIQUE, btree (customer_id, tracking_number)

...

 

production=> select count(*) from balances;

 count

-------

 16876

(1 row)

 

production=> analyze verbose customers;

INFO:  analyzing "public.customers"

INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows and 1949 dead rows; 300000 rows in sample, 475288 estimated total rows

ANALYZE

 

production=> analyze verbose balances;

INFO:  analyzing "public.balances"

INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 dead rows; 16876 rows in sample, 16876 estimated total rows

ANALYZE

 

production=> explain analyze SELECT * FROM balances where customer_id IN (SELECT id from customers WHERE group_id = 45);

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual time=119.905..145.126 rows=7318 loops=1)

   Merge Cond: (balances.customer_id = customers.id)

   ->  Index Scan using balances_customer_id_index on balances  (cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 141684

 Total runtime: 146.659 ms

(7 rows)

 

production=> explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=2214.50..20216.86 rows=30 width=80) (actual time=185.615..201991.752 rows=7318 loops=1)

   Merge Cond: (ac.customer_id = o.id)

   ->  Index Scan using balances_customer_tracking_number_index on balances ac  (cost=0.00..1007.49 rows=16876 width=80) (actual time=0.068..25.036 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

 Total runtime: 201995.044 ms

(7 rows)

 

 

What if you rewrite your second query like this:

 

SELECT ac.*

FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND o.group_id = 45);

 

Regards,

Igor Neyman

 

 


Re: Merge join vs merge semi join against primary key

From
Jeremy Harris
Date:
On 09/10/15 20:52, Sean Rhea wrote:
[...]
>    ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97
> rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)
>          Filter: (group_id = 45)
>          Rows Removed by Filter: 141684
>  Total runtime: 146.659 ms
[...]
>    ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61
> rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
>          Filter: (group_id = 45)
>          Rows Removed by Filter: 212699113
>  Total runtime: 201995.044 ms

Are you sure the customers table was the same?
--
Jeremy




Re: Merge join vs merge semi join against primary key

From
Igor Neyman
Date:

Hello,

 

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked

the archives, but I still can't explain it. Apologies if I missed something.

 

1. When I join two tables with "WHERE id IN (...)" versus with an explicit

join, and the join column for the inner table is a primary key, I would expect

the same behavior in both cases, but the optimizer is choosing a merge join in

one case and a merge semi join in the other. There's at most one customer

with a given id. Why not do a semi join?

 

2. Even though the join methods are different, I would expect about the same

performance in either case, but one query takes only a few hundred

milliseconds while the other takes hundreds of seconds. Ouch!

 

Can anyone help me explain this behavior?

 

Some details are below. Let me know if it would be helpful to gather others.

 

Sean

 

 

production=> select version();

                                        version

----------------------------------------------------------------------------------------

 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 32-bit

(1 row)

 

production=> \d customers

                        Table "public.customers"

     Column    |   Type   |                         Modifiers

---------------+----------+-------------------------------------------------------

 id            | bigint   | not null default nextval('customers_id_seq'::regclass)

 group_id      | bigint   |

...

Indexes:

    "customers_pkey" PRIMARY KEY, btree (id)

...

 

production=> select count(*) from customers;

 count

--------

 473733

(1 row)

 

production=> \d balances

                       Table "public.balances"

        Column     |   Type   |                         Modifiers

-------------------+----------+------------------------------------------------------

 id                | bigint   | not null default nextval('balances_id_seq'::regclass)

 balance           | integer  | not null default 0

 tracking_number   | integer  | not null

 customer_id       | bigint   | not null

...

Indexes:

    "balances_pkey" PRIMARY KEY, btree (id)

    "balances_customer_tracking_number_index" UNIQUE, btree (customer_id, tracking_number)

...

 

production=> select count(*) from balances;

 count

-------

 16876

(1 row)

 

production=> analyze verbose customers;

INFO:  analyzing "public.customers"

INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows and 1949 dead rows; 300000 rows in sample, 475288 estimated total rows

ANALYZE

 

production=> analyze verbose balances;

INFO:  analyzing "public.balances"

INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 dead rows; 16876 rows in sample, 16876 estimated total rows

ANALYZE

 

production=> explain analyze SELECT * FROM balances where customer_id IN (SELECT id from customers WHERE group_id = 45);

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual time=119.905..145.126 rows=7318 loops=1)

   Merge Cond: (balances.customer_id = customers.id)

   ->  Index Scan using balances_customer_id_index on balances  (cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 141684

 Total runtime: 146.659 ms

(7 rows)

 

production=> explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=2214.50..20216.86 rows=30 width=80) (actual time=185.615..201991.752 rows=7318 loops=1)

   Merge Cond: (ac.customer_id = o.id)

   ->  Index Scan using balances_customer_tracking_number_index on balances ac  (cost=0.00..1007.49 rows=16876 width=80) (actual time=0.068..25.036 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

 Total runtime: 201995.044 ms

(7 rows)

 

 

What if you rewrite your second query like this:

 

SELECT ac.*

FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND o.group_id = 45);

 

Regards,

Igor Neyman

 

 

Re: Merge join vs merge semi join against primary key

From
Igor Neyman
Date:

 

 

From: Sean Rhea [mailto:sean.c.rhea@gmail.com]
Sent: Friday, October 09, 2015 4:30 PM
To: Igor Neyman <ineyman@perceptron.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Merge join vs merge semi join against primary key

 

It does the merge (not-semi) join:

 

production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND o.group_id = 45);

                                                                              QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=2172.47..19959.82 rows=6 width=80) (actual time=114.578..243898.199 rows=7318 loops=1)

   Merge Cond: (ac.customer_id = o.id)

   ->  Index Scan using balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876 width=80) (actual time=0.025..20.972 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..64811.57 rows=179 width=8) (actual time=92.174..243813.231 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

 Total runtime: 243901.595 ms

(7 rows)

 

Sean

 

 

 

On Fri, Oct 9, 2015 at 1:09 PM, Igor Neyman <ineyman@perceptron.com> wrote:

Hello,

 

I'm seeing some inexplicable (to me) behavior in PostgreSQL 9.2. I checked

the archives, but I still can't explain it. Apologies if I missed something.

 

1. When I join two tables with "WHERE id IN (...)" versus with an explicit

join, and the join column for the inner table is a primary key, I would expect

the same behavior in both cases, but the optimizer is choosing a merge join in

one case and a merge semi join in the other. There's at most one customer

with a given id. Why not do a semi join?

 

2. Even though the join methods are different, I would expect about the same

performance in either case, but one query takes only a few hundred

milliseconds while the other takes hundreds of seconds. Ouch!

 

Can anyone help me explain this behavior?

 

Some details are below. Let me know if it would be helpful to gather others.

 

Sean

 

 

production=> select version();

                                        version

----------------------------------------------------------------------------------------

 PostgreSQL 9.2.13 on i686-pc-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 32-bit

(1 row)

 

production=> \d customers

                        Table "public.customers"

     Column    |   Type   |                         Modifiers

---------------+----------+-------------------------------------------------------

 id            | bigint   | not null default nextval('customers_id_seq'::regclass)

 group_id      | bigint   |

...

Indexes:

    "customers_pkey" PRIMARY KEY, btree (id)

...

 

production=> select count(*) from customers;

 count

--------

 473733

(1 row)

 

production=> \d balances

                       Table "public.balances"

        Column     |   Type   |                         Modifiers

-------------------+----------+------------------------------------------------------

 id                | bigint   | not null default nextval('balances_id_seq'::regclass)

 balance           | integer  | not null default 0

 tracking_number   | integer  | not null

 customer_id       | bigint   | not null

...

Indexes:

    "balances_pkey" PRIMARY KEY, btree (id)

    "balances_customer_tracking_number_index" UNIQUE, btree (customer_id, tracking_number)

...

 

production=> select count(*) from balances;

 count

-------

 16876

(1 row)

 

production=> analyze verbose customers;

INFO:  analyzing "public.customers"

INFO:  "customers": scanned 14280 of 14280 pages, containing 475288 live rows and 1949 dead rows; 300000 rows in sample, 475288 estimated total rows

ANALYZE

 

production=> analyze verbose balances;

INFO:  analyzing "public.balances"

INFO:  "balances": scanned 202 of 202 pages, containing 16876 live rows and 0 dead rows; 16876 rows in sample, 16876 estimated total rows

ANALYZE

 

production=> explain analyze SELECT * FROM balances where customer_id IN (SELECT id from customers WHERE group_id = 45);

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Semi Join  (cost=2442.14..19958.30 rows=16876 width=80) (actual time=119.905..145.126 rows=7318 loops=1)

   Merge Cond: (balances.customer_id = customers.id)

   ->  Index Scan using balances_customer_id_index on balances  (cost=0.00..727.79 rows=16876 width=80) (actual time=0.302..9.477 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 141684

 Total runtime: 146.659 ms

(7 rows)

 

production=> explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;

                                                                            QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Merge Join  (cost=2214.50..20216.86 rows=30 width=80) (actual time=185.615..201991.752 rows=7318 loops=1)

   Merge Cond: (ac.customer_id = o.id)

   ->  Index Scan using balances_customer_tracking_number_index on balances ac  (cost=0.00..1007.49 rows=16876 width=80) (actual time=0.068..25.036 rows=16876 loops=1)

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

 Total runtime: 201995.044 ms

(7 rows)

This:

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..64811.57 rows=179 width=8) (actual time=92.174..243813.231 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

This:

->  Index Scan using customers_pkey on customers  (cost=0.00..64192.97 rows=184 width=8) (actual time=103.354..126.459 rows=359 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 141684

And this:

   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)

         Filter: (group_id = 45)

         Rows Removed by Filter: 212699113

Does not make sense.

First, because for (1 ) and (3)  number of rows - 7672 returned is different from (2) – 359, even though the same index scanned  with the same filter applied in all 3 cases.

What do you get if you run: SELECT count(*) FROM customers WHERE group_id = 45.

Regards,

Igor Neyman

 

Re: Merge join vs merge semi join against primary key

From
David Rowley
Date:
On 10 October 2015 at 08:52, Sean Rhea <sean.c.rhea@gmail.com> wrote:

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?


Unfortunately the 9.2 planner does not make any checks to verify that customers.id is unique to perform a semi join. There is a pending patch in the 9.6 cycle to add this optimisation. 
 
production=> select count(*) from customers;
 count
--------
 473733
(1 row)

...
 
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113


Rows Removed by Filter: 212699113 seems to indicate that your 473733 row count for "customers" is incorrect. 

If you're doing lots of filtering on group_id, then perhaps you should think about adding an index on customers (group_id,id)


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Merge join vs merge semi join against primary key

From
Sean Rhea
Date:
All,

No, the customers table is not 100% the same. This is a live production system, so the data is (unfortunately) changing under us a bit here. That said, there are still some strange things going on. I just reran everything. The query plan time hasn't changed, but as Jeremy, Igor, and David all pointed out, there's something funky going on with the apparent size of the customers table. These queries were all run within 5 minutes of each other:

production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o ON (o.id= ac.customer_id AND o.group_id = 45);
                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=2475.89..20223.08 rows=7 width=80) (actual time=157.437..243670.853 rows=7318 loops=1)
   Merge Cond: (ac.customer_id = o.id)   ->  Index Scan using balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876 width=80) (actual time=0.489..30.573 rows=16876 loops=1)
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..65080.01 rows=184 width=8) (actual time=127.266..243582.767 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113
 Total runtime: 243674.288 ms
(7 rows)

production=> select count(*) from customers where group_id = 45;
 count 
-------
   430
(1 row)

production=> select count(*) from customers;
 count  
--------
 476645
(1 row)

Is it possible for explain analyze to somehow produce bad stats? I can't figure out where that 212699113 number is coming from at all.

Sean





On Mon, Oct 12, 2015 at 5:43 PM, David Rowley <david.rowley@2ndquadrant.com> wrote:
On 10 October 2015 at 08:52, Sean Rhea <sean.c.rhea@gmail.com> wrote:

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?


Unfortunately the 9.2 planner does not make any checks to verify that customers.id is unique to perform a semi join. There is a pending patch in the 9.6 cycle to add this optimisation. 
 
production=> select count(*) from customers;
 count
--------
 473733
(1 row)

...
 
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113


Rows Removed by Filter: 212699113 seems to indicate that your 473733 row count for "customers" is incorrect. 

If you're doing lots of filtering on group_id, then perhaps you should think about adding an index on customers (group_id,id)


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Merge join vs merge semi join against primary key

From
Tom Lane
Date:
Sean Rhea <sean.c.rhea@gmail.com> writes:
> No, the customers table is not 100% the same. This is a live production
> system, so the data is (unfortunately) changing under us a bit here. That
> said, there are still some strange things going on. I just reran
> everything. The query plan time hasn't changed, but as Jeremy, Igor, and
> David all pointed out, there's something funky going on with the apparent
> size of the customers table. These queries were all run within 5 minutes of
> each other:

> production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o
> ON (o.id= ac.customer_id AND o.group_id = 45);
>                                                                     QUERY
> PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=2475.89..20223.08 rows=7 width=80) (actual
> time=157.437..243670.853 rows=7318 loops=1)
>    Merge Cond: (ac.customer_id = o.id)   ->  Index Scan using
> balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876
> width=80) (actual time=0.489..30.573 rows=16876 loops=1)
>    ->  Index Scan using customers_pkey on customers o  (cost=0.00..65080.01
> rows=184 width=8) (actual time=127.266..243582.767 rows=*7672* loops=1)
>          Filter: (group_id = 45)
>          Rows Removed by Filter: *212699113*
>  Total runtime: 243674.288 ms
> (7 rows)

> production=> select count(*) from customers where group_id = 45;
>  count
> -------
>    430
> (1 row)

What you're looking at there is rows being read repeatedly as a
consequence of the mergejoin applying mark/restore operations to rescan
portions of its righthand input.  This will happen whenever there are
duplicate keys in the lefthand input.

I think the planner does take the possibility of rescans into account
in its cost estimates, but perhaps it's not weighing it heavily
enough.  It would be interesting to see what you get as a second-choice
plan if you set enable_mergejoin = off.

            regards, tom lane


Re: Merge join vs merge semi join against primary key

From
Sean Rhea
Date:
Tom,

Just to clarify, is the lefthand input customers or balances?

And turning off merge joins "fixes" everything, including the runtime:

production=> set enable_mergejoin = off;
SET
production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o ON (o.id = ac.customer_id AND o.group_id = 45);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=20288.24..20722.35 rows=7 width=80) (actual time=133.202..140.876 rows=7318 loops=1)
   Hash Cond: (ac.customer_id = o.id)
   ->  Seq Scan on balances ac  (cost=0.00..370.76 rows=16876 width=80) (actual time=0.015..5.853 rows=16876 loops=1)
   ->  Hash  (cost=20285.94..20285.94 rows=184 width=8) (actual time=126.768..126.768 rows=430 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 12kB
         ->  Seq Scan on customers o  (cost=0.00..20285.94 rows=184 width=8) (actual time=16.901..126.606 rows=430 loops=1)
               Filter: (group_id = 45)
               Rows Removed by Filter: 476221
 Total runtime: 142.089 ms
(9 rows)

Sean




On Tue, Oct 13, 2015 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Sean Rhea <sean.c.rhea@gmail.com> writes:
> No, the customers table is not 100% the same. This is a live production
> system, so the data is (unfortunately) changing under us a bit here. That
> said, there are still some strange things going on. I just reran
> everything. The query plan time hasn't changed, but as Jeremy, Igor, and
> David all pointed out, there's something funky going on with the apparent
> size of the customers table. These queries were all run within 5 minutes of
> each other:

> production=> explain analyze SELECT ac.* FROM balances ac JOIN customers o
> ON (o.id= ac.customer_id AND o.group_id = 45);
>                                                                     QUERY
> PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>  Merge Join  (cost=2475.89..20223.08 rows=7 width=80) (actual
> time=157.437..243670.853 rows=7318 loops=1)
>    Merge Cond: (ac.customer_id = o.id)   ->  Index Scan using
> balances_customer_id_index on balances ac  (cost=0.00..727.42 rows=16876
> width=80) (actual time=0.489..30.573 rows=16876 loops=1)
>    ->  Index Scan using customers_pkey on customers o  (cost=0.00..65080.01
> rows=184 width=8) (actual time=127.266..243582.767 rows=*7672* loops=1)
>          Filter: (group_id = 45)
>          Rows Removed by Filter: *212699113*
>  Total runtime: 243674.288 ms
> (7 rows)

> production=> select count(*) from customers where group_id = 45;
>  count
> -------
>    430
> (1 row)

What you're looking at there is rows being read repeatedly as a
consequence of the mergejoin applying mark/restore operations to rescan
portions of its righthand input.  This will happen whenever there are
duplicate keys in the lefthand input.

I think the planner does take the possibility of rescans into account
in its cost estimates, but perhaps it's not weighing it heavily
enough.  It would be interesting to see what you get as a second-choice
plan if you set enable_mergejoin = off.

                        regards, tom lane