Re: Multi-column join + aggregate subquery resulting in infinite run time - Mailing list pgsql-novice

From Steve Estes
Subject Re: Multi-column join + aggregate subquery resulting in infinite run time
Date
Msg-id CAJjrZPA8vc5zKYGNLtS1JdaoFFBqjXs8ue=+22Rb4Qpehbh_Dg@mail.gmail.com
Whole thread Raw
In response to Multi-column join + aggregate subquery resulting in infinite run time  (Steve Estes <denzera@gmail.com>)
Responses RE: [EXTERNAL] Re: Multi-column join + aggregate subquery resulting in infinite run time  ("Vianello, Dan A" <Dan.Vianello@charter.com>)
List pgsql-novice
Just wanted to bump this in case anyone had any insights.  If you live in a place served by Grubhub or Doordash, I will literally buy you lunch if you can help me figure out why these joins wouldn't take linear time to bolt on, rather than geometrically increasing.  As I envision it, if you've got a common base table serving as the parent in a LEFT JOIN, each additional table you want to LEFT JOIN onto it should take roughly the same amount of time, because there's no need to calculate any interplay between that table and any others, you're just matching on your keys with one scan through the indexes and off you go to the next one.  Clearly I'm either thinking about it wrong, or doing something wrong as my joins increase beyond 1.

-Steve


On Mon, Jul 6, 2020 at 10:59 AM Steve Estes <denzera@gmail.com> wrote:
Hey gang,

I thought I knew what I was doing with queries, but this has more-or-less ruined my 4th of July holiday so I think it's high time to ask for help.  I imagine it's something trivial and obvious to you.

I've got 3 tables in question here:

prod_account_details => 538k records
line_number PK,
acct_id integer, -- nearly unique, but not quite; acct_id PLUS address is unique
address varchar(100),
entity_name varchar(100),
(... a few other fields ...)
btree index on acct_id, hash index on address, also btree on (acct_id, address)

prod_customer_profiles => 532k records, nearly 1:1 with account details
acct_id integer, -- corresponds to account_details.acct_id
match_input_address varchar(100), -- corresponds to account_details.address
(... lots of other fields ...)
btree index on acct_id, hash index on match_input_address, also btree on (acct_id, match_input_address)

prod_dunning_letters => 518k records, 1:M with account_details (covering ~181k acct_ids)
line_number PK,
acct_id integer, -- FK to account_details, not formally via constraint but semantically
(... a few other fields ...)
btree index on acct_id


--- Queries ---

So firstly, what works:

(1)

SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match

FROM prod_account_details ad

INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id 

AND ad.address = acx.match_input_address;


...returns the 532k matched records in 3.0 seconds.


Now if we INNER join to the dunning_letters table via a grouped subquery:

(2)

SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match, dl.num_records AS dl_count

FROM prod_account_details ad

INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id 

AND ad.address = acx.match_input_address

INNER JOIN (SELECT count(*) AS num_records, acct_id FROM prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;


... it returns fine too, 161k records in 1.3 seconds.  

But now if I do something as simple as changing that second join (to the subquery) to a LEFT JOIN, it blows up. i.e. I want all valid accounts, appended with the count of records in the dunning_letters table if any exist:

(3)

SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match, dl.num_records AS dl_count

FROM prod_account_details ad

INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id 

AND ad.address = acx.match_input_address

LEFT JOIN (SELECT count(*) AS num_records, acct_id FROM prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;


What happens is, this runs for 20+ seconds, and then starts returning ~100 records every 10 seconds, which at the volume I'm talking about (>500k) would take forever to return.  And next up I need to bolt on the counts for some other, bigger tables too, so we're quickly approaching heat-death-of-the-universe amounts of runtime here.

Now what's interesting is, if I relax the join on the first two tables, i.e. drop the address part of the join and only do the acct_id one,

(4)

SELECT ad.acct_id, ad.entity_name, ad.address, acx.flag_infobase_match, dl.num_records AS dl_count

FROM prod_account_details ad

INNER JOIN prod_customer_profiles acx ON ad.acct_id = acx.acct_id 

-- AND ad.address = acx.match_input_address

LEFT JOIN (SELECT count(*) AS num_records, acct_id FROM prod_dunning_letters GROUP BY acct_id ) dl ON ad.acct_id = dl.acct_id;


...it starts returning immediately and spools 695k records in 4.7 seconds.  The problem is that we need the address part of the join in there, because there are some duplicate acct_ids on both sides so we get a bunch of cartesian-join junk.


--- EXPLAIN query plans ---

So here's the Explain for query (1), with just the two tables:

Gather  (cost=102995.20..123944.32 rows=1 width=41)
  Workers Planned: 2
  ->  Parallel Hash Join  (cost=101995.20..122944.22 rows=1 width=41)
        Hash Cond: ((ad.acct_id = acx.acct_id) AND (ad.address = (acx.match_input_address)::text))
        ->  Parallel Seq Scan on prod_account_details ad  (cost=0.00..14226.43 rows=224343 width=39)
        ->  Parallel Hash  (cost=97096.88..97096.88 rows=224288 width=26)
              ->  Parallel Seq Scan on prod_customer_profiles acx  (cost=0.00..97096.88 rows=224288 width=26)



Now, when we add on the third table, its strategy for that join changes.  Here's the Explain for (2), with the inner join that works:

Gather  (cost=22500.40..83636.43 rows=1 width=49)
  Workers Planned: 2
  ->  Nested Loop  (cost=21500.40..82636.33 rows=1 width=49)
        ->  Hash Join  (cost=21500.40..40427.74 rows=56285 width=54)
              Hash Cond: (ad.acct_id = prod_dunning_letters.acct_id)
              ->  Parallel Seq Scan on prod_account_details ad  (cost=0.00..14226.43 rows=224343 width=39)
              ->  Hash  (cost=19345.40..19345.40 rows=123920 width=15)
                    ->  GroupAggregate  (cost=0.42..18106.20 rows=123920 width=15)
                          Group Key: prod_dunning_letters.acct_id
                          ->  Index Only Scan using prod_dunning_letters_acct_id_idx on prod_dunning_letters  (cost=0.42..14269.36 rows=519529 width=7)
        ->  Index Scan using prod_customer_profiles_match_input_address_idx on prod_customer_profiles acx  (cost=0.00..0.74 rows=1 width=26)
              Index Cond: ((match_input_address)::text = ad.address)
              Filter: (ad.acct_id = acct_id)



But then change it to a LEFT JOIN from query (3), and it changes to:

Nested Loop Left Join  (cost=102995.63..144838.72 rows=1 width=49)
  Join Filter: (ad.acct_id = prod_dunning_letters.acct_id)
  ->  Gather  (cost=102995.20..123944.32 rows=1 width=41)
        Workers Planned: 2
        ->  Parallel Hash Join  (cost=101995.20..122944.22 rows=1 width=41)
              Hash Cond: ((ad.acct_id = acx.acct_id) AND (ad.address = (acx.match_input_address)::text))
              ->  Parallel Seq Scan on prod_account_details ad  (cost=0.00..14226.43 rows=224343 width=39)
              ->  Parallel Hash  (cost=97096.88..97096.88 rows=224288 width=26)
                    ->  Parallel Seq Scan on prod_customer_profiles acx  (cost=0.00..97096.88 rows=224288 width=26)
  ->  GroupAggregate  (cost=0.42..18106.20 rows=123920 width=15)
        Group Key: prod_dunning_letters.acct_id
        ->  Index Only Scan using prod_dunning_letters_acct_id_idx on prod_dunning_letters  (cost=0.42..14269.36 rows=519529 width=7)



Just reading that, I can't readily tell why #2 finishes fast and #3 runs without end.  If the difference was JUST the use of a left join, then why does #4 finish easily too, just with a single-column join between the first two tables?

This has befuddled the rest of my experimentation with it, so I'd really appreciate some insight.

-Steve

pgsql-novice by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: What's the best practice to compare the transaction with the checkpoint?
Next
From: "Vianello, Dan A"
Date:
Subject: RE: [EXTERNAL] Re: Multi-column join + aggregate subquery resulting in infinite run time