Re: Interesting slow query - Mailing list pgsql-performance

From PFC
Subject Re: Interesting slow query
Date
Msg-id op.ta214xotcigqcu@apollo13
Whole thread Raw
In response to Re: Interesting slow query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
> Usually we get complaints the other way around (that the NOT EXISTS
> approach is a lot slower).

    Yes, I know ;)
    (I rephrased the query this way to exploit the fact that the planner
would choose a nested loop)

> You did not show any statistics, but I
> suspect the key point here is that the condition id > 1130306 excludes
> most or all of the A and D tables.

    Right.

    Actually :
    - Table r (raw_annonces) contains raw data waiting to be processed
    - Table a (annonces) contains processed data ready for display on the
website (active data)
    - Table d (archive) contains old archived data which can be displayed on
request but is normally excluded from the searches, which normally only
hit recent records. This is to get speedy searches.

    So, records are added into the "raw" table, these have a SERIAL primary
key.
    Then a script processes them and inserts the results into the active
table. 15 days of "raw" records are kept, then they are deleted.
    Periodically old records from "annonces" are moved to the archive.

    The promary key stays the same in the 3 tables.

    The script knows at which id it stopped last time it was run, hence the
(id > x) condition. Normally this excludes the entire "annonces" table,
because we process only new records.

> The planner is not smart about
> making transitive inequality deductions, but you could help it along
> by adding the implied clauses yourself:
>
EXPLAIN ANALYZE SELECT r.* FROM raw_annonces r LEFT JOIN annonces a ON
(a.id=r.id AND a.id > 1180726) LEFT JOIN archive_data d ON (d.id=r.id AND
d.id > 1180726) WHERE a.id IS NULL AND d.id IS NULL AND r.id > 1180726
order by id limit 1;
>
> Whether this is worth doing in your app depends on how often you do
> searches at the end of the ID range ...

    Quite often actually, so I did the mod.
    The interesting part is that, yesterday after ANALYZE the query plan was
horrible, and today, after adding new data I ANALYZED and retried the slow
query, and it was fast again :

EXPLAIN ANALYZE SELECT r.* FROM raw_annonces r LEFT JOIN annonces a ON
(a.id=r.id) LEFT JOIN archive_data d ON (d.id=r.id) WHERE a.id IS NULL AND
d.id IS NULL AND r.id > 1180726 order by id limit 1;
                                                                         QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..10.42 rows=1 width=631) (actual time=0.076..0.076
rows=1 loops=1)
    ->  Nested Loop Left Join  (cost=0.00..7129.11 rows=684 width=631)
(actual time=0.074..0.074 rows=1 loops=1)
          Filter: ("inner".id IS NULL)
          ->  Nested Loop Left Join  (cost=0.00..4608.71 rows=684
width=631) (actual time=0.064..0.064 rows=1 loops=1)
                Filter: ("inner".id IS NULL)
                ->  Index Scan using raw_annonces_pkey on raw_annonces r
(cost=0.00..667.56 rows=684 width=631) (actual time=0.013..0.013 rows=1
loops=1)
                      Index Cond: (id > 1180726)
                ->  Index Scan using annonces_pkey on annonces a
(cost=0.00..5.75 rows=1 width=4) (actual time=0.046..0.046 rows=0 loops=1)
                      Index Cond: (a.id = "outer".id)
          ->  Index Scan using archive_data_pkey on archive_data d
(cost=0.00..3.67 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=1)
                Index Cond: (d.id = "outer".id)
  Total runtime: 0.197 ms

    So I did a few tests...

CREATE TABLE test.raw (id INTEGER PRIMARY KEY);
CREATE TABLE test.active (id INTEGER PRIMARY KEY);
CREATE TABLE test.archive (id INTEGER PRIMARY KEY);
INSERT INTO test.archive SELECT * FROM generate_series( 1, 1000000 );
INSERT INTO test.active SELECT * FROM generate_series( 1000001, 1100000 );
INSERT INTO test.raw SELECT * FROM generate_series( 1050000, 1101000 );
VACUUM ANALYZE;

So we have 1M archived records, 100K active, 51K in the "raw" table of
which 1000 are new.

    Query 1:

EXPLAIN ANALYZE SELECT * FROM test.raw AS raw LEFT JOIN test.active AS
active ON (active.id=raw.id) LEFT JOIN test.archive AS archive ON
(archive.id=raw.id) WHERE raw.id>1100000 AND active.id IS NULL AND
archive.id IS NULL LIMIT 1;
                                                                      QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..5.29 rows=1 width=12) (actual time=94.478..94.478
rows=1 loops=1)
    ->  Nested Loop Left Join  (cost=0.00..5400.09 rows=1021 width=12)
(actual time=94.477..94.477 rows=1 loops=1)
          Filter: ("inner".id IS NULL)
          ->  Merge Left Join  (cost=0.00..2310.55 rows=1021 width=8)
(actual time=94.458..94.458 rows=1 loops=1)
                Merge Cond: ("outer".id = "inner".id)
                Filter: ("inner".id IS NULL)
                ->  Index Scan using raw_pkey on raw  (cost=0.00..24.78
rows=1021 width=4) (actual time=0.016..0.016 rows=1 loops=1)
                      Index Cond: (id > 1100000)
                ->  Index Scan using active_pkey on active
(cost=0.00..2023.00 rows=100000 width=4) (actual time=0.005..76.572
rows=100000 loops=1)
          ->  Index Scan using archive_pkey on archive  (cost=0.00..3.01
rows=1 width=4) (actual time=0.013..0.013 rows=0 loops=1)
                Index Cond: (archive.id = "outer".id)
  Total runtime: 94.550 ms

    Query 2:

EXPLAIN ANALYZE SELECT * FROM test.raw AS raw LEFT JOIN test.active AS
active ON (active.id=raw.id AND active.id>1100000) LEFT JOIN test.archive
AS archive ON (archive.id=raw.id AND archive.id > 1100000) WHERE
raw.id>1100000 AND active.id IS NULL AND archive.id IS NULL LIMIT 1;
                                                               QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.04 rows=1 width=12) (actual time=0.035..0.035 rows=1
loops=1)
    ->  Merge Left Join  (cost=0.00..37.67 rows=1021 width=12) (actual
time=0.034..0.034 rows=1 loops=1)
          Merge Cond: ("outer".id = "inner".id)
          Filter: ("inner".id IS NULL)
          ->  Merge Left Join  (cost=0.00..30.51 rows=1021 width=8) (actual
time=0.026..0.026 rows=1 loops=1)
                Merge Cond: ("outer".id = "inner".id)
                Filter: ("inner".id IS NULL)
                ->  Index Scan using raw_pkey on raw  (cost=0.00..24.78
rows=1021 width=4) (actual time=0.016..0.016 rows=1 loops=1)
                      Index Cond: (id > 1100000)
                ->  Index Scan using active_pkey on active
(cost=0.00..3.14 rows=10 width=4) (actual time=0.006..0.006 rows=0 loops=1)
                      Index Cond: (id > 1100000)
          ->  Index Scan using archive_pkey on archive  (cost=0.00..4.35
rows=100 width=4) (actual time=0.007..0.007 rows=0 loops=1)
                Index Cond: (id > 1100000)
  Total runtime: 0.101 ms

    OK, you were right ;)

    Query 3:

EXPLAIN ANALYZE SELECT * FROM test.raw AS raw WHERE raw.id > 1100000 AND
NOT EXISTS (SELECT 1 FROM test.active AS a WHERE a.id=raw.id) AND NOT
EXISTS (SELECT 1 FROM test.archive AS a WHERE a.id=raw.id) LIMIT 1;
                                                               QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..24.23 rows=1 width=4) (actual time=0.036..0.036 rows=1
loops=1)
    ->  Index Scan using raw_pkey on raw  (cost=0.00..6178.35 rows=255
width=4) (actual time=0.035..0.035 rows=1 loops=1)
          Index Cond: (id > 1100000)
          Filter: ((NOT (subplan)) AND (NOT (subplan)))
          SubPlan
            ->  Index Scan using archive_pkey on archive a
(cost=0.00..3.01 rows=1 width=0) (actual time=0.007..0.007 rows=0 loops=1)
                  Index Cond: (id = $0)
            ->  Index Scan using active_pkey on active a  (cost=0.00..3.01
rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1)
                  Index Cond: (id = $0)
  Total runtime: 0.086 ms

    I see a problem with Query 1:
    The Merge Join goes through tables "raw" and "active" in sorted order.
"archive" contains values 1-1000000
"active" contains values 1000001-1100000
"raw" contains values 1050000-1101000

    However it starts at the beginning of "active" ; it would be smarter to
start the index scan of "active" at the lowest value in "raw", ie. to seek
into the right position into the index before beginning to scan it. This
is achieved by your advice on manually adding the "id > x" conditions in
the query.

    However, if I want to join the full tables, dropping the id>x condition :

EXPLAIN ANALYZE SELECT * FROM test.raw AS raw LEFT JOIN test.active AS
active ON (active.id=raw.id) LEFT JOIN test.archive AS archive ON
(archive.id=raw.id) WHERE active.id IS NULL AND archive.id IS NULL;
                                                                   QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
  Merge Left Join  (cost=0.00..27305.04 rows=51001 width=12) (actual
time=837.196..838.099 rows=1000 loops=1)
    Merge Cond: ("outer".id = "inner".id)
    Filter: ("inner".id IS NULL)
    ->  Merge Left Join  (cost=0.00..3943.52 rows=51001 width=8) (actual
time=153.495..154.190 rows=1000 loops=1)
          Merge Cond: ("outer".id = "inner".id)
          Filter: ("inner".id IS NULL)
          ->  Index Scan using raw_pkey on raw  (cost=0.00..1033.01
rows=51001 width=4) (actual time=0.012..23.085 rows=51001 loops=1)
          ->  Index Scan using active_pkey on active  (cost=0.00..2023.00
rows=100000 width=4) (actual time=0.004..47.333 rows=100000 loops=1)
    ->  Index Scan using archive_pkey on archive  (cost=0.00..20224.00
rows=1000000 width=4) (actual time=0.043..501.953 rows=1000000 loops=1)
  Total runtime: 838.272 ms

    This is very slow : the Index Scans on "active" and "archive" have to
skip a huge number of rows before getting to the first interesting row. We
know that rows in "active" and "archive" will be of no use if their id is
< (SELECT min(id) FROM test.raw) which is 1050000. Let's rephrase :

EXPLAIN ANALYZE SELECT * FROM test.raw AS raw LEFT JOIN test.active AS
active ON (active.id=raw.id AND active.id >= 1050000) LEFT JOIN
test.archive AS archive ON (archive.id=raw.id AND archive.id >= 1050000)
WHERE active.id IS NULL AND archive.id IS NULL;
                                                                  QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
  Merge Left Join  (cost=0.00..2837.93 rows=51001 width=12) (actual
time=114.590..115.451 rows=1000 loops=1)
    Merge Cond: ("outer".id = "inner".id)
    Filter: ("inner".id IS NULL)
    ->  Merge Left Join  (cost=0.00..2705.78 rows=51001 width=8) (actual
time=114.576..115.239 rows=1000 loops=1)
          Merge Cond: ("outer".id = "inner".id)
          Filter: ("inner".id IS NULL)
          ->  Index Scan using raw_pkey on raw  (cost=0.00..1033.01
rows=51001 width=4) (actual time=0.012..51.505 rows=51001 loops=1)
          ->  Index Scan using active_pkey on active  (cost=0.00..1158.32
rows=50913 width=4) (actual time=0.009..22.312 rows=50001 loops=1)
                Index Cond: (id >= 1050000)
    ->  Index Scan using archive_pkey on archive  (cost=0.00..4.35 rows=100
width=4) (actual time=0.012..0.012 rows=0 loops=1)
          Index Cond: (id >= 1050000)
  Total runtime: 115.601 ms

    So here's my point : the first operation in the Index Scan in a merge
join could be to seek to the right position in the index before scanning
it. This value is known : it is the first value yielded by the index scan
on "raw".

    This would remove the need for teaching the planner about transitivity,
and also optimize this case where transitivity is useless.












pgsql-performance by date:

Previous
From: Guido Neitzer
Date:
Subject: Re: Posrgres speed problem - solved!
Next
From: Sven Geisler
Date:
Subject: Re: 64-bit vs 32-bit performance ... backwards?