Thread: very slow left join

very slow left join

From

Ben

Date:

16 May 2008, 14:56:18

I've inherited an Oracle database that I'm porting to Postgres, and this
has been going quite well until now. Unfortunately, I've found one view (a
largish left join) that runs several orders of magnitude slower on
Postgres than it did on Oracle.

=> select version();
                                                  version
----------------------------------------------------------------------------------------------------------
  PostgreSQL 8.2.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)


After analyzing the database, the explain analyze output for the query is:

  Nested Loop Left Join  (cost=133.51..15846.99 rows=1 width=312) (actual time=109.131..550711.374 rows=1248 loops=1)
    Join Filter: (log.logkey = ln.logkey)
    ->  Nested Loop  (cost=133.51..267.44 rows=1 width=306) (actual time=15.316..74.074 rows=1248 loops=1)
          ->  Merge Join  (cost=133.51..267.16 rows=1 width=325) (actual time=15.300..60.332 rows=1248 loops=1)
                Merge Cond: (log.eventkey = e.eventkey)
                Join Filter: ((e.clientkey = log.clientkey) AND (e.premiseskey = log.premiseskey))
                ->  Index Scan using log_eventkey_idx on log  (cost=0.00..3732.14 rows=36547 width=167) (actual
time=0.015..25.385rows=36547 loops=1) 
                      Filter: (logicaldel = 'N'::bpchar)
                ->  Sort  (cost=133.51..135.00 rows=595 width=328) (actual time=15.185..16.379 rows=1248 loops=1)
                      Sort Key: e.eventkey
                      ->  Hash Join  (cost=1.30..106.09 rows=595 width=328) (actual time=0.073..2.033 rows=1248
loops=1)
                            Hash Cond: ((e.clientkey = p.clientkey) AND (e.premiseskey = p.premiseskey))
                            ->  Seq Scan on event e  (cost=0.00..89.48 rows=1248 width=246) (actual time=0.005..0.481
rows=1248loops=1) 
                            ->  Hash  (cost=1.14..1.14 rows=11 width=82) (actual time=0.059..0.059 rows=11 loops=1)
                                  ->  Seq Scan on premises p  (cost=0.00..1.14 rows=11 width=82) (actual
time=0.004..0.020rows=11 loops=1) 
                                        Filter: (logicaldel = 'N'::bpchar)
          ->  Index Scan using severity_pk on severity s  (cost=0.00..0.27 rows=1 width=49) (actual time=0.007..0.009
rows=1loops=1248) 
                Index Cond: (e.severitykey = s.severitykey)
    ->  Seq Scan on lognote ln1  (cost=0.00..15552.67 rows=1195 width=175) (actual time=1.173..440.695 rows=1244
loops=1248)
          Filter: ((logicaldel = 'N'::bpchar) AND (subplan))
          SubPlan
            ->  Limit  (cost=4.30..8.58 rows=1 width=34) (actual time=0.171..0.171 rows=1 loops=2982720)
                  InitPlan
                    ->  GroupAggregate  (cost=0.00..4.30 rows=1 width=110) (actual time=0.089..0.089 rows=1
loops=2982720)
                          ->  Index Scan using lognote_pk on lognote (cost=0.00..4.28 rows=1 width=110) (actual
time=0.086..0.087rows=1 loops=2982720) 
                                Index Cond: ((clientkey = $0) AND (premiseskey = $1) AND (logkey = $2))
                                Filter: ((logicaldel = 'N'::bpchar) AND ((lognotetext ~~ '_%;%'::text) OR (lognotetext
~~'_%has modified Respond Status to%'::text))) 
                  ->  Index Scan using lognote_pk on lognote  (cost=0.00..4.28 rows=1 width=34) (actual
time=0.170..0.170rows=1 loops=2982720) 
                        Index Cond: ((clientkey = $0) AND (premiseskey = $1) AND (logkey = $2))
                        Filter: ((logicaldel = 'N'::bpchar) AND (lognotetime = $3))
  Total runtime: 550712.393 ms
(31 rows)


Either side of the left join runs quite fast independently. (The full
query also runs well when made into an inner join, but that's not the
logic I want.) The biggest difference between running each side
indpendently and together in a left join is that this line in the plan for
the right side of the left join:

->  Index Scan using lognote_pk on lognote (cost=0.00..4.28 rows=1 width=110) (actual time=0.086..0.087 rows=1
loops=2982720)

...becomes this line when run independantly:

->  Index Scan using lognote_pk on lognote  (cost=0.00..4.28 rows=1 width=110) (actual time=0.086..0.087 rows=1
loops=2390)

That's quite a few more loops in the left join. Am I right to think that
it's looping so much because the analyzer is so far off when guessing the
rows for the left side of the join (1 vs. 1248)? Or is there something
else going on? I've tried bumping up analyze stats on a few columns, but
I'm not too sure how to spot which columns it might help with and, sure
enough, it didn't help.


The actual query:

select *
from
   (
     select *
     from
       event e,
       severity s,
       premises p,
       log
     where
       p.clientkey = e.clientkey and
       p.premiseskey = e.premiseskey and
       p.logicaldel = 'N' and
       log.logicaldel = 'N' and
       e.clientkey = log.clientkey and
       e.premiseskey = log.premiseskey and
       e.eventkey = log.eventkey and
       e.severitykey = s.severitykey
   ) lj
   left join
   (
     select
       clientkey, premiseskey, logkey, lognotetime, logicaldel,
     case
       when
         (case when instr(lognotetext,';') = 0 then instr(lognotetext,' has modified')
               else instr(lognotetext,';') end) = 0 then NULL
        else
            substr(lognotetext,1,
            (
                case when instr(lognotetext,';') = 0 then
                    instr(lognotetext,' has modified') else
                    instr(lognotetext,';') end
            ) - 1)
          end as responderid
        from lognote ln1
        where
          logicaldel = 'N' and
          lognotekey in
          (
            select lognotekey
            from lognote
            where
              logicaldel = 'N' and
              clientkey = ln1.clientkey and
              premiseskey = ln1.premiseskey and
              logkey = ln1.logkey and
              lognotetime =
                (
                  select min(lognotetime)
                  from lognote
                  where
                     logicaldel = 'N' and
                     (
                       lognotetext like '_%;%' or
                       lognotetext like '_%has modified Respond Status to%'
                     ) and
                     clientkey = ln1.clientkey and
                     premiseskey = ln1.premiseskey and
                     logkey = ln1.logkey
                   group by clientkey, premiseskey, logkey
                 )
              order by lognotekey limit 1
             )
   ) ln on
   (
     lj.logkey = ln.logkey
   )


The instr() function calls are calling this version of instr:
http://www.postgresql.org/docs/8.2/interactive/plpgsql-porting.html#PLPGSQL-PORTING-APPENDIX



The relevent schema:

                          Table "public.event"
      Column     |            Type             |       Modifiers
----------------+-----------------------------+------------------------
  clientkey      | character(30)               | not null
  premiseskey    | character(30)               | not null
  eventkey       | character(30)               | not null
  severitykey    | character(30)               |
Indexes:
     "event_pk" PRIMARY KEY, btree (clientkey, premiseskey, eventkey), tablespace "data"
Foreign-key constraints:
     "premisesevent" FOREIGN KEY (clientkey, premiseskey) REFERENCES premises(clientkey, premiseskey) DEFERRABLE
INITIALLYDEFERRED 

             Table "public.severity"
        Column       |     Type      | Modifiers
--------------------+---------------+-----------
  severitykey        | character(30) | not null
  severityname       | text          |
Indexes:
     "severity_pk" PRIMARY KEY, btree (severitykey), tablespace "data"

                        Table "public.premises"
      Column      |            Type             |      Modifiers
-----------------+-----------------------------+---------------------
  clientkey       | character(30)               | not null
  premiseskey     | character(30)               | not null
  logicaldel      | character(1)                | default 'N'::bpchar
Indexes:
     "premises_pk" PRIMARY KEY, btree (clientkey, premiseskey), tablespace "data"
Foreign-key constraints:
     "clientpremises" FOREIGN KEY (clientkey) REFERENCES client(clientkey) DEFERRABLE INITIALLY DEFERRED

                          Table "public.log"
      Column     |            Type             |      Modifiers
----------------+-----------------------------+---------------------
  clientkey      | character(30)               | not null
  premiseskey    | character(30)               | not null
  logkey         | character(30)               | not null
  logicaldel     | character(1)                | default 'N'::bpchar
  eventkey       | character(30)               |
Indexes:
     "log_pk" PRIMARY KEY, btree (clientkey, premiseskey, logkey), tablespace "data"
     "log_ak1" btree (clientkey, premiseskey, logtime, logkey), tablespace "data"
     "log_eventkey_idx" btree (eventkey), tablespace "data"
Foreign-key constraints:
     "premiseslog" FOREIGN KEY (clientkey, premiseskey) REFERENCES premises(clientkey, premiseskey) DEFERRABLE
INITIALLYDEFERRED 

                      Table "public.lognote"
    Column    |            Type             |      Modifiers
-------------+-----------------------------+---------------------
  clientkey   | character(30)               | not null
  premiseskey | character(30)               | not null
  logkey      | character(30)               | not null
  lognotekey  | character(30)               | not null
  logicaldel  | character(1)                | default 'N'::bpchar
  lognotetext | text                        |
  lognotetime | timestamp without time zone |
Indexes:
     "lognote_pk" PRIMARY KEY, btree (clientkey, premiseskey, logkey, lognotekey), tablespace "data"
     "lognotekey_idx" UNIQUE, btree (lognotekey), tablespace "data"
Foreign-key constraints:
     "log_lognote_fk1" FOREIGN KEY (clientkey, premiseskey, logkey) REFERENCES log(clientkey, premiseskey, logkey)
DEFERRABLEINITIALLY DEFERRED 



Any help would be appreciated!

Re: very slow left join

From

"Scott Marlowe"

Date:

16 May 2008, 15:09:55

On Fri, May 16, 2008 at 11:56 AM, Ben <bench@silentmedia.com> wrote:
> I've inherited an Oracle database that I'm porting to Postgres, and this has
> been going quite well until now. Unfortunately, I've found one view (a
> largish left join) that runs several orders of magnitude slower on Postgres
> than it did on Oracle.
>
> => select version();
>                                                 version
> ----------------------------------------------------------------------------------------------------------
>  PostgreSQL 8.2.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
> 4.1.1 20070105 (Red Hat 4.1.1-52)
> (1 row)

1: Update to 8.2.7.  It's pretty painless, and who knows what
performance bugs you might be fighting that you don't really need to.

> After analyzing the database, the explain analyze output for the query is:
>
>  Nested Loop Left Join  (cost=133.51..15846.99 rows=1 width=312) (actual
> time=109.131..550711.374 rows=1248 loops=1)
>   Join Filter: (log.logkey = ln.logkey)
>   ->  Nested Loop  (cost=133.51..267.44 rows=1 width=306) (actual
> time=15.316..74.074 rows=1248 loops=1)
SNIP
>  Total runtime: 550712.393 ms

Just for giggles, try running the query like so:

set enable_nestloop = off;
explain analyze ...

and see what happens.  I'm guessing that the nested loops are bad choices here.

>        (case when instr(lognotetext,';') = 0 then instr(lognotetext,' has
> modified')
>              else instr(lognotetext,';') end) = 0 then NULL

Try creating indexes on the functions above, and make sure you're
running the db in the C local if you can. Note you may need to dump /
initdb --locale=C / reload your data if you're not in the C locale
already.  text_pattern_ops may be applicable here, but I'm not sure
how to use it in the above functions.

>                         Table "public.event"
>     Column     |            Type             |       Modifiers
> ----------------+-----------------------------+------------------------
>  clientkey      | character(30)               | not null
>  premiseskey    | character(30)               | not null
>  eventkey       | character(30)               | not null
>  severitykey    | character(30)               |

Do these really need to be character and not varchar?  varchar / text
are better optimized in pgsql, and character often need to be cast
anyway, so you might as well start with varchar.  Unless you REALLY
need padding in your db, avoid char(x).

Don't see anything else, but who knows what someone else might see.

Re: very slow left join

From

Ben

Date:

16 May 2008, 15:21:41

On Fri, 16 May 2008, Scott Marlowe wrote:

> Just for giggles, try running the query like so:
>
> set enable_nestloop = off;
> explain analyze ...
>
> and see what happens.  I'm guessing that the nested loops are bad choices here.

You guess correctly, sir! Doing so shaves 3 orders of magnitude off the
runtime. That's nice. :) But that brings up the question of why postgres
thinks nested loops are the way to go? It would be handy if I could make
it guess correctly to begin with and didn't have to turn nested loops off
each time I run this.

>>                         Table "public.event"
>>     Column     |            Type             |       Modifiers
>> ----------------+-----------------------------+------------------------
>>  clientkey      | character(30)               | not null
>>  premiseskey    | character(30)               | not null
>>  eventkey       | character(30)               | not null
>>  severitykey    | character(30)               |
>
> Do these really need to be character and not varchar?  varchar / text
> are better optimized in pgsql, and character often need to be cast
> anyway, so you might as well start with varchar.  Unless you REALLY
> need padding in your db, avoid char(x).

Unfortuantely, the people who created this database made all keys 30
character strings, and we're not near a place in our release cycle where
we can fix that.

Re: very slow left join

From

"Scott Marlowe"

Date:

16 May 2008, 15:27:20

On Fri, May 16, 2008 at 12:21 PM, Ben <bench@silentmedia.com> wrote:
> On Fri, 16 May 2008, Scott Marlowe wrote:
>
>> Just for giggles, try running the query like so:
>>
>> set enable_nestloop = off;
>> explain analyze ...
>>
>> and see what happens.  I'm guessing that the nested loops are bad choices
>> here.
>
> You guess correctly, sir! Doing so shaves 3 orders of magnitude off the
> runtime. That's nice. :) But that brings up the question of why postgres
> thinks nested loops are the way to go? It would be handy if I could make it
> guess correctly to begin with and didn't have to turn nested loops off each
> time I run this.

Well, I'm guessing that you aren't in locale=C and that the text
functions in your query aren't indexed.  Try creating an index on them
something like:

create index abc_txtfield_func on mytable (substring(textfield,1,5));

etc and see if that helps.

As for the char type, I totally understand the issue, having inherited
oracle dbs before...

Re: very slow left join

From

Ben

Date:

16 May 2008, 15:43:17

On Fri, 16 May 2008, Scott Marlowe wrote:

> Well, I'm guessing that you aren't in locale=C and that the text

Correct, I am not. And my understanding is that by moving to the C locale,
I would loose utf8 validation, so I don't want to go there. Though, it's
news to me that I would get any kind of select performance boost with
locale=C. Why would it help?

> functions in your query aren't indexed.  Try creating an index on them
> something like:
>
> create index abc_txtfield_func on mytable (substring(textfield,1,5));
>
> etc and see if that helps.

It does not. :(

Re: very slow left join

From

Craig Ringer

Date:

16 May 2008, 16:25:37

Ben wrote:
> On Fri, 16 May 2008, Scott Marlowe wrote:
>
>> Well, I'm guessing that you aren't in locale=C and that the text
>
> Correct, I am not. And my understanding is that by moving to the C
> locale, I would loose utf8 validation, so I don't want to go there.
> Though, it's news to me that I would get any kind of select performance
> boost with locale=C. Why would it help?

As far as I know the difference is that in the "C" locale PostgreSQL can
use simple byte-ordinal-oriented rules for sorting, character access,
etc. It can ignore the possibility of a character being more than one
byte in size. It can also avoid having to consider pairs of characters
where the ordinality of the numeric byte value of the characters is not
the same as the ordinality of the characters in the locale (ie they
don't sort in byte-value order).

If I've understood it correctly ( I don't use "C" locale databases
myself and I have not tested any of this ) that means that two UTF-8
encoded strings stored in a "C" locale database might not compare how
you expect. They might sort in a different order to what you expect,
especially if one is a 2-byte or more char and the other is only 1 byte.
They might compare non-equal even though they contain the same sequence
of Unicode characters because one is in a decomposed form and one is in
a precomposed form. The database neither knows the encoding of the
strings nor cares about it; it's just treating them as byte sequences
without any interest in their meaning.

If you only ever work with 7-bit ASCII, that might be OK. Ditto if you
never rely on the database for text sorting and comparison.

Someone please yell at me if I've mistaken something here.

--
Craig Ringer