Thread: Very slow query

Very slow query

From
Rory Campbell-Lange
Date:
The following query on some small datasets takes over a second to run.
I'd be grateful for some help in understanding the explain output, and
to remake the code.

Looks like the sort is using up quite a bit of resources. I made an
index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
was not recorded in explain analyze. (see "testindex" below).

I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
in a psql function (I've changed variables to hard coded integers
below). The db is vacuumed every night.

Thanks for any help;
Rory


SELECT DISTINCT
        b.n_id                           as id,
        b.n_type,
        CASE b.n_type WHEN 0 THEN 'personal'
                      WHEN 1 THEN 'private'
                      WHEN 2 THEN 'blog'
                      ELSE 'public'
                      END                as type,
        b.t_name                         as title
    FROM
        boards b, people p, objects o
    WHERE
         b.b_hidden = 'f'
         AND
         (
            b.n_type = 3
            OR
            b.n_creator = 71
            OR
            (   b.n_id = o.n_board_id
                AND
                o.n_creator = 71
                AND
                o.n_joined > 0
            )
         )
    ORDER BY
        b.n_type, b.n_id ASC, b.t_name;

trial=> \d boards
                                         Table "public.boards"
    Column     |            Type             |                        Modifiers
---------------+-----------------------------+----------------------------------------------------------
 n_id          | integer                     | not null default nextval('public.boards_n_id_seq'::text)
 b_hidden      | boolean                     | default false
 dt_created    | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 dt_modified   | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 t_mobile      | character varying(15)       |
 t_email       | character varying(50)       |
 n_creator     | integer                     | not null
 n_type        | smallint                    | not null default 0
 t_name        | character varying(100)      | not null
 t_description | character varying(500)      |
 n_id_photo    | integer                     |
 n_bg_colour   | integer                     | default 0
Indexes:
    "boards_pkey" primary key, btree (n_id)
    "boards_t_email_key" unique, btree (t_email)
    "boards_t_mobile_key" unique, btree (t_mobile)
    "testindex" btree (n_type, n_id, t_name)
Foreign-key constraints:
    "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
    "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
    tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d people
                                           Table "public.people"
      Column       |            Type             |                        Modifiers
-------------------+-----------------------------+----------------------------------------------------------
 n_id              | integer                     | not null default nextval('public.people_n_id_seq'::text)
 n_object_id       | integer                     |
 n_objects_counter | integer                     | default 0
 b_hidden          | boolean                     | default false
 dt_created        | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 dt_modified       | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 t_nickname        | character varying(20)       | not null
 t_firstname       | character varying(20)       | not null
 t_surname         | character varying(25)       | not null
 t_mobile          | character varying(15)       |
 t_email           | character varying(50)       |
 t_password        | character varying(15)       | not null
 b_registered      | boolean                     | default false
 n_email_status    | smallint                    | default 0
 n_txt_status      | smallint                    | default 0
 b_work_hours      | boolean                     | default false
 b_confirmations   | boolean                     | default true
Indexes:
    "people_pkey" primary key, btree (n_id)
    "people_t_email_key" unique, btree (t_email)
    "people_t_mobile_key" unique, btree (t_mobile)
    "people_t_nickname_key" unique, btree (t_nickname)
Foreign-key constraints:
    "object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
Triggers:
    tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()

trial=> \d objects
                                         Table "public.objects"
    Column     |            Type             |                         Modifiers
---------------+-----------------------------+-----------------------------------------------------------
 n_id          | integer                     | not null default nextval('public.objects_n_id_seq'::text)
 t_text_id     | character varying(25)       | not null
 b_hidden      | boolean                     | default false
 dt_created    | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 dt_modified   | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
 n_creator     | integer                     | not null
 n_type        | smallint                    | not null default 0
 t_name        | character varying(100)      | not null
 t_description | text                        |
 t_postcode    | character varying(10)       |
 n_id_photo    | integer                     |
 n_board_id    | integer                     | not null
 n_joined      | smallint                    |
 b_trigger     | boolean                     | default true
Indexes:
    "objects_pkey" primary key, btree (n_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
    "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
    "$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE



 Unique  (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
   ->  Sort  (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
         Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN
'private'::te
xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
         ->  Nested Loop  (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1)
               ->  Seq Scan on people p  (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
               ->  Materialize  (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67)
                     ->  Nested Loop  (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362
loops=1)
                           Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR
("inner".n_creator= 71)) A 
ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR
("inner".n_type= 3 
) OR ("inner".n_creator = 71)))
                           ->  Seq Scan on objects o  (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372
rows=787loops= 
1)
                           ->  Materialize  (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102
loops=787)
                                 ->  Seq Scan on boards b  (cost=0.00..3.27 rows=102 width=22) (actual
time=0.017..0.352rows=102 lo 
ops=1)
                                       Filter: (b_hidden = false)
 Total runtime: 13051.065 ms



--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>

Re: Very slow query

From
Rory Campbell-Lange
Date:
Sorry for replying to my own post, but I'm anxious for an answer. Should
I provide other information?

Thanks
Rory

On 10/05/04, Rory Campbell-Lange (rory@campbell-lange.net) wrote:
> The following query on some small datasets takes over a second to run.
> I'd be grateful for some help in understanding the explain output, and
> to remake the code.
>
> Looks like the sort is using up quite a bit of resources. I made an
> index on boards using columns "b.n_type, b.n_id, b.t_name" but the index
> was not recorded in explain analyze. (see "testindex" below).
>
> I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is
> in a psql function (I've changed variables to hard coded integers
> below). The db is vacuumed every night.
>
> Thanks for any help;
> Rory
>
>
> SELECT DISTINCT
>         b.n_id                           as id,
>         b.n_type,
>         CASE b.n_type WHEN 0 THEN 'personal'
>                       WHEN 1 THEN 'private'
>                       WHEN 2 THEN 'blog'
>                       ELSE 'public'
>                       END                as type,
>         b.t_name                         as title
>     FROM
>         boards b, people p, objects o
>     WHERE
>          b.b_hidden = 'f'
>          AND
>          (
>             b.n_type = 3
>             OR
>             b.n_creator = 71
>             OR
>             (   b.n_id = o.n_board_id
>                 AND
>                 o.n_creator = 71
>                 AND
>                 o.n_joined > 0
>             )
>          )
>     ORDER BY
>         b.n_type, b.n_id ASC, b.t_name;
>
> trial=> \d boards
>                                          Table "public.boards"
>     Column     |            Type             |                        Modifiers
> ---------------+-----------------------------+----------------------------------------------------------
>  n_id          | integer                     | not null default nextval('public.boards_n_id_seq'::text)
>  b_hidden      | boolean                     | default false
>  dt_created    | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  dt_modified   | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  t_mobile      | character varying(15)       |
>  t_email       | character varying(50)       |
>  n_creator     | integer                     | not null
>  n_type        | smallint                    | not null default 0
>  t_name        | character varying(100)      | not null
>  t_description | character varying(500)      |
>  n_id_photo    | integer                     |
>  n_bg_colour   | integer                     | default 0
> Indexes:
>     "boards_pkey" primary key, btree (n_id)
>     "boards_t_email_key" unique, btree (t_email)
>     "boards_t_mobile_key" unique, btree (t_mobile)
>     "testindex" btree (n_type, n_id, t_name)
> Foreign-key constraints:
>     "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
>     "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
> Triggers:
>     tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
>
> trial=> \d people
>                                            Table "public.people"
>       Column       |            Type             |                        Modifiers
> -------------------+-----------------------------+----------------------------------------------------------
>  n_id              | integer                     | not null default nextval('public.people_n_id_seq'::text)
>  n_object_id       | integer                     |
>  n_objects_counter | integer                     | default 0
>  b_hidden          | boolean                     | default false
>  dt_created        | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  dt_modified       | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  t_nickname        | character varying(20)       | not null
>  t_firstname       | character varying(20)       | not null
>  t_surname         | character varying(25)       | not null
>  t_mobile          | character varying(15)       |
>  t_email           | character varying(50)       |
>  t_password        | character varying(15)       | not null
>  b_registered      | boolean                     | default false
>  n_email_status    | smallint                    | default 0
>  n_txt_status      | smallint                    | default 0
>  b_work_hours      | boolean                     | default false
>  b_confirmations   | boolean                     | default true
> Indexes:
>     "people_pkey" primary key, btree (n_id)
>     "people_t_email_key" unique, btree (t_email)
>     "people_t_mobile_key" unique, btree (t_mobile)
>     "people_t_nickname_key" unique, btree (t_nickname)
> Foreign-key constraints:
>     "object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL
> Triggers:
>     tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time()
>
> trial=> \d objects
>                                          Table "public.objects"
>     Column     |            Type             |                         Modifiers
> ---------------+-----------------------------+-----------------------------------------------------------
>  n_id          | integer                     | not null default nextval('public.objects_n_id_seq'::text)
>  t_text_id     | character varying(25)       | not null
>  b_hidden      | boolean                     | default false
>  dt_created    | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  dt_modified   | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone
>  n_creator     | integer                     | not null
>  n_type        | smallint                    | not null default 0
>  t_name        | character varying(100)      | not null
>  t_description | text                        |
>  t_postcode    | character varying(10)       |
>  n_id_photo    | integer                     |
>  n_board_id    | integer                     | not null
>  n_joined      | smallint                    |
>  b_trigger     | boolean                     | default true
> Indexes:
>     "objects_pkey" primary key, btree (n_id)
> Foreign-key constraints:
>     "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE
>     "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL
>     "$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE
>
>
>
>  Unique  (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1)
>    ->  Sort  (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1)
>          Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1)
THEN'private'::te 
> xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END
>          ->  Nested Loop  (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254
loops=1)
>                ->  Seq Scan on people p  (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
>                ->  Materialize  (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362
loops=67)
>                      ->  Nested Loop  (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362
loops=1)
>                            Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR
("inner".n_creator= 71)) A 
> ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR
("inner".n_type= 3 
> ) OR ("inner".n_creator = 71)))
>                            ->  Seq Scan on objects o  (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372
rows=787loops= 
> 1)
>                            ->  Materialize  (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102
loops=787)
>                                  ->  Seq Scan on boards b  (cost=0.00..3.27 rows=102 width=22) (actual
time=0.017..0.352rows=102 lo 
> ops=1)
>                                        Filter: (b_hidden = false)
>  Total runtime: 13051.065 ms
>
>
>
> --
> Rory Campbell-Lange
> <rory@campbell-lange.net>
> <www.campbell-lange.net>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>

Re: Very slow query

From
Nick Barr
Date:
[snip]
Rory Campbell-Lange wrote:
>>
>>
>>SELECT DISTINCT
>>        b.n_id                           as id,
>>        b.n_type,
>>        CASE b.n_type WHEN 0 THEN 'personal'
>>                      WHEN 1 THEN 'private'
>>                      WHEN 2 THEN 'blog'
>>                      ELSE 'public'
>>                      END                as type,
>>        b.t_name                         as title
>>    FROM
>>        boards b, people p, objects o
>>    WHERE
>>         b.b_hidden = 'f'
>>         AND
>>         (
>>            b.n_type = 3
>>            OR
>>            b.n_creator = 71
>>            OR
>>            (   b.n_id = o.n_board_id
>>                AND
>>                o.n_creator = 71
>>                AND
>>                o.n_joined > 0
>>            )
>>         )
>>    ORDER BY
>>        b.n_type, b.n_id ASC, b.t_name;
>>

You don't seem to be joining the people table to the boards or objects
table. In fact the people table is never referenced anywhere but the
FROM clause. This might be why it is not running so well. People seems
to be referenced by both boards and objects, so does there need to be a
join between all 3 tables?

[snip]

>>    "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE

[snip]

>>    "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE

It is this "missing" join which seems to cause 158254 rows to appear in
this nested loop, when PG only reckons it is going to see 2738 rows

>>         ->  Nested Loop  (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254
loops=1)

I cant remember what the correct term is, but basically all rows from
the seq scan combines with all rows from the materialize, which is what
is causing the 150,000 odd rows to appear (is that called a cross
product anyone?)

i.e

67 rows from the seq x 2362 rows from the materialize = 158254 output
rows for the nested loop to chug through.

>>               ->  Seq Scan on people p  (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1)
>>               ->  Materialize  (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362
loops=67)

I reckon fix the unconstrained join and your query might run a little
faster.

 >>                     ->  Nested Loop  (cost=3.27..3442.79 rows=41
width=18) (actual time=0.216..273.709 rows=2362 loops=1)

The estimate for this nested loop seems a little off, note the
guesstimate in the first set of brackets of 41, and compare with the
actual result in the second set of 2362. Have you vacuum analyzed recently?

HTH


Nick


Re: Very slow query

From
Paul Thomas
Date:
On 10/05/2004 22:30 Rory Campbell-Lange wrote:
> Sorry for replying to my own post, but I'm anxious for an answer. Should
> I provide other information?

Look carefully at your column types. I can see several smallint columns in
there WHERE clause which are not expicitely typed as such.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business             |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: Very slow query

From
Nick Barr
Date:
Rory Campbell-Lange wrote:

>Wow, this is an education! My php profiler now shows:
>
>0.02   db->db_board_listing C: /var/www/trial/php/db.php:175   M: 867024
>0.02     ob_start           C: /var/www/trial/php/db.php:238   M: 867024
>0.02     pg_exec            C: /var/www/trial/php/db.php:239   M: 908328
>1.24     pg_result_status   C: /var/www/trial/php/db.php:240   M: 908392
>1.24     ob_end_clean       C: /var/www/trial/php/db.php:241   M: 908416
>1.24     db->result_checker C: /var/www/trial/php/db.php:243   M: 867472
>1.24       is_resource      C: /var/www/trial/php/db.php:1317  M: 867472
>1.24     pg_numrows         C: /var/www/trial/php/db.php:248   M: 867568
>1.24     pg_fetch_array     C: /var/www/trial/php/db.php:250   M: 867600
>1.24     pg_fetch_array     C: /var/www/trial/php/db.php:250   M: 867760
>
>and now shows:
>
>C: /var/www/trial/php/core.php:151 M: 867040
> 0.02   db->db_board_listing C: /var/www/trial/php/db.php:175  M: 867040
> 0.02     ob_start           C: /var/www/trial/php/db.php:238  M: 867040
> 0.02     pg_exec            C: /var/www/trial/php/db.php:239  M: 908344
> 0.14     pg_result_status   C: /var/www/trial/php/db.php:240  M: 908408
> 0.14     ob_end_clean       C: /var/www/trial/php/db.php:241  M: 908432
> 0.14     db->result_checker C: /var/www/trial/php/db.php:243  M: 867488
> 0.14       is_resource      C: /var/www/trial/php/db.php:1317 M: 867488
> 0.14     pg_numrows         C: /var/www/trial/php/db.php:248  M: 867584
> 0.14     pg_fetch_array     C: /var/www/trial/php/db.php:250  M: 867616
> 0.14     pg_fetch_array     C: /var/www/trial/php/db.php:250  M: 867776
>
>This is a saving of 1.1 seconds!
>
>
>
Thats a bit better ;-)

>>On 10/05/2004 22:30 Rory Campbell-Lange wrote:
>>
>>
>>>Sorry for replying to my own post, but I'm anxious for an answer. Should
>>>I provide other information?
>>>
>>>
>>Look carefully at your column types. I can see several smallint columns in
>>there WHERE clause which are not expicitely typed as such.
>>
>>
>
>I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is
>smallint not implied?
>
>Thanks
>Rory
>
>
>
Not quite. Explicit casts are needed when you have any numbers in the
WHERE condition and the columns are not of type integer/int4. For
example I have tweaked your query.

WHERE
    b.b_hidden = 'f'
    AND
    (
        b.n_type = 3::smallint
        OR
        b.n_creator = 71
        OR
        (
            b.n_id = o.n_board_id
            AND
            o.n_creator = 71
            AND
            o.n_joined > 0::smallint
        )
)

Note that b.n_creator and o.n_creator do not need explicit casts because
they are both of type integer anyway. You could of course put them in
for clarity. PG only casts the numbers to integer's, and not to smallint
or bigint, which basically means it does not use any indexes on that
column. This is fixed in 7.4 I believe, which you seem to be running
anyway so you might not be affected.

HTH

Nick


Re: Very slow query

From
Rory Campbell-Lange
Date:
On 11/05/04, Nick Barr (nicky@chuckie.co.uk) wrote:
> Rory Campbell-Lange wrote:

> > > Look carefully at your column types. I can see several smallint
> > > columns in there WHERE clause which are not expicitely typed as
> > > such.

> > I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is
> > smallint not implied?

> Not quite. Explicit casts are needed when you have any numbers in the
> WHERE condition and the columns are not of type integer/int4. For
> example I have tweaked your query.
...
> Note that b.n_creator and o.n_creator do not need explicit casts because
> they are both of type integer anyway. You could of course put them in
> for clarity. PG only casts the numbers to integer's, and not to smallint
> or bigint, which basically means it does not use any indexes on that
> column. This is fixed in 7.4 I believe, which you seem to be running
> anyway so you might not be affected.

Mmm. Seems like I should try profiling with and without the cast. Maybe
there is little value in defining a column as a smallint, other than
checking the length of input.

Thanks again for your help,
Rory
--
Rory Campbell-Lange
<rory@campbell-lange.net>
<www.campbell-lange.net>

Re: Very slow query

From
Tom Lane
Date:
Rory Campbell-Lange <rory@campbell-lange.net> writes:
> The following query on some small datasets takes over a second to run.
> I'd be grateful for some help in understanding the explain output, and
> to remake the code.

I think the problem is you have an unconstrained join against "people
p", which is wasting a bunch of time generating duplicate records
(and then sorting them, and then discarding them in the DISTINCT step...)

            regards, tom lane