Thread: array can be slow when joining?

array can be slow when joining?

From
Volodymyr Kostyrko
Date:
I don't clearly understand why this happens, but when I try join some
tables using arrays I end up with:

=# explain select count(*) from urls JOIN rules ON urls.tag && rules.tag;
                                       QUERY PLAN

---------------------------------------------------------------------------------------
  Aggregate  (cost=1356.27..1356.28 rows=1 width=0)
    ->  Nested Loop  (cost=20.33..1354.96 rows=523 width=0)
          ->  Seq Scan on rules  (cost=0.00..1.01 rows=1 width=37)
          ->  Bitmap Heap Scan on urls  (cost=20.33..1347.42 rows=523
width=29)
                Recheck Cond: (urls.tag && rules.tag)
                ->  Bitmap Index Scan on url_tag_g  (cost=0.00..20.20
rows=523 width=0)
                      Index Cond: (urls.tag && rules.tag)

Here tag is text[] with list of tags. Whole select takes 142 ms. It
drops down to 42 ms when I add some conditions that strip result table
to zero length.

What am I missing? Is there any other ways to overlap those ones? Or
should I find "any other way"?

--
Sphinx of black quartz judge my vow.

Re: array can be slow when joining?

From
Pavel Stehule
Date:
Hello

if you read a some longer field - longer than 2Kb, then PostgreSQL has
to read this value from a table_toast file. see
http://developer.postgresql.org/pgdocs/postgres/storage-toast.html -
and reading have to be slower than you don't read this field.

Regards

Pavel Stehule

2010/8/29 Volodymyr Kostyrko <c.kworr@gmail.com>:
> I don't clearly understand why this happens, but when I try join some tables
> using arrays I end up with:
>
> =# explain select count(*) from urls JOIN rules ON urls.tag && rules.tag;
>                                      QUERY PLAN
> ---------------------------------------------------------------------------------------
>  Aggregate  (cost=1356.27..1356.28 rows=1 width=0)
>   ->  Nested Loop  (cost=20.33..1354.96 rows=523 width=0)
>         ->  Seq Scan on rules  (cost=0.00..1.01 rows=1 width=37)
>         ->  Bitmap Heap Scan on urls  (cost=20.33..1347.42 rows=523
> width=29)
>               Recheck Cond: (urls.tag && rules.tag)
>               ->  Bitmap Index Scan on url_tag_g  (cost=0.00..20.20 rows=523
> width=0)
>                     Index Cond: (urls.tag && rules.tag)
>
> Here tag is text[] with list of tags. Whole select takes 142 ms. It drops
> down to 42 ms when I add some conditions that strip result table to zero
> length.
>
> What am I missing? Is there any other ways to overlap those ones? Or should
> I find "any other way"?
>
> --
> Sphinx of black quartz judge my vow.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>