Thread: Query optimization path

Query optimization path

From

"Gaetano Mendola"

Date:

12 January 2003, 11:31:35

Hi all,

I have the following query:

SELECT count(1)
FROM providers p JOIN class_default cd    USING (id_provider)
                 JOIN user_data     ud    USING (id_class)
                 JOIN v_user_traffic ut   USING (id_user)
WHERE id_user_status in (4,5) AND
      p.company = 'XXXXX';

is not slow but I notice that if I do explain analyze with the table
reordered
inside the select in another way the cost change.

------------------- FIRST CASE --------------
explain analyze
SELECT count(1)
FROM providers p JOIN class_default cd    USING (id_provider)
                 JOIN user_data     ud    USING (id_class)
                 JOIN v_user_traffic ut   USING (id_user)
WHERE id_user_status in (4,5) AND
      p.company = 'SOL';

NOTICE:  QUERY PLAN:

Aggregate  (cost=9482.53..9482.53 rows=1 width=32) (actual
time=164.82..164.82 rows=1 loops=1)
  ->  Hash Join  (cost=145.89..9480.58 rows=782 width=32) (actual
time=77.29..164.16 rows=396 loops=1)
        ->  Hash Join  (cost=7.15..9232.71 rows=19870 width=12) (actual
time=1.67..152.21 rows=1170 loops=1)
              ->  Seq Scan on user_traffic u  (cost=0.00..8877.83 rows=19870
width=8) (actual time=0.23..145.39 rows=1170 loops=1)
              ->  Hash  (cost=6.52..6.52 rows=252 width=4) (actual
time=0.85..0.85 rows=0 loops=1)
                    ->  Seq Scan on contracts c  (cost=0.00..6.52 rows=252
width=4) (actual time=0.04..0.52 rows=181 loops=1)
        ->  Hash  (cost=138.05..138.05 rows=276 width=20) (actual
time=8.88..8.88 rows=0 loops=1)
              ->  Nested Loop  (cost=4.02..138.05 rows=276 width=20) (actual
time=1.53..7.87 rows=520 loops=1)
                    ->  Hash Join  (cost=4.02..5.29 rows=1 width=12) (actual
time=0.98..1.14 rows=1 loops=1)
                          ->  Seq Scan on class_default cd  (cost=0.00..1.18
rows=18 width=8) (actual time=0.02..0.09 rows=18 loops=1)
                          ->  Hash  (cost=4.01..4.01 rows=1 width=4) (actual
time=0.21..0.21 rows=0 loops=1)
                                ->  Seq Scan on providers p
(cost=0.00..4.01 rows=1 width=4) (actual time=0.19..0.19 rows=1 loops=1)
                    ->  Index Scan using idx_user_data_class on user_data ud
(cost=0.00..127.99 rows=382 width=8) (actual time=0.52..5.32 rows=520
loops=1)
Total runtime: 165.23 msec


------------------- SECOND CASE --------------
explain analyze
SELECT count(1)
FROM user_data ud JOIN v_user_traffic ut USING (id_user)
                  JOIN class_default cd USING (id_class)
                  JOIN providers p USING (id_provider)
WHERE id_user_status in (4,5) and  p.company = 'SOL';

NOTICE:  QUERY PLAN:

Aggregate  (cost=10194.82..10194.82 rows=1 width=32) (actual
time=210.09..210.09 rows=1 loops=1)
  ->  Hash Join  (cost=324.95..10194.38 rows=174 width=32) (actual
time=123.18..209.47 rows=396 loops=1)
        ->  Hash Join  (cost=320.94..10117.81 rows=14076 width=28) (actual
time=54.17..206.00 rows=1167 loops=1)
              ->  Hash Join  (cost=319.71..9870.25 rows=14076 width=20)
(actual time=53.10..199.45 rows=1167 loops=1)
                    ->  Hash Join  (cost=7.15..9232.71 rows=19870 width=12)
(actual time=1.61..142.42 rows=1170 loops=1)
                          ->  Seq Scan on user_traffic u
(cost=0.00..8877.83 rows=19870 width=8) (actual time=0.23..135.88 rows=1170
loops=1)
                          ->  Hash  (cost=6.52..6.52 rows=252 width=4)
(actual time=0.81..0.81 rows=0 loops=1)
                                ->  Seq Scan on contracts c
(cost=0.00..6.52 rows=252 width=4) (actual time=0.05..0.51 rows=181 loops=1)
                    ->  Hash  (cost=300.15..300.15 rows=4966 width=8)
(actual time=50.89..50.89 rows=0 loops=1)
                          ->  Seq Scan on user_data ud  (cost=0.00..300.15
rows=4966 width=8) (actual time=0.27..42.02 rows=4978 loops=1)
              ->  Hash  (cost=1.18..1.18 rows=18 width=8) (actual
time=0.33..0.33 rows=0 loops=1)
                    ->  Seq Scan on class_default cd  (cost=0.00..1.18
rows=18 width=8) (actual time=0.25..0.30 rows=18 loops=1)
        ->  Hash  (cost=4.01..4.01 rows=1 width=4) (actual time=0.58..0.58
rows=0 loops=1)
              ->  Seq Scan on providers p  (cost=0.00..4.01 rows=1 width=4)
(actual time=0.56..0.56 rows=1 loops=1)
Total runtime: 210.41 msec



I was believing that postgres before to do the query choose the combination
that cost less,
and in this case ( less then 11 table, I have geqo_threshold = 11 ) in an
exaustive way.
Why I obtain two different cost ? Note also that in the first case postgres
use and index
and not in the second.


Ciao
Gaetano.

Re: Query optimization path

From

Bruno Wolff III

Date:

12 January 2003, 11:49:01

On Sun, Jan 12, 2003 at 17:31:34 +0100,
  Gaetano Mendola <mendola@bigfoot.com> wrote:
> Hi all,
>
> I have the following query:
>
> SELECT count(1)
> FROM providers p JOIN class_default cd    USING (id_provider)
>                  JOIN user_data     ud    USING (id_class)
>                  JOIN v_user_traffic ut   USING (id_user)
> WHERE id_user_status in (4,5) AND
>       p.company = 'XXXXX';
>
> is not slow but I notice that if I do explain analyze with the table
> reordered
> inside the select in another way the cost change.

Using "JOIN" fixes the order that tables are joined in. If you are
just doing inner joins, then you probably don't want to use the "JOIN"
keyword. The exception being when there are so many tables being joined
that you want to manually specify at least some of the join ordering in
order to help out the planner.

Outer (and left and right) joins are generally not associative, so the
"JOIN" keyword being required for those cases isn't generally going to
be a problem.

Re: Query optimization path

From

"Gaetano Mendola"

Date:

12 January 2003, 12:07:12

"Bruno Wolff III" <bruno@wolff.to> wrote in message
news:20030112170108.GB31080@wolff.to...
>   Gaetano Mendola <mendola@bigfoot.com> wrote:
> > Hi all,
> >
> > I have the following query:
> >
> > SELECT count(1)
> > FROM providers p JOIN class_default cd    USING (id_provider)
> >                  JOIN user_data     ud    USING (id_class)
> >                  JOIN v_user_traffic ut   USING (id_user)
> > WHERE id_user_status in (4,5) AND
> >       p.company = 'XXXXX';
> >
> > is not slow but I notice that if I do explain analyze with the table
> > reordered
> > inside the select in another way the cost change.

> Using "JOIN" fixes the order that tables are joined in. If you are
> just doing inner joins, then you probably don't want to use the "JOIN"
> keyword. The exception being when there are so many tables being joined
> that you want to manually specify at least some of the join ordering in
> order to help out the planner.

And where is written this behavior ? Is it SQL standard ?


Ciao
Gaetano.

Re: Query optimization path

From

Stephan Szabo

Date:

12 January 2003, 12:28:34

On Sun, 12 Jan 2003, Gaetano Mendola wrote:

> "Bruno Wolff III" <bruno@wolff.to> wrote in message
> news:20030112170108.GB31080@wolff.to...
> >   Gaetano Mendola <mendola@bigfoot.com> wrote:
> > > Hi all,
> > >
> > > I have the following query:
> > >
> > > SELECT count(1)
> > > FROM providers p JOIN class_default cd    USING (id_provider)
> > >                  JOIN user_data     ud    USING (id_class)
> > >                  JOIN v_user_traffic ut   USING (id_user)
> > > WHERE id_user_status in (4,5) AND
> > >       p.company = 'XXXXX';
> > >
> > > is not slow but I notice that if I do explain analyze with the table
> > > reordered
> > > inside the select in another way the cost change.
>
> > Using "JOIN" fixes the order that tables are joined in. If you are
> > just doing inner joins, then you probably don't want to use the "JOIN"
> > keyword. The exception being when there are so many tables being joined
> > that you want to manually specify at least some of the join ordering in
> > order to help out the planner.
>
> And where is written this behavior ? Is it SQL standard ?

http://www.postgresql.org/idocs/index.php?explicit-joins.html

The standard generally has nothing to say about optimization.

Re: Query optimization path

From

Bruno Wolff III

Date:

12 January 2003, 12:41:41

On Sun, Jan 12, 2003 at 18:07:14 +0100,
  Gaetano Mendola <mendola@bigfoot.com> wrote:
> "Bruno Wolff III" <bruno@wolff.to> wrote in message
>
> > Using "JOIN" fixes the order that tables are joined in. If you are
> > just doing inner joins, then you probably don't want to use the "JOIN"
> > keyword. The exception being when there are so many tables being joined
> > that you want to manually specify at least some of the join ordering in
> > order to help out the planner.
>
> And where is written this behavior ? Is it SQL standard ?

The SQL standard isn't going to address performance, just results.

Look in section 10.3 of the User's Guide. This is under Performance Tips
and Controlling the Planner with Explicit JOIN Clauses.

Case Studio Postgresql

From

Vida Luz Arista

Date:

14 January 2003, 13:10:43

Hi,

I am analyzing to Postgresql for e-goverment, nevertheless I need to know
in that companies, country and so that aims this being used this data
base.

Where I can Find this information?

Thanks in Advanced,

Regards.

Re: Case Studio Postgresql

From

Jyry Kuukkanen

Date:

14 January 2003, 13:17:59

Hello Vida Luz Arista

I am not quite sure what do you mean by your message.

Do you want links or such of companies or goverments who are currently
using PostgreSQL?

Cheers,
--jq


On Tue, 14 Jan 2003, Vida Luz Arista wrote:

> Hi,
>
> I am analyzing to Postgresql for e-goverment, nevertheless I need to know
> in that companies, country and so that aims this being used this data
> base.
>
> Where I can Find this information?
>
> Thanks in Advanced,
>
> Regards.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
--Jyry
"Pornoteollisuudessa liikkuu sadat miljoonat. Sosiaalipornossa vain kymmenet.
Karpolla on vielä asiaa."
C:-(    C:-/    C========8-O    C8-/    C:-(

Re: Case Studio Postgresql

From

"David F. Skoll"

Date:

14 January 2003, 13:27:17

On Tue, 14 Jan 2003, Vida Luz Arista wrote:

> I am analyzing to Postgresql for e-goverment, nevertheless I need to know
> in that companies, country and so that aims this being used this data
> base.

My anti-spam solution at http://www.canit.ca/ is built around
PostgreSQL.  The largest installation is on a couple of Linux servers
that serve about 10,000 e-mail addresses, and the daily message volume
is probably on the order of 80-100K messages.  PostgreSQL has
performed admirably.

--
David.

Re: Case Studio Postgresql

From

Tim Ellis

Date:

14 January 2003, 13:36:24

On Tue, 2003-01-14 at 13:20, Vida Luz Arista wrote:
> I am analyzing to Postgresql for e-goverment, nevertheless I need to know
> in that companies, country and so that aims this being used this data
> base.
>
> Where I can Find this information?

Hola,

There are case studies on the PostgreSQL website:
  http://advocacy.postgresql.org/casestudies

¿Usted intentó utilizar Google para traducir de español al inglés?  El
traductor que usted utilizó hecho muy duro-a-leyo' inglés.  Utilicé
Google para este inglés a la traducción española.
  http://translate.google.com/translate_t

[English: Did you try to use Google to translate from Spanish to
English? The translator you used made very hard-to-read English. I used
Google for this English to Spanish translation.]

--
Tim Ellis
Senior Database Architect

Re: Case Studio Postgresql

From

Vida Luz Arista

Date:

14 January 2003, 14:30:12

Yes, I am searching companies or goverments who are currently
using PostgreSQL

Regards,



On Tue, 14 Jan 2003, Jyry Kuukkanen wrote:

>
> Hello Vida Luz Arista
>
> I am not quite sure what do you mean by your message.
>
> Do you want links or such of companies or goverments who are currently
> using PostgreSQL?
>
> Cheers,
> --jq
>
>
> On Tue, 14 Jan 2003, Vida Luz Arista wrote:
>
> > Hi,
> >
> > I am analyzing to Postgresql for e-goverment, nevertheless I need to know
> > in that companies, country and so that aims this being used this data
> > base.
> >
> > Where I can Find this information?
> >
> > Thanks in Advanced,
> >
> > Regards.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
>
>

--
\/_\_

Saludos,
"La Vida"