Re: On Differing Optimizer Choices ( Again) - Mailing list pgsql-sql

From Oleg Lebedev
Subject Re: On Differing Optimizer Choices ( Again)
Date
Msg-id 3BB90E21.F8E6CE80@waterford.org
Whole thread Raw
Responses Re: Calling Functions  ("Josh Berkus" <josh@agliodbs.com>)
List pgsql-sql
Hi,
this is a very silly question, but how do I run PL/pgSQL function from SQL?
I created a function test(varchar, varchar) and I am trying to invoke it with
parameters 'hello', 'world' using an SQL statement. I tried selects and executes
- nothing works.
Please help,

Oleg

Mark kirkwood wrote:

> Dear all,
>
> Tom's comments on my previous posting encouraged me think some more about
> this... and now I believe got to the heart of what I was attempting to get
> accross before.
>
> I have a fresh and hopefully clear example.
>
> Ok lets start with a small table called 'dim0' that has a unique key called
> 'd0key'  ( 10000 unique values ). Add to this a large table called 'fact2',
> which has 1000 of these 'd0key' values. There are 3000 duplicates for each
> value uniformly distributed throughout it. ( total of 3000000 rows ).
>
> Consider the query :
>
> SELECT
>        f.d0key,
>        count(f.val)
> FROM fact2 f
> WHERE f.d0key BETWEEN  270 AND 350
> GROUP BY f.d0key
>
> which has execution plan :
>
> Aggregate  (cost=0.00..102500.80 rows=2592 width=8)
>   ->  Group  (cost=0.00..102436.00 rows=25920 width=8)
>         ->  Index Scan using fact2_pk on fact2 f  (cost=0.00..102371.20
> rows=25920 width=8)
>
> If we use 351 instead of 350 we get a sequential scan.
>
> Now examine a similar query, but with 'dim0' joined :
>
> SELECT
>        f.d0key,
>        count(f.val)
> FROM dim0 d0,
>      fact2 f
> WHERE d0.d0key = f.d0key
> AND   f.d0key BETWEEN 270 AND 350
> GROUP BY f.d0key
>
> this has plan :
>
> Aggregate  (cost=0.00..103127.60 rows=2592 width=12)
>   ->  Group  (cost=0.00..103062.80 rows=25920 width=12)
>         ->  Merge Join  (cost=0.00..102998.00 rows=25920 width=12)
>               ->  Index Scan using dim0_pk on dim0 d0  (cost=0.00..213.00
> rows=10000 width=4)
>               ->  Index Scan using fact2_pk on fact2 f  (cost=0.00..102371.20
> rows=25920 width=8)
>
> No surprises there
>
> (If we use 351, again we get a sequential scan used instead ).
>
> So far this is all as one would expect. However suppose we substitute
> 'd0.d0key' in the 'AND' clause instead of 'f.d0key' to obtain :
>
> SELECT
>        f.d0key,
>        count(f.val)
> FROM dim0 d0,
>      fact2 f
> WHERE d0.d0key = f.d0key
> AND   d0.d0key BETWEEN 270 AND 350
> GROUP BY f.d0key
>
> Suddenly the plan is :
>
> Aggregate  (cost=103530.27..104293.15 rows=2624 width=12)
>   ->  Group  (cost=103530.27..104227.54 rows=26244 width=12)
>         ->  Merge Join  (cost=103530.27..104161.93 rows=26244 width=12)
>               ->  Index Scan using dim0_pk on dim0 d0  (cost=0.00..213.00
> rows=10000 width=4)
>               ->  Sort  (cost=103530.27..103530.27 rows=26244 width=8)
>                     ->  Seq Scan on fact2 f  (cost=0.00..101604.00 rows=26244
> width=8)
>
> Now this is interesting, I would have expected an index scan to be still
> used... This behavour was what I was seeing ( in disguised form ) in the
> queries of the previous posting.
>
> ( In fact to encourage an index scan changing 350 down to 313 is required )
>
> I wonder how 7.1.x behaves when faced with this situation?... a build of an
> extra 7.1.3 database I reveals the corrosponding plan for this query is
> (note that for 7.1.3 the magic number for index-> sequential scan is 369
> instead of 350 but bear with me) :
>
> Aggregate  (cost=0.00..118850.17 rows=2970 width=12)
>   ->  Group  (cost=0.00..118775.91 rows=29703 width=12)
>         ->  Nested Loop  (cost=0.00..118701.66 rows=29703 width=12)
>               ->  Index Scan using dim0_pk on dim0 d0  (cost=0.00..67.99
> rows=99 width=4)
>               ->  Index Scan using fact2_pk on fact2 f  (cost=0.00..1194.45
> rows=300 width=8)
>
> So that this version is using an index scan for this query ( in fact will
> keep using one until after d0key=445 - so in some sense a behavour opposite
> to 7.2dev is being exibited)
>
> Now the obvious question to ask here is "why are you are griping about using
> a seq scan...? ". Timing the queries reveals that the index scan is
> considerably faster : specifically  10s against 60s. Additionally 7.1.3
> performs the above query in 10s also - and even "out" at the "extreme end"
> using d0.d0key=445 the elapsed time is just 15s .
>
> Why am I pointing this out ? - well I hope that "field testing" the optimizer
> will occasionally provide food for thought !
>
> regards
>
> Mark
>
> P.s : (I am using 7.2 dev 2001-08-17 and all parameters are default apart
> from shared_buffers and sort_mem)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



pgsql-sql by date:

Previous
From: Philip Hallstrom
Date:
Subject: Re: PostgreSQL downloads
Next
From: "Josh Berkus"
Date:
Subject: Re: Calling Functions