Thread: 7.4beta2 vs 7.3.3

7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Hi all,
my impression on 7.4 performances:


===== Good news =====

How expected the use of clausole 'IN' with Postgres 7.4beta2
now is really fast ( see the result below )


===== Bad news =====

I posted time ago about a slow query:

SELECT ul.*
FROM user_logs ul,     user_data ud,     class_default cd
WHERE ul.id_user = ud.id_user AND      ud.id_class = cd.id_class AND      cd.id_provider = 39;


these are the information about the tables involved:

user_logs:    ~1.5 Milion rows
user_data:    ~10000 rows
class_default ~100 rows

and I found also that was better do this query
in three steps:

SELECT id_class
FROM class_default
WHERE id_provider = 39;

SELECT id_user
FROM user_data
WHERE id_class IN ( 48 ); <= result of query above

SELECT *
FROM user_logs
WHERE id_user IN ( 11126, ...., 11769 );   <= 43 values result


This last query runs for 10.30 msec !!!



Here the comparison between Postgres7.3.3 and
Postgres7.4beta2 for that original query:



Postgres 7.3.3

test=# explain analyze select ul.*
test-# from user_logs ul,
test-#      user_data ud,
test-#      class_default cd
test-# where ul.id_user = ud.id_user and
test-#       ud.id_class = cd.id_class and
test-#       cd.id_provider = 39;
QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------------------------------
HashJoin  (cost=325.79..36234.20 rows=41957 width=60) (actual 
 
time=6151.29..7022.29 rows=702 loops=1)   Hash Cond: ("outer".id_user = "inner".id_user)   ->  Seq Scan on user_logs ul
(cost=0.00..28251.30 rows=1426530 
 
width=48) (actual time=0.02..5427.07 rows=1426530 loops=1)   ->  Hash  (cost=324.97..324.97 rows=329 width=12) (actual

time=320.97..320.97 rows=0 loops=1)         ->  Nested Loop  (cost=0.00..324.97 rows=329 width=12) (actual 
time=0.24..320.89 rows=43 loops=1)               ->  Seq Scan on class_default cd  (cost=0.00..1.43 
rows=1 width=4) (actual time=0.05..0.07 rows=1 loops=1)                     Filter: (id_provider = 39)               ->
Index Scan using idx_user_data_class on user_data ud  (cost=0.00..318.55 rows=400 width=8) (actual time=0.19..320.72
rows=43
 
loops=1)                     Index Cond: (ud.id_class = "outer".id_class) Total runtime: 7023.15 msec
(10 rows)


Postgres 7.4beta2

test=# explain analyze select ul.*
test-# from user_logs ul,
test-#      user_data ud,
test-#      class_default cd
test-# where ul.id_user = ud.id_user and
test-#       ud.id_class = cd.id_class and
test-#       cd.id_provider = 39;
QUERY PLAN 


-------------------------------------------------------------------------------------------------------------------------------------------------------
HashJoin  (cost=322.94..36127.70 rows=42081 width=47) (actual 
 
time=2629.84..2990.00 rows=702 loops=1)   Hash Cond: ("outer".id_user = "inner".id_user)   ->  Seq Scan on user_logs ul
(cost=0.00..28251.30 rows=1426530 
 
width=47) (actual time=0.03..1738.65 rows=1426530 loops=1)   ->  Hash  (cost=322.12..322.12 rows=330 width=4) (actual 
time=0.78..0.78 rows=0 loops=1)         ->  Nested Loop  (cost=0.00..322.12 rows=330 width=4) (actual 
time=0.19..0.71 rows=43 loops=1)               ->  Seq Scan on class_default cd  (cost=0.00..1.43 
rows=1 width=4) (actual time=0.07..0.08 rows=1 loops=1)                     Filter: (id_provider = 39)               ->
Index Scan using idx_user_data_class on user_data ud  (cost=0.00..315.87 rows=386 width=8) (actual time=0.11..0.54
rows=43
 
loops=1)                     Index Cond: (ud.id_class = "outer".id_class) Total runtime: 2990.70 msec
(10 rows)



The performance are really improved but not yet 10 msecs.




I tried nesting the 3 queries that I was speak about,
I did it only with 7.4 due the fact with 7.3.3 I never
seen the result.



test=# EXPLAIN ANALYZE SELECT *
test-# FROM user_logs
test-# WHERE id_user in (
test(#    SELECT id_user
test(#    FROM user_data
test(#    WHERE id_class in (
test(#        SELECT id_class FROM class_default WHERE id_provider = 39
test(#    )
test(# );
QUERY PLAN 


----------------------------------------------------------------------------------------------------------------------------------------------------
HashIN Join  (cost=322.94..36127.70 rows=42081 width=47) (actual 
 
time=2626.20..2985.53 rows=702 loops=1)   Hash Cond: ("outer".id_user = "inner".id_user)   ->  Seq Scan on user_logs
(cost=0.00..28251.30rows=1426530 
 
width=47) (actual time=0.03..1731.59 rows=1426530 loops=1)   ->  Hash  (cost=322.12..322.12 rows=330 width=4) (actual 
time=0.80..0.80 rows=0 loops=1)         ->  Nested Loop  (cost=1.43..322.12 rows=330 width=4) (actual 
time=0.22..0.72 rows=43 loops=1)               ->  HashAggregate  (cost=1.43..1.43 rows=1 width=4) 
(actual time=0.10..0.10 rows=1 loops=1)                     ->  Seq Scan on class_default  (cost=0.00..1.43 
rows=1 width=4) (actual time=0.07..0.08 rows=1 loops=1)                           Filter: (id_provider = 39)
  ->  Index Scan using idx_user_data_class on user_data 
 
(cost=0.00..315.87 rows=386 width=8) (actual time=0.11..0.54 rows=43 
loops=1)                     Index Cond: (user_data.id_class = "outer".id_class) Total runtime: 2986.33 msec
(11 rows)


How you can see with 7.4 the two queries ( the original with the join
and with the IN nested ) are performing at the same way.



Regards
Gaetano Mendola














Re: 7.4beta2 vs 7.3.3

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> ===== Bad news =====
> I posted time ago about a slow query:

> SELECT ul.*
> FROM user_logs ul,
>       user_data ud,
>       class_default cd
> WHERE ul.id_user = ud.id_user AND
>        ud.id_class = cd.id_class AND
>        cd.id_provider = 39;

I think the reason the planner chooses a not-optimal plan here is that
it's misestimating the number of rows involved by a factor of 10
(it thinks there will be 400 matching user_data rows, not 43).

You would probably find better results if you increase the statistics
target for user_data.id_class and re-ANALYZE.  See past discussions in
pgsql-performance.

> Here the comparison between Postgres7.3.3 and
> Postgres7.4beta2 for that original query:

Hm, it sure looks to be exactly the same plan.  The performance
difference seems to be just that the seqscans are faster.  I surmise
that in the 7.3 database you had a lot of dead rows, or at least a lot
of free space.  Possibly you need to vacuum more often to keep down the
amount of junk in the tables.
        regards, tom lane


Re: 7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Tom Lane wrote:
> 
> Hm, it sure looks to be exactly the same plan.  The performance
> difference seems to be just that the seqscans are faster.  I surmise
> that in the 7.3 database you had a lot of dead rows, or at least a lot
> of free space.  Possibly you need to vacuum more often to keep down the
> amount of junk in the tables.

The two databases were created from scratch and the first
operation on it ( after a vacuum analyze ) was just that query.

I will try to pump up the statistic target.


Regards
Gaetano Mendola



Re: 7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Gaetano Mendola wrote:
> Tom Lane wrote:
> 
>>
>> Hm, it sure looks to be exactly the same plan.  The performance
>> difference seems to be just that the seqscans are faster.  I surmise
>> that in the 7.3 database you had a lot of dead rows, or at least a lot
>> of free space.  Possibly you need to vacuum more often to keep down the
>> amount of junk in the tables.
> 
> 
> The two databases were created from scratch and the first
> operation on it ( after a vacuum analyze ) was just that query.
> 
> I will try to pump up the statistic target.

The default_statistics_target was already 1000 for
postgres7.4beta2 and after setting this value to 1000
for postgres7.3.3 I had the same explain with the same
expected row ( almost 400 ) and the almost the same time execution.


Another piece: The query

SELECT ud.*
FROM user_data ud,     class_default cd
WHERE ud.id_class = cd.id_class AND      cd.id_provider = 39;

( I removed the huge table )

take just few ms to be executed.


Regards
Gaetano Mendola












Re: 7.4beta2 vs 7.3.3

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> Tom Lane wrote:
>> Hm, it sure looks to be exactly the same plan.  The performance
>> difference seems to be just that the seqscans are faster.  I surmise
>> that in the 7.3 database you had a lot of dead rows, or at least a lot
>> of free space.  Possibly you need to vacuum more often to keep down the
>> amount of junk in the tables.

> The two databases were created from scratch and the first
> operation on it ( after a vacuum analyze ) was just that query.

Y'know, I'd love to think that 7.4 is 2x faster than 7.3 on seqscanning,
but I just can't believe that.  We didn't do anything that could account
for such a speedup.  So there's got to be some inconsistency in your
test conditions.
        regards, tom lane


Re: 7.4beta2 vs 7.3.3

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> The default_statistics_target was already 1000 for
> postgres7.4beta2 and after setting this value to 1000
> for postgres7.3.3 I had the same explain with the same
> expected row ( almost 400 ) and the almost the same time execution.

Hmph.  Can you look into why the row count estimate is still off by 10x
with stats target 1000?  Seems like there must be some odd skew in your
data statistics.
        regards, tom lane


Re: 7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Tom Lane wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>Tom Lane wrote:
>>
>>>Hm, it sure looks to be exactly the same plan.  The performance
>>>difference seems to be just that the seqscans are faster.  I surmise
>>>that in the 7.3 database you had a lot of dead rows, or at least a lot
>>>of free space.  Possibly you need to vacuum more often to keep down the
>>>amount of junk in the tables.
>
>
>>The two databases were created from scratch and the first
>>operation on it ( after a vacuum analyze ) was just that query.
>
>
> Y'know, I'd love to think that 7.4 is 2x faster than 7.3 on seqscanning,
> but I just can't believe that.  We didn't do anything that could account
> for such a speedup.  So there's got to be some inconsistency in your
> test conditions.

The machine is the same, the disk too, the main values in the
configuration file are the same ( I put the confs in attachment ).
I was alo able to remove the useless column on that tables, and I
put the dump here:

http://212.198.37.110

The select take long:

Postgres7.3.3:  average 4000 ms
Postgres7.4b2:  average 2600 ms

you can experiment your self with the dump that I gave you

Regards
Gaetano Mendola

tcpip_socket = true
max_connections = 128

shared_buffers = 13000        # I'm running two postgres server
max_fsm_relations = 1000    # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 2000000
max_locks_per_transaction = 64    # min 10

wal_buffers = 2000

sort_mem = 32768
vacuum_mem = 32768

default_statistics_target = 1000

server_min_messages = debug2

log_connections = true
log_pid = true
log_statement = true
log_duration = true
log_timestamp = true

syslog = 2
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'

stats_start_collector = true
stats_reset_on_server_start = true
stats_command_string = true
stats_row_level = true
stats_block_level = true

LC_MESSAGES = 'en_US.UTF-8'
LC_MONETARY = 'en_US.UTF-8'
LC_NUMERIC = 'en_US.UTF-8'
LC_TIME = 'en_US.UTF-8'

tcpip_socket = true
max_connections = 128

shared_buffers = 13000
sort_mem = 32768
vacuum_mem = 32768

max_fsm_pages = 2000000
max_fsm_relations = 1000

wal_buffers = 2000

default_statistics_target = 1000

syslog = 2
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'

log_min_messages = debug2

log_connections = true
log_duration = true
log_pid = true
log_statement = true
log_timestamp = true
log_hostname = false
log_source_port = true



stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true


lc_messages = 'en_US.UTF-8'        # locale for system error message strings
lc_monetary = 'en_US.UTF-8'        # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'        # locale for number formatting
lc_time = 'en_US.UTF-8'            # locale for time formatting



Re: 7.4beta2 vs 7.3.3

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> The select take long:
> Postgres7.3.3:  average 4000 ms
> Postgres7.4b2:  average 2600 ms
> you can experiment your self with the dump that I gave you

Hm.  I tried to duplicate your results.  I'm getting about 5400 msec
versus 4200 msec, which is a nice version-to-version improvement but
not as large as you're seeing.  (I have --enable-cassert on, though,
and that may be cancelling some of the percentage gain.)

It looks like most of the speedup is in the seq scan step.  I think
it is probably due to this change:

2003-02-03 10:07  tgl
* src/: backend/executor/execMain.c, backend/executor/execScan.c,backend/executor/nodeIndexscan.c,
backend/executor/nodeSeqscan.c,backend/executor/nodeTidscan.c,backend/optimizer/plan/createplan.c,backend/optimizer/plan/planner.c,
backend/optimizer/util/plancat.c,backend/optimizer/util/relnode.c,include/executor/executor.h,include/nodes/relation.h,
include/optimizer/plancat.h:Tweakplanner and executor to avoid doing ExecProject() in table scannodes where it's not
reallynecessary.    In many cases where thescan node is not the topmost plan node (eg, joins, aggregation),it's
possibleto just return the table tuple directly instead ofgenerating an intermediate projection tuple.  In
preliminarytesting,this reduced the CPU time needed for 'SELECT COUNT(*) FROMfoo' by about 10%.
 
        regards, tom lane


Re: 7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Tom Lane wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
> 
>>The select take long:
>>Postgres7.3.3:  average 4000 ms
>>Postgres7.4b2:  average 2600 ms
>>you can experiment your self with the dump that I gave you
> 
> 
> Hm.  I tried to duplicate your results.  I'm getting about 5400 msec
> versus 4200 msec, which is a nice version-to-version improvement but
> not as large as you're seeing.  (I have --enable-cassert on, though,
> and that may be cancelling some of the percentage gain.)

May be, I have the --enable-cassert off.

What about the wrong row expected ?
Anyway if the rows expected are 400 ( instead of 43 ) why not an index 
scan, with 400 rows on 1500000 seems a good choise do an index scan, 
isn't it ?


Regards
Gaetano Mendola




Re: 7.4beta2 vs 7.3.3

From
Tom Lane
Date:
Gaetano Mendola <mendola@bigfoot.com> writes:
> What about the wrong row expected ?

After I looked more closely, I realized that the planner hasn't any hope
of getting a really correct answer on that.  You've got
WHERE ... ud.id_class = cd.id_class AND                      cd.id_provider = 39;

Now the planner doesn't have any problem figuring out that this will
select one "cd" row, but the number of "ud" rows matched varies wildly
depending on which one cd.id_class value happens to be involved.
Without actually pre-executing the query it has no way to know which
value will be involved, and so it has to fall back to a default
estimate, which is IIRC (number of rows in ud)/(number of distinct values).
Then there's a similar problem with estimating the number of rows
retrieved from ul.

> Anyway if the rows expected are 400 ( instead of 43 ) why not an index 
> scan, with 400 rows on 1500000 seems a good choise do an index scan, 
> isn't it ?

The trouble here is that because of the very skewed data statistics (in
both ud and ul), the planner can't really be sure that this query will
retrieve only a few rows from either table.  There are other values in
both tables that would have retrieved vastly more data.  The hash join
may be slower for this particular id_provider value, but it won't get
very much worse with other id_provider values --- a nestloop plan will.

Your idea of reducing id_provider to id_class using a separate query
seems like a good one to me --- that will allow the planner to generate
different plans depending on which id_class value is involved.
        regards, tom lane


Re: 7.4beta2 vs 7.3.3

From
Gaetano Mendola
Date:
Tom Lane wrote:
> Your idea of reducing id_provider to id_class using a separate query
> seems like a good one to me --- that will allow the planner to generate
> different plans depending on which id_class value is involved.

However is not a natural way to approch the problem;
Am I wrong ?


Gaetano