Thread: performance problem aftrer update from 7.1 to 7.4.2

performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
Hello everyone,

i just upgraded my old postgres-database from version 7.1 to 7.4.2.
i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file with
copy-commands and to one file using insert-statements.

after initalizing and starting postgres 7.4 on a different port and
datadirectory, i tried to import the sql-dump with the copy statements.
this import fails, but importing the dump-file with inserts took a long
time but was successfully.
ok, at first i do a vacuum and analyze and fire up the database again.
now i changed my php-scripts to use the new database as datasource.

ok, things look good, but as i was testing some php-sciripts, i
recognized that the querys took about 2 or 5 times longer :(
a test a script which reads a lot of stuff from database, normaly with
pgsql 7.1 it tooks about 4 seconds to display the data, but with
postgres 7.4 it tooks about 25 seconds.
i start the 7.4 pgsql the same way as 7.1 with postmaster -D
/xxx/xxx/...  -N512 -S -F -B2048 -i
i also tried to kill some indexes and recreate them, but this doesnt
matter and the query took the same long time to execute. i also tested
some simple select querys, the will take longer time.

if somebody could help me, and hints on how i can speed up my 'new'
databse would be very nice :)

thanks in advance

yours
volker


Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Joshua D. Drake"
Date:
Hello,

You are probably missing a step some where... I know you got your data
imported but you might try
using the pg_dump from 7.4.2 to grab the 7.1 database and import from
there. It sounds to me like you
are missing an index or something.

Also it would help if we new your data structure, if you could post an
explain from 7.1 and from 7.4 and
possibly the queries that you are running your tests against.

Sincerely,

Joshua D. Drake


Development - multi.art.studio wrote:

> Hello everyone,
>
> i just upgraded my old postgres-database from version 7.1 to 7.4.2.
> i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file
> with copy-commands and to one file using insert-statements.
>
> after initalizing and starting postgres 7.4 on a different port and
> datadirectory, i tried to import the sql-dump with the copy statements.
> this import fails, but importing the dump-file with inserts took a
> long time but was successfully.
> ok, at first i do a vacuum and analyze and fire up the database again.
> now i changed my php-scripts to use the new database as datasource.
>
> ok, things look good, but as i was testing some php-sciripts, i
> recognized that the querys took about 2 or 5 times longer :(
> a test a script which reads a lot of stuff from database, normaly with
> pgsql 7.1 it tooks about 4 seconds to display the data, but with
> postgres 7.4 it tooks about 25 seconds.
> i start the 7.4 pgsql the same way as 7.1 with postmaster -D
> /xxx/xxx/...  -N512 -S -F -B2048 -i
> i also tried to kill some indexes and recreate them, but this doesnt
> matter and the query took the same long time to execute. i also tested
> some simple select querys, the will take longer time.
>
> if somebody could help me, and hints on how i can speed up my 'new'
> databse would be very nice :)
>
> thanks in advance
>
> yours
> volker
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly



--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL


Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
hello again,

thanks for help :)

i re-dumped the database (called mcms09) from postgres 7.1 with the pg_dump command from 7.4.2 successfully and restored it to 7.4.2 (database called mcms, postgres 7.4.2 running on a different port)

i compared some querys from 7.1 to 7.4.2 and put all the output of the explain command to a website, because i think its too much for posting it here.
i just see that most time is spent in sorting the tables before (?) creating the filter.

7.1:
mcms09=> explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
NOTICE:  QUERY PLAN:
Limit  (cost=9.26..9.26 rows=7 width=84) ->  Sort  (cost=9.26..9.26 rows=8 width=84)       ->  Index Scan using site_id_newsletter_key on newsletter  (cost=0.00..9.14 rows=8 width=84)
EXPLAIN
7.4.2:
mcms=# explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------Limit  (cost=17.78..17.81 rows=10 width=610)  ->  Sort  (cost=17.78..17.81 rows=11 width=610)        Sort Key: date, id        ->  Index Scan using site_id_newsletter_key on newsletter  (cost=0.00..17.59 rows=11 width=610)              Index Cond: ((site_id)::text = 'm000000-970'::text)
(5 rows)

but this query is the first one, done only once, then another subset of querys follow 10 times , the others depend on this first one, in my opinion this cant be the big factor which slows down the dbs

i also added new indexes to both databases on news, newsletter and newsletter_send table, speeding up the query by factor 10 :-) but 7.1 is always faster ....

you can see all the stuff , query, table layout and explains on the website http://www.erdtrabant.de/index.php?i=60500

thanks in advance
volker

Joshua D. Drake wrote:
Hello,

You are probably missing a step some where... I know you got your data imported but you might try
using the pg_dump from 7.4.2 to grab the 7.1 database and import from there. It sounds to me like you
are missing an index or something.

Also it would help if we new your data structure, if you could post an explain from 7.1 and from 7.4 and
possibly the queries that you are running your tests against.

Sincerely,

Joshua D. Drake


Development - multi.art.studio wrote:

Hello everyone,

i just upgraded my old postgres-database from version 7.1 to 7.4.2.
i dumped out my 7.1 database (with pg_dump from 7.1) as an sql-file with copy-commands and to one file using insert-statements.

after initalizing and starting postgres 7.4 on a different port and datadirectory, i tried to import the sql-dump with the copy statements.
this import fails, but importing the dump-file with inserts took a long time but was successfully.
ok, at first i do a vacuum and analyze and fire up the database again.
now i changed my php-scripts to use the new database as datasource.

ok, things look good, but as i was testing some php-sciripts, i recognized that the querys took about 2 or 5 times longer :(
a test a script which reads a lot of stuff from database, normaly with pgsql 7.1 it tooks about 4 seconds to display the data, but with postgres 7.4 it tooks about 25 seconds.
i start the 7.4 pgsql the same way as 7.1 with postmaster -D /xxx/xxx/...  -N512 -S -F -B2048 -i
i also tried to kill some indexes and recreate them, but this doesnt matter and the query took the same long time to execute. i also tested some simple select querys, the will take longer time.

if somebody could help me, and hints on how i can speed up my 'new' databse would be very nice :)

thanks in advance

yours
volker


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to majordomo@postgresql.org so that your
     message can get through to the mailing list cleanly




Re: performance problem aftrer update from 7.1 to 7.4.2

From
Richard Huxton
Date:
On Wednesday 14 April 2004 01:56, Development - multi.art.studio wrote:
>
> i compared some querys from 7.1 to 7.4.2 and put all the output of the
> explain command to a website, because i think its too much for posting
> it here.

Something funny here - notice the width of newsletter (84) in the old version
against the width in the new one (610).

> 7.1:
> mcms09=> explain select * from newsletter where site_id='m000000-970' order
> by date desc,id desc limit 10; NOTICE:  QUERY PLAN:
> Limit  (cost=9.26..9.26 rows=7 width=84)

> 7.4.2:
> mcms=# explain select * from newsletter where site_id='m000000-970' order
> by date desc,id desc limit 10; QUERY PLAN
> ---------------------------------------------------------------------------
>------------------------------ Limit  (cost=17.78..17.81 rows=10 width=610)
>    ->  Sort  (cost=17.78..17.81 rows=11 width=610)

--
  Richard Huxton
  Archonet Ltd

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
Hello, and thanks for reading

Richard Huxton wrote:
Something funny here - notice the width of newsletter (84) in the old version 
against the width in the new one (610).
 
hmmm, but why? the old mcms09 database is still growing and productive, and 7.4 is not updated and only for testing at this time.
as i do the tests, both database contents where the same, but the old was still used by other users.
 i do some additional tests, and a simple query explain in 7.4 gives much bigger values than in 7.1,
i tried vacuum, but nothing deleteted because everything grows....and only sometimes one
or two entries are deleted....,
last but not least i also reindex the new database, but indexes where all ok, and there are no problems with that (not looking on the final db design)

what happens that 7.4 will took so much time, and width for its querys? (really, i read that pg74 would speed up 2 or 3 times....if good db design) i didnt change anything on the tables.

thanks for any help,
volker

ps:mcms09 will be availablke for download in a few weeks, using pgsql as a backend :)

7.1:
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE:  QUERY PLAN:
Limit  (cost=9.26..9.26 rows=7 width=84)   
 
7.4.2:
mcms=# explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; QUERY PLAN
---------------------------------------------------------------------------
------------------------------ Limit  (cost=17.78..17.81 rows=10 width=610)  ->  Sort  (cost=17.78..17.81 rows=11 width=610)   
 

Re: performance problem aftrer update from 7.1 to 7.4.2

From
Richard Huxton
Date:
On Wednesday 14 April 2004 11:08, Development - multi.art.studio wrote:
> Richard Huxton wrote:
> >Something funny here - notice the width of newsletter (84) in the old
> > version against the width in the new one (610).
>
> hmmm, but why? the old mcms09 database is still growing and productive,
> and 7.4 is not updated and only for testing at this time.

> what happens that 7.4 will took so much time, and width for its querys?
> (really, i read that pg74 would speed up 2 or 3 times....if good db
> design) i didnt change anything on the tables.

I'm not saying the change in width is your problem here, but it's odd. Compare
a pg_dump of the schema of 7.1 newsletter vs 7.4 newsletter. There are only
two possibilities:
1. PG has changed the way it reports row width (I don't remember any such
change).
2. Something *has* changed.

If it still doesn't make sense, can you post the CREATE TABLE/INDEX etc. for
the newsletter table and I'll try it here and see what figure I get.

Once we've cleared this up, we'll have a look at what your config settings are
and whether they need tuning. PS - EXPLAIN ANALYSE is better than just
EXPLAIN for seeing what happens, it actually runs the query and shows what
happened.
--
  Richard Huxton
  Archonet Ltd

Re: performance problem aftrer update from 7.1 to 7.4.2

From
Tom Lane
Date:
Richard Huxton <dev@archonet.com> writes:
> 1. PG has changed the way it reports row width (I don't remember any such
> change).

My recollection is that up till 7.2, the estimation of widths for
variable-width columns was completely bogus.  Since 7.2 it's driven by
an actual average width for the column as measured by ANALYZE.  So if
the query is selecting some fairly wide variable-width columns then it's
entirely likely for the width estimate to take a big jump.

Given that we haven't seen any EXPLAIN ANALYZE output it's hard to say
anything about what the *real* problem is ...

            regards, tom lane

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
Hello,
sorry im late, but here are more details:

im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index
explain with 7.1: without analyze
mcms09=> explain select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
NOTICE:  QUERY PLAN:
Limit  (cost=9.26..9.26 rows=7 width=84)
  ->  Sort  (cost=9.26..9.26 rows=8 width=84)
        ->  Index Scan using newsletter_site_id_date_idx on newsletter  (cost=0.00..9.14 rows=8 width=84)
EXPLAIN
__________________________ 
and with pgsql 7.4.2:
mcms=# explain analyse select * from newsletter where site_id='m000000-970' order by date desc,id desc limit 10;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645 rows=10 loops=1)
   ->  Sort  (cost=17.78..17.81 rows=11 width=610) (actual time=0.620..0.627 rows=10 loops=1)
         Sort Key: date, id
         ->  Index Scan using site_id_newsletter_key on newsletter  (cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15 loops=1)
               Index Cond: ((site_id)::text = 'm000000-970'::text)
 Total runtime: 0.766 ms
(6 rows)
 
_____________________________
i also dumped both db-structures, with pg_dump from 7.4.2 
(i also used for dumping out old 7.1 before importing to 7.4.2, 
i also tried pg_dump from 7.1 and restored the db to 7.4, but performace was the same)
from old 7.1 dumped with pg_dump from 7.4.2:
CREATE SEQUENCE newsletter_id_seq   INCREMENT BY 1   MAXVALUE 2147483647   NO MINVALUE   CACHE 1;

CREATE TABLE newsletter (   id integer DEFAULT nextval('"newsletter_id_seq"'::text) NOT NULL,   site_id character varying,   date character varying,   title character varying,   text text,   aktiv smallint DEFAULT 1,   online smallint DEFAULT 1,   subtitle character varying,   show_titles smallint,   show_headline smallint,   bgcolor character varying
);
CREATE UNIQUE INDEX newsletter_id_key ON newsletter USING btree (id int4_ops);
CREATE INDEX aktiv_newsletter_key ON newsletter USING btree (aktiv int2_ops);
CREATE INDEX site_id_newsletter_key ON newsletter USING btree (site_id varchar_ops);
CREATE INDEX date_newsletter_key ON newsletter USING btree (date varchar_ops);
CREATE INDEX online_newsletter_key ON newsletter USING btree (online int2_ops);
CREATE INDEX newsletter_site_id_date_idx ON newsletter USING btree (site_id varchar_ops, date varchar_ops);
mcms09=> \d newsletter                                   Table "newsletter"  Attribute   |       Type        |                       Modifier
---------------+-------------------+-------------------------------------------------------id            | integer           | not null default nextval('"newsletter_id_seq"'::text)site_id       | character varying |date          | character varying |title         | character varying |text          | text              |aktiv         | smallint          | default 1online        | smallint          | default 1subtitle      | character varying |show_titles   | smallint          |show_headline | smallint          |bgcolor       | character varying |
Indices: aktiv_newsletter_key,        date_newsletter_key,        newsletter_id_key,        newsletter_site_id_date_idx,        online_newsletter_key,        site_id_newsletter_key

____________________
and structure from pgsql 7.4.2:
CREATE SEQUENCE newsletter_id_seq
    INCREMENT BY 1   MAXVALUE 2147483647   NO MINVALUE   CACHE 1;

CREATE TABLE newsletter (   id integer DEFAULT nextval('"newsletter_id_seq"'::text) NOT NULL,   site_id character varying,   date character varying,   title character varying,   text text,   aktiv smallint DEFAULT 1,   online smallint DEFAULT 1,   subtitle character varying,   show_titles smallint,   show_headline smallint,   bgcolor character varying
);

CREATE UNIQUE INDEX newsletter_id_key ON newsletter USING btree (id);
CREATE INDEX site_id_newsletter_key ON newsletter USING btree (site_id);
CREATE INDEX date_newsletter_key ON newsletter USING btree (date);
CREATE INDEX online_newsletter_key ON newsletter USING btree (online);
CREATE INDEX newsleter_date_idx ON newsletter USING btree (date);
CREATE INDEX newsletter_site_id_date_idx ON newsletter USING btree (site_id varchar_ops, date varchar_ops);

mcms=# \d newsletter                                Table "public.newsletter"   Column     |       Type        |                       Modifiers
---------------+-------------------+-------------------------------------------------------id            | integer           | not null default nextval('"newsletter_id_seq"'::text)site_id       | character varying |date          | character varying |title         | character varying |text          | text              |aktiv         | smallint          | default 1online        | smallint          | default 1subtitle      | character varying |show_titles   | smallint          |show_headline | smallint          |bgcolor       | character varying |
Indexes:   "newsletter_id_key" unique, btree (id)   "aktiv_newsletter_key" btree (aktiv)   "date_newsletter_key" btree (date)   "newsleter_date_idx" btree (date)   "newsletter_site_id_date_idx" btree (site_id, date)   "online_newsletter_key" btree (online)   "site_id_newsletter_key" btree (site_id)


i tried also creation of index on date only in 7.4, but this does not change anything.
hmmm

special thanks for reading and all comments :)
yours sincerely,
volker
Tom Lane wrote:
Richard Huxton <dev@archonet.com> writes: 
1. PG has changed the way it reports row width (I don't remember any such 
change).   
My recollection is that up till 7.2, the estimation of widths for
variable-width columns was completely bogus.  Since 7.2 it's driven by
an actual average width for the column as measured by ANALYZE.  So if
the query is selecting some fairly wide variable-width columns then it's
entirely likely for the width estimate to take a big jump.

Given that we haven't seen any EXPLAIN ANALYZE output it's hard to say
anything about what the *real* problem is ...
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
 

			
		

Re: performance problem aftrer update from 7.1 to 7.4.2

From
Richard Huxton
Date:
On Thursday 15 April 2004 17:30, Development - multi.art.studio wrote:
> Hello,
>
> sorry im late, but here are more details:
>
> im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index
>
> explain with 7.1: without analyze
> mcms09=> explain select * from newsletter where site_id='m000000-970' order
> by date desc,id desc limit 10; NOTICE:  QUERY PLAN:
> Limit  (cost=9.26..9.26 rows=7 width=84)
>   ->  Sort  (cost=9.26..9.26 rows=8 width=84)
>         ->  Index Scan using newsletter_site_id_date_idx on newsletter
> (cost=0.00..9.14 rows=8 width=84) EXPLAIN
> __________________________
> and with pgsql 7.4.2:
>
> mcms=# explain analyse select * from newsletter where site_id='m000000-970'
> order by date desc,id desc limit 10;

>  Limit  (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645
> rows=10 loops=1)
>    ->  Sort  (cost=17.78..17.81 rows=11 width=610) (actual
> time=0.620..0.627 rows=10 loops=1)
>          Sort Key: date, id
>          ->  Index Scan using site_id_newsletter_key on newsletter
> (cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15
> loops=1)
>                Index Cond: ((site_id)::text = 'm000000-970'::text)
>
>  Total runtime: 0.766 ms

It's not using the date index because it's using the id index - there's only
10 matches, so that looks like a good choice to me. It takes less than 1ms,
so I'm not sure this is a good example of a problem.

--
  Richard Huxton
  Archonet Ltd

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
hello everyone,

Richard Huxton wrote:

>It's not using the date index because it's using the id index - there's only 
>10 matches, so that looks like a good choice to me. It takes less than 1ms, 
>so I'm not sure this is a good example of a problem.

thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an expert. 
ok, i could use another site_id  and it will grow up and gives 500 or more rows back from the select clause.
this takes a lot of more time. 
would this help?

for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of more time, and doesnt look for the correct index? but why this?
why does it take longer than in 7.1? any ideas? what can i do? 

any help would be greatly apreciated.
yours volker
sory for the bad english

Richard Huxton wrote:
On Thursday 15 April 2004 17:30, Development - multi.art.studio wrote: 
Hello,

sorry im late, but here are more details:

im wondering why 7.4 doesnt use the newsletter_site_id_date_idx-index

explain with 7.1: without analyze
mcms09=> explain select * from newsletter where site_id='m000000-970' order
by date desc,id desc limit 10; NOTICE:  QUERY PLAN:
Limit  (cost=9.26..9.26 rows=7 width=84) ->  Sort  (cost=9.26..9.26 rows=8 width=84)       ->  Index Scan using newsletter_site_id_date_idx on newsletter 
(cost=0.00..9.14 rows=8 width=84) EXPLAIN
__________________________
and with pgsql 7.4.2:

mcms=# explain analyse select * from newsletter where site_id='m000000-970'
order by date desc,id desc limit 10;   
 
 Limit  (cost=17.78..17.81 rows=10 width=610) (actual time=0.625..0.645
rows=10 loops=1)  ->  Sort  (cost=17.78..17.81 rows=11 width=610) (actual
time=0.620..0.627 rows=10 loops=1)        Sort Key: date, id        ->  Index Scan using site_id_newsletter_key on newsletter 
(cost=0.00..17.59 rows=11 width=610) (actual time=0.087..0.286 rows=15
loops=1)              Index Cond: ((site_id)::text = 'm000000-970'::text)
Total runtime: 0.766 ms   
It's not using the date index because it's using the id index - there's only 
10 matches, so that looks like a good choice to me. It takes less than 1ms, 
so I'm not sure this is a good example of a problem.
 

Re: performance problem aftrer update from 7.1 to 7.4.2

From
Richard Huxton
Date:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:
> hello everyone,
>
> Richard Huxton wrote:
> >It's not using the date index because it's using the id index - there's
> > only 10 matches, so that looks like a good choice to me. It takes less
> > than 1ms, so I'm not sure this is a good example of a problem.
>
> thanks all !
> but this doesnt help me a lot, i dont know what what to do? im not an
> expert. ok, i could use another site_id  and it will grow up and gives 500
> or more rows back from the select clause. this takes a lot of more time.
> would this help?

If that is the problem, that's what you'll need to post.

> for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
> more time, and doesnt look for the correct index? but why this? why does it
> take longer than in 7.1? any ideas? what can i do?

It's not something people are generally seeing. In most cases performance is
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
want to make sure the data is cached, run it three times and use the last
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
7.4, same as before, verify that the same results are returned. Same as
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table
definitions and row counts for each table involved and we can see exactly
where the problem is. If you think this is too much info, you can post it on
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep
it safe in case it's needed.

--
  Richard Huxton
  Archonet Ltd

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
hello,

>Richard Huxton wrote:
>What I suggest:
>1. Compare the two postgresql.conf files and any other config settings and 
>make sure you know what differences there are and why.
>2. Identify what queries seem to be the cause of the problem, and pick one you 
>think is a good example.
>3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
>4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you 
>want to make sure the data is cached, run it three times and use the last 
>one.
>While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in 
>another terminal.
>5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for 
>7.4, same as before, verify that the same results are returned. Same as 
>before for the caching.
>Trace using vmstat for this too.

thanks for the great hints, last days i was trying and trying...... restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump with structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas) with links to valid newsletter and addresses always using a specific site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR:  Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
NOTICE:  QUERY PLAN:

Sort  (cost=4224132.30..4224132.30 rows=29063 width=54)
  ->  Index Scan using newsletter_sent_site_id_idx on newsletter_send  (cost=0.00..4221402.52 rows=29063 width=54)
        SubPlan
          ->  Seq Scan on newsletter_address  (cost=0.00..13.61 rows=133 width=4)
          ->  Seq Scan on newsletter  (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Sort  (cost=6205.83..6275.90 rows=28026 width=76)
   Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id, newsletter_send.id
   ->  Hash IN Join  (cost=100.00..3409.13 rows=28026 width=76)
         Hash Cond: ("outer".newsletter_id = "inner".id)
         ->  Hash IN Join  (cost=13.94..2887.87 rows=30991 width=76)
               Hash Cond: ("outer".newsletter_adr_id = "inner".id)
               ->  Seq Scan on newsletter_send  (cost=0.00..2128.29 rows=87145 width=76)
                     Filter: (((site_id)::text = 'm200384-000'::text) AND (sent IS NOT NULL))
               ->  Hash  (cost=13.61..13.61 rows=133 width=4)
                     ->  Seq Scan on newsletter_address  (cost=0.00..13.61 rows=133 width=4)
                           Filter: (((site_id)::text = 'm200384-000'::text) AND (aktiv = 1))
         ->  Hash  (cost=83.95..83.95 rows=841 width=4)
               ->  Seq Scan on newsletter  (cost=0.00..83.95 rows=841 width=4)
                     Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(14 rows)


....Heavy! its about 1000-times faster i think, and now postgres speeds up like a rocket.
i dont know why this happens......, but i did another query, because my idea is there are problems in sorting the table in a very simple query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and site_id='m200384_000' order by id desc,date desc;
NOTICE:  QUERY PLAN:
Sort  (cost=9.29..9.29 rows=8 width=84)
  ->  Index Scan using newsletter_site_id_date_idx on newsletter  (cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and site_id='m200384_000' order by id desc,date desc;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Sort  (cost=18.67..18.70 rows=11 width=598)
   Sort Key: id, date
   ->  Index Scan using site_id_newsletter_key on newsletter  (cost=0.00..18.48 rows=11 width=598)
         Index Cond: ((site_id)::text = 'm200384_000'::text)
         Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it scans and uses filter.......is this the order postgres works? maybe sorting the table without filters before querying it costs a lot of performance, only result should be sorted.
hmmm,  i will start it with higher debuglevel and try to get out.
>Post the query SQL and both EXPLAIN ANALYSE outputs along with table 
>definitions and row counts for each table involved and we can see exactly 
>where the problem is. If you think this is too much info, you can post it on 
>the web instead, that's fine.
>If you're using the vmstat info, you could post that too, otherwise just keep 
>it safe in case it's needed.




thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker


Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote: 
hello everyone,

Richard Huxton wrote:   
It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.     
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id  and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?   
If that is the problem, that's what you'll need to post.
 
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?   
It's not something people are generally seeing. In most cases performance is 
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and 
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you 
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you 
want to make sure the data is cached, run it three times and use the last 
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in 
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for 
7.4, same as before, verify that the same results are returned. Same as 
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table 
definitions and row counts for each table involved and we can see exactly 
where the problem is. If you think this is too much info, you can post it on 
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep 
it safe in case it's needed.
 

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
hello again,

i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and site_id='m200384-000';
 count
-------
   845
(1 row)

---some users where working on it and added two newsletter since last dump....

mcms09=> explain select * from newsletter where aktiv=1 and site_id='m200384-000' order by id desc,date desc;
NOTICE:  QUERY PLAN:
Sort  (cost=123.78..123.78 rows=841 width=84)
  ->  Seq Scan on newsletter  (cost=0.00..82.95 rows=841 width=84)
EXPLAIN


and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and site_id='m200384-000';
 count
-------
   843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and site_id='m200384-000' order by id desc,date desc;
                                QUERY PLAN
---------------------------------------------------------------------------
 Sort  (cost=124.81..126.91 rows=841 width=598)
   Sort Key: id, date
   ->  Seq Scan on newsletter  (cost=0.00..83.95 rows=841 width=598)
         Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)


so in 7.4 seq-scan is about '1'  (ms or what?) slower than 7.1

really thats not a very big table, my tests will go on!
if any idea, please let me know.

thanks in advance.
volker

Development - multi.art.studio wrote:
hello,

>Richard Huxton wrote:
>What I suggest:
>1. Compare the two postgresql.conf files and any other config settings and 
>make sure you know what differences there are and why.
>2. Identify what queries seem to be the cause of the problem, and pick one you 
>think is a good example.
>3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
>4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you 
>want to make sure the data is cached, run it three times and use the last 
>one.
>While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in 
>another terminal.
>5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for 
>7.4, same as before, verify that the same results are returned. Same as 
>before for the caching.
>Trace using vmstat for this too. 

thanks for the great hints, last days i was trying and trying...... restoring databases using a long time.
but i was surprised on the results.
ok, im not finished because i can stop the production dbs only late in the night after telling some people...but here are some 'heavy' results

i dumped out my database with pgdump from 7.4 as before , one dump with structure only, one with inserts
after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.

i created a test-query
(it will show the entries in table newsletter_send (historical datas) with links to valid newsletter and addresses always using a specific site-id ordered by send-date, address-id and internal-id

on old production 7.1:

Query:

mcms09=> select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
Cancel request sent
ERROR:  Query was cancelled.

------takes too long time, about 3 minutes without displaying a result ;-)

Explain:

mcms09=> explain select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
NOTICE:  QUERY PLAN:

Sort  (cost=4224132.30..4224132.30 rows=29063 width=54)
  ->  Index Scan using newsletter_sent_site_id_idx on newsletter_send  (cost=0.00..4221402.52 rows=29063 width=54)
        SubPlan
          ->  Seq Scan on newsletter_address  (cost=0.00..13.61 rows=133 width=4)
          ->  Seq Scan on newsletter  (cost=0.00..82.95 rows=841 width=4)
EXPLAIN

-------i do this querys about three or five times to be sure its cached

now i started 7.4 again and test it with fresh 7.4.2 restored database:

Query-Explain:

mcms=# explain select * from newsletter_send where site_id='m200384-000' and newsletter_id in (select id from newsletter where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in (select id from newsletter_address where site_id='m200384-000' and aktiv=1) and sent is not null order by sent desc,newsletter_adr_id desc,id desc;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Sort  (cost=6205.83..6275.90 rows=28026 width=76)
   Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id, newsletter_send.id
   ->  Hash IN Join  (cost=100.00..3409.13 rows=28026 width=76)
         Hash Cond: ("outer".newsletter_id = "inner".id)
         ->  Hash IN Join  (cost=13.94..2887.87 rows=30991 width=76)
               Hash Cond: ("outer".newsletter_adr_id = "inner".id)
               ->  Seq Scan on newsletter_send  (cost=0.00..2128.29 rows=87145 width=76)
                     Filter: (((site_id)::text = 'm200384-000'::text) AND (sent IS NOT NULL))
               ->  Hash  (cost=13.61..13.61 rows=133 width=4)
                     ->  Seq Scan on newsletter_address  (cost=0.00..13.61 rows=133 width=4)
                           Filter: (((site_id)::text = 'm200384-000'::text) AND (aktiv = 1))
         ->  Hash  (cost=83.95..83.95 rows=841 width=4)
               ->  Seq Scan on newsletter  (cost=0.00..83.95 rows=841 width=4)
                     Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(14 rows)


....Heavy! its about 1000-times faster i think, and now postgres speeds up like a rocket.
i dont know why this happens......, but i did another query, because my idea is there are problems in sorting the table in a very simple query....hmmmm, if im not totally wrong ;-) tell me *g
________________________________

ok here are some more tests

Query 7.1:
mcms09=> explain select * from newsletter where aktiv=1 and site_id='m200384_000' order by id desc,date desc;
NOTICE:  QUERY PLAN:
Sort  (cost=9.29..9.29 rows=8 width=84)
  ->  Index Scan using newsletter_site_id_date_idx on newsletter  (cost=0.00..9.17 rows=8 width=84)
EXPLAIN

Query 7.4.2:
mcms=# explain select * from newsletter where aktiv=1 and site_id='m200384_000' order by id desc,date desc;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Sort  (cost=18.67..18.70 rows=11 width=598)
   Sort Key: id, date
   ->  Index Scan using site_id_newsletter_key on newsletter  (cost=0.00..18.48 rows=11 width=598)
         Index Cond: ((site_id)::text = 'm200384_000'::text)
         Filter: (aktiv = 1)
(5 rows)

now it costs double time to query the table.
Explain show me it sorts the table up by id and date, after that it scans and uses filter.......is this the order postgres works? maybe sorting the table without filters before querying it costs a lot of performance, only result should be sorted.
hmmm,  i will start it with higher debuglevel and try to get out.
>Post the query SQL and both EXPLAIN ANALYSE outputs along with table 
>definitions and row counts for each table involved and we can see exactly 
>where the problem is. If you think this is too much info, you can post it on 
>the web instead, that's fine.
>If you're using the vmstat info, you could post that too, otherwise just keep 
>it safe in case it's needed.




thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500

yours sincerely
volker
 

Richard Huxton wrote:
On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote: 
hello everyone,

Richard Huxton wrote:   
It's not using the date index because it's using the id index - there's
only 10 matches, so that looks like a good choice to me. It takes less
than 1ms, so I'm not sure this is a good example of a problem.     
thanks all !
but this doesnt help me a lot, i dont know what what to do? im not an
expert. ok, i could use another site_id  and it will grow up and gives 500
or more rows back from the select clause. this takes a lot of more time.
would this help?   
If that is the problem, that's what you'll need to post.
 
for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
more time, and doesnt look for the correct index? but why this? why does it
take longer than in 7.1? any ideas? what can i do?   
It's not something people are generally seeing. In most cases performance is 
the same or slightly better. For some queries it can be a lot better.

What I suggest:
1. Compare the two postgresql.conf files and any other config settings and 
make sure you know what differences there are and why.
2. Identify what queries seem to be the cause of the problem, and pick one you 
think is a good example.
3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you 
want to make sure the data is cached, run it three times and use the last 
one.
While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in 
another terminal.
5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for 
7.4, same as before, verify that the same results are returned. Same as 
before for the caching.
Trace using vmstat for this too.

Post the query SQL and both EXPLAIN ANALYSE outputs along with table 
definitions and row counts for each table involved and we can see exactly 
where the problem is. If you think this is too much info, you can post it on 
the web instead, that's fine.
If you're using the vmstat info, you could post that too, otherwise just keep 
it safe in case it's needed.
 


Re: performance problem aftrer update from 7.1 to 7.4.2

From
Stephan Szabo
Date:
On Thu, 22 Apr 2004, Development - multi.art.studio wrote:

> i did some mistakes, site-id was wrong in my simple select querys.
> (interesting that no matches would take double time on 7.4)
>
> here are the 'right' results, showing to me postgres 7.4 is slightly
> slower with simple querys (but 1000-times faster with more complex
> querys ;-)
>
> with production 7.1:
>
> mcms09=> select count(id) from newsletter where aktiv=1 and
> site_id='m200384-000';
>  count
> -------
>    845
> (1 row)
>
> ---some users where working on it and added two newsletter since last
> dump....
>
> mcms09=> explain select * from newsletter where aktiv=1 and
> site_id='m200384-000' order by id desc,date desc;
> NOTICE:  QUERY PLAN:
> Sort  (cost=123.78..123.78 rows=841 width=84)
>   ->  Seq Scan on newsletter  (cost=0.00..82.95 rows=841 width=84)
> EXPLAIN
>
>
> and 7.4 test:
>
> mcms=# select count(id) from newsletter where aktiv=1 and
> site_id='m200384-000';
>  count
> -------
>    843
> (1 row)
>
> mcms=# explain select * from newsletter where aktiv=1 and
> site_id='m200384-000' order by id desc,date desc;
>                                 QUERY PLAN
> ---------------------------------------------------------------------------
>  Sort  (cost=124.81..126.91 rows=841 width=598)
>    Sort Key: id, date
>    ->  Seq Scan on newsletter  (cost=0.00..83.95 rows=841 width=598)
>          Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
> (4 rows)
>
>
> so in 7.4 seq-scan is about '1'  (ms or what?) slower than 7.1

The cost numbers from explain have no direct connection to real time.
You'd actually have to compare the time it took to get the results from
the two in order to see how they ran (and be very careful about caching
effects and the like).

Re: performance problem aftrer update from 7.1 to 7.4.2

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Thu, 22 Apr 2004, Development - multi.art.studio wrote:
>> so in 7.4 seq-scan is about '1'  (ms or what?) slower than 7.1

> The cost numbers from explain have no direct connection to real time.

... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time.  Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway.  We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.

            regards, tom lane

Re: performance problem aftrer update from 7.1 to 7.4.2

From
"Development - multi.art.studio"
Date:
Hallo all,

thanks for the great help.
i tried around a bit more, using different queries and so on, but pgsql7.4 will always be somewhat slower than 7.1
finally i tested some php code , (after fixing all those wrong integer-values of '' to 0 or null ..... ) on it
 and it uses a lot of more time for fulltext search than 7.1 with less rows in the table, also dynamis site-management
will sometimes be more than 1 second slower.
i cant explain whats wrong, i dont matter, i will update my db and live with that, maybe optimize the querys, but theres not a lot what i can do here ;-)

if somebody is more interested in debugging this and getting out why performance decreases, tell me and i will send you the stuff, a test account and all you need.
i cant examine the problem, only can help

thanks for all, your sincerely
volker


Tom Lane wrote:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes: 
On Thu, 22 Apr 2004, Development - multi.art.studio wrote:   
so in 7.4 seq-scan is about '1'  (ms or what?) slower than 7.1     
 
The cost numbers from explain have no direct connection to real time.   
... or at least, if your concern is poor choice of query plan, it's
folly to treat them as being equivalent to real time.  Actual runtimes
are the things to compare.

Also, I wouldn't care to bet that estimated costs from 7.1 are very
comparable to 7.4, anyway.  We rejiggered the cost models enough in
between that it's quite likely to be a meaningless comparison.
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command   (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)