Thread: Terrible performance after deleting/recreating indexes

Terrible performance after deleting/recreating indexes

From

Bill Chandler

Date:

07 July 2004, 13:17:45

Hi,

Using PostgreSQL 7.4.2 on Solaris.  I'm trying to
improve performance on some queries to my databases so
I wanted to try out various index structures.

Since I'm going to be running my performance tests
repeatedly, I created some SQL scripts to delete and
recreate various index configurations.  One of the
scripts contains the commands for recreating the
'original' index configuration (i.e. the one I've
already got some numbers for).  Only thing is now
when I delete and recreate the original indexes then
run the query, I'm finding the performance has gone
completely down the tubes compared to what I
originally had.  A query that used to take 5 minutes
to complete now takes hours to complete.

For what it's worth my query looks something like:

select * from tbl_1, tbl_2 where tbl_1.id = tbl_2.id
and tbl_2.name like 'x%y%' and tbl_1.x > 1234567890123
order by tbl_1.x;

tbl_1 is very big (> 2 million rows)
tbl_2 is relatively small (7000 or so rows)
tbl_1.x is a numeric(13)
tbl_1.id & tbl_2.id are integers
tbl_2.name is a varchar(64)

I've run 'VACUUM ANALYZE' on both tables involved in
the query.  I also used 'EXPLAIN' and observed that
the query plan is completely changed from what it
was originally.

Any idea why this would be?  I would have thougth
that a freshly created index would have better
performance not worse.  I have not done any inserts
or updates since recreating the indexes.

thanks in advance,

Bill C

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: Terrible performance after deleting/recreating indexes

From

Shridhar Daithankar

Date:

08 July 2004, 06:07:49

Bill Chandler wrote:

> Hi,
>
> Using PostgreSQL 7.4.2 on Solaris.  I'm trying to
> improve performance on some queries to my databases so
> I wanted to try out various index structures.
>
> Since I'm going to be running my performance tests
> repeatedly, I created some SQL scripts to delete and
> recreate various index configurations.  One of the
> scripts contains the commands for recreating the
> 'original' index configuration (i.e. the one I've
> already got some numbers for).  Only thing is now
> when I delete and recreate the original indexes then
> run the query, I'm finding the performance has gone
> completely down the tubes compared to what I
> originally had.  A query that used to take 5 minutes
> to complete now takes hours to complete.
>
> For what it's worth my query looks something like:
>
> select * from tbl_1, tbl_2 where tbl_1.id = tbl_2.id
> and tbl_2.name like 'x%y%' and tbl_1.x > 1234567890123
> order by tbl_1.x;
>
> tbl_1 is very big (> 2 million rows)
> tbl_2 is relatively small (7000 or so rows)
> tbl_1.x is a numeric(13)
> tbl_1.id & tbl_2.id are integers
> tbl_2.name is a varchar(64)
>
> I've run 'VACUUM ANALYZE' on both tables involved in
> the query.  I also used 'EXPLAIN' and observed that
> the query plan is completely changed from what it
> was originally.

Get an explain analyze. That gives actual v/s planned time spent. See what is
causing the difference. A discrepency between planned and actual row is usually
a indication of out-of-date stats.

Which are the indexes on these tables? You should list fields with indexes first
in where clause. Also list most selective field first so that it eliminates as
many rows as possible in first scan.

I hope you have read the tuning articles on varlena.com and applied some basic
tuning.

And post the table schema, hardware config, postgresql config(important ones of
course) and  explain analyze for queries. That would be something to start with.

  Shridhar

Re: Terrible performance after deleting/recreating indexes

From

Bill Chandler

Date:

08 July 2004, 17:49:33

Thanks for the advice.

On further review it appears I am only getting this
performance degradation when I run the command via
a JDBC app.  If I do the exact same query from
psql, the performance is fine.  I've tried both the
JDBC2 and JDBC3 jars.  Same results.

It definitely seems to correspond to deleting and
recreating the indexes, though.  The same query thru
JDBC worked fine before recreating the indexes.

Does that make any sense at all?

thanks

Bill

--- Shridhar Daithankar <shridhar@frodo.hserus.net>
wrote:
> Bill Chandler wrote:
>
> > Hi,
> >
> > Using PostgreSQL 7.4.2 on Solaris.  I'm trying to
> > improve performance on some queries to my
> databases so
> > I wanted to try out various index structures.
> >
> > Since I'm going to be running my performance tests
> > repeatedly, I created some SQL scripts to delete
> and
> > recreate various index configurations.  One of the
> > scripts contains the commands for recreating the
> > 'original' index configuration (i.e. the one I've
> > already got some numbers for).  Only thing is now
> > when I delete and recreate the original indexes
> then
> > run the query, I'm finding the performance has
> gone
> > completely down the tubes compared to what I
> > originally had.  A query that used to take 5
> minutes
> > to complete now takes hours to complete.
> >
> > For what it's worth my query looks something like:
> >
> > select * from tbl_1, tbl_2 where tbl_1.id =
> tbl_2.id
> > and tbl_2.name like 'x%y%' and tbl_1.x >
> 1234567890123
> > order by tbl_1.x;
> >
> > tbl_1 is very big (> 2 million rows)
> > tbl_2 is relatively small (7000 or so rows)
> > tbl_1.x is a numeric(13)
> > tbl_1.id & tbl_2.id are integers
> > tbl_2.name is a varchar(64)
> >
> > I've run 'VACUUM ANALYZE' on both tables involved
> in
> > the query.  I also used 'EXPLAIN' and observed
> that
> > the query plan is completely changed from what it
> > was originally.
>
> Get an explain analyze. That gives actual v/s
> planned time spent. See what is
> causing the difference. A discrepency between
> planned and actual row is usually
> a indication of out-of-date stats.
>
>
> Which are the indexes on these tables? You should
> list fields with indexes first
> in where clause. Also list most selective field
> first so that it eliminates as
> many rows as possible in first scan.
>
>
> I hope you have read the tuning articles on
> varlena.com and applied some basic
> tuning.
>
> And post the table schema, hardware config,
> postgresql config(important ones of
> course) and  explain analyze for queries. That would
> be something to start with.
>
>   Shridhar
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the
> unregister command
>     (send "unregister YourEmailAddressHere" to
> majordomo@postgresql.org)
>




__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail

Re: Terrible performance after deleting/recreating indexes

From

Mark Kirkwood

Date:

08 July 2004, 21:13:23

That is interesting - both psql and JDBC merely submit statements for
the backend to process, so generally you would expect no difference in
execution plan or performance.

It might be worth setting "log_statement=true" in postgresql.conf and
checking that you are executing *exactly* the same statement in both
JDBC and psql.

regards

Mark

P.s : lets see the output from  EXPLAIN ANALYZE  :-)

Bill Chandler wrote:

>Thanks for the advice.
>
>On further review it appears I am only getting this
>performance degradation when I run the command via
>a JDBC app.  If I do the exact same query from
>psql, the performance is fine.  I've tried both the
>JDBC2 and JDBC3 jars.  Same results.
>
>
>
>
>
>
>

Re: Terrible performance after deleting/recreating indexes

From

Bill Chandler

Date:

09 July 2004, 12:18:57

Thanks for this tip.  Turns out there is a difference.
I am using cursors (i.e. calling setFetchSize(5000) on
my Statement) in JDBC.  So the SQL statement is
preceded by:

  DECLARE JDBC_CURS_1 CURSOR FOR ...

which is then followed by the SQL statemnt.

This is followed by the separate statement:

  FETCH FORWARD 5000 FROM JDBC_CURS_1;

Also, don't know if this is significant but there
are a few lines before both of these:

  set datestyle to 'ISO'; select version(), case when
pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN'
else getdatabaseencoding() end;
  set client_encoding = 'UNICODE
  begin;

Only thing is, though, none of this is new.  I was
using cursors before as well.

Here is the output from "EXPLAIN ANALYZE".  Hope it
comes out readable:

   QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=50466.04..50470.45 rows=1765 width=114)
(actual time=87237.003..88235.011 rows=108311 loops=1)
   Sort Key: iso_nep_data_update_events.lds
   ->  Merge Join  (cost=49240.03..50370.85 rows=1765
width=114) (actual time=56658.356..65221.995
rows=108311 loops=1)
         Merge Cond: ("outer".obj_id = "inner".obj_id)
         ->  Sort  (cost=198.01..198.16 rows=61
width=65) (actual time=175.947..181.172 rows=3768
loops=1)
               Sort Key: iso_nep_control.obj_id
               ->  Seq Scan on iso_nep_control
(cost=0.00..196.20 rows=61 width=65) (actual
time=0.056..108.151 rows=3768 loops=1)
                     Filter: ((real_name)::text ~~
'NEPOOL%REAL%'::text)
         ->  Sort  (cost=49042.02..49598.46
rows=222573 width=69) (actual
time=56482.073..58642.901 rows=216528 loops=1)
               Sort Key:
iso_nep_data_update_events.obj_id
               ->  Index Scan using iso_nep_due_idx1
on iso_nep_data_update_events  (cost=0.00..7183.18
rows=222573 width=69) (actual time=0.179..11739.104
rows=216671 loops=1)
                     Index Cond: (lds >
1088554754000::numeric)
 Total runtime: 88643.330 ms
(13 rows)


Here is the actual query:

select iso_nep_DATA_UPDATE_EVENTS.lds,
       iso_nep_DATA_UPDATE_EVENTS.tsds,
       iso_nep_DATA_UPDATE_EVENTS.value,
       iso_nep_DATA_UPDATE_EVENTS.correction,
       iso_nep_DATA_UPDATE_EVENTS.delta_lds_tsds,
       iso_nep_CONTROL.real_name,
       iso_nep_CONTROL.freq,
       iso_nep_CONTROL.type from
       iso_nep_DATA_UPDATE_EVENTS, iso_nep_CONTROL
       where iso_nep_CONTROL.real_name like
'NEPOOL%REAL%' escape '/' and
       iso_nep_DATA_UPDATE_EVENTS.obj_id =
iso_nep_CONTROL.obj_id and
       iso_nep_DATA_UPDATE_EVENTS.lds > 1088554754000
order by lds;

Two tables: iso_nep_data_update_events and
iso_nep_control.  Basically getting all columns from
both tables.  Joining the tables on obj_id = obj_id.
Have unique indexes on iso_nep_control.obj_id
(clustered) and iso_nep_control.real_name.  Have
non-unique indexes on iso_nep_data_update_events.lds
and iso_nep_data_update_events.obj_id.

thanks,

Bill

--- Mark Kirkwood <markir@coretech.co.nz> wrote:
> That is interesting - both psql and JDBC merely
> submit statements for
> the backend to process, so generally you would
> expect no difference in
> execution plan or performance.
>
> It might be worth setting "log_statement=true" in
> postgresql.conf and
> checking that you are executing *exactly* the same
> statement in both
> JDBC and psql.
>
> regards
>
> Mark
>
> P.s : lets see the output from  EXPLAIN ANALYZE  :-)
>
> Bill Chandler wrote:
>
> >Thanks for the advice.
> >
> >On further review it appears I am only getting this
>
> >performance degradation when I run the command via
> >a JDBC app.  If I do the exact same query from
> >psql, the performance is fine.  I've tried both the
> >JDBC2 and JDBC3 jars.  Same results.
> >
> >
> >
> >
> >
> >
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>





__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail

Cursors performance (was: Re: Terrible performance after deleting/recreating indexes)

From

Bill Chandler

Date:

09 July 2004, 17:24:24

Thanks to all who have responded.  I now think my
problem is not related to deleting/recreating indexes.
Somehow it is related to JDBC cursors.  It appears
that what is happening is that since I'm using
a fetch size of 5000, the command:

FETCH FORWARD 5000 FROM JDBC_CURS_1

is being repeatedly sent to the server as I process
the result set from my query.  Each time this command
is sent it it takes about 5 minutes to return which is
about the amount of time the whole query took to
complete before the performance degredation. So in
other words it looks as if the full select is being
rerun on each fetch.

Now the mystery is why is this happening all of the
sudden?  I have been running w/ fetch size set to 5000
for the last couple of weeks and it did not appear to
be doing this (i.e. re-running the entire select
statement again).  Is this what I should expect when
using cursors?  I would have thought that the server
should "remember" where it left off in the query since
the last fetch and continue from there.

Could I have inadvertently changed a parameter
somewhere that would cause this behavior?

thanks,

Bill

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: [JDBC] Cursors performance (was: Re: Terrible

From

Dave Cramer

Date:

09 July 2004, 17:38:53

Bill,

What happens if you do this in psql, also you can turn on duration
logging in the backend and log the queries.

dave
On Fri, 2004-07-09 at 16:24, Bill Chandler wrote:
> Thanks to all who have responded.  I now think my
> problem is not related to deleting/recreating indexes.
> Somehow it is related to JDBC cursors.  It appears
> that what is happening is that since I'm using
> a fetch size of 5000, the command:
>
> FETCH FORWARD 5000 FROM JDBC_CURS_1
>
> is being repeatedly sent to the server as I process
> the result set from my query.  Each time this command
> is sent it it takes about 5 minutes to return which is
> about the amount of time the whole query took to
> complete before the performance degredation. So in
> other words it looks as if the full select is being
> rerun on each fetch.
>
> Now the mystery is why is this happening all of the
> sudden?  I have been running w/ fetch size set to 5000
> for the last couple of weeks and it did not appear to
> be doing this (i.e. re-running the entire select
> statement again).  Is this what I should expect when
> using cursors?  I would have thought that the server
> should "remember" where it left off in the query since
> the last fetch and continue from there.
>
> Could I have inadvertently changed a parameter
> somewhere that would cause this behavior?
>
> thanks,
>
> Bill
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
>
>
> !DSPAM:40eefff6170301475214189!
>
>
--
Dave Cramer
519 939 0336
ICQ # 14675561

Re: [JDBC] Cursors performance (was: Re: Terrible performance

From

Kris Jurka

Date:

09 July 2004, 17:44:46

On Fri, 9 Jul 2004, Bill Chandler wrote:

> Thanks to all who have responded.  I now think my
> problem is not related to deleting/recreating indexes.
> Somehow it is related to JDBC cursors.  It appears
> that what is happening is that since I'm using
> a fetch size of 5000, the command:
>
> FETCH FORWARD 5000 FROM JDBC_CURS_1
>

If the top level node of your execution plan is a sort step, it should
take essentially no time to retrieve additional rows after the first
fetch.  The sort step is materializes the results so that future fetches
simply need to spit this data back to the client.

I would agree with Dave's suggestion to use log_duration and compare the
values for the first and subsequent fetches.

Kris Jurka

Re: [JDBC] Cursors performance

From

Oliver Jowett

Date:

09 July 2004, 19:22:49

Bill Chandler wrote:

> Now the mystery is why is this happening all of the
> sudden?  I have been running w/ fetch size set to 5000
> for the last couple of weeks and it did not appear to
> be doing this (i.e. re-running the entire select
> statement again).  Is this what I should expect when
> using cursors?  I would have thought that the server
> should "remember" where it left off in the query since
> the last fetch and continue from there.

I'd check heap size, GC activity (-verbose:gc), CPU use, swapping
activity on the *client* side. It may be that your dataset size or
physical memory or something similar has changed sufficiently that GC
resulting from the data in each 5k row batch is killing you.

Can you try a trivial app that runs the same query (with same fetchsize,
autocommit, etc) via JDBC and does nothing but steps forward through the
resultset, and see how fast it runs? Perhaps the problem is in your
processing logic.

-O

Re: Cursors performance

From

Mark Kirkwood

Date:

10 July 2004, 00:04:55

Might be worth doing a little test:

i) modify your code to fetch 1 row at a time
ii) set log_duration=true in your postgresql.conf (as the other posters
have suggested)

Then compare with running the query in psql.

regards

Mark



Bill Chandler wrote:

>Thanks to all who have responded.  I now think my
>problem is not related to deleting/recreating indexes.
>Somehow it is related to JDBC cursors.  It appears
>that what is happening is that since I'm using
>a fetch size of 5000, the command:
>
>FETCH FORWARD 5000 FROM JDBC_CURS_1
>
>is being repeatedly sent to the server as I process
>the result set from my query.  Each time this command
>is sent it it takes about 5 minutes to return which is
>about the amount of time the whole query took to
>complete before the performance degredation. So in
>other words it looks as if the full select is being
>rerun on each fetch.
>
>Now the mystery is why is this happening all of the
>sudden?  I have been running w/ fetch size set to 5000
>for the last couple of weeks and it did not appear to
>be doing this (i.e. re-running the entire select
>statement again).  Is this what I should expect when
>using cursors?  I would have thought that the server
>should "remember" where it left off in the query since
>the last fetch and continue from there.
>
>Could I have inadvertently changed a parameter
>somewhere that would cause this behavior?
>
>thanks,
>
>Bill
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around
>http://mail.yahoo.com
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>