Re: Less rows -> better performance? - Mailing list pgsql-performance

From Christian GRANDIN
Subject Re: Less rows -> better performance?
Date
Msg-id 1568f9ad0807210700k78d55744mcd36838df5b78e8e@mail.gmail.com
Whole thread Raw
In response to Re: Less rows -> better performance?  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
Hi,

Reducing the amount of data will only have effect on table scan or index scan. If your queries are selective and optimized, it will have no effect.

Before looking for solutions, the first thing to do is to understand what's happen.

If you already know the queries then explain them. Otherwise, you must log duration with the log_statement and log_min_duration parameters in the postgresql.conf.

Before this, you must at least run VACUUM ANALYZE on the database to collect actual statistics and have current explain plans.

Best regards.

Christian

2008/7/21 Richard Huxton <dev@archonet.com>
Andreas Hartmann wrote:

Here's some info about the actual amount of data:

SELECT pg_database.datname,
pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM pg_database where pg_database.datname = 'vvz_live_1';

   datname    |  size
---------------+---------
 vvz_live_1    | 2565 MB

I wonder why the actual size is so much bigger than the data-only dump - is this because of index data etc.?

I suspect Guillame is right and you've not been vacuuming. That or you've got a *LOT* of indexes. If the database is only 27MB dumped, I'd just dump/restore it.

Since the database is read-only it might be worth running CLUSTER on the  main tables if there's a sensible ordering for them.


What in particular is slow?

There's no particular bottleneck (at least that we're aware of). During the first couple of days after the beginning of the semester the application request processing tends to slow down due to the high load (many students assemble their schedule). The customer upgraded the hardware (which already helped a lot), but they asked us to find further approaches to performance optimiziation.

1. Cache sensibly at the application (I should have thought there's plenty of opportunity here).
2. Make sure you're using a connection pool and have sized it reasonably (try 4,8,16 see what loads you can support).
3. Use prepared statements where it makes sense. Not sure how you'll manage the interplay between this and connection pooling in JDBC. Not a Java man I'm afraid.

If you're happy with the query plans you're looking to reduce overheads as much as possible during peak times.

4. Offload more of the processing to clients with some fancy ajax-ed interface.
5. Throw in a spare machine as an app server for the first week of term.    Presumably your load is 100 times average at this time.

--
 Richard Huxton
 Archonet Ltd


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


pgsql-performance by date:

Previous
From: Rusty Conover
Date:
Subject: Re: Perl/DBI vs Native
Next
From: Andreas Hartmann
Date:
Subject: Re: Less rows -> better performance?