Thread: Performance question (PostgreSQL 7.1.3)

Performance question (PostgreSQL 7.1.3)

From
Medve Gábor
Date:
Hi there,

I have no experiences with big databases. Now I have a database that's size
is appr. 7,6GB (HW/OS: Intel Pentium III 600MHz, 512MB RAM, GNU/Linux 2.2.19
kernel). There is a big table in it with 25318267 tuples today. If I make a
simple query (SELECT x,y,z FROM table WHERE x = something) then it takes
2m42s to get the results. Is it a good value?
 I've read the documentation and I've checked the kernel parameters (sem.h
and shmparam.h) and the parameter values in postgresql.conf and I have no
idea what could I do to get the results in shorter time period.
Thanks for any kindness in advance.

Greetings from Hungary,
Gabor

Re: Performance question (PostgreSQL 7.1.3)

From
Jochem van Dieten
Date:
Medve Gábor wrote:

> Hi there,
>
> I have no experiences with big databases. Now I have a database that's size
> is appr. 7,6GB (HW/OS: Intel Pentium III 600MHz, 512MB RAM, GNU/Linux 2.2.19
> kernel). There is a big table in it with 25318267 tuples today. If I make a
> simple query (SELECT x,y,z FROM table WHERE x = something) then it takes
> 2m42s to get the results. Is it a good value?


I don't know, you tell me ;) How many results are exactly returned? Can
you run an EXPLAIN query? Did you VACUUM ANALYZE? Any indices defined on
"x"?

Jochem


Re: Performance question (PostgreSQL 7.1.3)

From
"Josh Berkus"
Date:
Gabor,

> I have no experiences with big databases. Now I have a database
> that's size
> is appr. 7,6GB (HW/OS: Intel Pentium III 600MHz, 512MB RAM, GNU/Linux
> 2.2.19
> kernel). There is a big table in it with 25318267 tuples today. If I
> make a
> simple query (SELECT x,y,z FROM table WHERE x = something) then it
> takes
> 2m42s to get the results. Is it a good value?
>  I've read the documentation and I've checked the kernel parameters
> (sem.h
> and shmparam.h) and the parameter values in postgresql.conf and I
> have no
> idea what could I do to get the results in shorter time period.
> Thanks for any kindness in advance.

Hmmm.. for 25 million records on a single-processor machine, 2 minutes
isn't unreasonable, especially if you have IDE rather than SCSI drives.
Drive access times are your big bottleneck with really large databases.

Things to check:
1. Indexes ... use EXPLAIN to find out how Postgres is finding the data.
2. VACUUM and VACUUM ANALYZE frequency
3. Postgresql.conf parameters: sort_mem, buffers, wal_files (make tem
high!)
4. Use multiple drives or RAID ... having the log files on a seperate
drive from the database makes a huge difference in really large queries,
although mostly for UPDATE and INSERT queries.  Also, putting the OS on
a seperate drive doesn't hurt.
5. Postgres verison ... an upgrade always improves the query optimizer.

Also, how large is each tuple?  It makes a difference in retrieval.

Finally, I suggest that you read through the PGSQL-SQL list archives.
We discuss performance issues all the time on that list.

-Josh Berkus





______AGLIO DATABASE SOLUTIONS___________________________
                                       Josh Berkus
  Complete information technology      josh@agliodbs.com
   and data management solutions       (415) 565-7293
  for law firms, small businesses        fax 621-2533
    and non-profit organizations.      San Francisco