Gabor,
> I have no experiences with big databases. Now I have a database
> that's size
> is appr. 7,6GB (HW/OS: Intel Pentium III 600MHz, 512MB RAM, GNU/Linux
> 2.2.19
> kernel). There is a big table in it with 25318267 tuples today. If I
> make a
> simple query (SELECT x,y,z FROM table WHERE x = something) then it
> takes
> 2m42s to get the results. Is it a good value?
> I've read the documentation and I've checked the kernel parameters
> (sem.h
> and shmparam.h) and the parameter values in postgresql.conf and I
> have no
> idea what could I do to get the results in shorter time period.
> Thanks for any kindness in advance.
Hmmm.. for 25 million records on a single-processor machine, 2 minutes
isn't unreasonable, especially if you have IDE rather than SCSI drives.
Drive access times are your big bottleneck with really large databases.
Things to check:
1. Indexes ... use EXPLAIN to find out how Postgres is finding the data.
2. VACUUM and VACUUM ANALYZE frequency
3. Postgresql.conf parameters: sort_mem, buffers, wal_files (make tem
high!)
4. Use multiple drives or RAID ... having the log files on a seperate
drive from the database makes a huge difference in really large queries,
although mostly for UPDATE and INSERT queries. Also, putting the OS on
a seperate drive doesn't hurt.
5. Postgres verison ... an upgrade always improves the query optimizer.
Also, how large is each tuple? It makes a difference in retrieval.
Finally, I suggest that you read through the PGSQL-SQL list archives.
We discuss performance issues all the time on that list.
-Josh Berkus
______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco