Thread: Postgres scalability and performance on windows
Hi all,
I have a postgres installation thats running under 70-80% CPU usage while
an MSSQL7 installation did 'roughly' the same thing with 1-2% CPU load.
Here’s the scenario,
300 queries/second
Server: Postgres 8.1.4 on win2k server
CPU: Dual Xeon 3.6 Ghz,
Memory: 4GB RAM
Disks: 3 x 36gb , 15K RPM SCSI
C# based web application calling postgres functions using npgsql 0.7.
Its almost completely read-only db apart from fortnightly updates.
Table 1 - About 300,000 rows with simple rectangles
Table 2 – 1 million rows
Total size: 300MB
Functions : Simple coordinate reprojection and intersection query + inner join of table1 and table2.
I think I have all the right indexes defined and indeed the performance for queries under low loads is fast.
==================================================================================
postgresql.conf has following settings
max_connections = 150
hared_buffers = 20000 # min 16 or max_connections*2, 8KB each
temp_buffers = 2000 # min 100, 8KB each
max_prepared_transactions = 25 # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 512 # min 64, size in KB
#maintenance_work_mem = 16384 # min 1024, size in KB
max_stack_depth = 2048
effective_cache_size = 82728 # typically 8KB each
random_page_cost = 4 # units are one sequential page fetch
==================================================================================
SQL server caches all the data in memory which is making it faster(uses about 1.2GB memory- which is fine).
But postgres has everything spread across 10-15 processes, with each process using about 10-30MB, not nearly enough to cache all the data and ends up doing a lot of disk reads.
I've read that postgres depends on OS to cache the files, I wonder if this is not happenning on windows.
In any case I cannot believe that having 15-20 processes running on windows helps. Why not spwan of threads instead of processes, which might
be far less expensive and more efficient. Is there any way of doing this?
My question is, should I just accept the performance I am getting as the limit on windows or should I be looking at some other params that I might have missed?
Thanks,
Gopal
________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
Gopal wrote: > Functions : Simple coordinate reprojection and intersection query + > inner join of table1 and table2. > > I think I have all the right indexes defined and indeed the performance > for queries under low loads is fast. Can you do a EXPLAIN ANALYZE on your queries, and send the results back to the list just to be sure? > SQL server caches all the data in memory which is making it faster(uses > about 1.2GB memory- which is fine). > > But postgres has everything spread across 10-15 processes, with each > process using about 10-30MB, not nearly enough to cache all the data and > ends up doing a lot of disk reads. I don't know Windows memory management very well, but let me just say that it's not that simple. > I've read that postgres depends on OS to cache the files, I wonder if > this is not happenning on windows. Using the Task Manager, or whatever it's called these days, you can see how much memory is used for caching. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Am 23.11.2006 um 23:37 schrieb Gopal: > hared_buffers = 20000 # min 16 or > max_connections*2, 8KB each If this is not a copy & paste error, you should add the "s" at the beginning of the line. Also you might want to set this to a higher number. You are setting about 20000 * 8k = 160MB, this number might be a bit too small if you do a lot of queries spread over the whole dataset. I don't know whether the memory management on Windows handles this well, but you can give it a try. > effective_cache_size = 82728 # typically 8KB each Hmm. I don't know what the real effect of this might be as the doc states: "This parameter has no effect on the size of shared memory allocated by PostgreSQL, nor does it reserve kernel disk cache; it is used only for estimation purposes." You should try optimizing your shared_buffers to cache more of the data. > But postgres has everything spread across 10-15 processes, with > each process using about 10-30MB, not nearly enough to cache all > the data and ends up doing a lot of disk reads. It's not soo easy. PostgreSQL maintains a shared_buffer which is accessible by all processes for reading. On a Unix system you can see this in the output of top - don't know how this works on Windows. > In any case I cannot believe that having 15-20 processes running on > windows helps. Why not spwan of threads instead of processes, which > migh be far less expensive and more efficient. Is there any way of > doing this? Because it brings you a whole lot of other problems? And because PostgreSQL is not "made for Windows". PostgreSQL runs very good on Linux, BSD, Mac OS X and others. The Windows version is quite young. But before you blame stuff on PostgreSQL you should give more information about the query itself. > My question is, should I just accept the performance I am getting > as the limit on windows or should I be looking at some other params > that I might have missed? Post the "explain analyse select <your query here>" output here. That might help to understand, why you get such a high CPU load. cug
Hi,
Thanks for your suggestions. Here’s an output of the explain analyse. I’ll change the shared_buffers and look at the behaviour again.
"Limit (cost=59.53..59.53 rows=1 width=28) (actual time=15.681..15.681 rows=1 loops=1)"
" -> Sort (cost=59.53..59.53 rows=1 width=28) (actual time=15.678..15.678 rows=1 loops=1)"
" Sort Key: ceil(subq.percentcover), (1::double precision / ds.maxresolution)"
" -> Hash Join (cost=58.19..59.52 rows=1 width=28) (actual time=15.630..15.663 rows=2 loops=1)"
" Hash Cond: ("outer".datasetid = "inner".did)"
" -> Seq Scan on tbl_metadata_dataset ds (cost=0.00..1.21 rows=21 width=24) (actual time=0.006..0.021 rows=21 loops=1)"
" -> Hash (cost=58.18..58.18 rows=1 width=12) (actual time=15.591..15.591 rows=2 loops=1)"
" -> Sort (cost=58.17..58.17 rows=1 width=117) (actual time=15.585..15.586 rows=2 loops=1)"
" Sort Key: tbl_metadata_chunks.datasetid"
" -> HashAggregate (cost=58.13..58.16 rows=1 width=117) (actual time=15.572..15.573 rows=2 loops=1)"
" -> Hash IN Join (cost=3.34..58.10 rows=7 width=117) (actual time=0.261..0.544 rows=50 loops=1)"
" Hash Cond: ("outer".datasetid = "inner".datasetid)"
" -> Bitmap Heap Scan on tbl_metadata_chunks (cost=2.05..56.67 rows=14 width=117) (actual time=0.204..0.384 rows=60 loops=1)"
" Filter: (chunkgeometry && '0103000020E6100000010000000500000058631EDF87ECC1BF608F3D1911694940A0958A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B8BF7700CC99FA68494058631EDF87ECC1BF608F3D1 (..)"
" -> Bitmap Index Scan on tbl_metadata_chunks_idx3 (cost=0.00..2.05 rows=14 width=0) (actual time=0.192..0.192 rows=60 loops=1)"
" Index Cond: (chunkgeometry && '0103000020E6100000010000000500000058631EDF87ECC1BF608F3D1911694940A0958A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B8BF7700CC99FA68494058631EDF87ECC (..)"
" -> Hash (cost=1.26..1.26 rows=10 width=4) (actual time=0.037..0.037 rows=10 loops=1)"
" -> Seq Scan on tbl_metadata_dataset (cost=0.00..1.26 rows=10 width=4) (actual time=0.005..0.024 rows=10 loops=1)"
" Filter: (typeofdataid = 1)"
"Total runtime: 15.871 ms"
Gopal
On Fri, 24 Nov 2006 09:22:45 +0100 Guido Neitzer <lists@event-s.net> wrote: > > effective_cache_size = 82728 # typically 8KB each > Hmm. I don't know what the real effect of this might be as the doc > states: > > "This parameter has no effect on the size of shared memory allocated > by PostgreSQL, nor does it reserve kernel disk cache; it is used > only for estimation purposes." This is a hint to the optimizer about how much of the database may be in the OS level cache. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
"Gopal" <gopal@getmapping.com> writes: > Thanks for your suggestions. Here's an output of the explain analyse. What's the query exactly, and what are the schemas of the tables it uses (psql \d descriptions would do)? The actual runtime seems to be almost all spent in the hash aggregation step: > -> HashAggregate (cost=58.13..58.16 rows=1 width=117) (actual time=15.572..15.573 rows=2 loops=1) > -> Hash IN Join (cost=3.34..58.10 rows=7 width=117) (actual time=0.261..0.544 rows=50 loops=1) 15 msec seems like a long time to aggregate only 50 rows, so I'm wondering what aggregates are being calculated and over what datatypes... regards, tom lane
Tom, This is the query and the schema.... Query is : SELECT subq.percentCover, ds.datasetname, ds.maxresolution FROM ( select sum(area(intersection(snaptogrid(chunkgeometry,0.00000001), GeometryFromText('POLYGON((-0.140030845589332 50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823 50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332 50.8208343077265))',4326))) * 100/ (0.00114901195862628)) as percentCover, datasetid as did from tbl_metadata_chunks where chunkgeometry && GeometryFromText('POLYGON((-0.140030845589332 50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823 50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332 50.8208343077265))',4326) and datasetid in (select datasetid from tbl_metadata_dataset where typeofdataid=1) group by did order by did desc ) AS subq INNER JOIN tbl_metadata_dataset AS ds ON subq.did = ds.datasetid ORDER by ceil(subq.percentCover),1/ds.maxresolution DESC; Schema is Table 1 CREATE TABLE public.tbl_metadata_dataset ( datasetname varchar(70) NOT NULL, maxresolution real, typeofdataid integer NOT NULL, datasetid serial NOT NULL, CONSTRAINT "PK_Dataset" PRIMARY KEY (datasetid) ); -- Indexes CREATE INDEX dsnameindex ON tbl_metadata_dataset USING btree (datasetname);-- Owner ALTER TABLE public.tbl_metadata_dataset OWNER TO postgres; -- Triggers CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2196039" AFTER DELETE ON tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_del"('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2196040" AFTER UPDATE ON tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd"('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); Table 2 CREATE TABLE public.tbl_metadata_chunks ( chunkid serial NOT NULL, chunkgeometry geometry NOT NULL, datasetid integer NOT NULL, CONSTRAINT tbl_metadata_chunks_pkey PRIMARY KEY (chunkid), CONSTRAINT dsid FOREIGN KEY (datasetid) REFERENCES tbl_metadata_dataset(datasetid) ); -- Indexes CREATE INDEX idx_dsid ON tbl_metadata_chunks USING btree (datasetid); CREATE UNIQUE INDEX tbl_metadata_chunks_idx2 ON tbl_metadata_chunks USING btree (nativetlx, nativetly, datasetid); CREATE INDEX tbl_metadata_chunks_idx3 ON tbl_metadata_chunks USING gist (chunkgeometry);-- Owner ALTER TABLE public.tbl_metadata_chunks OWNER TO postgres; -- Triggers CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2194515" AFTER DELETE ON tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_restrict_del"('fk', 'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED', 'chunkid', 'chunkid'); CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2194516" AFTER UPDATE ON tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_restrict_upd"('fk', 'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED', 'chunkid', 'chunkid'); CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2196037" AFTER INSERT ON tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins"('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); CREATE CONSTRAINT TRIGGER "RI_ConstraintTrigger_2196038" AFTER UPDATE ON tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_upd"('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); -----Original Message----- From: Frank Wiles [mailto:frank@wiles.org] Sent: 24 November 2006 17:05 To: Guido Neitzer Cc: Gopal; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Postgres scalability and performance on windows On Fri, 24 Nov 2006 09:22:45 +0100 Guido Neitzer <lists@event-s.net> wrote: > > effective_cache_size = 82728 # typically 8KB each > Hmm. I don't know what the real effect of this might be as the doc > states: > > "This parameter has no effect on the size of shared memory allocated > by PostgreSQL, nor does it reserve kernel disk cache; it is used > only for estimation purposes." This is a hint to the optimizer about how much of the database may be in the OS level cache. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org --------------------------------- ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
"Gopal" <gopal@getmapping.com> writes: > This is the query and the schema.... > ... > select > sum(area(intersection(snaptogrid(chunkgeometry,0.00000001), > GeometryFromText('POLYGON((-0.140030845589332 > 50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823 > 50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332 > 50.8208343077265))',4326))) * 100/ (0.00114901195862628)) as > percentCover, So evidently area(intersection(snaptogrid(...))) takes about 300 microsec per row. The PostGIS hackers would have to comment on whether that seems out-of-line or not, and whether you can make it faster. regards, tom lane
On Nov 28, 2006, at 8:24 AM, Tom Lane wrote: > "Gopal" <gopal@getmapping.com> writes: >> This is the query and the schema.... >> ... >> select >> sum(area(intersection(snaptogrid(chunkgeometry,0.00000001), >> GeometryFromText('POLYGON((-0.140030845589332 >> 50.8208343077265,-0.138958398039148 >> 50.8478005422809,-0.0963639712296823 >> 50.8471133071392,-0.0974609286275892 >> 50.8201477285483,-0.140030845589332 >> 50.8208343077265))',4326))) * 100/ (0.00114901195862628)) as >> percentCover, > > So evidently area(intersection(snaptogrid(...))) takes about 300 > microsec per row. The PostGIS hackers would have to comment on > whether > that seems out-of-line or not, and whether you can make it faster. This is consistent with the typical cost for GIS geometry ops -- they are relatively expensive. When running queries against PostGIS fields for our apps, about half the CPU time will be spent inside the geometry ops. Fortunately, there is significant opportunity for improvement in the performance of the underlying code if anyone found the time to optimize (and uglify) it for raw speed. Cheers, J. Andrew Rogers