Thread: firebird X postgresql 8.1.2 windows, performance comparison
Hello, A friend asked for help to accelerate some postgresql queries on postgresql 8.1.2 for windows. He is comparing with firebird. Firebird was being up to 90 times faster at some queries. Attached is a gziped text file containing some steps I tried on a simple example query. Could get improvements from 270 seconds to 74 seconds. But Firebird effortlessly still can perform the same query at 20 seconds. Please, do you have some suggestion? Thanks. Andre Felipe Machado
Attachment
andremachado wrote: > Hello, > A friend asked for help to accelerate some postgresql queries on postgresql > 8.1.2 for windows. > He is comparing with firebird. > Firebird was being up to 90 times faster at some queries. > Attached is a gziped text file containing some steps I tried on a simple > example query. > Could get improvements from 270 seconds to 74 seconds. > But Firebird effortlessly still can perform the same query at 20 seconds. > Please, do you have some suggestion? > Thanks. > Try increasing your work mem and shared buffers considerably. Sincerely, Joshua D. Drake > Andre Felipe Machado > > > ------------------------------------------------------------------------ > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl - http://www.commandprompt.com/
Andre,
I noticed that enable_bitmapscan and enable_seqscan are off, is there a reason for it? Have you tried with enable_bitmapscan on?
How much RAM do you have? What kind of disks are being used?
Beste regards,
Reimer
55-47-33270878
Blumenau - SC - Brazil
andremachado <andremachado@techforce.com.br> escreveu:
Hello,
A friend asked for help to accelerate some postgresql queries on postgresql
8.1.2 for windows.
He is comparing with firebird.
Firebird was being up to 90 times faster at some queries.
Attached is a gziped text file containing some steps I tried on a simple
example query.
Could get improvements from 270 seconds to 74 seconds.
But Firebird effortless ly still can perform the same query at 20 seconds.
Please, do you have some suggestion?
Thanks.
Andre Felipe Machado
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Yahoo! Acesso Grátis
Internet rápida e grátis. Instale o discador agora!
On Tue, 2006-03-07 at 10:29, andremachado wrote: > Hello, > A friend asked for help to accelerate some postgresql queries on postgresql > 8.1.2 for windows. > He is comparing with firebird. > Firebird was being up to 90 times faster at some queries. > Attached is a gziped text file containing some steps I tried on a simple > example query. > Could get improvements from 270 seconds to 74 seconds. > But Firebird effortlessly still can perform the same query at 20 seconds. > Please, do you have some suggestion? First off, PostgreSQL on Windows is still kinda new, so it's quite possible that on some flavor of unix the disparity we're seeing wouldn't be so great. You may be seeing some issue with PostgreSQL's fairly new windows port instead of some basic postgresql problem. Is this running on the same basic hardware for both databases? I would imagine so, but just wanted to check. As someone else mentioned, try cranking up work mem, and to a lesser extent, shared_buffers. Also, as mentioned, why are bitmap scans and seq scans turned off? Bitmap scans are quite a nice improvement, and sometimes, a sequential scan is faster than an index. Forcing PostgreSQL to always use an index it not really a good idea. Lastly, I noticed that after you clusters on all your indexes, the query planner switched from a merge join to a hash join, and it was slower. You might wanna try turning off hash joins for a quick test to see if merge joins are any faster. Lastly, you might want to compare the two databases running on linux or BSD to see how they compare there.
Scott Marlowe wrote: > Lastly, I noticed that after you clusters on all your indexes, the query > planner switched from a merge join to a hash join, and it was slower. > You might wanna try turning off hash joins for a quick test to see if > merge joins are any faster. Anyway please note that clustering "all indexes" does not really make sense. You can cluster only on one index. If you cluster on another, then the first clustering will be lost. Better make sure to cluster on the one index where it makes the most difference. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Tue, 2006-03-07 at 11:15, Alvaro Herrera wrote: > Scott Marlowe wrote: > > > Lastly, I noticed that after you clusters on all your indexes, the query > > planner switched from a merge join to a hash join, and it was slower. > > You might wanna try turning off hash joins for a quick test to see if > > merge joins are any faster. > > Anyway please note that clustering "all indexes" does not really make > sense. You can cluster only on one index. If you cluster on another, > then the first clustering will be lost. Better make sure to cluster on > the one index where it makes the most difference. Note that I was referring to his clustering on an index for each table. I.e. not on every single index. but he clustered on four tables / indexes at once, so that was what I was referring to. Sorry for any confusion there. So, do you see any obvious, low hanging fruit here?
Scott Marlowe wrote: > On Tue, 2006-03-07 at 11:15, Alvaro Herrera wrote: > > Scott Marlowe wrote: > > > > > Lastly, I noticed that after you clusters on all your indexes, the query > > > planner switched from a merge join to a hash join, and it was slower. > > > You might wanna try turning off hash joins for a quick test to see if > > > merge joins are any faster. > > > > Anyway please note that clustering "all indexes" does not really make > > sense. You can cluster only on one index. If you cluster on another, > > then the first clustering will be lost. Better make sure to cluster on > > the one index where it makes the most difference. > > Note that I was referring to his clustering on an index for each table. > I.e. not on every single index. but he clustered on four tables / > indexes at once, so that was what I was referring to. Sorry for any > confusion there. Ah, sorry, I misinterpreted. > So, do you see any obvious, low hanging fruit here? Sorry, I didn't look at his test case very closely :-( -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Scott Marlowe <smarlowe@g2switchworks.com> writes: > So, do you see any obvious, low hanging fruit here? It would help if we were being told the whole truth about the settings being used. The first few plans are clearly suffering from the "enable_seqscan = off" error, but the last few don't seem to be. I don't trust the SHOW ALL at all since it disagrees with the immediately following retail SHOWs --- there is seemingly a whole lot of parameter changing going on that we are not being told about. It'd also be a good idea to know something about the datatypes involved, particularly for the join keys. regards, tom lane
Hello, Many thanks for the valuable suggestions and insights. The defaults enable_bitmapscan and enable_seqscan were altered by my friend. He already re enabled them (maybe even while I was trying some of the queries). The machine is a P4 3.2GHz, 1 GBram, sata hd, windows 2000. I did not used pg on win before to have any advice to my friend. The previously attached file contains SOME relevant info from the psql session, in order to not clutter file. When some server parameter was modified (at least by me) and server restarted, a new sholl parameter was issued to show the new value. Firebird is running at the same machine. As you can see by the session log, indexes were created on the columns used and tables was first clustered on the indexes actually used by the query. The subsequent cluster commands only recluster on the same indexes previously clustered. shared_buffers was increased from 1000 to 16384 pages effective_cache_size was increased from 1000 to 65535 pages and at the final steps REDUCED to 8192 pages work_mem was increased from 1024 first to 16384 KB and then to 65535 KB. The first 2 parameters reduced time 18%. work_mem reduced time almost 66%. But work_mem easily can exhaust ram with many users connected, as each connection query will use this amount of memory (if I can remember). How much it can grow at this 1 gbram win machine? Some of the docs I already read suggested that indexes should be entirely contained in ram. How to dimension the parameters? Other docs adviced that some memory parameters could actually degrade performance if too big. There are peak points at the performance curve by adjusting mem parameters. I hope tomorrow execute explain with the bitmapscan and seqscan enabled. bitmapscans are almost always faster? The data, as far I know, are a sample real app data (hey, if and when in production it will be even large?). They are almost true random as my friend informed, and according to him, cluster should not really be of benefit. It seems confirmed by the various explain analyze commands before and after clustering. Any suggestions? Do you see some obvious error on the steps at the previous session log file? It seems that Firebird windows can use adequately as much ram it finds and postgresql windows can not. How dimension ram to the indexes? Only by trial and error? I tried some suggested values found at some tuning docs suitable to the available system ram. Thanks Andre Felipe
Andre,
Could not Postgresql file cache being killed by firebird activity?
Haven´t you tried decrease ramdom_page_cost to 3 or 2?
It would be better if only one person will make configuration changes,
otherwise it will be difficult to measure each configuration change impact.
Reimer
Andre Felipe Machado <andremachado@techforce.com.br> escreveu:
Haven´t you tried decrease ramdom_page_cost to 3 or 2?
It would be better if only one person will make configuration changes,
otherwise it will be difficult to measure each configuration change impact.
Reimer
Andre Felipe Machado <andremachado@techforce.com.br> escreveu:
Hello,
Many thanks for the valuable suggestions and insights.
The defaults enable_bitmapscan and enable_seqscan were altered by my
friend. He already re enabled them (maybe even while I was trying some
of the queries).
The machine is a P4 3.2GHz, 1 GBram, sata hd, windows 2000. I did not
used pg on win before to have any advice to my friend.
The previously attached file contains SOME relevant info from the psql
session, in order to not clutter file.
When some server parameter was modified (at least by me) and server
restarted, a new sholl parameter was issued to show the new value.
Firebird is running at the same machine.
As you can see by the session log, indexes were created on the columns
used and tables was first clustered on the indexes actually used by the
query.
The subsequent cluster commands only recluster on the same indexes
previously clustered.
shared_buffers was increased from 1000 to 16384 pages
effective_cache_size was increased from 1000 to 65535 pages and at the
final steps REDUCED to 8192 pages
work_mem was increased from 1024 first to 16384 KB and then to 65535
KB.
The first 2 parameters reduced time 18%.
work_mem reduced time almost 66%.
But work_mem easily can exhaust ram with many users connected, as each
connection query will use this amount of memory (if I can remember).
How much it can grow at this 1 gbram win machine?
Some of the docs I already read suggested that indexes should be
entirely contained in ram. How to dimension the parameters?
Other docs adviced that some memory parameters could actually degrade
performance if too big. There are peak points at the performance curve
by adjusting mem parameters.
I hope tomorrow execute explain with the bitmapscan and seqscan enabled.
bitmapscans are almost always faster?
The data, as far I know, are a sample real app data (hey, if and when in
production it will be even large?). They are almost true random as my
friend informed, and according to him, cluster should not really be of
benefit. It seems confirmed by the various explain analyze commands
before and after clustering.
Any suggestions? Do you see some obvious error on the steps at the
previous session log file?
It seems that Firebird windows can use adequately as much ram it finds
and postgresql windows can not. How dimens ion ram to the indexes? Only
by trial and error? I tried some suggested values found at some tuning
docs suitable to the available system ram.
Thanks
Andre Felipe
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
Yahoo! Acesso Grátis
Internet rápida e grátis. Instale o discador agora!
> I hope tomorrow execute explain with the bitmapscan and seqscan enabled. > bitmapscans are almost always faster? Like all the rest, they're just a tool, which works great when used in its intended purpose : - Fetching just a few percent of the rows from a table is better served by an index scan - Fetching a lot of rows (>30-50%) from a table is better served by a seq scan - Bitmap scan comes in between and it's a very welcome addition. Also Bitmap scan will save your life if you have complex searches, like if you run a dating site and have an index on blondes and an index on boob size, because it can use several indexes in complex AND/OR queries. Common wisdom says simpler databases can be faster than postgres on simple queries. Reality check with pg 8.1 driven by PHP : - SELECT 1 mysql 5 ~ 42 us postgres ~ 70 us - SELECT * FROM users WHERE id=1 mysql 5 ~ 180 us postgres ~ 160 us Of course people doing stupid things, like using the database to keep a hit counter on their website which is updated on every hit, will say that postgres is slow.
Andre Felipe Machado wrote: >It seems that Firebird windows can use adequately as much ram it finds >and postgresql windows can not. > > PostgreSQL relies on the OS cache to utilize RAM. Make sure that most of the RAM is 'available' so Windows can do its thing. effective_cache_size should be set correspondingly high - at least 65535. shared_buffers should be as low as you can get away with (allowing for multiple users). 16384 is 12.5% of your RAM and far too high. AFAIK, PostgreSQL still doesn't differentiate between index blocks and data blocks. >work_mem reduced time almost 66%. >But work_mem easily can exhaust ram with many users connected, as each >connection query will use this amount of memory (if I can remember). >How much it can grow at this 1 gbram win machine? > > work_mem has to be just big enough to allow hash joins to operate efficiently. This varies from query to query and can be set in your code accordingly. However, the 1024 default is just too low for most applications and you'll probably find even 4096 is a huge improvement. You need to find the minimum that delivers acceptable performance in most queries and boost it for selected queries as required. BTW, which version of Firebird is this?
Hello, I got good results on tuning postgresql performance for my friend. One of the queries took almost 10 minutes. Now it completes on 26 miliseconds! (at the second run) A combination of query otimization, indexes choosing (with some droping and clustering), server parameters reconfigurations. Firebird still execute it on almost 2 minutes, much slower. Firebird is much slower than Postgresql at queries without joins. Postgresql is lightning faster than Firebird when manually tunned and without using joins and aggregates functions. The example query and its explain analyze results are attached, with the "show all" output of each config iteration, and indexes created. (UPDATE: i am sending msg from home and does not have the correct log file here. Will send the file at monday....) BUT there are some issues still unknown. The example query executes consistently at 56 seconds, and even at 39 seconds. Firebird executes the same query at 54 seconds the first time and at 20 seconds at next times. Today I went to the machine (was previously executing pg commands remotely) to observe the windows behaviour. Postgresql uses around 30% cpu and hard disk heavily (not so as vacuum) at all executions. Firebird uses around 40% cpu and hard disk heavily at the first execution. The second execution uses around 60% cpu and **NO** disk activity. The previously cited query running at 26 miliseconds down from 10 minutes, can achieve this performance at the second run, with **NO** disk activity. At the first run it uses 1,7 seconds, down from 10 minutes. The hard disk is clearly a bottleneck. 1,7 seconds against 26 miliseconds. So, How "convince" postgresql to use windows disk cache or to read all indexes to ram? It seems that effective_cache_size does not tell postgresql to actually use windows disk cache. What parameter must be configured? Do you have some suggestions? Regards. Andre Felipe Machado www.techforce.com.br
On Fri, Mar 10, 2006 at 10:39:57PM -0300, Andre Felipe Machado wrote: > It seems that effective_cache_size does not tell postgresql to actually > use windows disk cache. No, it just tells PostgreSQL how much cache memory it should expect to have. > What parameter must be configured? > Do you have some suggestions? Well, you could try increasing shared_buffers, but the real question is why Windows isn't caching the data. Are you sure that the data you're reading is small enough to fit entirely in memory? Remember that Firebird has a completely different on-disk storage layout than PostgreSQL, so just because the table fits in memory there doesn't mean it will do so on PostgreSQL. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Andre Felipe Machado wrote: >Postgresql uses around 30% cpu and hard disk heavily (not so as vacuum) >at all executions. >Firebird uses around 40% cpu and hard disk heavily at the first >execution. >The second execution uses around 60% cpu and **NO** disk activity. > >The previously cited query running at 26 miliseconds down from 10 >minutes, can achieve this performance at the second run, with **NO** >disk activity. >At the first run it uses 1,7 seconds, down from 10 minutes. > >The hard disk is clearly a bottleneck. >1,7 seconds against 26 miliseconds. > > >So, >How "convince" postgresql to use windows disk cache or to read all >indexes to ram? >It seems that effective_cache_size does not tell postgresql to actually >use windows disk cache. >What parameter must be configured? >Do you have some suggestions? > > Assuming these are selects and that you have already vacuumed, etc. Look at memory useage. It seems likely that you have a difference in caching behavior. PostgreSQL has its own cache, and failing that will use the OS disk cache. So there may be a number of possible issues involved including whether the data is staying in the OS cache, how much memory is being used for caching, etc. It is also likely that the Windows version of PostgreSQL may have some issues in these areas that the UNIX/Linux versions may not simply because it is more immature. You might even try a vacuum full to retrieve space. This may mean smaller tables, more likely to remain in disk cache, etc. But that would not provide any indication of scalability. Best Wishes, Chris Travers Metatron Technology Consulting