Thread: SLOOOOOOOW
Sorry, but I better use this email address, I just hate to use Outlook for this stuff. To Peter Eisentraut Yes, I've read the chapter in the manual. To Michael Glaesemann locally I run the database on my laptop (Dell D800) 1 GB Ram, but there within VMWARE with 512MB assigned RAM. But the target platform is a dual processor machine with 2 GB. But, and thats the big but here, I don't care. For me a database has to work satisfying in the first place without twiddeling some obscure knobs or push levers to get just accaptable performance if I only have a small set of data. Heck, I'm talking about maybe in the whole 45.000 records!!! I mean I used Interbase, MySQL, SQLite, SQLServer before, and for this project postgres was set, so I had to use it. Which is fine, I wouldn't mind, if I would not have such troubles. Which I'm working on is just a redesign of a database which has some hysterically grown tables. Not much in it, but there are some tables which should be merged together and some others have to split up. No big deal. So my basic idea was to use the flexibility of rules to provide a transparent interface to the frontend, which has the big advantage of not having to change the frontend in most places at all. We have a bit of a homegrown framework (PHP) to show and manipulate the data on the frontend side. Unfortunately it is only easy if you access 1 table, and don't have to update several tables. So my idea was to use the rule system as well to put the data into the database and distribute it on the underlying tables. To have views which separate the physical model from the logical model. This is best practice isn't it? Unfortunately it seems no way to create triggers on views, which is what I need. Some insert rules are not enough, because I'm using data which is just created, so this is not an option. Ok as a workaround I create a table which is just there to have a insert trigger on it to distribute the data on the tables. For selecting, updating, and deleting the rules are sufficient. So I actually merging some tables with appr. 8000 + 14,000 + 30,000 records in it, so we talking about a small database. The performance of selecting data from the views is slow, I mean there are only around 50000 records in there in the whole. It can take up to several seconds to get the data from the views, which is just not fast enough. The update is even slower, for just updating 1 record it takes ages. The actual migration process, of moving the old data to the new tables is just agonizing slow. To move tha data from the small table (8000 entries) it takes somewhere (not deterministic) between a few minutes and 40 minutes to move it. Essentually it is just a select from one table to the compatibility view of the new table. For me it seems that each additional row makes the database slower. It occured to me that either table (8000 or 14000 entries) is faster migrated if it happens to be the first of both. Then migrating the 30000 entries (and it has to be the last one) takes **hours**!!! The migration of the tables itself consists of two parts, first move the data from the table, than update all the linked tables (I had to remove the joins, they have to point afterwards to the new tables), at this point I'm using a lot of subselects (which are slow but there is no other way). And I actually vacuum and analyze the database after each step, all usefull indices are set and also used. I did some serious stuff with SQLServer and Interbase, and I had **never** those performance problems. enough of ranting, but I'm totally frustrated with best regards Jürgen
On 9/6/05, Jürgen Rose <anykey@gmx.de> wrote: > I did some serious stuff with SQLServer and Interbase, and I had > **never** those performance problems. > On a laptop? Under VMWare? I have used MSSQL Server too, and find PostgreSQL to compare favorably in most cases. You may have found a case where it does not. I have found PostgreSQL to work fine out of the box in most cases, with the main 'obscure knob' that needs to be twiddled being sort_mem (or work_mem for 8.0 and up.) If you give specifics of your database structure and migration functions I am sure someone on the list will be able to help you get reasonable performance. > enough of ranting, but I'm totally frustrated It will work out. - Ian > with best regards > Jürgen > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
=?ISO-8859-1?Q?J=FCrgen_Rose?= <anykey@gmx.de> writes: > enough of ranting, but I'm totally frustrated So are we, because you haven't provided nearly enough detail to let anyone help you. A complete test case would be good, for instance. regards, tom lane
On Tue, 2005-09-06 at 16:09, Jürgen Rose wrote: > Sorry, but I better use this email address, I just hate to use Outlook > for this stuff. > > To Peter Eisentraut > > Yes, I've read the chapter in the manual. > > To Michael Glaesemann > > locally I run the database on my laptop (Dell D800) 1 GB Ram, but there > within VMWARE with 512MB assigned RAM. But the target platform is a dual > processor machine with 2 GB. > > But, and thats the big but here, I don't care. For me a database has to > work satisfying in the first place without twiddeling some obscure knobs > or push levers to get just accaptable performance if I only have a small > set of data. Heck, I'm talking about maybe in the whole 45.000 > records!!! I mean I used Interbase, MySQL, SQLite, SQLServer before, and > for this project postgres was set, so I had to use it. Which is fine, I > wouldn't mind, if I would not have such troubles. A couple of points: 1: You wouldn't buy the QE II (a big luxery liner) and complain that it doesn't work well for water skiing and is too complex. It's the QE II. 2: You've given us absolutely nothing we can go on to help you make postgresql work better for you. Nothing. Just one explain analyze output. 3: If you're running one OS on top of another, and then a database on top of that, and all you've got is 512 Meg of RAM, don't expect stellar performance, especially from a database that uses shared memory like postgresql does. 4: Don't compare PostgreSQL to those other databases unless you're going to give it a chance. So far, you haven't done so, you've only complained. 5: I have tested a properly tuned PostgreSQL server that was on about 1/2 the hardware (CPU speed, memory, RAID array) as a MSSQL server and easily out ran it. But, I took my time, read the docs, and tuned the server OS and PostgreSQL 6: Databases may appear simple, they are not, and the more complex they are, the more you'll have to do to make full use of them. So, have you been running vacuum and analyze, do you have the right indexes, are you using queries that can use those indexes, have you turned up sort_mem and a few other easily tweakable settings. PostgreSQL's use of shared memory, combined with many older Operating systems have VERY conservative settings for such, combined further with the need for PostgreSQL to run on dang near anything, mean that, often, out of the box, it's not as fast as some other servers. OTOH, it coexists well with other software. If you've ever tried to build a MSSQL or ORacle box that did anything else, you know how those two database engines just consume memory and CPU without really asking. Here's an explain analyze on one of the production pgsql servers I work on: explain analyze select count(*) from sometable QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=6209.99..6209.99 rows=1 width=0) (actual time=339.200..339.201 rows=1 loops=1) -> Seq Scan on sometable (cost=0.00..5856.19 rows=141519 width=0) (actual time=0.025..202.636 rows=162427 loops=1) Total runtime: 339.262 ms (3 rows) Admitted, the rows aren't that big, but that's a seq scan of 160,000 rows. Not bad really. And it's quite fast at our more esoteric reporting generation queries as well. Now, we can sit here and argue about how nice it would be if PostgreSQL just configured itself for maximum performance on installation, or you can tell us what runs slow, and let us help you fix it. The ball is in your court.
Scott Marlowe wrote: > On Tue, 2005-09-06 at 16:09, Jürgen Rose wrote: > >>Sorry, but I better use this email address, I just hate to use Outlook >>for this stuff. >> >>To Peter Eisentraut >> >>Yes, I've read the chapter in the manual. >> >>To Michael Glaesemann >> >>locally I run the database on my laptop (Dell D800) 1 GB Ram, but there >>within VMWARE with 512MB assigned RAM. But the target platform is a dual >>processor machine with 2 GB. >> >>But, and thats the big but here, I don't care. For me a database has to >>work satisfying in the first place without twiddeling some obscure knobs >>or push levers to get just accaptable performance if I only have a small >>set of data. Heck, I'm talking about maybe in the whole 45.000 >>records!!! I mean I used Interbase, MySQL, SQLite, SQLServer before, and >>for this project postgres was set, so I had to use it. Which is fine, I >>wouldn't mind, if I would not have such troubles. > > > A couple of points: > > 1: You wouldn't buy the QE II (a big luxery liner) and complain that it > doesn't work well for water skiing and is too complex. It's the QE II. I don't get that argument. > 2: You've given us absolutely nothing we can go on to help you make > postgresql work better for you. Nothing. Just one explain analyze > output. It is a bit complex, and I have now added two eplains from the main queries to this mail > 3: If you're running one OS on top of another, and then a database on > top of that, and all you've got is 512 Meg of RAM, don't expect stellar > performance, especially from a database that uses shared memory like > postgresql does. I don't expect mega performance, this is just a devel system, I mean do you develop your database on the productive system? > 4: Don't compare PostgreSQL to those other databases unless you're > going to give it a chance. So far, you haven't done so, you've only > complained. I used it now for over half a year, and as I said in one of my previous mails, it is quite powerful and flexible, but I'm not impressed with the performance and it has its quirks. > 5: I have tested a properly tuned PostgreSQL server that was on about > 1/2 the hardware (CPU speed, memory, RAID array) as a MSSQL server and > easily out ran it. But, I took my time, read the docs, and tuned the > server OS and PostgreSQL As I said before, I don't want to tune my system (my devel!), in my opinion it has to run with ok performance out of the box. > 6: Databases may appear simple, they are not, and the more complex they > are, the more you'll have to do to make full use of them. > > So, have you been running vacuum and analyze, do you have the right > indexes, are you using queries that can use those indexes, have you > turned up sort_mem and a few other easily tweakable settings. > > PostgreSQL's use of shared memory, combined with many older Operating > systems have VERY conservative settings for such, combined further with > the need for PostgreSQL to run on dang near anything, mean that, often, > out of the box, it's not as fast as some other servers. I can set the sort_mem to what I want, but postgres doesn't care. It just consumes my CPU time but no memory. > OTOH, it coexists well with other software. If you've ever tried to > build a MSSQL or ORacle box that did anything else, you know how those > two database engines just consume memory and CPU without really asking. > Here's an explain analyze on one of the production pgsql servers I work > on: > > explain analyze select count(*) from sometable > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=6209.99..6209.99 rows=1 width=0) (actual > time=339.200..339.201 rows=1 loops=1) > -> Seq Scan on sometable (cost=0.00..5856.19 rows=141519 width=0) > (actual time=0.025..202.636 rows=162427 loops=1) > Total runtime: 339.262 ms > (3 rows) > > Admitted, the rows aren't that big, but that's a seq scan of 160,000 > rows. Not bad really. And it's quite fast at our more esoteric > reporting generation queries as well. > > Now, we can sit here and argue about how nice it would be if PostgreSQL > just configured itself for maximum performance on installation, or you > can tell us what runs slow, and let us help you fix it. The ball is in > your court. > I'm sorry but it is a bit complicated to explain the whole structure. And my main point is, that with each row I insert, it becomes slower, and I will try to reproduce that behaviour, but I don't have so much time for it right now. The following output is for the two main queries. There are some other views depending on it, but the tuning was left as an exercise for later, but it seems it would be needed now. The basic idea is just to handle the stored persons and organisations in the same way, therefore using the same tables and having the advantage of creating arbitrary relationships between either persons and/or organisations. There are more views and tables involved, but these are the ones which are used most, and in all other views. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Left Join (cost=15664.00..56623.19 rows=33545 width=241) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=15664.00..46776.22 rows=33545 width=235) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=15664.00..36636.71 rows=31800 width=229) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=15664.00..26521.15 rows=31065 width=223) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=15664.00..16414.74 rows=30347 width=217) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Sort (cost=7524.64..7600.50 rows=30347 width=148) Sort Key: r.orgr_id -> Merge Join (cost=0.00..3729.72 rows=30347 width=148) Merge Cond: ("outer".orga_id = "inner".orga_id) -> Index Scan using pk_org_organisations on org_organisations o (cost=0.00..1615.77 rows=20175 width=126) Filter: (orga_type = 'O'::bpchar) -> Index Scan using r_orga_orga_orga_orga_id_index on r_orga_orga r (cost=0.00..1612.55 rows=59116 width=26) -> Sort (cost=8139.36..8287.14 rows=59111 width=73) Sort Key: a.orgr_id -> Seq Scan on org_addresses a (cost=0.00..1634.11 rows=59111 width=73) -> Index Scan using org_contacts_orgr_id_index on org_contacts c1 (cost=0.00..9570.22 rows=60514 width=10) Filter: (cont_type = 'E'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c2 (cost=0.00..9570.22 rows=60514 width=10) Filter: (cont_type = 'P'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c3 (cost=0.00..9570.22 rows=62359 width=10) Filter: (cont_type = 'F'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c4 (cost=0.00..9570.22 rows=23745 width=10) Filter: (cont_type = 'H'::bpchar) (28 Zeilen) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Merge Left Join (cost=16199.20..88333.55 rows=27660 width=631) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..78455.64 rows=27660 width=625) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..68589.40 rows=27660 width=619) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..58696.51 rows=27660 width=613) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..48786.43 rows=27660 width=607) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..38719.72 rows=26221 width=601) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..28673.57 rows=25615 width=595) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..18634.97 rows=25023 width=589) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Merge Left Join (cost=16199.20..16289.45 rows=25023 width=520) Merge Cond: ("outer".orgr_id = "inner".orgr_id) -> Sort (cost=15409.07..15471.62 rows=25023 width=487) Sort Key: r.orgr_id -> Merge Left Join (cost=1303.40..1941.01 rows=25023 width=487) Merge Cond: ("outer".orga_id = "inner".orga_id) -> Merge Join (cost=0.00..4558.02 rows=25023 width=226) Merge Cond: ("outer".orga_id = "inner".orga_id) -> Merge Join (cost=0.00..2455.34 rows=16636 width=175) Merge Cond: ("outer".orga_id = "inner".orga_id) -> Index Scan using org_personsprivate_orga_id_index on org_persons p (cost=0.00..631.83 rows=16635 width=112) -> Index Scan using pk_org_organisations on org_organisations o (cost=0.00..1517.52 rows=39301 width=63) -> Index Scan using r_orga_orga_orga_orga_id_index on r_orga_orga r (cost=0.00..1612.55 rows=59116 width=59) -> Sort (cost=1303.40..1322.78 rows=7750 width=265) Sort Key: v.orga_id -> Seq Scan on org_personsprivate v (cost=0.00..227.50 rows=7750 width=265) -> Sort (cost=790.14..812.97 rows=9133 width=37) Sort Key: cp.orgr_id -> Seq Scan on org_contactpersons cp (cost=0.00..189.33 rows=9133 width=37) -> Index Scan using org_addresses_orgr_id_index on org_addresses a (cost=0.00..1884.97 rows=59111 width=73) -> Index Scan using org_contacts_orgr_id_index on org_contacts c1 (cost=0.00..9570.22 rows=60514 width=10) Filter: (cont_type = 'E'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c2 (cost=0.00..9570.22 rows=60514 width=10) Filter: (cont_type = 'P'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c3 (cost=0.00..9570.22 rows=62359 width=10) Filter: (cont_type = 'F'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c4 (cost=0.00..9570.22 rows=37923 width=10) Filter: (cont_type = 'M'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c5 (cost=0.00..9570.22 rows=35502 width=10) Filter: (cont_type = 'S'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c6 (cost=0.00..9570.22 rows=31814 width=10) Filter: (cont_type = 'N'::bpchar) -> Index Scan using org_contacts_orgr_id_index on org_contacts c7 (cost=0.00..9570.22 rows=33427 width=10) Filter: (cont_type = 'X'::bpchar) (50 Zeilen)
Apparently postgresql runs at 11% to 45% of normal speed in VMware workstation. Basically it could be about 1/10th the performance for OLTP stuff. See here: http://www.cl.cam.ac.uk/Research/SRG/netos/xen/performance.html (Notice also that the web server performance is less than 30% of native). I daresay those figures are applicable for VMware GSX as well. So unless you are talking about VMware ESX (which I seriously doubt) don't expect VMware stuff to run at near full speed especially for complex stuff (e.g. not SPECINT), until _maybe_ sometime next year on next year's hardware - Intel's Vanderpool or AMD's Pacifica. In my own experience postgresql runs much faster native than running in vmware. Seriously though, if you are doing development vmware is fine. But when it's time to test performance, then if it's slow, you have to try it on real hardware to confirm whether it really is slow or not. If it's fast enough on vmware, then you're lucky :). Basically if you are running something in vmware workstation/GSX, you should be thankful if it runs fast enough. For VMware ESX, apparently stuff like I/O is significantly faster - but I haven't had a chance to test it :). Even if postgresql somehow performs rather more poorly than other DBs on virtual machines, I doubt the developers should invest much time in optimizing postgresql for current imperfect virtual machines. Especially with virtualization friendly x86 hardware on the horizon... Link.
On Wed, 2005-09-07 at 01:50, Jürgen Rose wrote: > Scott Marlowe wrote: > > > > A couple of points: > > > > 1: You wouldn't buy the QE II (a big luxery liner) and complain that it > > doesn't work well for water skiing and is too complex. It's the QE II. > > I don't get that argument. The real point is that database performance isn't very interesting from the point of view of a single user. I've seen plenty of database driven apps that were quite snappy for a single user bog down and fall over dead when a dozen or a hundred users started accessing it. PostgreSQL's strength is in how it handles dozens, hundreds, and sometimes thousands of users. It's not a dog at the single user situation either, mind you, but not a development priority like multiple user performance is. Right along side that is performance of things like schematic changes while in use and under load. PostgreSQL also excels at that. > > 2: You've given us absolutely nothing we can go on to help you make > > postgresql work better for you. Nothing. Just one explain analyze > > output. > > It is a bit complex, and I have now added two eplains from the main > queries to this mail Thanks, however an explain ANALYZE would tell us much more. explain tells us what the query planner thinks is gonna happen, explain analyze tells us both what the query planner expects, and then what really happens. Far more useful for troubleshooting. > > 3: If you're running one OS on top of another, and then a database on > > top of that, and all you've got is 512 Meg of RAM, don't expect stellar > > performance, especially from a database that uses shared memory like > > postgresql does. > > I don't expect mega performance, this is just a devel system, I mean do > you develop your database on the productive system? Well, I develop it on a system as close as possible to a production system, within reason. We deploy to production on linux, so I develop on linux. My workstation hardware is actually pretty good, considering it's only got a single IDE hard drive. > > 4: Don't compare PostgreSQL to those other databases unless you're > > going to give it a chance. So far, you haven't done so, you've only > > complained. > > I used it now for over half a year, and as I said in one of my previous > mails, it is quite powerful and flexible, but I'm not impressed with the > performance and it has its quirks. But you've used it without any tuning of any kind. That means you haven't really had a chance to see it at its best yet. > > 5: I have tested a properly tuned PostgreSQL server that was on about > > 1/2 the hardware (CPU speed, memory, RAID array) as a MSSQL server and > > easily out ran it. But, I took my time, read the docs, and tuned the > > server OS and PostgreSQL > > As I said before, I don't want to tune my system (my devel!), in my > opinion it has to run with ok performance out of the box. Wait, you're not willing to invest 2 minutes to make one or two minor changes that might double or even treble the speed of your development machine? I can't help you then. Not on that. no one's going to do what it takes to make your development machine run postgresql faster out of the box if it means postgresql can't run out of the box on half the machines it now runs on. There has been some work on auto-tuning it. But since you're running under an emulator, I can't imagine performance is all that important. Are you running Windows with linux on top in vmware, then postgresql? If that's the only reason you're running vmware, then get the windows native port and run it, it should be MUCH faster than the emulated one you're running now. > > 6: Databases may appear simple, they are not, and the more complex they > > are, the more you'll have to do to make full use of them. > > > > So, have you been running vacuum and analyze, do you have the right > > indexes, are you using queries that can use those indexes, have you > > turned up sort_mem and a few other easily tweakable settings. > > > > PostgreSQL's use of shared memory, combined with many older Operating > > systems have VERY conservative settings for such, combined further with > > the need for PostgreSQL to run on dang near anything, mean that, often, > > out of the box, it's not as fast as some other servers. > > I can set the sort_mem to what I want, but postgres doesn't care. It > just consumes my CPU time but no memory. Again, have you been running vacuum and analyze regularly, do you know the difference between a regular and full vacuum and which you may need (right now a full vacuum may help a bit, if you haven't been vacuuming up til now, then schedule regular vacuums) sort_mem is just one setting, and unless the database is in the act of sorting, it won't use that memory. Have you read the tuning guide on the varlena site? > I'm sorry but it is a bit complicated to explain the whole structure. > > And my main point is, that with each row I insert, it becomes slower, > and I will try to reproduce that behaviour, but I don't have so much > time for it right now. That sounds like index and / or table bloat. Are you doing any updates or deletes on this table as well? Dead tuples will slow you down over time if they aren't reclaimed. Could you repost both the query and the explain ANALYZE output for your queries? Thanks. Also, feel free to post your postgresql.conf file. Note that you're not likely to ever get performance equivalent to "peppy" or something like that on a laptop running under vmware. Looking at your current query plans, it would seem that increasing both shared memory and sort memory would help, as shared memory being increase might let the planner choose a hash join method that fits in memory and is much faster than a merge or loop join. The sort memory would help with all those sorts, of course.