Thread: Re: Large databases, performance
On 3 Oct 2002 at 11:57, Robert Treat wrote: > NOTE: Setting follow up to the performance list > > Funny that the status quo seems to be if you need fast selects on data > that has few inserts to pick mysql, otherwise if you have a lot of > inserts and don't need super fast selects go with PostgreSQL; yet your > data seems to cut directly against this. Well, couple of things.. The number of inserts aren't few. it's 5000/sec.required in the field Secondly I don't know really but postgresql seems doing pretty fine in parallel selects. If we use mysql with transaction support then numbers are really close.. May be it's time to rewrite famous myth that postgresql is slow. When properly tuned or given enough head room, it's almost as fast as mysql.. > I'm curious, did you happen to run the select tests while also running > the insert tests? IIRC the older mysql versions have to lock the table > when doing the insert, so select performance goes in the dumper in that > scenario, perhaps that's not an issue with 3.23.52? IMO even if it locks tables that shouldn't affect select performance. It would be fun to watch when we insert multiple chunks of data and fire queries concurrently. I would be surprised if mysql starts slowing down.. > It also seems like the vacuum after each insert is unnecessary, unless > your also deleting/updating data behind it. Perhaps just running an > ANALYZE on the table would suffice while reducing overhead. I believe that was vacuum analyze only. But still it takes lot of time. Good thing is it's not blocking.. Anyway I don't think such frequent vacuums are going to convince planner to choose index scan over sequential scan. I am sure it's already convinced.. Regards, Shridhar ----------------------------------------------------------- Shridhar Daithankar LIMS CPE Team Member, PSPL. mailto:shridhar_daithankar@persistent.co.in Phone:- +91-20-5678900 Extn.270 Fax :- +91-20-5678901 -----------------------------------------------------------
On Thu, 2002-10-03 at 12:17, Shridhar Daithankar wrote: > On 3 Oct 2002 at 11:57, Robert Treat wrote: > May be it's time to rewrite famous myth that postgresql is slow. That myth has been dis-proven long ago, it just takes awhile for everyone to catch on ;-) When properly > tuned or given enough head room, it's almost as fast as mysql.. > > > I'm curious, did you happen to run the select tests while also running > > the insert tests? IIRC the older mysql versions have to lock the table > > when doing the insert, so select performance goes in the dumper in that > > scenario, perhaps that's not an issue with 3.23.52? > > IMO even if it locks tables that shouldn't affect select performance. It would > be fun to watch when we insert multiple chunks of data and fire queries > concurrently. I would be surprised if mysql starts slowing down.. > Hmm... been awhile since I dug into mysql internals, but IIRC once the table was locked, you had to wait for the insert to complete so the table would be unlocked and the select could go through. (maybe this is a myth that I need to get clued in on) > > It also seems like the vacuum after each insert is unnecessary, unless > > your also deleting/updating data behind it. Perhaps just running an > > ANALYZE on the table would suffice while reducing overhead. > > I believe that was vacuum analyze only. But still it takes lot of time. Good > thing is it's not blocking.. > > Anyway I don't think such frequent vacuums are going to convince planner to > choose index scan over sequential scan. I am sure it's already convinced.. > My thinking was that if your just doing inserts, you need to update the statistics but don't need to check on unused tuples. Robert Treat
On 3 Oct 2002 at 12:26, Robert Treat wrote: > On Thu, 2002-10-03 at 12:17, Shridhar Daithankar wrote: > > On 3 Oct 2002 at 11:57, Robert Treat wrote: > > May be it's time to rewrite famous myth that postgresql is slow. > > That myth has been dis-proven long ago, it just takes awhile for > everyone to catch on ;-) :-) > Hmm... been awhile since I dug into mysql internals, but IIRC once the > table was locked, you had to wait for the insert to complete so the > table would be unlocked and the select could go through. (maybe this is > a myth that I need to get clued in on) If that turns out to be true, I guess mysql will nose dive out of window.. May be time to run a test that's nearer to real world expectation, especially in terms on concurrency.. I don't think tat will be an issue with mysql with transaction support. The vanilla one might suffer.. Not the other one.. At least theoretically.. > My thinking was that if your just doing inserts, you need to update the > statistics but don't need to check on unused tuples. Any other way of doing that other than vacuum analyze? I thought that was the only way.. Bye Shridhar -- "Even more amazing was the realization that God has Internet access. Iwonder if He has a full newsfeed?"(By Matt Welsh)
Shridhar Daithankar wrote: >On 3 Oct 2002 at 11:57, Robert Treat wrote: > > > >>NOTE: Setting follow up to the performance list >> >>Funny that the status quo seems to be if you need fast selects on data >>that has few inserts to pick mysql, otherwise if you have a lot of >>inserts and don't need super fast selects go with PostgreSQL; yet your >>data seems to cut directly against this. >> >> > >Well, couple of things.. > >The number of inserts aren't few. it's 5000/sec.required in the field Secondly >I don't know really but postgresql seems doing pretty fine in parallel selects. >If we use mysql with transaction support then numbers are really close.. > >May be it's time to rewrite famous myth that postgresql is slow. When properly >tuned or given enough head room, it's almost as fast as mysql.. > > In the case of concurrent transactions MySQL does not do as well due to very bad locking behavious. PostgreSQL is far better because it does row level locking instead of table locking. If you have many concurrent transactions MySQL performs some sort of "self-denial-of-service". I'd choose PostgreSQL in order to make sure that the database does not block. >>I'm curious, did you happen to run the select tests while also running >>the insert tests? IIRC the older mysql versions have to lock the table >>when doing the insert, so select performance goes in the dumper in that >>scenario, perhaps that's not an issue with 3.23.52? >> >> > >IMO even if it locks tables that shouldn't affect select performance. It would >be fun to watch when we insert multiple chunks of data and fire queries >concurrently. I would be surprised if mysql starts slowing down.. > > In the case of concurrent SELECTs and INSERT/UPDATE/DELETE operations MySQL will slow down for sure. The more concurrent transactions you have the worse MySQL will be. >>It also seems like the vacuum after each insert is unnecessary, unless >>your also deleting/updating data behind it. Perhaps just running an >>ANALYZE on the table would suffice while reducing overhead. >> >> > >I believe that was vacuum analyze only. But still it takes lot of time. Good >thing is it's not blocking.. > >Anyway I don't think such frequent vacuums are going to convince planner to >choose index scan over sequential scan. I am sure it's already convinced.. > > PostgreSQL allows you to improve execution plans by giving the planner a hint. In addition to that: if you need REAL performance and if you are running similar queries consider using SPI. Also: 7.3 will support PREPARE/EXECUTE. If you are running MySQL you will not be able to add features to the database easily. In the case of PostgreSQL you have a broad range of simple interfaces which make many things pretty simple (eg. optimized data types in < 50 lines of C code). PostgreSQL is the database of the future and you can perform a lot of tuning. MySQL is a simple frontend to a filesystem and it is fast as long as you are doing SELECT 1+1 operations. Also: Keep in mind that PostgreSQL has a wonderful core team. MySQL is built on Monty Widenius and the core team = Monty. Also: PostgreSQL = ANSI compilant, MySQL = Monty compliant In the past few years I have seen that there is no database system which can beat PostgreSQL's flexibility and stability. I am familiar with various database systems but believe: PostgreSQL is the best choice. Hans >Regards, > Shridhar > >----------------------------------------------------------- >Shridhar Daithankar >LIMS CPE Team Member, PSPL. >mailto:shridhar_daithankar@persistent.co.in >Phone:- +91-20-5678900 Extn.270 >Fax :- +91-20-5678901 >----------------------------------------------------------- > > >---------------------------(end of broadcast)--------------------------- >TIP 6: Have you searched our list archives? > >http://archives.postgresql.org > > -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at <http://www.postgresql.at>, cluster.postgresql.at <http://cluster.postgresql.at>, www.cybertec.at <http://www.cybertec.at>, kernel.cybertec.at <http://kernel.cybertec.at>
On Thu, 03 Oct 2002 21:47:03 +0530, "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> wrote: >I believe that was vacuum analyze only. Well there is VACUUM [tablename]; and there is ANALYZE [tablename]; And VACUUM ANALYZE [tablename]; is VACUUM followed by ANALYZE. Servus Manfred
On Thu, 2002-10-03 at 11:17, Shridhar Daithankar wrote: > On 3 Oct 2002 at 11:57, Robert Treat wrote: > [snip] > > I'm curious, did you happen to run the select tests while also running > > the insert tests? IIRC the older mysql versions have to lock the table > > when doing the insert, so select performance goes in the dumper in that > > scenario, perhaps that's not an issue with 3.23.52? > > IMO even if it locks tables that shouldn't affect select performance. It would > be fun to watch when we insert multiple chunks of data and fire queries > concurrently. I would be surprised if mysql starts slowing down.. What kind of lock? Shared lock or exclusive lock? If SELECT performance tanked when doing simultaneous INSERTs, then maybe there were exclusive table locks. -- +------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "What other evidence do you have that they are terrorists, | | other than that they trained in these camps?" | | 17-Sep-2002 Katie Couric to an FBI agent regarding the 5 | | men arrested near Buffalo NY | +------------------------------------------------------------+
On Thu, 2002-10-03 at 11:51, Hans-Jürgen Schönig wrote: > Shridhar Daithankar wrote: > > >On 3 Oct 2002 at 11:57, Robert Treat wrote: [snip] > PostgreSQL allows you to improve execution plans by giving the planner a > hint. > In addition to that: if you need REAL performance and if you are running > similar queries consider using SPI. What is SPI? -- +------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "What other evidence do you have that they are terrorists, | | other than that they trained in these camps?" | | 17-Sep-2002 Katie Couric to an FBI agent regarding the 5 | | men arrested near Buffalo NY | +------------------------------------------------------------+
May I suggest that instead of [pgsql-performance] that [PERF] be used to save some of the subject line. Ron Johnson wrote: > > On Thu, 2002-10-03 at 11:51, Hans-Jürgen Schönig wrote: > > Shridhar Daithankar wrote: > > > > >On 3 Oct 2002 at 11:57, Robert Treat wrote: > [snip] > > PostgreSQL allows you to improve execution plans by giving the planner a > > hint. > > In addition to that: if you need REAL performance and if you are running > > similar queries consider using SPI. > > What is SPI? > > -- > +------------------------------------------------------------+ > | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | > | Jefferson, LA USA http://members.cox.net/ron.l.johnson | > | | > | "What other evidence do you have that they are terrorists, | > | other than that they trained in these camps?" | > | 17-Sep-2002 Katie Couric to an FBI agent regarding the 5 | > | men arrested near Buffalo NY | > +------------------------------------------------------------+ > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Thu, Oct 03, 2002 at 06:51:05PM +0200, Hans-J?rgen Sch?nig wrote: > In the case of concurrent transactions MySQL does not do as well due to > very bad locking behavious. PostgreSQL is far better because it does row > level locking instead of table locking. It is my understanding that MySQL no longer does this on InnoDB tables. Whether various bag-on-the-side table types are a good thing I will leave to others; but there's no reason to go 'round making claims about old versions of MySQL any more than there is a reason to continue to talk about PostgreSQL not being crash safe. MySQL has moved along nearly as quickly as PostgreSQL. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On 3 Oct 2002 at 18:53, Manfred Koizar wrote: > On Thu, 03 Oct 2002 21:47:03 +0530, "Shridhar Daithankar" > <shridhar_daithankar@persistent.co.in> wrote: > >I believe that was vacuum analyze only. > > Well there is > > VACUUM [tablename]; > > and there is > > ANALYZE [tablename]; > > And > > VACUUM ANALYZE [tablename]; > > is VACUUM followed by ANALYZE. I was using vacuum analyze. Good that you pointed out. Now I will modify the postgresql auto vacuum daemon that I wrote to analyze only in case of excesive inserts. I hope that's lighter on performance compared to vacuum analyze.. Bye Shridhar -- Mix's Law: There is nothing more permanent than a temporary building. There is nothing more permanent than a temporary tax.
On Thu, 3 Oct 2002, Hans-Jürgen Schönig wrote: > In the case of concurrent transactions MySQL does not do as well due to > very bad locking behavious. PostgreSQL is far better because it does row > level locking instead of table locking. > If you have many concurrent transactions MySQL performs some sort of > "self-denial-of-service". I'd choose PostgreSQL in order to make sure > that the database does not block. While I'm no big fan of MySQL, I must point out that with innodb tables, the locking is row level, and the ability to handle parallel read / write is much improved. Also, Postgresql does NOT use row level locking, it uses MVCC, which is "better than row level locking" as Tom puts it. Of course, hot backup is only 2,000 Euros for an innodb table mysql, while hot backup for postgresql is free. :-) That said, MySQL still doesn't handle parallel load nearly as well as postgresql, it's just better than it once was. > Also: Keep in mind that PostgreSQL has a wonderful core team. MySQL is > built on Monty Widenius and the core team = Monty. > Also: PostgreSQL = ANSI compilant, MySQL = Monty compliant This is a very valid point. The "committee" that creates and steers Postgresql is very much a meritocracy. The "committee" that steers MySQL is Monty. I'm much happier knowing that every time something important needs to be done we have a whole cupboard full of curmudgeons arguing the fine points so that the "right thing" gets done.
Andrew Sullivan <andrew@libertyrms.info> wrote: > On Thu, Oct 03, 2002 at 06:51:05PM +0200, Hans-J?rgen Sch?nig wrote: > > > In the case of concurrent transactions MySQL does not do as well due to > > very bad locking behavious. PostgreSQL is far better because it does row > > level locking instead of table locking. > > It is my understanding that MySQL no longer does this on InnoDB > tables. Whether various bag-on-the-side table types are a good thing > I will leave to others; but there's no reason to go 'round making > claims about old versions of MySQL any more than there is a reason to > continue to talk about PostgreSQL not being crash safe. MySQL has > moved along nearly as quickly as PostgreSQL. Locking and transactions is not fine in MySQL (with InnoDB) though. I tried to do selects on a table I was concurrently inserting to. In a single thread I was constantly inserting 1000 rows per transaction. While inserting I did some random selects on the same table. It often happend that the insert transactions were aborted due to dead lock problems. There I see the problem with locking reads. I like PostgreSQL's MVCC! Regards, Michael Paesold