Thread: Why Not MySQL?
>Actually, I'm changing the link > >http://openacs.org/why-not-mysql.html Moments after forwarding the link to Ben's piece on why MySQL sucks to this list, he e-mailed me the above note. Sorry for any inconvenience... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> > >Actually, I'm changing the link > > > >http://openacs.org/why-not-mysql.html > > Moments after forwarding the link to Ben's piece on why > MySQL sucks to this list, he e-mailed me the above note. > > Sorry for any inconvenience... I am adding this to the FAQ. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
A very, very good article. I love the comment about MySQL being a filesystem with an SQL interface :-) However.. I'm faced with a huge dilemma. We use PostgreSQL for a fairly large application I wrote, the database is still pretty small, it carries info on about 25-30,000 people and about 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that every month I stump someone with the obscure things that happen to our data :-) From corrupted indexes to corrupted system tables, it's almost always unrecoverable. Luckily I always have a backup to restore from and the world goes on... We've only recently started to notice that the backend is slowing down. It seems that with every additional applicant added it get exponentially slower... So, sadly I have to go find another backend for this application -- a commercial one too so we can get "commercial support" (yuck).. So, could you guys suggest some other backends I might look into? I know it's an odd place for me to ask but the flat truth is that I think *I* am to blame for my Postgres troubles and even taking all of the problems into account I think PG is the best damn free RDBMS out there. It's functionality is superior to everyone else's, it's developers are no less than amazing and well -- I trust you guys to give me some honest opinions.. The functionality I need is basically what PG has.. Transactions are a must as well as some sort of sequence -- stability over performance but performance is very important too. It also needs to run native on FreeBSD.. Oracle is out as we use FreeBSD and someone out there decided that they wouldn't support FreeBSD (in the license as well as in the code!).. Thanks guys, especially to all who tried to help in private (Don, Tom -- many others).. -Mitch ----- Original Message ----- From: Bruce Momjian <pgman@candle.pha.pa.us> To: Don Baccus <dhogaza@pacifier.com> Cc: <pgsql-hackers@postgresql.org> Sent: Tuesday, May 02, 2000 5:55 PM Subject: Re: [HACKERS] Why Not MySQL? > > > > >Actually, I'm changing the link > > > > > >http://openacs.org/why-not-mysql.html > > > > Moments after forwarding the link to Ben's piece on why > > MySQL sucks to this list, he e-mailed me the above note. > > > > Sorry for any inconvenience... > > I am adding this to the FAQ. > > -- > Bruce Momjian | http://www.op.net/~candle > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
> exponentially slower... So, sadly I have to go find another backend for this > application -- a commercial one too so we can get "commercial support" > (yuck).. Actually you might want to give Interbase a try. Version 6 is now in beta and is going to be open sourced. It's been around a while and seems pretty solid. There will be a commercial entity to buy support contracts from and the newsgroups are very helpful. As a bonus there is a realatively large installed user base already and very very nice client side tools available. See http://www.interbase.com/ or http://www.interbase2000.com/ for further details. If I can't get my questions answered about case sensitivity issues here (no help so far) I will most likely to use it myself. > I need is basically what PG has.. Transactions are a must as well as some > sort of sequence -- stability over performance but performance is very > important too. It also needs to run native on FreeBSD.. IB supports transactions and sequences. It's very stable and pretty fast. There was a problem with shared memory in the early beta and earlier releases but it's fixed now. I am pretty sure it runs on FreeBSD but I am not sure if it runs natively or under linux emulation.
At 06:28 PM 5/2/00 -0400, Mitch Vincent wrote: >So, could you guys suggest some other backends I might look into? I know >it's an odd place for me to ask but the flat truth is that I think *I* am to >blame for my Postgres troubles and even taking all of the problems into >account I think PG is the best damn free RDBMS out there. It's functionality >is superior to everyone else's, it's developers are no less than amazing and >well -- I trust you guys to give me some honest opinions.. The functionality >I need is basically what PG has.. Transactions are a must as well as some >sort of sequence -- stability over performance but performance is very >important too. It also needs to run native on FreeBSD.. First, have you been having the same problems with PG 7.0? I recall that you had it up on a test system but nothing more. It's a pity that you've reached this point, because PG is so much better than it was 18 months ago (and before, of course, I mention that timeframe because that's roughly when I first investigated its suitability for the web toolkit project) and the trajectory is definitely in the right direction. It's also a loss to the development effort, as people with bugs in many ways are more useful than people who have no problems (though of course having no bugs for users to stumble across is the best situation!) Still, I understand the need to solve your problems today, not tomorrow. Interbase is a possible solution. They have a pretty good reputation, and their "super server" (threaded with connections sharing a buffer cache) should scale well. My rough estimate is that they're at about the place PG will be when 7.1 comes out. I don't know if they support FreeBSD, though. Any reason you can't just put up a box with Linux? There's an older version of Sybase available at no charge, again only for Linux, though. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Tue, 2 May 2000, Tim Uckun wrote: > If I can't get my questions answered about case sensitivity issues here > (no help so far) I will most likely to use it myself. What questions? *raised eyebrow*
As Don asks, what happened with the v7.0 trials you were doing? Corrupted indices, I've seen occasionally in older versions, but I can't recall ever seeing corrupt system tables ... I don't have a GUI browser right, so searching the archives is kinda tough for me :( Can you refresh my memory for me? There has to be something logical to this, as to what the cause for the corruption is :( From Don's comment, I take it you are using FreeBSD? Version? Stability of the machine? Never crashes? Version of PostgreSQL? Compile/configure options? Do you have any core files in your data/base/* hierarchy that would be the result of a backend crashing? I know you are looking at alternatives, but I'm terrible at letting go of problems :( On Tue, 2 May 2000, Mitch Vincent wrote: > A very, very good article. I love the comment about MySQL being a filesystem > with an SQL interface :-) > > However.. I'm faced with a huge dilemma. > > We use PostgreSQL for a fairly large application I wrote, the database is > still pretty small, it carries info on about 25-30,000 people and about > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that > every month I stump someone with the obscure things that happen to our data > :-) > > >From corrupted indexes to corrupted system tables, it's almost always > unrecoverable. Luckily I always have a backup to restore from and the world > goes on... We've only recently started to notice that the backend is slowing > down. It seems that with every additional applicant added it get > exponentially slower... So, sadly I have to go find another backend for this > application -- a commercial one too so we can get "commercial support" > (yuck).. > > So, could you guys suggest some other backends I might look into? I know > it's an odd place for me to ask but the flat truth is that I think *I* am to > blame for my Postgres troubles and even taking all of the problems into > account I think PG is the best damn free RDBMS out there. It's functionality > is superior to everyone else's, it's developers are no less than amazing and > well -- I trust you guys to give me some honest opinions.. The functionality > I need is basically what PG has.. Transactions are a must as well as some > sort of sequence -- stability over performance but performance is very > important too. It also needs to run native on FreeBSD.. > > Oracle is out as we use FreeBSD and someone out there decided that they > wouldn't support FreeBSD (in the license as well as in the code!).. > > Thanks guys, especially to all who tried to help in private (Don, Tom -- > many others).. > > -Mitch > > ----- Original Message ----- > From: Bruce Momjian <pgman@candle.pha.pa.us> > To: Don Baccus <dhogaza@pacifier.com> > Cc: <pgsql-hackers@postgresql.org> > Sent: Tuesday, May 02, 2000 5:55 PM > Subject: Re: [HACKERS] Why Not MySQL? > > > > > > > > >Actually, I'm changing the link > > > > > > > >http://openacs.org/why-not-mysql.html > > > > > > Moments after forwarding the link to Ben's piece on why > > > MySQL sucks to this list, he e-mailed me the above note. > > > > > > Sorry for any inconvenience... > > > > I am adding this to the FAQ. > > > > -- > > Bruce Momjian | http://www.op.net/~candle > > pgman@candle.pha.pa.us | (610) 853-3000 > > + If your life is a hard drive, | 830 Blythe Avenue > > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> > If I can't get my questions answered about case sensitivity issues here > > (no help so far) I will most likely to use it myself. My recollection is that it involved needing non-standard case-insensitive LIKE comparisons to get transparent behavior with an existing M$ Access app. So far, we were too polite to ask why one is working so hard to maintain compatibility with a non-standard interface, rather than writing the app to be portable. But I'll ask now. Tim? - Thomas btw, it seems to be the case that problems such as these, which might be interesting during slow times (from a theoretical standpoint at least), are decidely less so during the final stages of a release cycle. -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
The Hermit Hacker <scrappy@hub.org> wrote: > >On Tue, 2 May 2000, Tim Uckun wrote: > >> If I can't get my questions answered about case sensitivity issues here >> (no help so far) I will most likely to use it myself. > >What questions? *raised eyebrow* > > The question dealt with trying to acieve case insensitive collation in postgres. Specifically about possibly rewriting thetext_cmp or the varchareq functions. I am wondering what kind of mayhem that might cause. I was told by someone who triedit that it's possible to overload the ~~ (like) operator but that dropping the = operator cripples the database so thatit's not possible to overload the = operator. As to why I might want to do this I'll answer that one in response to another question :) ---------- Message To Spammers -- Game Over! Get spam-free email at http://www.MsgTo.com
Thomas Lockhart <lockhart@alumni.caltech.edu> wrote: >existing M$ Access app. So far, we were too polite to ask why one is >working so hard to maintain compatibility with a non-standard >interface, rather than writing the app to be portable. But I'll ask >now. Tim? Fair enough question. I agree with you that this is non standard typical MS lock in crap. But I have an application thatis written in access and has outgrown the data engine in access (which is pretty pathetic). Unfortunately this applicationis very large with over 300 tables and over 1400 saved queries (views). The MS solution to this problem is toupgrade to MS-SQL server (vendor lock in) which processes the queries in the exact same case insensitive manner. SQL serverdoes not break my application. I on the other hand want to avoid upsizing to SQL server. I could use sybase which alsoallows for case insensitive collation (no surprise there) but I really-really want to use an upen source database server. So. So right now I have a few choices. 1) Buckle into the vendor lock and be stuck with NT and SQL server 2) Buy sybase and spend way more then I want to. 3) Completely rewrite all 1400 queries and write all kinds of new code make sure the SQL emitted by access gets interceptedand translated properly. 4) Make Postgres process queries case insensitively. Well the third one is out of the question really I don't have that kind of time or money. It would take me a the rest ofthe year to accomplish that goal and the database would have to be taken out of commision in the meantime. > >btw, it seems to be the case that problems such as these, which might >be interesting during slow times (from a theoretical standpoint at >least), are decidely less so during the final stages of a release >cycle. I fully understand that you guys have your own set of priorities. I also appreciate the work you guys have put into makingpostgres into a database I want to use. Having said all that I did wait 4 to 5 days without a reply of any sort. Itwould have been perfectly fine for somebody to say "It's not possible don't waste your time", "Don't ask this questionhere", "we are really entirely too busy to deal with this" or even "go away and don't ever bother us ever again". ---------- Message To Spammers -- Game Over! Get spam-free email at http://www.MsgTo.com
At , Malcontent null wrote: >The question dealt with trying to acieve case insensitive collation in postgres. Specifically about possibly rewriting the text_cmp or the varchareq functions. I am wondering what kind of mayhem that might cause. I was told by someone who tried it that it's possible to overload the ~~ (like) operator but that dropping the = operator cripples the database so that it's not possible to overload the = operator. > >As to why I might want to do this I'll answer that one in response to another question :) Actually, I'd suggest you state the problem you're trying to solve first, rather than present the reasons you think PG can't handle the problem without showing your hand. Saying, "how can I implement the following solution to an unstated problem" rather than simply stating the problem seems ... well, impolite at first glance. I'm assuming that you actually are seeking a solution, rather than starting some sort of crusade in favor of a pet peeve? Don't misunderstand ... you may be right, there may be no way to solve your problem efficiently in PG. But until you tell us what you're trying to do, we have no way to decide whether or not you've actually exhausted the possibilities. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> I fully understand that you guys have your own set of priorities. I also > appreciate the work you guys have put into making postgres into a database > I want to use. Having said all that I did wait 4 to 5 days without a reply > of any sort. It would have been perfectly fine for somebody to say "It's not > possible don't waste your time", "Don't ask this question here", "we are > really entirely too busy to deal with this" or even "go away and don't ever > bother us ever again". Well, none of those things are true, and it is rare that someone would speak for a group this widely distributed to say "we are too busy". In most cases, when the usual suspects are too busy someone else will post an answer to a question, and you are never likely to get a definitive "I'm too busy and everyone else is too". At some point, someone may have time to answer *exactly* the questions you asked. Another strategy to try after the first one failed is to come in with the more detailed problem statement, asking for suggestions on a solution. Particularly if you can phrase it so it is clear that it may solve problems for a larger class of user than the one who managed to grow a M$ Access app to 300 tables and 1400 queries before deciding that Access might be a little light in performance to be suitable. But that's water under the bridge, eh? Anyway, so the larger class of problem is for the Sybase/M$ user who relies on case insensitive queries (which *are* available in Postgres) which are indistinguishable from the SQL92-mandated case-sensitive ones. So we might explore the possibilities for a contrib/ module which does this, though because it touches on replacing existing backend code it may not quite fly since there are some function lookup optimizations which may keep you from overwriting the existing routines. But it would be a neat capability to have; I wonder if it would work right away or if we could tweak the backend to allow this in the future?? Of course the alternative is to just dive in and hack and slash at the backend code. Look in parser/gram.y and utils/adt/like.c for starters... - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Mitch Vincent wrote: > > A very, very good article. I love the comment about MySQL being a filesystem > with an SQL interface :-) > > However.. I'm faced with a huge dilemma. > > We use PostgreSQL for a fairly large application I wrote, the database is > still pretty small, it carries info on about 25-30,000 people and about > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that > every month I stump someone with the obscure things that happen to our data > :-) What version are you using ? > >From corrupted indexes to corrupted system tables, it's almost always > unrecoverable. Luckily I always have a backup to restore from and the world > goes on... We've only recently started to notice that the backend is slowing > down. It seems that with every additional applicant added it get > exponentially slower... So, sadly I have to go find another backend for this > application -- a commercial one too so we can get "commercial support" > (yuck).. Could you be a little more specific on your performance issues ? The usual way to deal wih them is tuning your db structure and/or queries or setting backend options to use more memory for stuff or other such things. If there is something wrong with the structure or queries, then a database switch will help you very little, unless your front-end tool has some special support for _some_ databases and not for others. > So, could you guys suggest some other backends I might look into? The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid The website is usually obtained by putting www inf front and com at the end ;) And let us know of your results. > I know > it's an odd place for me to ask but the flat truth is that I think *I* am to > blame for my Postgres troubles and even taking all of the problems into > account I think PG is the best damn free RDBMS out there. It's functionality > is superior to everyone else's, it's developers are no less than amazing and > well -- I trust you guys to give me some honest opinions.. The functionality > I need is basically what PG has.. Transactions are a must as well as some > sort of sequence -- stability over performance but performance is very > important too. It also needs to run native on FreeBSD.. > > Oracle is out as we use FreeBSD and someone out there decided that they > wouldn't support FreeBSD (in the license as well as in the code!).. Is FreeBSD a religious issue there or can it be negotiated ? ------------- Hannu
Don Baccus <dhogaza@pacifier.com> wrote: > >Actually, I'd suggest you state the problem you're trying to solve first, >rather >than present the reasons you think PG can't handle the problem without >showing your >hand. Saying, "how can I implement the following solution to an unstated >problem" >rather than simply stating the problem seems ... well, impolite at first >glance. I admit that I may not be the clearest thinking individual on this planet but I thought I stated my problem in the originalpost to the best of my ability. I certainly wasn't trying to be rude. Here is a snippet from my original post. "In a nutshell I want to use postgres as a back end to an access database. This means that all collation done by postgres musht be case insensitive including like clauses." I left the all the spelling mistakes in place :) ---------- Message To Spammers -- Game Over! Get spam-free email at http://www.MsgTo.com
Thomas Lockhart <lockhart@alumni.caltech.edu> wrote: > >clear that it may solve problems for a larger class of user than the >one who managed to grow a M$ Access app to 300 tables and 1400 queries >before deciding that Access might be a little light in performance to >be suitable. But that's water under the bridge, eh? Actually I did post twice I had hoped that I was being more clear the second time. As for growing the access database wellsometimes apps take a life of their own. Database apps in general tend to be too critical to business to just scrap andrewrite so they just keep growing. >Anyway, so the larger class of problem is for the Sybase/M$ user who >relies on case insensitive queries (which *are* available in Postgres) If I may. MS Access for all of it's damnable faults is the single most popular database in the world. There are a whole slew of peoplewho do nothing except access programming and make very good money at it. Postgres is a great candidate as a possible back end database engine for access. This is a big possible application for postgres. To be usable for this purposehowever it needs a few things. 1) Longer object names (I guess this is possible via a DEFINE) 2) Case insensitive queries. 3) Outer joins (coming soon!). 4) Maybe ADO drivers for the VB users of the world. I don't know how important access integration is to the postgres community as a whole though. >Of course the alternative is to just dive in and hack and slash at the >backend code. Look in parser/gram.y and utils/adt/like.c for >starters... Thanks for the tip I'll start looking at this right away. ---------- Message To Spammers -- Game Over! Get spam-free email at http://www.MsgTo.com
On Tue, 2 May 2000, The Hermit Hacker wrote: > > As Don asks, what happened with the v7.0 trials you were doing? Corrupted > indices, I've seen occasionally in older versions, but I can't recall ever > seeing corrupt system tables ... > > I don't have a GUI browser right, so searching the archives is kinda tough > for me :( Can you refresh my memory for me? There has to be something > logical to this, as to what the cause for the corruption is :( > > >From Don's comment, I take it you are using FreeBSD? Version? Stability > of the machine? Never crashes? > > Version of PostgreSQL? Compile/configure options? Do you have any core > files in your data/base/* hierarchy that would be the result of a backend > crashing? > > I know you are looking at alternatives, but I'm terrible at letting go of > problems :( His description of table corruption and the system running slower and slower sounds like a disk going bad. I've seen it hundreds of times on news machines. Constant retries while trying to write to the disk will give slowdowns. Having data on a spot of the disk that's unreliable will certainly cause data integrity problems. Mitch, have you thoroughly checked the hardware? Vince. > > > On Tue, 2 May 2000, Mitch Vincent wrote: > > > A very, very good article. I love the comment about MySQL being a filesystem > > with an SQL interface :-) > > > > However.. I'm faced with a huge dilemma. > > > > We use PostgreSQL for a fairly large application I wrote, the database is > > still pretty small, it carries info on about 25-30,000 people and about > > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that > > every month I stump someone with the obscure things that happen to our data > > :-) > > > > >From corrupted indexes to corrupted system tables, it's almost always > > unrecoverable. Luckily I always have a backup to restore from and the world > > goes on... We've only recently started to notice that the backend is slowing > > down. It seems that with every additional applicant added it get > > exponentially slower... So, sadly I have to go find another backend for this > > application -- a commercial one too so we can get "commercial support" > > (yuck).. > > > > So, could you guys suggest some other backends I might look into? I know > > it's an odd place for me to ask but the flat truth is that I think *I* am to > > blame for my Postgres troubles and even taking all of the problems into > > account I think PG is the best damn free RDBMS out there. It's functionality > > is superior to everyone else's, it's developers are no less than amazing and > > well -- I trust you guys to give me some honest opinions.. The functionality > > I need is basically what PG has.. Transactions are a must as well as some > > sort of sequence -- stability over performance but performance is very > > important too. It also needs to run native on FreeBSD.. > > > > Oracle is out as we use FreeBSD and someone out there decided that they > > wouldn't support FreeBSD (in the license as well as in the code!).. > > > > Thanks guys, especially to all who tried to help in private (Don, Tom -- > > many others).. > > > > -Mitch > > > > ----- Original Message ----- > > From: Bruce Momjian <pgman@candle.pha.pa.us> > > To: Don Baccus <dhogaza@pacifier.com> > > Cc: <pgsql-hackers@postgresql.org> > > Sent: Tuesday, May 02, 2000 5:55 PM > > Subject: Re: [HACKERS] Why Not MySQL? > > > > > > > > > > > > >Actually, I'm changing the link > > > > > > > > > >http://openacs.org/why-not-mysql.html > > > > > > > > Moments after forwarding the link to Ben's piece on why > > > > MySQL sucks to this list, he e-mailed me the above note. > > > > > > > > Sorry for any inconvenience... > > > > > > I am adding this to the FAQ. > > > > > > -- > > > Bruce Momjian | http://www.op.net/~candle > > > pgman@candle.pha.pa.us | (610) 853-3000 > > > + If your life is a hard drive, | 830 Blythe Avenue > > > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > > > > > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy > Systems Administrator @ hub.org > primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org > > > -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net128K ISDN from $22.00/mo - 56K Dialup from $16.00/moat Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
> >existing M$ Access app. So far, we were too polite to ask why one is > >working so hard to maintain compatibility with a non-standard > >interface, rather than writing the app to be portable. But I'll ask > >now. Tim? > > Fair enough question. I agree with you that this is non > standard typical MS lock in crap. But I have an application > that is written in access and has outgrown the data engine in > access (which is pretty pathetic). Unfortunately this > application is very large with over 300 tables and over 1400 > saved queries (views). The MS solution to this problem is to > upgrade to MS-SQL server (vendor lock in) which processes the > queries in the exact same case insensitive manner. SQL server > does not break my application. I on the other hand want to > avoid upsizing to SQL server. Not to turn you away from PostgreSQL, but you might want to look at MSDE (Microsoft Data Engine) as an easier step. It has the same query processor as SQL Server, but scales much better (and includes transaction logs etc). See for example http://msdn.microsoft.com/library/backgrnd/html/msdeforvs.htm. The license for MSDE is also included in Office 2000 Pro/Premium, so chances are you may already have the required licenses. Still leaves you in the Microsoft box, though. //Magnus
> If I may. > MS Access for all of it's damnable faults is the single most > popular database in the world. There are a whole slew of > people who do nothing except access programming and make very > good money at it. Postgres is a great candidate as a possible > back end database engine for access. This is a big possible > application for postgres. To be usable for this purpose > however it needs a few things. > 1) Longer object names (I guess this is possible via a DEFINE) > 2) Case insensitive queries. > 3) Outer joins (coming soon!). > 4) Maybe ADO drivers for the VB users of the world. Should be no major need for a separate ADO driver for Pg. You can use ADO going through the ODBC driver, which already exists. //Magnus
Malcontent null wrote: > > >Anyway, so the larger class of problem is for the Sybase/M$ user who > >relies on case insensitive queries (which *are* available in Postgres) Maybe the right place to introduce case-insensitiveness would be in ODBC driver then ? > If I may. > MS Access for all of it's damnable faults is the single most popular > database in the world. There are a whole slew of people who do nothing > except access programming and make very good money at it. Postgres is > a great candidate as a possible back end database engine for access. > This is a big possible application for postgres. To be usable for this > purpose however it needs a few things. > 1) Longer object names (I guess this is possible via a DEFINE) How long should they be ? > 2) Case insensitive queries. Probably only the Access subset ("like", "order by", maybe even "=" ?) > 3) Outer joins (coming soon!). > 4) Maybe ADO drivers for the VB users of the world. AFAIK MS moves fast and ADO will be soon (or is already) officially obsolete. The technology du jour is XML. > I don't know how important access integration is to the postgres > community as a whole though. Probably not a top priority. Oracle is much more often seen as the target. --------------------- Hannu
On Wed, 3 May 2000, Hannu Krosing wrote: > Malcontent null wrote: > > > > >Anyway, so the larger class of problem is for the Sybase/M$ user who > > >relies on case insensitive queries (which *are* available in Postgres) > > Maybe the right place to introduce case-insensitiveness would be in ODBC > driver then ? > > > If I may. > > MS Access for all of it's damnable faults is the single most popular > > database in the world. There are a whole slew of people who do nothing > > except access programming and make very good money at it. Postgres is > > a great candidate as a possible back end database engine for access. > > This is a big possible application for postgres. To be usable for this > > purpose however it needs a few things. > > 1) Longer object names (I guess this is possible via a DEFINE) > > How long should they be ? > > > 2) Case insensitive queries. > > Probably only the Access subset ("like", "order by", maybe even "=" ?) don't we have a 'lower()' function? SELECT * FROM <table> WHERE field ~* 'this string' ORDER BY lower(field);? or SELECT * FROM <table> WHERE lower(field) = lower('StriNg');
On Wed, 3 May 2000, Vince Vielhaber wrote: > On Tue, 2 May 2000, The Hermit Hacker wrote: > > > > > As Don asks, what happened with the v7.0 trials you were doing? Corrupted > > indices, I've seen occasionally in older versions, but I can't recall ever > > seeing corrupt system tables ... > > > > I don't have a GUI browser right, so searching the archives is kinda tough > > for me :( Can you refresh my memory for me? There has to be something > > logical to this, as to what the cause for the corruption is :( > > > > >From Don's comment, I take it you are using FreeBSD? Version? Stability > > of the machine? Never crashes? > > > > Version of PostgreSQL? Compile/configure options? Do you have any core > > files in your data/base/* hierarchy that would be the result of a backend > > crashing? > > > > I know you are looking at alternatives, but I'm terrible at letting go of > > problems :( > > His description of table corruption and the system running slower and > slower sounds like a disk going bad. I've seen it hundreds of times > on news machines. Constant retries while trying to write to the disk > will give slowdowns. Having data on a spot of the disk that's unreliable > will certainly cause data integrity problems. That was one thing I was thinking ... the other was the possibility that he's mount'd async and his machine is rebooting ... *or* he has memory problems causing the shared memory to corrupt, dump the postmaster process which is corrupting his tables ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker wrote: > > On Wed, 3 May 2000, Hannu Krosing wrote: > > > Malcontent null wrote: > > > > > > >Anyway, so the larger class of problem is for the Sybase/M$ user who > > > >relies on case insensitive queries (which *are* available in Postgres) > > > > Maybe the right place to introduce case-insensitiveness would be in ODBC > > driver then ? > > ... > > > 2) Case insensitive queries. > > > > Probably only the Access subset ("like", "order by", maybe even "=" ?) > > don't we have a 'lower()' function? > > SELECT * FROM <table> WHERE field ~* 'this string' ORDER BY lower(field);? > > or > > SELECT * FROM <table> WHERE lower(field) = lower('StriNg'); That's what I meant by introducing pushing the case-insensitiveness into ODBC, so that the MS Access program can be made case-insensitive automatically by A) rewriting the queries to use lower() or B) by using case-insensitive operators where possible. ---------------- Hannu
Thomas Lockhart <lockhart@alumni.caltech.edu> el día Wed, 03 May 2000 04:12:37 +0000, escribió: >> > If I can't get my questions answered about case sensitivity issues here >> > (no help so far) I will most likely to use it myself. > >My recollection is that it involved needing non-standard >case-insensitive LIKE comparisons to get transparent behavior with an >existing M$ Access app. So far, we were too polite to ask why one is >working so hard to maintain compatibility with a non-standard >interface, rather than writing the app to be portable. But I'll ask >now. Tim? if tim want a LIKE to be done case-insensitive, why not doing: select foo from table1 where upper(foo) like "%BAR%" ?? sergio
> Of course the alternative is to just dive in and hack and slash at the > backend code. Look in parser/gram.y and utils/adt/like.c for > starters... That would be my recommendation. It is open source, so you can modify it however you like. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Hi Don, thanks for your reply... > First, have you been having the same problems with PG 7.0? I recall that > you had it up on a test system but nothing more. No, I was afraid to run 7.0 beta on the production server. I do have it on my development server however and haven't had any problems with it (of course the devel server is only used by me, the production server is used by about 600 people)... > It's a pity that you've reached this point, because PG is so much better > than it was 18 months ago (and before, of course, I mention that timeframe > because that's roughly when I first investigated its suitability for > the web toolkit project) and the trajectory is definitely in the right > direction. Well, we've started to look into it. My boss is worried about the stability and speed of Postgres in the long run, personally I love PG, I would like to stay with it until at least 7.1, if the problems still continue then maybe look elsewhere. > It's also a loss to the development effort, as people with bugs in many > ways are more useful than people who have no problems (though of course > having no bugs for users to stumble across is the best situation!) > Still, I understand the need to solve your problems today, not tomorrow. > Interbase is a possible solution. They have a pretty good reputation, > and their "super server" (threaded with connections sharing a buffer > cache) should scale well. My rough estimate is that they're at about > the place PG will be when 7.1 comes out. I don't know if they support > FreeBSD, though. Any reason you can't just put up a box with Linux? Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that Linux and I use to be friends, now we're mortal enemies. :-) -Mitch Vincent
> As Don asks, what happened with the v7.0 trials you were doing? Corrupted > indices, I've seen occasionally in older versions, but I can't recall ever > seeing corrupt system tables ... I couldn't run the 7.0 beta on our production server. It was forbidden from "higher up".. > I don't have a GUI browser right, so searching the archives is kinda tough > for me :( Can you refresh my memory for me? There has to be something > logical to this, as to what the cause for the corruption is :( Ok, the latest thing was "cannot find attribute 15 of relation pg_am" -- I got that when I tried to do an query. > >From Don's comment, I take it you are using FreeBSD? Version? Stability > of the machine? Never crashes? FreeBSD 4.0-R The machine is brand new (we built it because we thought it was a hardware problem before).. Ultra 160 SCSI Drives, 512 megs of ECC RAM, PIII 500 processor (soon to be upgraded). > Version of PostgreSQL? Compile/configure options? Do you have any core > files in your data/base/* hierarchy that would be the result of a backend > crashing? PG 6.5.3, no core files (this latest time at least, in the past there have been). As far as configure options, nothing, just the default configuration... > I know you are looking at alternatives, but I'm terrible at letting go of > problems :( Me too, that's why I've stayed with PG for 6 monthes with these problems.. I wish I had more to tell you now, however I had to restore the data from a backup. Thanks!! -Mitch
> > We use PostgreSQL for a fairly large application I wrote, the database is > > still pretty small, it carries info on about 25-30,000 people and about > > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that > > every month I stump someone with the obscure things that happen to our data > > :-) > > What version are you using ? 6.5.3 :-) > > >From corrupted indexes to corrupted system tables, it's almost always > > unrecoverable. Luckily I always have a backup to restore from and the world > > goes on... We've only recently started to notice that the backend is slowing > > down. It seems that with every additional applicant added it get > > exponentially slower... So, sadly I have to go find another backend for this > > application -- a commercial one too so we can get "commercial support" > > (yuck).. > > Could you be a little more specific on your performance issues ? Well, I'm just noticing that simple select queries are taking 3-5 seconds - on a table with 63 fields, 10000ish rows of data. The ID fields are indexed, as well as several always-searched varchar() fields. Here are some typical queries my application might generate. Please, let me know if you see anything that can be improved! select * from applicants as a where a.created::date = '05-01-2000' and a.firstname ~* '^mitch' limit 10 offset 0 select * from applicants as a,applicants_states as s where a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0 .... There are 63 fields in the 'applicants' table, all of which are searchable. Would it be a good or bad thing to index all fields that are searchable? > The usual way to deal wih them is tuning your db structure and/or > queries or > setting backend options to use more memory for stuff or other such > things. I'd love some pointers! This machine has lots-n-lots of memory. I'd love to make postgre use more than normal if it would get me better speed! > If there is something wrong with the structure or queries, then a > database > switch will help you very little, unless your front-end tool has some > special > support for _some_ databases and not for others. PHP/C is what I use to access PostgreSQL. > > So, could you guys suggest some other backends I might look into? > > The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid > > The website is usually obtained by putting www inf front and com at the > end ;) > > And let us know of your results. > > > I know > > it's an odd place for me to ask but the flat truth is that I think *I* am to > > blame for my Postgres troubles and even taking all of the problems into > > account I think PG is the best damn free RDBMS out there. It's functionality > > is superior to everyone else's, it's developers are no less than amazing and > > well -- I trust you guys to give me some honest opinions.. The functionality > > I need is basically what PG has.. Transactions are a must as well as some > > sort of sequence -- stability over performance but performance is very > > important too. It also needs to run native on FreeBSD.. > > > > Oracle is out as we use FreeBSD and someone out there decided that they > > wouldn't support FreeBSD (in the license as well as in the code!).. > > Is FreeBSD a religious issue there or can it be negotiated ? Not religious, though I love that OS like I love my right arm.. I am stuck with x86 hardware -- that's pretty much it (and eliminates Solaris and several other as a possible OS I suppose).. THANKS!! -Mitch Vincent
> His description of table corruption and the system running slower and > slower sounds like a disk going bad. I've seen it hundreds of times > on news machines. Constant retries while trying to write to the disk > will give slowdowns. Having data on a spot of the disk that's unreliable > will certainly cause data integrity problems. > > Mitch, have you thoroughly checked the hardware? Checked and replaced twice. We're using Ultra 160 SCSI drives now so the speed there isn't a problem I hope. :-)
Mitch Vincent wrote: > > > As Don asks, what happened with the v7.0 trials you were doing? Corrupted > > indices, I've seen occasionally in older versions, but I can't recall ever > > seeing corrupt system tables ... > > I couldn't run the 7.0 beta on our production server. It was forbidden from > "higher up".. Hmmm. I wonder what would be the reaction if you showed your "higher ups" the list of bug-fixes between your release and 7.0. It is a *very long* list. I loathe unplanned/forced upgrades, especially when due to reliability problems. I was put in that position with pg 6.5.2 problems. But I have been quite pleased with 7.0 so far. Regards, Ed Loehr
On Wed, 3 May 2000, Mitch Vincent wrote: > Hi Don, thanks for your reply... > > > First, have you been having the same problems with PG 7.0? I recall that > > you had it up on a test system but nothing more. > > No, I was afraid to run 7.0 beta on the production server. I do have it on > my development server however and haven't had any problems with it (of > course the devel server is only used by me, the production server is used by > about 600 people)... Okay, its no longer beta :) run it, run it :) My experience so far is that it was basically a 'plug-n-play' sort of thing ... dump your database, upgrade and reload ... the only problem I had was one script that I did something like: SELECT field as field without realizing it, and v7.0 won't allow that (field == field) ... other then that, *much* more stable, and appears faster then the old ... > Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that > Linux and I use to be friends, now we're mortal enemies. :-) I like you :)
On Wed, 3 May 2000, Mitch Vincent wrote: > > I don't have a GUI browser right, so searching the archives is kinda tough > > for me :( Can you refresh my memory for me? There has to be something > > logical to this, as to what the cause for the corruption is :( > > Ok, the latest thing was "cannot find attribute 15 of relation pg_am" -- I > got that when I tried to do an query. Someone recently posted similar, and I swore he answered himself with a 'vacuuming fixed the problem' ... but I could be mis-quoting ... > > Version of PostgreSQL? Compile/configure options? Do you have any core > > files in your data/base/* hierarchy that would be the result of a backend > > crashing? > > PG 6.5.3, no core files (this latest time at least, in the past there have > been). > > As far as configure options, nothing, just the default configuration... Okay, with v7.0, I'd recommend adding --enable-debug, so that if a core does creep in, we can analyze it ... My first and foremost recommendation is to upgrade to v7.0 first ... take your test machine and make sure you have no problem with the dump/reload, but v7.0 is, once more, leaps and bounds ahead of v6.5.3 ... no guarantees it will make a difference, but at least it gets you into a release that we're going to be focusing on debugging intensely over the next little while ...
Malcontent null <malcontent@msgto.com> writes: > "In a nutshell I want to use postgres as a back end to an access > database. This means that all collation done by postgres musht be case > insensitive including like clauses." Well, it'd certainly be easy enough to make a case-insensitive set of text comparison functions and LIKE comparator (sounds like you already found out where they live, so go to it). The $64 question is exactly when where and how to invoke them. For your immediate purpose I suppose you can just alter the standard operators, and thus produce a server that can do nothing *but* case-insensitive text comparisons. That will certainly not be acceptable as a general-usage answer, however. There has been a good deal of talk about supporting more of SQL92's national-character and collation features in future releases. What might ultimately happen is that we offer a case-insensitive collation mode that could be assigned to particular text columns, or even made the default for a whole database. Not sure how far down the road that is. In the meantime, there's no shame in running a locally-hacked server. That's one of the things that open source is for, after all ;-) regards, tom lane
"Mitch Vincent" <mitch@huntsvilleal.com> writes: >> First, have you been having the same problems with PG 7.0? I recall that >> you had it up on a test system but nothing more. > No, I was afraid to run 7.0 beta on the production server. I do have it on > my development server however and haven't had any problems with it (of > course the devel server is only used by me, the production server is used by > about 600 people)... FWIW, we've fixed a huge number of bugs since 6.5.*. Even the beta versions of 7.0 are more stable than any prior release IMHO (and we've seen no beta test reports that would contradict that). I'd really like to see you try 7.0 before walking away... regards, tom lane
Oh, I'm downloading it to the development server as I type this.. I didn't know RC3 (virtually the release) was out. I plan on starting to use it today if everything works on the development server. - Mitch ----- Original Message ----- From: Tom Lane <tgl@sss.pgh.pa.us> To: Mitch Vincent <mitch@huntsvilleal.com> Cc: <pgsql-hackers@postgresql.org>; Don Baccus <dhogaza@pacifier.com> Sent: Wednesday, May 03, 2000 11:21 AM Subject: Re: [HACKERS] Why Not MySQL? > "Mitch Vincent" <mitch@huntsvilleal.com> writes: > >> First, have you been having the same problems with PG 7.0? I recall that > >> you had it up on a test system but nothing more. > > > No, I was afraid to run 7.0 beta on the production server. I do have it on > > my development server however and haven't had any problems with it (of > > course the devel server is only used by me, the production server is used by > > about 600 people)... > > FWIW, we've fixed a huge number of bugs since 6.5.*. Even the beta > versions of 7.0 are more stable than any prior release IMHO (and we've > seen no beta test reports that would contradict that). I'd really like > to see you try 7.0 before walking away... > > regards, tom lane >
"Mitch Vincent" <mitch@huntsvilleal.com> writes: >> Could you be a little more specific on your performance issues ? > Well, I'm just noticing that simple select queries are taking 3-5 seconds - > on a table with 63 fields, 10000ish rows of data. The ID fields are > indexed, as well as several always-searched varchar() fields. Hmm. What does EXPLAIN show for the query plan? You might also try turning on execution stats (run psql with PGOPTIONS="-d2 -s" for starters) to see what getrusage() can tell. The results will be in the postmaster log and might look like this: StartTransactionCommand query: SELECT usename, relname, relkind, relhasrules FROM pg_class, pg_user WHERE usesysid = relowner and ( relkind = 'r'OR relkind = 'i' OR relkind = 'S') and relname !~ '^pg_' and (relkind != 'i' OR relname !~ '^xinx') ORDER BY relname ProcessQuery ! system usage stats: ! 0.083256 elapsed 0.040000 user 0.000000 system sec ! [0.080000 user 0.020000 sys total] ! 12/1 [46/11] filesystem blocks in/out ! 0/0 [1/2] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent ! 8/5 [29/10] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 18 read, 0 written, buffer hit rate = 94.29% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written CommitTransactionCommand regards, tom lane
On Wed, 3 May 2000, Mitch Vincent wrote: > Here are some typical queries my application might generate. Please, let me > know if you see anything that can be improved! First comment ... Tom Lane always jumps on me on this ... if you are going to send a QUERY to get recommendations on, send in an EXPLAIN on that query also, so that we can see what the backend 'thinks" its going to do ... > select * from applicants as a where a.created::date = '05-01-2000' and > a.firstname ~* '^mitch' limit 10 offset 0 First comment, that Tom can clarify in case I'm wrong ... when I ran UDMSearch under v6.5.3, there was a problem where a LIKE query was causing a query to take forever to complete ... Tom, at the time, got me to change the query so that instead of: url LIKE '%s' it was: (url || '') LIKE '%s' Now, this was in an earlier RC of v7.0 that I had to do this, and Tom made some changes to the following one to 'fix the problem', but my performance went from several *minutes* to several *seconds* of time to complete the exact same query ... > > The usual way to deal wih them is tuning your db structure and/or > > queries or > > setting backend options to use more memory for stuff or other such > > things. > > I'd love some pointers! This machine has lots-n-lots of memory. I'd love to > make postgre use more than normal if it would get me better speed! on my machine (FreeBSD 4.0-STABLE), I'm currently running with a kernel of: options SYSVSHM options SHMMAXPGS=262144 options SHMSEG=32 options SYSVSEM options SEMMNI=40 options SEMMNS=240 options SEMMNU=120 options SEMMAP=120 options SYSVMSG and a -B set to 4096 and -o ' -S 16384 ' ... the -B deals with teh amoun tof shared memory, the -S I'm using only affects stuff like ORDER BY and GROUP BY (allocates up to how much RAM to use on a sort before going to disk ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Okay, single or dual processor machine? I'm running Dual myself, and for any largish DB, I would highly recommend that regardless of OS ... at least if a query starts to hog CPU, you have a second one to use ... On Wed, 3 May 2000, Mitch Vincent wrote: > > > We use PostgreSQL for a fairly large application I wrote, the database > is > > > still pretty small, it carries info on about 25-30,000 people and about > > > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems > that > > > every month I stump someone with the obscure things that happen to our > data > > > :-) > > > > What version are you using ? > > 6.5.3 :-) > > > > >From corrupted indexes to corrupted system tables, it's almost always > > > unrecoverable. Luckily I always have a backup to restore from and the > world > > > goes on... We've only recently started to notice that the backend is > slowing > > > down. It seems that with every additional applicant added it get > > > exponentially slower... So, sadly I have to go find another backend for > this > > > application -- a commercial one too so we can get "commercial support" > > > (yuck).. > > > > Could you be a little more specific on your performance issues ? > > Well, I'm just noticing that simple select queries are taking 3-5 seconds - > on a table with 63 fields, 10000ish rows of data. The ID fields are > indexed, as well as several always-searched varchar() fields. > > Here are some typical queries my application might generate. Please, let me > know if you see anything that can be improved! > > select * from applicants as a where a.created::date = '05-01-2000' and > a.firstname ~* '^mitch' limit 10 offset 0 > > select * from applicants as a,applicants_states as s where a.firstname ~* > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0 > > .... There are 63 fields in the 'applicants' table, all of which are > searchable. Would it be a good or bad thing to index all fields that are > searchable? > > > The usual way to deal wih them is tuning your db structure and/or > > queries or > > setting backend options to use more memory for stuff or other such > > things. > > I'd love some pointers! This machine has lots-n-lots of memory. I'd love to > make postgre use more than normal if it would get me better speed! > > > If there is something wrong with the structure or queries, then a > > database > > switch will help you very little, unless your front-end tool has some > > special > > support for _some_ databases and not for others. > > PHP/C is what I use to access PostgreSQL. > > > > So, could you guys suggest some other backends I might look into? > > > > The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid > > > > The website is usually obtained by putting www inf front and com at the > > end ;) > > > > And let us know of your results. > > > > > I know > > > it's an odd place for me to ask but the flat truth is that I think *I* > am to > > > blame for my Postgres troubles and even taking all of the problems into > > > account I think PG is the best damn free RDBMS out there. It's > functionality > > > is superior to everyone else's, it's developers are no less than amazing > and > > > well -- I trust you guys to give me some honest opinions.. The > functionality > > > I need is basically what PG has.. Transactions are a must as well as > some > > > sort of sequence -- stability over performance but performance is very > > > important too. It also needs to run native on FreeBSD.. > > > > > > Oracle is out as we use FreeBSD and someone out there decided that they > > > wouldn't support FreeBSD (in the license as well as in the code!).. > > > > Is FreeBSD a religious issue there or can it be negotiated ? > > Not religious, though I love that OS like I love my right arm.. I am stuck > with x86 hardware -- that's pretty much it (and eliminates Solaris and > several other as a possible OS I suppose).. > > THANKS!! > > -Mitch Vincent > > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Hannu Krosing <hannu@tm.ee> writes: > That's what I meant by introducing pushing the case-insensitiveness > into ODBC, I don't believe ODBC parses the query carefully enough to recognize operators that would need to be altered to become case-insensitive. I'm not even sure that it could do that --- does "WHERE f1 = f2" need to be changed to "WHERE lower(f1) = lower(f2)"? No way to know that unless you know the datatypes of f1 and f2, which would mean (a) a moderately complete SQL parser/analyzer and (b) a copy of the system catalogs inside ODBC. Doesn't sound like a promising avenue of attack... regards, tom lane
At 11:44 AM 5/3/00 -0300, The Hermit Hacker wrote: >My experience so far is that it was basically a 'plug-n-play' sort of >thing ... dump your database, upgrade and reload ... the only problem I >had was one script that I did something like: > >SELECT field as field I was able to dump my birdnotes.net database and reload it without change, if another datapoint will help your boss's confidence level. >without realizing it, and v7.0 won't allow that (field == field) ... other >then that, *much* more stable, and appears faster then the old ... My system had one query in particular that suffered from a poor plan (a very bad nested loop), the changes to the optimizer have resulted in a much better plan in 7.0 which just flies. I've been very happy with 7.0. Tell your boss we're smarter than he is and switch :) - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 10:18 AM 5/3/00 -0400, Mitch Vincent wrote: >No, I was afraid to run 7.0 beta on the production server. I do have it on >my development server however and haven't had any problems with it (of >course the devel server is only used by me, the production server is used by >about 600 people)... It's being run at http://community.aolserver.com and http://openacs.org, and it seems to be solid. ... >> Interbase is a possible solution. They have a pretty good reputation, >> and their "super server" (threaded with connections sharing a buffer >> cache) should scale well. My rough estimate is that they're at about >> the place PG will be when 7.1 comes out. I don't know if they support >> FreeBSD, though. Any reason you can't just put up a box with Linux? >Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that >Linux and I use to be friends, now we're mortal enemies. :-) I love it when personal religon gets in the way of technical decision making! FreeBSD's great, so is Linux. I personally use Linux, but if a key component of a business of mine would be better served by software only available on FreeBSD, I'd have a server up in a matter of hours. Oh, well...to each their own. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
"Mitch Vincent" <mitch@huntsvilleal.com> writes: > Here are some typical queries my application might generate. Please, let me > know if you see anything that can be improved! > select * from applicants as a where a.created::date = '05-01-2000' and > a.firstname ~* '^mitch' limit 10 offset 0 Neither of these WHERE clauses can be used with a plain-vanilla index (I'm assuming a.created is of time datetime?), so you're getting a simple sequential scan over the whole table --- unless the LIMIT stops it sooner. If the table is large then you could get better performance by arranging for an indexscan using whichever clause is likely to be more selective (I'd be inclined to go for the date, I think, unless your creation dates come in bunches). The trick for the date test would be to have a functional index on date(a.created). I'm not sure how bright 6.5.* is about this, but it definitely works in 7.0: create table foo (f1 datetime); -- a straight index on f1 is no help: create index foof1 on foo(f1); explain select * from foo where f1::date = '05-01-2000'; NOTICE: QUERY PLAN: Seq Scan on foo (cost=0.00..25.00 rows=10 width=8) -- but an index on date(f1) is: create index foof1date on foo(date(f1)); explain select * from foo where f1::date = '05-01-2000'; NOTICE: QUERY PLAN: Index Scan using foof1date on foo (cost=0.00..8.16 rows=10 width=8) If you wanted to make the scan on firstname indexable, you'd need to make an index on lower(firstname) and then change the query to read... lower(a.firstname) ~ '^mitch' or possibly... lower(a.firstname) ~ lower('^mitch') if you don't want to assume the given pattern is lowercase to begin with. (The second example will fail to be indexed under 6.5, but should be just fine in 7.0.) ~* can't use an index under any circumstance, but ~ can if the pattern has a left-anchored fixed prefix. > select * from applicants as a,applicants_states as s where a.firstname ~* > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0 Again, the ~* clause is not indexable as-is, but the rstate clause would be if you have an index on s.rstate --- however, I imagine that it wouldn't be very selective, either, so it might not be worth the trouble. Changing the query to make the firstname part be indexable could be a win. You also need to look at how the join between a and s is being done. How big are these tables, anyway? > .... There are 63 fields in the 'applicants' table, all of which are > searchable. Would it be a good or bad thing to index all fields that are > searchable? A lot of indexes will hurt your insert/update/delete times, so I wouldn't recommend having a whole bunch of indexes unless searches are far more frequent than changes. What you want is a few well-chosen indexes that match the commonly used kinds of WHERE clauses in your query mix. > I'd love some pointers! This machine has lots-n-lots of memory. I'd love to > make postgre use more than normal if it would get me better speed! Increase postmaster's -B and -S settings ... regards, tom lane
> The trick for the date test would be to have a functional index on > date(a.created). I'm not sure how bright 6.5.* is about this, but > it definitely works in 7.0: > > create table foo (f1 datetime); > > -- a straight index on f1 is no help: > create index foof1 on foo(f1); > explain select * from foo where f1::date = '05-01-2000'; > NOTICE: QUERY PLAN: > > Seq Scan on foo (cost=0.00..25.00 rows=10 width=8) > > -- but an index on date(f1) is: > create index foof1date on foo(date(f1)); > explain select * from foo where f1::date = '05-01-2000'; > NOTICE: QUERY PLAN: > > Index Scan using foof1date on foo (cost=0.00..8.16 rows=10 width=8) Thanks!. > > select * from applicants as a,applicants_states as s where a.firstname ~* > > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0 > > Again, the ~* clause is not indexable as-is, but the rstate clause > would be if you have an index on s.rstate --- however, I imagine that > it wouldn't be very selective, either, so it might not be worth the > trouble. Changing the query to make the firstname part be indexable > could be a win. You also need to look at how the join between a and s > is being done. How big are these tables, anyway? ipa=> select count(app_id) from applicants_states;count ------ 244367 ipa=> select count(app_id) from applicants; count -----9791 > > .... There are 63 fields in the 'applicants' table, all of which are > > searchable. Would it be a good or bad thing to index all fields that are > > searchable? > > A lot of indexes will hurt your insert/update/delete times, so I > wouldn't recommend having a whole bunch of indexes unless searches are > far more frequent than changes. What you want is a few well-chosen > indexes that match the commonly used kinds of WHERE clauses in your > query mix. It's basically a search engine so yes, searching is FAR more frequently done than inserts/updates/deletes > Increase postmaster's -B and -S settings ... I will. Thanks! -Mitch Vincent
"Mitch Vincent" <mitch@huntsvilleal.com> writes: >> You also need to look at how the join between a and s >> is being done. How big are these tables, anyway? > ipa=> select count(app_id) from applicants_states; > count > ------ > 244367 > ipa=> select count(app_id) from applicants; > count > ----- > 9791 Now I'm confused --- what's the data model here? I guess each applicants row must match many entries in applicants_states? Anyway, I suspect you definitely want to avoid a nested-loop join :-). It'd be fairly reasonable for the system to use either hash or merge join, I think. What does EXPLAIN show that the system is actually doing with this query? >>>> .... There are 63 fields in the 'applicants' table, all of which are >>>> searchable. Would it be a good or bad thing to index all fields that are >>>> searchable? >> >> A lot of indexes will hurt your insert/update/delete times, so I >> wouldn't recommend having a whole bunch of indexes unless searches are >> far more frequent than changes. What you want is a few well-chosen >> indexes that match the commonly used kinds of WHERE clauses in your >> query mix. > It's basically a search engine so yes, searching is FAR more frequently done > than inserts/updates/deletes Well, there's still a cost to having a lot of seldom-used indexes, because the planner has to sit there and consider whether to use each one for each query. So I'd still recommend looking at your mix of queries and only creating indexes that match reasonably commonly-used WHERE clauses. regards, tom lane
> > ipa=> select count(app_id) from applicants_states; > > count > > ------ > > 244367 > > > ipa=> select count(app_id) from applicants; > > count > > ----- > > 9791 > > Now I'm confused --- what's the data model here? I guess each applicants > row must match many entries in applicants_states? Well, that's one possible search a person could do. applicants is the 63 field table that hold general info about the person. applicants_states is a table having only two fields --> app_id int4, rstate varchar(2) -- this holds all the state abbreviations of the states that an applicant will relocate too. It was either break it out into a different table or make a ver large varchar() field in the applicant table that would most of the time be totally blank but would have to be able to have every state abbreviation in there, including a delimiter (comma or something) between each one.. We brokw it out into another table so each applicant can have virtually unlimited number of states (or other countries) in the database. So for each applicant there could be 53 records in the applicants_states table.. (there are 53 different abbreviations that are valid in this application).. Clear as mud? :-) Looking back I think that it would have almost been better to make a 150-200 character varchar field -- perhaps not though.. > Anyway, I suspect you definitely want to avoid a nested-loop join :-). *cough cough cough* I use to have a subselect there -- in fact you smacked me around and told me to change it! > It'd be fairly reasonable for the system to use either hash or merge > join, I think. What does EXPLAIN show that the system is actually > doing with this query? 6.5.3 : ipa=> explain select * from applicants as a,applicants_states as s where a.firstname ~* ipa-> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; NOTICE: QUERY PLAN: Nested Loop (cost=1693.76 rows=6 width=615) -> Seq Scan on applicants a (cost=1554.71 rows=50 width=599) -> Index Scanusing applicants_states_app_id on applicants_states s (cost=2.78 rows=1023 width=16) 7.0RC3 (with the exact same data) : ipa=# explain select * from applicants as a,applicants_states as s where a.firstname ~* ipa-# '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; NOTICE: QUERY PLAN: Merge Join (cost=0.00..29675.86 rows=2397 width=615) -> Index Scan using applicants_states_app_id on applicants_statess (cost=0.00..23062.15 rows=2445 width=16) -> Index Scan using applicants_app_id on applicants a (cost=0.00..6581.91 rows=98 width=599) NOTICE: QUERY PLAN: Merge Join (cost=0.00..29675.86 rows=2397 width=615) -> Index Scan using applicants_states_app_id on applicants_statess (cost=0.00..23062.15 rows=2445 width=16) -> Index Scan using applicants_app_id on applicants a (cost=0.00..6581.91 rows=98 width=599) I did get two errors importing data from the 6.5.3 database into the 7.0 database : ERROR: DefineIndex: opclass "date_ops" does not accept datatype "timestamp" ERROR: DefineIndex: opclass "date_ops" does not accept datatype "timestamp" And I have several fields of type 'datetime' in 6.5.3 which seem to get translated to 'timestamp' in 7.0 -- then I try to index them using date_ops.. -Mitch
"Mitch Vincent" <mitch@huntsvilleal.com> writes: > applicants_states is a table having only two fields --> app_id int4, rstate > varchar(2) -- this holds all the state abbreviations of the states that an > applicant will relocate too. Ah, got it. If we had better search capabilities on arrays, you could have stored this within the applicants table as an array of char(2) ... but as is, I suspect you did the right thing to make it a second table. >> It'd be fairly reasonable for the system to use either hash or merge >> join, I think. What does EXPLAIN show that the system is actually >> doing with this query? > 6.5.3 : [ nested loop ] > 7.0RC3 (with the exact same data) : [ merge join ] So, may I have the envelope please? What's the timings? > I did get two errors importing data from the 6.5.3 database into the 7.0 > database : > ERROR: DefineIndex: opclass "date_ops" does not accept datatype "timestamp" > ERROR: DefineIndex: opclass "date_ops" does not accept datatype "timestamp" > And I have several fields of type 'datetime' in 6.5.3 which seem to get > translated to 'timestamp' in 7.0 -- then I try to index them using > date_ops.. Oh, that's interesting. I doubt that date_ops will work very well on timestamp data (or on its predecessor datetime). But 7.0 is the first version that actually checks whether your requested index operators are compatible with the column datatype --- previous versions would blindly do what you told them to, and very possibly coredump depending on what the datatypes in question where. I wonder if that mistake was causing some of the instability you had with 6.5? regards, tom lane
> > 6.5.3 : [ nested loop ] > > 7.0RC3 (with the exact same data) : [ merge join ] > > So, may I have the envelope please? What's the timings? Eh'? I cut and pasted everything that was printed... Here it is again, with both the beginning and ending prompts :-) 7.0 : ipa=# explain select * from applicants as a,applicants_states as s where a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; NOTICE: QUERY PLAN: Hash Join (cost=1355.82..5943.73 rows=17 width=615) -> Seq Scan on applicants_states s (cost=0.00..4492.54 rows=2350 width=16) -> Hash (cost=1355.54..1355.54 rows=112 width=599) -> Seq Scan on applicants a (cost=0.00..1355.54 rows=112 width=599) EXPLAIN ipa=# 6.5.3 : ipa=> explain select * from applicants as a,applicants_states as s where a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; NOTICE: QUERY PLAN: Nested Loop (cost=1693.76 rows=6 width=615) -> Seq Scan on applicants a (cost=1554.71 rows=50 width=599) -> Index Scanusing applicants_states_app_id on applicants_states s (cost=2.78 rows=1023 width=16) EXPLAIN ipa=> > Oh, that's interesting. I doubt that date_ops will work very well on > timestamp data (or on its predecessor datetime). But 7.0 is the first > version that actually checks whether your requested index operators are > compatible with the column datatype --- previous versions would blindly > do what you told them to, and very possibly coredump depending on what > the datatypes in question where. I wonder if that mistake was causing > some of the instability you had with 6.5? It's very likely that had to do with the stability problems, I indexed all the datetime fields like that.. I index the datetime fields as you mentioned previously in your email and WOW, the speed improvement was crazy. It's damn near instant if you're searching just by date created (as many of my lovely users do!).. -Mitch
Hmm, something else that I see that might break some code between 6.5.* and 7.0 (it did mine) is the fact that dates and date times are returned the reverse that they use to be. I make comparisions of the literal string in a few places, it didn't like 2000-03-03 20:20:02-05 as a date time :-) No big deal but just out of curiosity, why the change? - Mitch
On Wed, 3 May 2000, Mitch Vincent wrote: > > > 6.5.3 : [ nested loop ] > > > 7.0RC3 (with the exact same data) : [ merge join ] > > > > So, may I have the envelope please? What's the timings? > > Eh'? I cut and pasted everything that was printed... Explain tells how the system will do things ... if you run it without the ExPLAIn, how long does it take to run? :)
> So, may I have the envelope please? What's the timings? 6.5.3: StartTransactionCommand query: select * from applicants as a,applicants_states as s where a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; ProcessQuery ! system usage stats: ! 0.436792 elapsed 0.275139 user 0.157033 system sec ! [0.283135 user 0.173026 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent ! 0/8 [2/9] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 1403 read, 0 written, buffer hit rate = 51.22% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written CommitTransactionCommand proc_exit(0) [#0] shmem_exit(0) [#0] exit(0) ______________________________________________________________________ 7.0 : StartTransactionCommand query: select * from applicants as a,applicants_states as s where a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; ProcessQuery ! system usage stats: ! 1.461997 elapsed 1.224377 user 0.234618 system sec ! [1.238219 user 0.255382 sys total] ! 0/12 [0/12] filesystem blocks in/out ! 0/60 [0/318] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent ! 0/22 [1/24] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 2713 read, 0 written, buffer hit rate = 25.34% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written CommitTransactionCommand proc_exit(0) shmem_exit(0) exit(0) /usr/local/pgsql/bin/postmaster: reaping dead processes... /usr/local/pgsql/bin/postmaster: CleanupProc: pid 60606 exited with status 0 Sorry, I didn't get what you wanted at first. I'm down to my last brain cell today. -Mitch
Mitch, one question ... what is the configuration (RAM, CPU, etc) of the v6.5.3 machine vs v7.0 machine? Also, how is your postmaster started up? What options? On Wed, 3 May 2000, Mitch Vincent wrote: > > So, may I have the envelope please? What's the timings? > > 6.5.3: > > StartTransactionCommand > query: select * from applicants as a,applicants_states as s where > a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 > offset 0; > ProcessQuery > ! system usage stats: > ! 0.436792 elapsed 0.275139 user 0.157033 system sec > ! [0.283135 user 0.173026 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > ! 0/8 [2/9] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 1403 read, 0 written, buffer hit rate > = 51.22% > ! Local blocks: 0 read, 0 written, buffer hit rate > = 0.00% > ! Direct blocks: 0 read, 0 written > CommitTransactionCommand > proc_exit(0) [#0] > shmem_exit(0) [#0] > exit(0) > > > ______________________________________________________________________ > > 7.0 : > > StartTransactionCommand > query: select * from applicants as a,applicants_states as s where > a.firstname ~* > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; > ProcessQuery > ! system usage stats: > ! 1.461997 elapsed 1.224377 user 0.234618 system sec > ! [1.238219 user 0.255382 sys total] > ! 0/12 [0/12] filesystem blocks in/out > ! 0/60 [0/318] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > ! 0/22 [1/24] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 2713 read, 0 written, buffer hit rate > = 25.34% > ! Local blocks: 0 read, 0 written, buffer hit rate > = 0.00% > ! Direct blocks: 0 read, 0 written > CommitTransactionCommand > proc_exit(0) > shmem_exit(0) > exit(0) > /usr/local/pgsql/bin/postmaster: reaping dead processes... > /usr/local/pgsql/bin/postmaster: CleanupProc: pid 60606 exited with status 0 > > > > Sorry, I didn't get what you wanted at first. I'm down to my last brain cell > today. > > > > -Mitch > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
"Mitch Vincent" <mitch@huntsvilleal.com> writes: > Hmm, something else that I see that might break some code between 6.5.* and > 7.0 (it did mine) is the fact that dates and date times are returned the > reverse that they use to be. ISO format to be exact... not just the reverse order, but different layout. It's always been possible to break unsuspecting apps by changing the DATESTYLE; I think it's good practice for an app to set DATESTYLE for itself, if it's dependent on a particular setting. > No big deal but just out of curiosity, why the change? ISO and Y2K political correctness ;-). If you don't like it, set DATESTYLE to the old default ('Postgres' I think). It's also possible to change the system-wide default with a suitable postmaster switch, but I forget the details. regards, tom lane
Mitch Vincent wrote: > > 7.0 : > > StartTransactionCommand > query: select * from applicants as a,applicants_states as s where > a.firstname ~* > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0; > ProcessQuery > ! system usage stats: With the numbers of records in the applicants file (and the probable distribution of firstnames) a most efficient query under 7.0 will work heaps better if you have that index on lower(a.firstname) and stop using ~* (i.e. just using ~). I think this will be especially the case with your '63 fields, ~10,000 records. I'm guessing that a significant portion of those fields are TEXT or VARCHAR, so record size will be creeping up. The best way to see/show all of the information on this table is to:VACUUM VERBOSE ANALYZE applicants; rather than to just:SELECT COUNT(*) FROM applicants; because you/we will get to see the average record size as well. Also, I believe I once read that putting all of the fixed length fields at the start of the record will make for faster access, especially when a scan is being done against those fields. (Can someone confirm this one? :-) Do you keep statistics regarding what fields people actually _use_ for their query matching? If you can construct indexes to support those frequent queries then you will find huge speed improvements. These speed improvements won't degrade as you add more records too (at least not to the same extent). You'll probably also find that people will use those query terms more often once they twig to how much more quickly the results come back to them! Another point: 7.0 will much more frequently choose indexes when you use the LIMIT clause, as you do. Cheers, Andrew McMillan. -- _____________________________________________________________________ Andrew McMillan, e-mail: Andrew@cat-it.co.nz Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267
"Mitch Vincent" <mitch@huntsvilleal.com> writes: >> So, may I have the envelope please? What's the timings? > 6.5.3: > ! system usage stats: > ! 0.436792 elapsed 0.275139 user 0.157033 system sec > ! [0.283135 user 0.173026 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > ! 0/8 [2/9] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 1403 read, 0 written, buffer hit rate > = 51.22% > ! Local blocks: 0 read, 0 written, buffer hit rate > = 0.00% > ! Direct blocks: 0 read, 0 written > 7.0 : > ! system usage stats: > ! 1.461997 elapsed 1.224377 user 0.234618 system sec > ! [1.238219 user 0.255382 sys total] > ! 0/12 [0/12] filesystem blocks in/out > ! 0/60 [0/318] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > ! 0/22 [1/24] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 2713 read, 0 written, buffer hit rate > = 25.34% > ! Local blocks: 0 read, 0 written, buffer hit rate > = 0.00% > ! Direct blocks: 0 read, 0 written Well, drat. Looks like 7.0's query plan is slower :-(. There's something fishy about the numbers for 6.5.3 though --- how could it have done that query with zero blocks read? Are you sure you are comparing apples to apples here? I wonder whether the 6.5 system already had the tables cached in kernel disk buffers while 7.0 was working from a standing start and had to physically go to the disk. Also, did both versions have the same -B and -S settings? regards, tom lane
> "Mitch Vincent" <mitch@huntsvilleal.com> writes: > > Well, there's still a cost to having a lot of seldom-used indexes, > because the planner has to sit there and consider whether to use each > one for each query. So I'd still recommend looking at your mix of > queries and only creating indexes that match reasonably commonly-used > WHERE clauses. > When doing insert/updates on larger tables (>500.000 entries) these indexes are also time consuming ! For our vertical attribute object storage systems we noticed, that the time for insert/updates are the critical part - they behave very linear in our test suite and they seem to be the limiting factor in our system. Marten
On Wed, 3 May 2000, Mitch Vincent wrote: > > So, may I have the envelope please? What's the timings? > > 6.5.3: > > StartTransactionCommand > query: select * from applicants as a,applicants_states as s where > a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 > offset 0; > ProcessQuery > ! system usage stats: > ! 0.436792 elapsed 0.275139 user 0.157033 system sec > ! [0.283135 user 0.173026 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > ! 0/8 [2/9] voluntary/involuntary context switches How is this to be read? I'm looking at it, and reading it as: 0 - voluntary 8 - involuntary But what about the [2/9]? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: >> ! system usage stats: >> ! 0.436792 elapsed 0.275139 user 0.157033 system sec >> ! [0.283135 user 0.173026 sys total] >> ! 0/0 [0/0] filesystem blocks in/out >> ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps >> ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent >> ! 0/8 [2/9] voluntary/involuntary context switches > How is this to be read? I'm looking at it, and reading it as: > 0 - voluntary > 8 - involuntary > But what about the [2/9]? I believe the numbers outside brackets are for the particular query cycle, and the ones in brackets are total for the process (ie, total since backend start). I didn't design the printout format ;-) ... not real sure what the point is of printing the total-since-start numbers ... regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > >> ! system usage stats: > >> ! 0.436792 elapsed 0.275139 user 0.157033 system sec > >> ! [0.283135 user 0.173026 sys total] > >> ! 0/0 [0/0] filesystem blocks in/out > >> ! 0/149 [0/332] page faults/reclaims, 0 [0] swaps > >> ! 0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent > >> ! 0/8 [2/9] voluntary/involuntary context switches > > > How is this to be read? I'm looking at it, and reading it as: > > > 0 - voluntary > > 8 - involuntary > > > But what about the [2/9]? > > I believe the numbers outside brackets are for the particular query > cycle, and the ones in brackets are total for the process (ie, total > since backend start). > > I didn't design the printout format ;-) ... not real sure what the > point is of printing the total-since-start numbers ... Okay, that explains that :) Now, Mitch's results for v7.0 showed something like: 0/12 filesystem blocks in/out You intepreted that as 12 reads from the file system ... 'out' I would have interpreted as writes to the file system, which made zero sense ... do we have our 'in/out's backwards here? One thing that would be nice (who wrote these stats?) would be some way to be able to determine a suitable setting for -S from this ... someway to know that an ORDER BY needed to swap to disk because it needed 32Meg when only 16Meg was allocated for it ... would help give an indication where performance could be improved by either just raising -S (in Mitch's case, where lots of RAM is available) or more RAM should be added ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > Now, Mitch's results for v7.0 showed something like: > 0/12 filesystem blocks in/out > You intepreted that as 12 reads from the file system ... 'out' I would > have interpreted as writes to the file system, which made zero sense > ... do we have our 'in/out's backwards here? Good point. Writes from a SELECT are certainly possible --- the SELECT could be writing tuple status-flag updates, if it was the first transaction to verify commit status of tuples created by a prior transaction. But that again raises the issue of whether we've got a fair comparison. The 6.5 test apparently only saw already-marked- committed tuples ... regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > Now, Mitch's results for v7.0 showed something like: > > 0/12 filesystem blocks in/out > > You intepreted that as 12 reads from the file system ... 'out' I would > > have interpreted as writes to the file system, which made zero sense > > ... do we have our 'in/out's backwards here? > > Good point. Writes from a SELECT are certainly possible --- the SELECT > could be writing tuple status-flag updates, if it was the first > transaction to verify commit status of tuples created by a prior > transaction. But that again raises the issue of whether we've got > a fair comparison. The 6.5 test apparently only saw already-marked- > committed tuples ... I was hoping that Mitch would have spoken up by now about it, but an email I saw from him stated that the v7.0 machine (development) wasn't as powerful as the v6.5.3 machine (production) ... that might account for it, I just don't know how much different the two machines are ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
At 02:36 AM 5/4/00 -0300, The Hermit Hacker wrote: >One thing that would be nice (who wrote these stats?) would be some way to >be able to determine a suitable setting for -S from this ... someway to >know that an ORDER BY needed to swap to disk because it needed 32Meg when >only 16Meg was allocated for it ... would help give an indication where >performance could be improved by either just raising -S (in Mitch's case, >where lots of RAM is available) or more RAM should be added ... It would also be nice to be able to get at these via SQL, ala Oracle. Then toolkits like OpenACS could easily generate administration pages that present the stats nicely to webmasters, including perhaps putting up links to help page. If sorting is going to disk, a link to a short pages dicussing raising -S could be put up. ACS Classic, the Oracle version, has admin pages (without help, it points you to a bookshelf of Oracle DBA books instead) which display stats, etc and it is very useful. A lot of folks using toolkits like this will simply customize look and feel of the web pages and at first, at least, won't know much or anything about SQL or Postgres and need all the help we can give them. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18 Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a 64-bit PCI slot). There is a big upgrade planned for this box when the motherboard we're waiting for comes out.. The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI drives.. However it's running several other things other than the database and webserver so it has a small load. Sorry for not replying sooner, had a small crisis yesterday evening :-) - Mitch ----- Original Message ----- From: The Hermit Hacker <scrappy@hub.org> To: Tom Lane <tgl@sss.pgh.pa.us> Cc: Mitch Vincent <mitch@huntsvilleal.com>; <pgsql-hackers@postgresql.org> Sent: Thursday, May 04, 2000 1:51 AM Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? ) > On Thu, 4 May 2000, Tom Lane wrote: > > > The Hermit Hacker <scrappy@hub.org> writes: > > > Now, Mitch's results for v7.0 showed something like: > > > 0/12 filesystem blocks in/out > > > You intepreted that as 12 reads from the file system ... 'out' I would > > > have interpreted as writes to the file system, which made zero sense > > > ... do we have our 'in/out's backwards here? > > > > Good point. Writes from a SELECT are certainly possible --- the SELECT > > could be writing tuple status-flag updates, if it was the first > > transaction to verify commit status of tuples created by a prior > > transaction. But that again raises the issue of whether we've got > > a fair comparison. The 6.5 test apparently only saw already-marked- > > committed tuples ... > > I was hoping that Mitch would have spoken up by now about it, but an email > I saw from him stated that the v7.0 machine (development) wasn't as > powerful as the v6.5.3 machine (production) ... that might account for it, > I just don't know how much different the two machines are ... > > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy > Systems Administrator @ hub.org > primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org >
On Thu, 4 May 2000, Mitch Vincent wrote: > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18 > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a > 64-bit PCI slot). > > There is a big upgrade planned for this box when the motherboard we're > waiting for comes out.. > > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI > drives.. However it's running several other things other than the database > and webserver so it has a small load. Wait, correct me if I'm wrong, but the more powerful CPU is in your development server? My understanding is that a Celeron is a chop'd up PII ... my first recommendation here is that if you are running a *server*, get rid of that Celeron ... from what I've been told about the difference, Celeron is a great, cheap chip for using in a desktop environment (its what I use at home), but shy away from it in a server environment, as the speed reduction of the reduced cache alone will hurt things ...
> > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18 > > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a > > 64-bit PCI slot). > > > > There is a big upgrade planned for this box when the motherboard we're > > waiting for comes out.. > > > > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI > > drives.. However it's running several other things other than the database > > and webserver so it has a small load. > > Wait, correct me if I'm wrong, but the more powerful CPU is in your > development server? > > My understanding is that a Celeron is a chop'd up PII ... my first > recommendation here is that if you are running a *server*, get rid of that > Celeron ... from what I've been told about the difference, Celeron is a > great, cheap chip for using in a desktop environment (its what I use at > home), but shy away from it in a server environment, as the speed > reduction of the reduced cache alone will hurt things ... Ooooooooooh you're preaching to the choir. I know, I'm argueing with someone about this as we speak. A Celeron is basically a PII with 128k of full-speed cache. NOT a server processor, I know but sadly I can't do much about it at this point.. It's my understanding that "we" want to wait for an AMD board that has a 64-bit PCI slot because "we" don't like the i840 chipset for some reason "we" can't understand. As you can tell, I'm a bit upset about the whole thing.. -Mitch
On Thu, 4 May 2000, Mitch Vincent wrote: > > > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 > 18 > > > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have > a > > > 64-bit PCI slot). > > > > > > There is a big upgrade planned for this box when the motherboard we're > > > waiting for comes out.. > > > > > > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI > > > drives.. However it's running several other things other than the > database > > > and webserver so it has a small load. > > > > Wait, correct me if I'm wrong, but the more powerful CPU is in your > > development server? > > > > My understanding is that a Celeron is a chop'd up PII ... my first > > recommendation here is that if you are running a *server*, get rid of that > > Celeron ... from what I've been told about the difference, Celeron is a > > great, cheap chip for using in a desktop environment (its what I use at > > home), but shy away from it in a server environment, as the speed > > reduction of the reduced cache alone will hurt things ... > > Ooooooooooh you're preaching to the choir. I know, I'm argueing with someone > about this as we speak. > > A Celeron is basically a PII with 128k of full-speed cache. NOT a > server processor, I know but sadly I can't do much about it at this > point.. It's my understanding that "we" want to wait for an AMD board > that has a 64-bit PCI slot because "we" don't like the i840 chipset > for some reason "we" can't understand. Can someone out there that understands CPUs help me out here? My understanding is that Intel vs AMD has benefits depending on use. Unfortunately, I don't recall how it goes ... as a server, Intel is faster, and for graphics processing, AMD is ... or something like that?
At 10:33 AM 5/4/00 -0300, The Hermit Hacker wrote: >Wait, correct me if I'm wrong, but the more powerful CPU is in your >development server? > >My understanding is that a Celeron is a chop'd up PII ... my first >recommendation here is that if you are running a *server*, get rid of that >Celeron ... from what I've been told about the difference, Celeron is a >great, cheap chip for using in a desktop environment (its what I use at >home), but shy away from it in a server environment, as the speed >reduction of the reduced cache alone will hurt things ... Celerons have a smaller L2 cache (128K) than PIIs (512K), but it runs full-speed rather than 1/2 speed like the PII cache. Current models aren't "chopped up" in any sense, they're the same core with a smaller but faster cache. So, applications that have a high cache hit rate can actually run faster on the Celeron. New Coppermine PIII's (those that end in E or are > 600 MHz) have 256K full-speed cache, the Coppermine-based Celeron II's 128K full-speed. Yes, they cut the cache size in half compared to PII's and non-E PIII's (Katmai cores) but it's full-speed, which turns out to be a win for nearly all applications. Other than cache size and FSB/memory bus speed the new Celerons and PIII's are identical. All Celerons run with 66 MHz FSB and RAM, current Coppermines with 100 MHz RAM (even those with a 133MHz front-side bus) or spendy RDRAM which almost no one is buying. So, what's the bottom line? The numbers don't tell us much, though I still think Tom's right that the PG7.0 one is really slower. You just can't say if how MUCH slower. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Well well. Mitch gets his dual PIII board/processors -- I guess I sold the idea to the boss-man :-) - Mitch ----- Original Message ----- From: Don Baccus <dhogaza@pacifier.com> To: The Hermit Hacker <scrappy@hub.org>; Mitch Vincent <mitch@huntsvilleal.com> Cc: <pgsql-hackers@postgresql.org> Sent: Thursday, May 04, 2000 9:55 AM Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? ) > At 10:33 AM 5/4/00 -0300, The Hermit Hacker wrote: > > >Wait, correct me if I'm wrong, but the more powerful CPU is in your > >development server? > > > >My understanding is that a Celeron is a chop'd up PII ... my first > >recommendation here is that if you are running a *server*, get rid of that > >Celeron ... from what I've been told about the difference, Celeron is a > >great, cheap chip for using in a desktop environment (its what I use at > >home), but shy away from it in a server environment, as the speed > >reduction of the reduced cache alone will hurt things ... > > Celerons have a smaller L2 cache (128K) than PIIs (512K), but it runs > full-speed rather than 1/2 speed like the PII cache. Current models > aren't "chopped up" in any sense, they're the same core with a smaller > but faster cache. > > So, applications that have a high cache hit rate can actually run faster > on the Celeron. > > New Coppermine PIII's (those that end in E or are > 600 MHz) have > 256K full-speed cache, the Coppermine-based Celeron II's 128K > full-speed. Yes, they cut the cache size in half compared to > PII's and non-E PIII's (Katmai cores) but it's full-speed, which > turns out to be a win for nearly all applications. Other than > cache size and FSB/memory bus speed the new Celerons and PIII's > are identical. > > All Celerons run with 66 MHz FSB and RAM, current Coppermines with > 100 MHz RAM (even those with a 133MHz front-side bus) or spendy > RDRAM which almost no one is buying. > > So, what's the bottom line? The numbers don't tell us much, > though I still think Tom's right that the PG7.0 one is really > slower. You just can't say if how MUCH slower. > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. >
> > A Celeron is basically a PII with 128k of full-speed cache. NOT a > > server processor, I know but sadly I can't do much about it at this > > point.. It's my understanding that "we" want to wait for an AMD board > > that has a 64-bit PCI slot because "we" don't like the i840 chipset > > for some reason "we" can't understand. > Can someone out there that understands CPUs help me out here? My > understanding is that Intel vs AMD has benefits depending on > use. Unfortunately, I don't recall how it goes ... as a server, Intel is > faster, and for graphics processing, AMD is ... or something like that? From what I've read, the extra cache in the PII/III gives you a 5% boost over the Celeron (I'll guess more for some server apps). Intel still sells the Xeon chips, which have a cache twice as big as the PII/III, but I'm not sure the clock has kept pace and it was always overpriced wrt performance. I don't remember which way the Intel/AMD thing goes, but most folks won't notice a 5% difference in speed. Not that anyone asked, but imho the best price/performance x86 machine has always been a dual processor box one or two clock jumps behind the fastest available. You get ~80% more performance for ~5% more cost than a uniprocessor at the fastest speed. I haven't looked recently to see if there are now uniprocessor machines at the low end that can beat the price/performance of the dual-processor setup. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
At 10:54 AM 5/4/00 -0300, The Hermit Hacker wrote: >Can someone out there that understands CPUs help me out here? My >understanding is that Intel vs AMD has benefits depending on >use. Unfortunately, I don't recall how it goes ... as a server, Intel is >faster, and for graphics processing, AMD is ... or something like that? First, take a look at my earlier post (that I didn't send to the list because it seemed irrelevant) about some basic issues regarding the i840. For servers, I think the bottom line at the moment is that the Athlon isn't available in SMP, and won't be 'til later this year. And the Athlon boards haven't been out long, and they're tricky to make, and require high quality power supplies. All and all, I'd look closely at an Athlon for a personal workstation but not a server at this point. For a low-end server (say, static web pages) an i810 or i810E with on-board video is a compact solution that fits nicely in 2U rack cases. A Celeron 366 does a great job for this kind of application, cheaply (see http://donb.photo.net for an example of a C366 at work). Of course, you can only buy faster ones now, they'll do the job even better. For a high-end db server, you probably want an SMP board, I'd guess. Even if you only put in a single processor, this gives the option of adding a second one if load grows and you've got enough RAM, etc. That rules out the Athlon at the moment. As far as head-to-head comparisons of current PIII coppermines and Athlons...the Athlon has 1/2 or 2/5 speed cache (slower speeds at the high end 750 MHz+ I believe), the coppermine full-speed, on-die cache. But the Athlon has more L1 cache. And the Athlon has faster FPP, but more graphics stuff is optimized for Intel. Price performance wise, the Athlon smokes the coppermines as you can buy a much faster part for the same price. Absolute performance wise, in theory a 1 GHz coppermine beats a 1 GHz Athlon for almost anything, but you can't get them unless your name is Michael Dell, etc. For build-your-own types or small systems integrators they don't really exist. By fall, it won't matter. The new generation Intel chips start showing up, Athlon on-die full-speed cache version shows up making the current part look like a slouch, and new generation AMD parts start showing up. We'll all realize our current computers suck, even though they've got more power than most of us can figure out how to use. And ... maybe an Intel solution to the i820/i840 fiasco. We can always hope. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
On Thu, 4 May 2000, Thomas Lockhart wrote: > > > A Celeron is basically a PII with 128k of full-speed cache. NOT a > > > server processor, I know but sadly I can't do much about it at this > > > point.. It's my understanding that "we" want to wait for an AMD board > > > that has a 64-bit PCI slot because "we" don't like the i840 chipset > > > for some reason "we" can't understand. > > Can someone out there that understands CPUs help me out here? My > > understanding is that Intel vs AMD has benefits depending on > > use. Unfortunately, I don't recall how it goes ... as a server, Intel is > > faster, and for graphics processing, AMD is ... or something like that? > > >From what I've read, the extra cache in the PII/III gives you a 5% > boost over the Celeron (I'll guess more for some server apps). Intel > still sells the Xeon chips, which have a cache twice as big as the > PII/III, but I'm not sure the clock has kept pace and it was always > overpriced wrt performance. > > I don't remember which way the Intel/AMD thing goes, but most folks > won't notice a 5% difference in speed. > > Not that anyone asked, but imho the best price/performance x86 machine > has always been a dual processor box one or two clock jumps behind the > fastest available. You get ~80% more performance for ~5% more cost > than a uniprocessor at the fastest speed. I haven't looked recently to > see if there are now uniprocessor machines at the low end that can > beat the price/performance of the dual-processor setup. this is the sort of thing I've been moving towards as well ... went a little cheaper this last time with upgrading my home machine and have a Dual-Celeron ... quite happy with her so far ...
At 02:26 PM 5/4/00 +0000, Thomas Lockhart wrote: >From what I've read, the extra cache in the PII/III gives you a 5% >boost over the Celeron (I'll guess more for some server apps). For a db server the 100Mhz memory bus of the PII/III probably wins more than 5% over the 66Mhz memory bus of the Celeron. This assumes you're database is reasonably big. Lots of memory transfers going on... >Intel >still sells the Xeon chips, which have a cache twice as big as the >PII/III, but I'm not sure the clock has kept pace and it was always >overpriced wrt performance. Katmai Xeons come with cache size ranging from 512K (the same as a PII/PIII Katmai) to 2MB. That cache, though, is FULL SPEED. Gets you about 10% over a PII/III Katmai for server-type benchmarks I've seen. But they're expensive. You can also go 4-way SMP with them, vs. 2-way with PII/III... Now the new Coppermines have changed things...the Xeon Cu and PII/III Cu both have identical 256K full-speed cache. There's not much reason to buy the Xeon unless you want 4-way SMP, and Intel recognizes this apparently because the price is only 10% higher for these new parts. >Not that anyone asked, but imho the best price/performance x86 machine >has always been a dual processor box one or two clock jumps behind the >fastest available. That's what I did, a dual PII450. I bought them (boxed, fan) for $180 each when PIII450s were $250 or so and PIII 500Es $299 (I bought one of the latter for a home workstation). So it's a PII without the latest matrix instructions for graphics? How many games will I run on my web/db server? :) >You get ~80% more performance for ~5% more cost >than a uniprocessor at the fastest speed. I haven't looked recently to >see if there are now uniprocessor machines at the low end that can >beat the price/performance of the dual-processor setup. The Athlons do, actually ... but I wouldn't trust the mobos for a remote server at this point, not until they're proven stable. They're tricky to build and finicky about power supplies. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Don Baccus <dhogaza@pacifier.com> writes: > So, what's the bottom line? The numbers don't tell us much, > though I still think Tom's right that the PG7.0 one is really > slower. You just can't say if how MUCH slower. Actually I was a lot more concerned about disk performance than CPU speed. I notice no one's said anything about the relative speed of Mitch's two different disk setups ... regards, tom lane
On Thu, May 04, 2000 at 11:06:20AM -0400, Tom Lane wrote: > Don Baccus <dhogaza@pacifier.com> writes: > > So, what's the bottom line? The numbers don't tell us much, > > though I still think Tom's right that the PG7.0 one is really > > slower. You just can't say if how MUCH slower. > > Actually I was a lot more concerned about disk performance than CPU > speed. I notice no one's said anything about the relative speed > of Mitch's two different disk setups ... > Mitch wrote: > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18 > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a > 64-bit PCI slot). > > There is a big upgrade planned for this box when the motherboard we're > waiting for comes out.. > > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI > drives.. However it's running several other things other than the database > and webserver so it has a small load. > So, 6.5.3 is running on Ultra 160 drives, with the controller throttled to 80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a controller maximum of 80 MB/s. However, the sustained transfer speed of the drives themselves are what should be limiting: if they're all relatively modern drives, 20MB/s is typical, so neither config will max out the controller. (2 drives each, right?) Ross -- Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> NSBRI Research Scientist/Programmer Computer and Information Technology Institute Rice University, 6100 S. Main St., Houston, TX 77005
At 11:03 AM 5/4/00 -0500, Ross J. Reedstrom wrote: >So, 6.5.3 is running on Ultra 160 drives, with the controller throttled to >80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a controller >maximum of 80 MB/s. However, the sustained transfer speed of the drives >themselves are what should be limiting: if they're all relatively modern >drives, 20MB/s is typical, so neither config will max out the controller. >(2 drives each, right?) Not to mention that seek times make it very difficult to max out a controller even if theoretically possible with four drives, unless you're striping and doing large transfers or lookaheads, etc. If one's got 10K drives and the other 7.2K drives, you'll certainly see a difference in transfer rate and seek time. So ... what are the disk configurations? - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
Production 10k, development 7.2k RPM :-) - Mitch ----- Original Message ----- From: Don Baccus <dhogaza@pacifier.com> To: Ross J. Reedstrom <reedstrm@wallace.ece.rice.edu>; Tom Lane <tgl@sss.pgh.pa.us> Cc: <pgsql-hackers@postgresql.org> Sent: Thursday, May 04, 2000 1:11 PM Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? ) > At 11:03 AM 5/4/00 -0500, Ross J. Reedstrom wrote: > > >So, 6.5.3 is running on Ultra 160 drives, with the controller throttled to > >80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a controller > >maximum of 80 MB/s. However, the sustained transfer speed of the drives > >themselves are what should be limiting: if they're all relatively modern > >drives, 20MB/s is typical, so neither config will max out the controller. > >(2 drives each, right?) > > Not to mention that seek times make it very difficult to max out > a controller even if theoretically possible with four drives, unless > you're striping and doing large transfers or lookaheads, etc. > > If one's got 10K drives and the other 7.2K drives, you'll certainly > see a difference in transfer rate and seek time. > > So ... what are the disk configurations? > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. >
> Well, drat. Looks like 7.0's query plan is slower :-(. There's > something fishy about the numbers for 6.5.3 though --- how could it have > done that query with zero blocks read? Are you sure you are comparing > apples to apples here? I wonder whether the 6.5 system already had the > tables cached in kernel disk buffers while 7.0 was working from a > standing start and had to physically go to the disk. This is very possible as the 6.5.3 PG is running on the production server which is constantly being queried. >Also, did both > versions have the same -B and -S settings? I didn't specify any -B or -S settings so both are using their respective defaults.. Thanks!
On Thu, 4 May 2000, Mitch Vincent wrote: > > Well, drat. Looks like 7.0's query plan is slower :-(. There's > > something fishy about the numbers for 6.5.3 though --- how could it have > > done that query with zero blocks read? Are you sure you are comparing > > apples to apples here? I wonder whether the 6.5 system already had the > > tables cached in kernel disk buffers while 7.0 was working from a > > standing start and had to physically go to the disk. > > This is very possible as the 6.5.3 PG is running on the production server > which is constantly being queried. > > >Also, did both > > versions have the same -B and -S settings? > > I didn't specify any -B or -S settings so both are using their respective > defaults.. For you machine, go with something like '-S <32*1024>' to use 32Meg of RAM for ORDER/GROUP BY ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
I was just playing with some of the suggested lower() indexes and ran into a bit of trouble, it's no doubt from my lack of understanding but still, I wanted to ask.. Do dum : ipa=# create index applicants_firstname on applicants(lower(firstname)); ERROR: DefineIndex: function 'lower(varchar)' does not exist ...that syntax is right, isn't it? Thanks! - Mitch ----- Original Message ----- From: Tom Lane <tgl@sss.pgh.pa.us> To: Mitch Vincent <mitch@huntsvilleal.com> Cc: <pgsql-hackers@postgresql.org> Sent: Wednesday, May 03, 2000 1:31 PM Subject: Re: [HACKERS] Why Not MySQL? > "Mitch Vincent" <mitch@huntsvilleal.com> writes: > > Here are some typical queries my application might generate. Please, let me > > know if you see anything that can be improved! > > > select * from applicants as a where a.created::date = '05-01-2000' and > > a.firstname ~* '^mitch' limit 10 offset 0 > > Neither of these WHERE clauses can be used with a plain-vanilla index > (I'm assuming a.created is of time datetime?), so you're getting a > simple sequential scan over the whole table --- unless the LIMIT stops > it sooner. If the table is large then you could get better performance > by arranging for an indexscan using whichever clause is likely to be > more selective (I'd be inclined to go for the date, I think, unless your > creation dates come in bunches). > > The trick for the date test would be to have a functional index on > date(a.created). I'm not sure how bright 6.5.* is about this, but > it definitely works in 7.0: > > create table foo (f1 datetime); > > -- a straight index on f1 is no help: > create index foof1 on foo(f1); > explain select * from foo where f1::date = '05-01-2000'; > NOTICE: QUERY PLAN: > > Seq Scan on foo (cost=0.00..25.00 rows=10 width=8) > > -- but an index on date(f1) is: > create index foof1date on foo(date(f1)); > explain select * from foo where f1::date = '05-01-2000'; > NOTICE: QUERY PLAN: > > Index Scan using foof1date on foo (cost=0.00..8.16 rows=10 width=8) > > If you wanted to make the scan on firstname indexable, you'd need to > make an index on lower(firstname) and then change the query to read > ... lower(a.firstname) ~ '^mitch' > or possibly > ... lower(a.firstname) ~ lower('^mitch') > if you don't want to assume the given pattern is lowercase to begin > with. (The second example will fail to be indexed under 6.5, but should > be just fine in 7.0.) ~* can't use an index under any circumstance, > but ~ can if the pattern has a left-anchored fixed prefix. > > > select * from applicants as a,applicants_states as s where a.firstname ~* > > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0 > > Again, the ~* clause is not indexable as-is, but the rstate clause > would be if you have an index on s.rstate --- however, I imagine that > it wouldn't be very selective, either, so it might not be worth the > trouble. Changing the query to make the firstname part be indexable > could be a win. You also need to look at how the join between a and s > is being done. How big are these tables, anyway? > > > .... There are 63 fields in the 'applicants' table, all of which are > > searchable. Would it be a good or bad thing to index all fields that are > > searchable? > > A lot of indexes will hurt your insert/update/delete times, so I > wouldn't recommend having a whole bunch of indexes unless searches are > far more frequent than changes. What you want is a few well-chosen > indexes that match the commonly used kinds of WHERE clauses in your > query mix. > > > I'd love some pointers! This machine has lots-n-lots of memory. I'd love to > > make postgre use more than normal if it would get me better speed! > > Increase postmaster's -B and -S settings ... > > regards, tom lane >
"Mitch Vincent" <mitch@huntsvilleal.com> writes: > ipa=# create index applicants_firstname on applicants(lower(firstname)); > ERROR: DefineIndex: function 'lower(varchar)' does not exist > ...that syntax is right, isn't it? Hmm, that's annoying. I guess you are going to have to make that field be of type text. Actually, since text and varchar look the same under the hood, the existing lower() code would work just fine on varchar. One fix for this would be to add a pg_proc entry for lower(varchar), which you could do by hand if you wanted: regression=# create index f1lower on f1v (lower(f1)); ERROR: DefineIndex: function 'lower(varchar)' does not exist regression=# create function lower(varchar) returns text as 'lower' regression-# language 'internal' with (iscachable); CREATE regression=# select * from pg_proc where proname = 'lower';proname | proowner | prolang | proisinh | proistrusted | proiscachable| pronargs | proretset | prorettype | proargtypes | probyte_pct | properbyte_cpu | propercall_cpu | prooutin_ratio| prosrc | probin ---------+----------+---------+----------+--------------+---------------+----------+-----------+------------+-------------+-------------+----------------+----------------+----------------+--------+--------lower | 256 | 11 | f | t | t | 1 | f | 25 | 25 | 100 | 0 | 0 | 100 | lower | -lower | 256 | 11 | f | t | t | 1 | f | 25 | 1043 | 100 | 0 | 0 | 100 | lower | - (2 rows) -- ok, looks like I got it right ... regression=# create index f1lower on f1v (lower(f1)); CREATE This will be a tiny bit slower than if the function were really truly built-in, but it should work well enough. But since type varchar is considered binary-compatible with type text, you shouldn't have had to create the extra function entry. It looks like the indexing routines do not pay attention to binary type compatibility when looking up functions for functional indexes. I'm not going to try fixing that now, but it's something that should be on the TODO list: * Functional indexes should allow functions on binary-compatible types regards, tom lane
Excellent, Tom. Thanks! It went from ipa=# explain select * from applicants as a where a.firstname ~* '^mitch' limit 10 offset 0; NOTICE: QUERY PLAN: Seq Scan on applicants a (cost=0.00..1355.54 rows=98 width=599) EXPLAIN ipa=# To.... ipa=# explain select * from applicants as a where lower(a.firstname) ~ lower('^mitch') limit 10 offset 0; NOTICE: QUERY PLAN: Index Scan using applicants_firstname on applicants a (cost=0.00..228.47 rows=98 width=599) EXPLAIN On 7.0 RC5. Could putting that function in there even though it's not 'built-in' cause any problems that you can think of? (Mainly worried about any kind of index corruption like I've seen before) Thanks again! - Mitch
"Mitch Vincent" <mitch@huntsvilleal.com> writes: > Could putting that function in there even though it's not 'built-in' cause > any problems that you can think of? (Mainly worried about any kind of index > corruption like I've seen before) No, shouldn't be a problem. The only thing non "built in" about it is that fmgr.c has to find the function by name rather than by OID (there's a shortcut for the OIDs of the standard pg_proc entries). So the lookup is a tad slower, that's all. regards, tom lane