Thread: Why Not MySQL?

Why Not MySQL?

From

Don Baccus

Date:

02 May 2000, 13:23:57

>Actually, I'm changing the link
>
>http://openacs.org/why-not-mysql.html

Moments after forwarding the link to Ben's piece on why
MySQL sucks to this list, he e-mailed me the above note.

Sorry for any inconvenience...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: Why Not MySQL?

From

Bruce Momjian

Date:

02 May 2000, 17:59:00

> 
> >Actually, I'm changing the link
> >
> >http://openacs.org/why-not-mysql.html
> 
> Moments after forwarding the link to Ben's piece on why
> MySQL sucks to this list, he e-mailed me the above note.
> 
> Sorry for any inconvenience...

I am adding this to the FAQ.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

02 May 2000, 18:25:09

A very, very good article. I love the comment about MySQL being a filesystem
with an SQL interface :-)

However.. I'm faced with a huge dilemma.

We use PostgreSQL for a fairly large application I wrote, the database is
still pretty small, it carries info on about 25-30,000 people and about
5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that
every month I stump someone with the obscure things that happen to our data
:-)

From corrupted indexes to corrupted system tables, it's almost always
unrecoverable. Luckily I always have a backup to restore from and the world
goes on... We've only recently started to notice that the backend is slowing
down. It seems that with every additional applicant added it get
exponentially slower... So, sadly I have to go find another backend for this
application -- a commercial one too so we can get "commercial support"
(yuck)..

So, could you guys suggest some other backends I might look into? I know
it's an odd place for me to ask but the flat truth is that I think *I* am to
blame for my Postgres troubles and even taking all of the problems into
account I think PG is the best damn free RDBMS out there. It's functionality
is superior to everyone else's, it's developers are no less than amazing and
well -- I trust you guys to give me some honest opinions.. The functionality
I need is basically what PG has.. Transactions are a must as well as some
sort of sequence -- stability over performance but performance is very
important too. It also needs to run native on FreeBSD..

Oracle is out as we use FreeBSD and someone out there decided that they
wouldn't support FreeBSD (in the license as well as in the code!)..

Thanks guys, especially to all who tried to help in private (Don, Tom --
many others)..

-Mitch

----- Original Message -----
From: Bruce Momjian <pgman@candle.pha.pa.us>
To: Don Baccus <dhogaza@pacifier.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Tuesday, May 02, 2000 5:55 PM
Subject: Re: [HACKERS] Why Not MySQL?


> >
> > >Actually, I'm changing the link
> > >
> > >http://openacs.org/why-not-mysql.html
> >
> > Moments after forwarding the link to Ben's piece on why
> > MySQL sucks to this list, he e-mailed me the above note.
> >
> > Sorry for any inconvenience...
>
> I am adding this to the FAQ.
>
> --
>   Bruce Momjian                        |  http://www.op.net/~candle
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>

Re: Why Not MySQL?

From

Tim Uckun

Date:

02 May 2000, 19:02:03

> exponentially slower... So, sadly I have to go find another backend for this
> application -- a commercial one too so we can get "commercial support"
> (yuck)..

Actually you might want to give Interbase a try. Version 6 is now in
beta and is going to be open sourced. It's been around a while and seems
pretty solid.  There will be a commercial entity to buy support
contracts from and the newsgroups are very helpful. As a bonus there is
a realatively large installed user base already and very very nice
client side tools available. See http://www.interbase.com/ or
http://www.interbase2000.com/ for further details.

If I can't get my questions answered about case sensitivity issues here
(no help so far) I will most likely to use it myself. 

> I need is basically what PG has.. Transactions are a must as well as some
> sort of sequence -- stability over performance but performance is very
> important too. It also needs to run native on FreeBSD..

IB supports transactions and  sequences. It's very stable and pretty
fast. There was a problem with shared memory in the early beta and
earlier releases but it's fixed now. I am pretty sure it runs on FreeBSD
but I am not sure if it runs natively or under linux emulation.

Re: Why Not MySQL?

From

Don Baccus

Date:

02 May 2000, 19:08:01

At 06:28 PM 5/2/00 -0400, Mitch Vincent wrote:

>So, could you guys suggest some other backends I might look into? I know
>it's an odd place for me to ask but the flat truth is that I think *I* am to
>blame for my Postgres troubles and even taking all of the problems into
>account I think PG is the best damn free RDBMS out there. It's functionality
>is superior to everyone else's, it's developers are no less than amazing and
>well -- I trust you guys to give me some honest opinions.. The functionality
>I need is basically what PG has.. Transactions are a must as well as some
>sort of sequence -- stability over performance but performance is very
>important too. It also needs to run native on FreeBSD..

First, have you been having the same problems with PG 7.0?  I recall that
you had it up on a test system but nothing more. 

It's a pity that you've reached this point, because PG is so much better
than it was 18 months ago (and before, of course, I mention that timeframe
because that's roughly when I first investigated its suitability for
the web toolkit project) and the trajectory is definitely in the right
direction.

It's also a loss to the development effort, as people with bugs in many
ways are more useful than people who have no problems (though of course
having no bugs for users to stumble across is the best situation!)

Still, I understand the need to solve your problems today, not tomorrow.

Interbase is a possible solution.   They have a pretty good reputation,
and their "super server" (threaded with connections sharing a buffer
cache) should scale well.  My rough estimate is that they're at about
the place PG will be when 7.1 comes out.  I don't know if they support
FreeBSD, though.  Any reason you can't just put up a box with Linux?

There's an older version of Sybase available at no charge, again
only for Linux, though.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

02 May 2000, 21:20:01

On Tue, 2 May 2000, Tim Uckun wrote:

> If I can't get my questions answered about case sensitivity issues here
> (no help so far) I will most likely to use it myself. 

What questions? *raised eyebrow*

Corruption (Was: Re: Why Not MySQL?)

From

The Hermit Hacker

Date:

02 May 2000, 21:38:02

As Don asks, what happened with the v7.0 trials you were doing?  Corrupted
indices, I've seen occasionally in older versions, but I can't recall ever
seeing corrupt system tables ...

I don't have a GUI browser right, so searching the archives is kinda tough
for me :(  Can you refresh my memory for me?  There has to be something
logical to this, as to what the cause for the corruption is :(

From Don's comment, I take it you are using FreeBSD?  Version?  Stability
of the machine?  Never crashes?

Version of PostgreSQL?  Compile/configure options?  Do you have any core
files in your data/base/* hierarchy that would be the result of a backend
crashing?  

I know you are looking at alternatives, but I'm terrible at letting go of
problems :(


On Tue, 2 May 2000, Mitch Vincent wrote:

> A very, very good article. I love the comment about MySQL being a filesystem
> with an SQL interface :-)
> 
> However.. I'm faced with a huge dilemma.
> 
> We use PostgreSQL for a fairly large application I wrote, the database is
> still pretty small, it carries info on about 25-30,000 people and about
> 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that
> every month I stump someone with the obscure things that happen to our data
> :-)
> 
> >From corrupted indexes to corrupted system tables, it's almost always
> unrecoverable. Luckily I always have a backup to restore from and the world
> goes on... We've only recently started to notice that the backend is slowing
> down. It seems that with every additional applicant added it get
> exponentially slower... So, sadly I have to go find another backend for this
> application -- a commercial one too so we can get "commercial support"
> (yuck)..
> 
> So, could you guys suggest some other backends I might look into? I know
> it's an odd place for me to ask but the flat truth is that I think *I* am to
> blame for my Postgres troubles and even taking all of the problems into
> account I think PG is the best damn free RDBMS out there. It's functionality
> is superior to everyone else's, it's developers are no less than amazing and
> well -- I trust you guys to give me some honest opinions.. The functionality
> I need is basically what PG has.. Transactions are a must as well as some
> sort of sequence -- stability over performance but performance is very
> important too. It also needs to run native on FreeBSD..
> 
> Oracle is out as we use FreeBSD and someone out there decided that they
> wouldn't support FreeBSD (in the license as well as in the code!)..
> 
> Thanks guys, especially to all who tried to help in private (Don, Tom --
> many others)..
> 
> -Mitch
> 
> ----- Original Message -----
> From: Bruce Momjian <pgman@candle.pha.pa.us>
> To: Don Baccus <dhogaza@pacifier.com>
> Cc: <pgsql-hackers@postgresql.org>
> Sent: Tuesday, May 02, 2000 5:55 PM
> Subject: Re: [HACKERS] Why Not MySQL?
> 
> 
> > >
> > > >Actually, I'm changing the link
> > > >
> > > >http://openacs.org/why-not-mysql.html
> > >
> > > Moments after forwarding the link to Ben's piece on why
> > > MySQL sucks to this list, he e-mailed me the above note.
> > >
> > > Sorry for any inconvenience...
> >
> > I am adding this to the FAQ.
> >
> > --
> >   Bruce Momjian                        |  http://www.op.net/~candle
> >   pgman@candle.pha.pa.us               |  (610) 853-3000
> >   +  If your life is a hard drive,     |  830 Blythe Avenue
> >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
> >
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

Thomas Lockhart

Date:

03 May 2000, 00:14:17

> > If I can't get my questions answered about case sensitivity issues here
> > (no help so far) I will most likely to use it myself.

My recollection is that it involved needing non-standard
case-insensitive LIKE comparisons to get transparent behavior with an
existing M$ Access app. So far, we were too polite to ask why one is
working so hard to maintain compatibility with a non-standard
interface, rather than writing the app to be portable. But I'll ask
now. Tim?
                    - Thomas

btw, it seems to be the case that problems such as these, which might
be interesting  during slow times (from a theoretical standpoint at
least), are decidely less so during the final stages of a release
cycle.

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: Why Not MySQL?

From

Malcontent null

Date:

03 May 2000, 00:27:17

The Hermit Hacker <scrappy@hub.org> wrote:
>
>On Tue, 2 May 2000, Tim Uckun wrote:
>
>> If I can't get my questions answered about case sensitivity issues here
>> (no help so far) I will most likely to use it myself. 
>
>What questions? *raised eyebrow*  
>
>

The question dealt with trying to acieve case insensitive collation in postgres. Specifically about possibly rewriting
thetext_cmp or the varchareq functions. I am wondering what kind of mayhem that might cause. I was told by someone who
triedit that it's possible to overload the ~~ (like) operator but that dropping the = operator cripples the database so
thatit's not possible to overload the = operator. 

As to why I might want to do this I'll answer that one in response to another question :)
----------
Message To Spammers -- Game Over!  Get spam-free email at http://www.MsgTo.com

Re: Why Not MySQL?

From

Malcontent null

Date:

03 May 2000, 00:50:19

Thomas Lockhart <lockhart@alumni.caltech.edu> wrote:

>existing M$ Access app. So far, we were too polite to ask why one is
>working so hard to maintain compatibility with a non-standard
>interface, rather than writing the app to be portable. But I'll ask
>now. Tim?

Fair enough question. I agree with you that this is non standard typical MS lock in crap. But I have an application
thatis written in access and has outgrown the data engine in access (which is pretty pathetic). Unfortunately this
applicationis very large with over 300 tables and over 1400 saved queries (views). The MS solution to this problem is
toupgrade to MS-SQL server (vendor lock in) which processes the queries in the exact same case insensitive manner. SQL
serverdoes not break my application. I on the other hand want to avoid upsizing to SQL server. I could use sybase which
alsoallows for case insensitive collation (no surprise there) but I really-really want to use an upen source database
server.
So. So right now I have a few choices.
1) Buckle into the vendor lock and be stuck with NT and SQL server
2) Buy sybase and spend way more then I want to.
3) Completely rewrite all 1400 queries and write all kinds of new code make sure the SQL emitted by access gets
interceptedand translated properly.

4) Make Postgres process queries case insensitively.

Well the third one is out of the question really I don't have that kind of time or money. It would take me a the rest
ofthe year to accomplish that goal and the database would have to be taken out of commision in the meantime. 

>
>btw, it seems to be the case that problems such as these, which might
>be interesting  during slow times (from a theoretical standpoint at
>least), are decidely less so during the final stages of a release
>cycle.

I fully understand that you guys have your own set of priorities. I also appreciate the work you guys have put into
makingpostgres into a database I want to use. Having said all that I did wait 4 to 5 days without a reply of any sort.
Itwould have been perfectly fine for somebody to say "It's not possible don't waste your time", "Don't ask this
questionhere", "we are really entirely too busy to deal with this" or even "go away and don't ever bother us ever
again".

----------
Message To Spammers -- Game Over!  Get spam-free email at http://www.MsgTo.com

Re: Why Not MySQL?

From

Don Baccus

Date:

03 May 2000, 00:59:17

At , Malcontent null wrote:

>The question dealt with trying to acieve case insensitive collation in
postgres. Specifically about possibly rewriting the text_cmp or the
varchareq functions. I am wondering what kind of mayhem that might cause. I
was told by someone who tried it that it's possible to overload the ~~
(like) operator but that dropping the = operator cripples the database so
that it's not possible to overload the = operator. 
>
>As to why I might want to do this I'll answer that one in response to
another question :)

Actually, I'd suggest you state the problem you're trying to solve first,
rather
than present the reasons you think PG can't handle the problem without
showing your
hand.  Saying, "how can I implement the following solution to an unstated
problem"
rather than simply stating the problem seems ... well, impolite at first
glance.

I'm assuming that you actually are seeking a solution, rather than starting
some sort of crusade in favor of a pet peeve?

Don't misunderstand ... you may be right, there may be no way to solve your
problem efficiently in PG.  But until you tell us what you're trying to do,
we have no way to decide whether or not you've actually exhausted the
possibilities.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: Why Not MySQL?

From

Thomas Lockhart

Date:

03 May 2000, 01:06:21

> I fully understand that you guys have your own set of priorities. I also
> appreciate the work you guys have put into making postgres into a database
> I want to use. Having said all that I did wait 4 to 5 days without a reply
> of any sort. It would have been perfectly fine for somebody to say "It's not
> possible don't waste your time", "Don't ask this question here", "we are
> really entirely too busy to deal with this" or even "go away and don't ever
> bother us ever again".

Well, none of those things are true, and it is rare that someone would
speak for a group this widely distributed to say "we are too busy". In
most cases, when the usual suspects are too busy someone else will
post an answer to a question, and you are never likely to get a
definitive "I'm too busy and everyone else is too".

At some point, someone may have time to answer *exactly* the questions
you asked. Another strategy to try after the first one failed is to
come in with the more detailed problem statement, asking for
suggestions on a solution. Particularly if you can phrase it so it is
clear that it may solve problems for a larger class of user than the
one who managed to grow a M$ Access app to 300 tables and 1400 queries
before deciding that Access might be a little light in performance to
be suitable. But that's water under the bridge, eh?

Anyway, so the larger class of problem is for the Sybase/M$ user who
relies on case insensitive queries (which *are* available in Postgres)
which are indistinguishable from the SQL92-mandated case-sensitive
ones. So we might explore the possibilities for a contrib/ module
which does this, though because it touches on replacing existing
backend code it may not quite fly since there are some function lookup
optimizations which may keep you from overwriting the existing
routines. But it would be a neat capability to have; I wonder if it
would work right away or if we could tweak the backend to allow this
in the future??

Of course the alternative is to just dive in and hack and slash at the
backend code. Look in parser/gram.y and utils/adt/like.c for
starters...
                        - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: Why Not MySQL?

From

Hannu Krosing

Date:

03 May 2000, 01:17:21

Mitch Vincent wrote:
> 
> A very, very good article. I love the comment about MySQL being a filesystem
> with an SQL interface :-)
> 
> However.. I'm faced with a huge dilemma.
> 
> We use PostgreSQL for a fairly large application I wrote, the database is
> still pretty small, it carries info on about 25-30,000 people and about
> 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that
> every month I stump someone with the obscure things that happen to our data
> :-)

What version are you using ?

> >From corrupted indexes to corrupted system tables, it's almost always
> unrecoverable. Luckily I always have a backup to restore from and the world
> goes on... We've only recently started to notice that the backend is slowing
> down. It seems that with every additional applicant added it get
> exponentially slower... So, sadly I have to go find another backend for this
> application -- a commercial one too so we can get "commercial support"
> (yuck)..

Could you be a little more specific on your performance issues ?

The usual way to deal wih them is tuning your db structure and/or
queries or 
setting backend options to use more memory for stuff or other such
things. 

If there is something wrong with the structure or queries, then a
database 
switch will help you very little, unless your front-end tool has some
special
support for _some_ databases and not for others.

> So, could you guys suggest some other backends I might look into?

The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid  

The website is usually obtained by putting www inf front and com at the
end ;)

And let us know of your results.

> I know
> it's an odd place for me to ask but the flat truth is that I think *I* am to
> blame for my Postgres troubles and even taking all of the problems into
> account I think PG is the best damn free RDBMS out there. It's functionality
> is superior to everyone else's, it's developers are no less than amazing and
> well -- I trust you guys to give me some honest opinions.. The functionality
> I need is basically what PG has.. Transactions are a must as well as some
> sort of sequence -- stability over performance but performance is very
> important too. It also needs to run native on FreeBSD..
> 
> Oracle is out as we use FreeBSD and someone out there decided that they
> wouldn't support FreeBSD (in the license as well as in the code!)..

Is FreeBSD a religious issue there or can it be negotiated ?

-------------
Hannu

Re: Why Not MySQL?

From

Malcontent null

Date:

03 May 2000, 02:07:17

Don Baccus <dhogaza@pacifier.com> wrote:

>
>Actually, I'd suggest you state the problem you're trying to solve first,
>rather
>than present the reasons you think PG can't handle the problem without
>showing your
>hand.  Saying, "how can I implement the following solution to an unstated
>problem"
>rather than simply stating the problem seems ... well, impolite at first
>glance.

I admit that I may not be the clearest thinking individual on this planet but I thought I stated my problem in the
originalpost to the best of my ability. I certainly wasn't trying to be rude. Here is a snippet from my original post.

"In a nutshell I want to use postgres as a back end to an access
database. This means that all collation done by postgres musht be case
insensitive including like clauses."

I left the all the spelling mistakes in place :)

----------
Message To Spammers -- Game Over!  Get spam-free email at http://www.MsgTo.com

Re: Why Not MySQL?

From

Malcontent null

Date:

03 May 2000, 02:22:17

Thomas Lockhart <lockhart@alumni.caltech.edu> wrote:
>
>clear that it may solve problems for a larger class of user than the
>one who managed to grow a M$ Access app to 300 tables and 1400 queries
>before deciding that Access might be a little light in performance to
>be suitable. But that's water under the bridge, eh?

Actually I did post twice I had hoped that I was being more clear the second time. As for growing the access database
wellsometimes apps take a life of their own. Database apps in general tend to be too critical to business to just scrap
andrewrite so they just keep growing. 

>Anyway, so the larger class of problem is for the Sybase/M$ user who
>relies on case insensitive queries (which *are* available in Postgres)

If I may.
MS Access for all of it's damnable faults is the single most popular database in the world. There are a whole slew of
peoplewho do nothing except access programming and make very good money at it. Postgres is a great candidate as a
possible back end database engine for access.  This is a big possible application for postgres. To be usable for this
purposehowever it needs a few things.

1) Longer object names (I guess this is possible via a DEFINE)
2) Case insensitive queries.
3) Outer joins (coming soon!).
4) Maybe ADO drivers for the VB users of the world.
I don't know how important access integration is to the postgres community as a whole though.

>Of course the alternative is to just dive in and hack and slash at the
>backend code. Look in parser/gram.y and utils/adt/like.c for
>starters...

Thanks for the tip I'll start looking at this right away.

----------
Message To Spammers -- Game Over!  Get spam-free email at http://www.MsgTo.com

Re: Corruption (Was: Re: Why Not MySQL?)

From

Vince Vielhaber

Date:

03 May 2000, 05:31:22

On Tue, 2 May 2000, The Hermit Hacker wrote:

> 
> As Don asks, what happened with the v7.0 trials you were doing?  Corrupted
> indices, I've seen occasionally in older versions, but I can't recall ever
> seeing corrupt system tables ...
> 
> I don't have a GUI browser right, so searching the archives is kinda tough
> for me :(  Can you refresh my memory for me?  There has to be something
> logical to this, as to what the cause for the corruption is :(
> 
> >From Don's comment, I take it you are using FreeBSD?  Version?  Stability
> of the machine?  Never crashes?
> 
> Version of PostgreSQL?  Compile/configure options?  Do you have any core
> files in your data/base/* hierarchy that would be the result of a backend
> crashing?  
> 
> I know you are looking at alternatives, but I'm terrible at letting go of
> problems :(

His description of table corruption and the system running slower and
slower sounds like a disk going bad.   I've seen it hundreds of times
on news machines.  Constant retries while trying to write to the disk
will give slowdowns.  Having data on a spot of the disk that's unreliable
will certainly cause data integrity problems.  

Mitch, have you thoroughly checked the hardware?  

Vince.


> 
> 
> On Tue, 2 May 2000, Mitch Vincent wrote:
> 
> > A very, very good article. I love the comment about MySQL being a filesystem
> > with an SQL interface :-)
> > 
> > However.. I'm faced with a huge dilemma.
> > 
> > We use PostgreSQL for a fairly large application I wrote, the database is
> > still pretty small, it carries info on about 25-30,000 people and about
> > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems that
> > every month I stump someone with the obscure things that happen to our data
> > :-)
> > 
> > >From corrupted indexes to corrupted system tables, it's almost always
> > unrecoverable. Luckily I always have a backup to restore from and the world
> > goes on... We've only recently started to notice that the backend is slowing
> > down. It seems that with every additional applicant added it get
> > exponentially slower... So, sadly I have to go find another backend for this
> > application -- a commercial one too so we can get "commercial support"
> > (yuck)..
> > 
> > So, could you guys suggest some other backends I might look into? I know
> > it's an odd place for me to ask but the flat truth is that I think *I* am to
> > blame for my Postgres troubles and even taking all of the problems into
> > account I think PG is the best damn free RDBMS out there. It's functionality
> > is superior to everyone else's, it's developers are no less than amazing and
> > well -- I trust you guys to give me some honest opinions.. The functionality
> > I need is basically what PG has.. Transactions are a must as well as some
> > sort of sequence -- stability over performance but performance is very
> > important too. It also needs to run native on FreeBSD..
> > 
> > Oracle is out as we use FreeBSD and someone out there decided that they
> > wouldn't support FreeBSD (in the license as well as in the code!)..
> > 
> > Thanks guys, especially to all who tried to help in private (Don, Tom --
> > many others)..
> > 
> > -Mitch
> > 
> > ----- Original Message -----
> > From: Bruce Momjian <pgman@candle.pha.pa.us>
> > To: Don Baccus <dhogaza@pacifier.com>
> > Cc: <pgsql-hackers@postgresql.org>
> > Sent: Tuesday, May 02, 2000 5:55 PM
> > Subject: Re: [HACKERS] Why Not MySQL?
> > 
> > 
> > > >
> > > > >Actually, I'm changing the link
> > > > >
> > > > >http://openacs.org/why-not-mysql.html
> > > >
> > > > Moments after forwarding the link to Ben's piece on why
> > > > MySQL sucks to this list, he e-mailed me the above note.
> > > >
> > > > Sorry for any inconvenience...
> > >
> > > I am adding this to the FAQ.
> > >
> > > --
> > >   Bruce Momjian                        |  http://www.op.net/~candle
> > >   pgman@candle.pha.pa.us               |  (610) 853-3000
> > >   +  If your life is a hard drive,     |  830 Blythe Avenue
> > >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
> > >
> > 
> 
> Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
> Systems Administrator @ hub.org 
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
> 
> 
> 

-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net128K ISDN from $22.00/mo - 56K Dialup from
$16.00/moat Pop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop
Superstore   http://www.cloudninegifts.com
 
==========================================================================

RE: Why Not MySQL?

From

Magnus Hagander

Date:

03 May 2000, 07:44:38

> >existing M$ Access app. So far, we were too polite to ask why one is
> >working so hard to maintain compatibility with a non-standard
> >interface, rather than writing the app to be portable. But I'll ask
> >now. Tim?
> 
> Fair enough question. I agree with you that this is non 
> standard typical MS lock in crap. But I have an application 
> that is written in access and has outgrown the data engine in 
> access (which is pretty pathetic). Unfortunately this 
> application is very large with over 300 tables and over 1400 
> saved queries (views). The MS solution to this problem is to 
> upgrade to MS-SQL server (vendor lock in) which processes the 
> queries in the exact same case insensitive manner. SQL server 
> does not break my application. I on the other hand want to 
> avoid upsizing to SQL server. 
Not to turn you away from PostgreSQL, but you might want to look at MSDE
(Microsoft Data Engine) as an easier step. It has the same query processor
as SQL Server, but scales much better (and includes transaction logs etc).
See for example
http://msdn.microsoft.com/library/backgrnd/html/msdeforvs.htm.
The license for MSDE is also included in Office 2000 Pro/Premium, so chances
are you may already have the required licenses.
Still leaves you in the Microsoft box, though.


//Magnus

RE: Why Not MySQL?

From

Magnus Hagander

Date:

03 May 2000, 07:46:19

> If I may.
> MS Access for all of it's damnable faults is the single most 
> popular database in the world. There are a whole slew of 
> people who do nothing except access programming and make very 
> good money at it. Postgres is a great candidate as a possible 
>  back end database engine for access.  This is a big possible 
> application for postgres. To be usable for this purpose 
> however it needs a few things.
> 1) Longer object names (I guess this is possible via a DEFINE)
> 2) Case insensitive queries.
> 3) Outer joins (coming soon!).

> 4) Maybe ADO drivers for the VB users of the world.
Should be no major need for a separate ADO driver for Pg. You can use ADO
going through the ODBC driver, which already exists. 


//Magnus

Re: Why Not MySQL?

From

Hannu Krosing

Date:

03 May 2000, 07:55:22

Malcontent null wrote:
> 
> >Anyway, so the larger class of problem is for the Sybase/M$ user who
> >relies on case insensitive queries (which *are* available in Postgres)

Maybe the right place to introduce case-insensitiveness would be in ODBC 
driver then ?

> If I may.
> MS Access for all of it's damnable faults is the single most popular
> database in the world. There are a whole slew of people who do nothing
> except access programming and make very good money at it. Postgres is
> a great candidate as a possible  back end database engine for access.
> This is a big possible application for postgres. To be usable for this
> purpose however it needs a few things.
> 1) Longer object names (I guess this is possible via a DEFINE)

How long should they be ?

> 2) Case insensitive queries.

Probably only the Access subset ("like", "order by", maybe even "=" ?)

> 3) Outer joins (coming soon!).
> 4) Maybe ADO drivers for the VB users of the world.

AFAIK MS moves fast and ADO will be soon (or is already) officially obsolete.

The technology du jour is XML.

> I don't know how important access integration is to the postgres
> community as a whole though.

Probably not a top priority. Oracle is much more often seen as the target.

---------------------
Hannu

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 08:27:31

On Wed, 3 May 2000, Hannu Krosing wrote:

> Malcontent null wrote:
> > 
> > >Anyway, so the larger class of problem is for the Sybase/M$ user who
> > >relies on case insensitive queries (which *are* available in Postgres)
> 
> Maybe the right place to introduce case-insensitiveness would be in ODBC 
> driver then ?
> 
> > If I may.
> > MS Access for all of it's damnable faults is the single most popular
> > database in the world. There are a whole slew of people who do nothing
> > except access programming and make very good money at it. Postgres is
> > a great candidate as a possible  back end database engine for access.
> > This is a big possible application for postgres. To be usable for this
> > purpose however it needs a few things.
> > 1) Longer object names (I guess this is possible via a DEFINE)
> 
> How long should they be ?
> 
> > 2) Case insensitive queries.
> 
> Probably only the Access subset ("like", "order by", maybe even "=" ?)

don't we have a 'lower()' function?

SELECT * FROM <table> WHERE field ~* 'this string' ORDER BY lower(field);?

or

SELECT * FROM <table> WHERE lower(field) = lower('StriNg');

Re: Corruption (Was: Re: Why Not MySQL?)

From

The Hermit Hacker

Date:

03 May 2000, 08:30:24

On Wed, 3 May 2000, Vince Vielhaber wrote:

> On Tue, 2 May 2000, The Hermit Hacker wrote:
> 
> > 
> > As Don asks, what happened with the v7.0 trials you were doing?  Corrupted
> > indices, I've seen occasionally in older versions, but I can't recall ever
> > seeing corrupt system tables ...
> > 
> > I don't have a GUI browser right, so searching the archives is kinda tough
> > for me :(  Can you refresh my memory for me?  There has to be something
> > logical to this, as to what the cause for the corruption is :(
> > 
> > >From Don's comment, I take it you are using FreeBSD?  Version?  Stability
> > of the machine?  Never crashes?
> > 
> > Version of PostgreSQL?  Compile/configure options?  Do you have any core
> > files in your data/base/* hierarchy that would be the result of a backend
> > crashing?  
> > 
> > I know you are looking at alternatives, but I'm terrible at letting go of
> > problems :(
> 
> His description of table corruption and the system running slower and
> slower sounds like a disk going bad.   I've seen it hundreds of times
> on news machines.  Constant retries while trying to write to the disk
> will give slowdowns.  Having data on a spot of the disk that's unreliable
> will certainly cause data integrity problems.  

That was one thing I was thinking ... the other was the possibility that
he's mount'd async and his machine is rebooting ... *or* he has memory
problems causing the shared memory to corrupt, dump the postmaster process
which is corrupting his tables ...


Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

Hannu Krosing

Date:

03 May 2000, 08:48:22

The Hermit Hacker wrote:
> 
> On Wed, 3 May 2000, Hannu Krosing wrote:
> 
> > Malcontent null wrote:
> > >
> > > >Anyway, so the larger class of problem is for the Sybase/M$ user who
> > > >relies on case insensitive queries (which *are* available in Postgres)
> >
> > Maybe the right place to introduce case-insensitiveness would be in ODBC
> > driver then ?
> >

...

> > > 2) Case insensitive queries.
> >
> > Probably only the Access subset ("like", "order by", maybe even "=" ?)
> 
> don't we have a 'lower()' function?
> 
> SELECT * FROM <table> WHERE field ~* 'this string' ORDER BY lower(field);?
> 
> or
> 
> SELECT * FROM <table> WHERE lower(field) = lower('StriNg');

That's what I meant by introducing pushing the case-insensitiveness into ODBC, 
so that the MS Access program can be made case-insensitive automatically by

A) rewriting the queries to use lower()

or

B) by using case-insensitive operators where possible.

----------------
Hannu

Re: Why Not MySQL?

From

"Sergio A. Kessler"

Date:

03 May 2000, 09:13:21

Thomas Lockhart <lockhart@alumni.caltech.edu> el día Wed, 03 May 2000 
04:12:37 +0000, escribió:

>> > If I can't get my questions answered about case sensitivity issues here
>> > (no help so far) I will most likely to use it myself.
>
>My recollection is that it involved needing non-standard
>case-insensitive LIKE comparisons to get transparent behavior with an
>existing M$ Access app. So far, we were too polite to ask why one is
>working so hard to maintain compatibility with a non-standard
>interface, rather than writing the app to be portable. But I'll ask
>now. Tim?

if tim want a LIKE to be done case-insensitive, why not doing:

select foo from table1 where upper(foo) like "%BAR%" ??

sergio

Re: Why Not MySQL?

From

Bruce Momjian

Date:

03 May 2000, 09:31:24

> Of course the alternative is to just dive in and hack and slash at the
> backend code. Look in parser/gram.y and utils/adt/like.c for
> starters...

That would be my recommendation.  It is open source, so you can modify
it however you like.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 10:16:24

Hi Don, thanks for your reply...

> First, have you been having the same problems with PG 7.0?  I recall that
> you had it up on a test system but nothing more.

No, I was afraid to run 7.0 beta on the production server. I do have it on
my development server however and haven't had any problems with it (of
course the devel server is only used by me, the production server is used by
about 600 people)...
> It's a pity that you've reached this point, because PG is so much better
> than it was 18 months ago (and before, of course, I mention that timeframe
> because that's roughly when I first investigated its suitability for
> the web toolkit project) and the trajectory is definitely in the right
> direction.

Well, we've started to look into it. My boss is worried about the stability
and speed of Postgres in the long run, personally I love PG, I would like to
stay with it until at least 7.1, if the problems still continue then maybe
look elsewhere.

> It's also a loss to the development effort, as people with bugs in many
> ways are more useful than people who have no problems (though of course
> having no bugs for users to stumble across is the best situation!)
> Still, I understand the need to solve your problems today, not tomorrow.


> Interbase is a possible solution.   They have a pretty good reputation,
> and their "super server" (threaded with connections sharing a buffer
> cache) should scale well.  My rough estimate is that they're at about
> the place PG will be when 7.1 comes out.  I don't know if they support
> FreeBSD, though.  Any reason you can't just put up a box with Linux?

Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that
Linux and I use to be friends, now we're mortal enemies. :-)

-Mitch Vincent

Re: Corruption (Was: Re: Why Not MySQL?)

From

"Mitch Vincent"

Date:

03 May 2000, 10:25:41

> As Don asks, what happened with the v7.0 trials you were doing?  Corrupted
> indices, I've seen occasionally in older versions, but I can't recall ever
> seeing corrupt system tables ...

I couldn't run the 7.0 beta on our production server. It was forbidden from
"higher up"..

> I don't have a GUI browser right, so searching the archives is kinda tough
> for me :(  Can you refresh my memory for me?  There has to be something
> logical to this, as to what the cause for the corruption is :(

Ok, the latest thing was   "cannot find attribute 15 of relation pg_am" -- I
got that when I tried to do an query.
> >From Don's comment, I take it you are using FreeBSD?  Version?
Stability
> of the machine?  Never crashes?

FreeBSD 4.0-R

The machine is brand new (we built it because we thought it was a hardware
problem before)..

Ultra 160 SCSI Drives, 512 megs of ECC RAM, PIII 500 processor (soon to be
upgraded).

> Version of PostgreSQL?  Compile/configure options?  Do you have any core
> files in your data/base/* hierarchy that would be the result of a backend
> crashing?

PG 6.5.3, no core files (this latest time at least, in the past there have
been).

As far as configure options, nothing, just the default configuration...

> I know you are looking at alternatives, but I'm terrible at letting go of
> problems :(

Me too, that's why I've stayed with PG for 6 monthes with these problems..

I wish I had more to tell you now, however I had to restore the data from a
backup.
Thanks!!

-Mitch

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 10:39:29

> > We use PostgreSQL for a fairly large application I wrote, the database
is
> > still pretty small, it carries info on about 25-30,000 people and about
> > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems
that
> > every month I stump someone with the obscure things that happen to our
data
> > :-)
>
> What version are you using ?

6.5.3 :-)

> > >From corrupted indexes to corrupted system tables, it's almost always
> > unrecoverable. Luckily I always have a backup to restore from and the
world
> > goes on... We've only recently started to notice that the backend is
slowing
> > down. It seems that with every additional applicant added it get
> > exponentially slower... So, sadly I have to go find another backend for
this
> > application -- a commercial one too so we can get "commercial support"
> > (yuck)..
>
> Could you be a little more specific on your performance issues ?

Well, I'm just noticing that simple select queries are taking 3-5 seconds -
on a table with 63 fields, 10000ish  rows of data. The ID fields are
indexed, as well as several always-searched varchar() fields.

Here are some typical queries my application might generate. Please, let me
know if you see anything that can be improved!

select * from applicants as a where a.created::date = '05-01-2000' and
a.firstname ~* '^mitch' limit 10 offset 0

select * from applicants as a,applicants_states as s where a.firstname ~*
'^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0

.... There are 63 fields in the 'applicants' table, all of which are
searchable. Would it be a good or bad thing to index all fields that are
searchable?

> The usual way to deal wih them is tuning your db structure and/or
> queries or
> setting backend options to use more memory for stuff or other such
> things.

I'd love some pointers!  This machine has lots-n-lots of memory. I'd love to
make postgre use more than normal if it would get me better speed!

> If there is something wrong with the structure or queries, then a
> database
> switch will help you very little, unless your front-end tool has some
> special
> support for _some_ databases and not for others.

PHP/C is what I use to access PostgreSQL.

> > So, could you guys suggest some other backends I might look into?
>
> The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid
>
> The website is usually obtained by putting www inf front and com at the
> end ;)
>
> And let us know of your results.
>
> > I know
> > it's an odd place for me to ask but the flat truth is that I think *I*
am to
> > blame for my Postgres troubles and even taking all of the problems into
> > account I think PG is the best damn free RDBMS out there. It's
functionality
> > is superior to everyone else's, it's developers are no less than amazing
and
> > well -- I trust you guys to give me some honest opinions.. The
functionality
> > I need is basically what PG has.. Transactions are a must as well as
some
> > sort of sequence -- stability over performance but performance is very
> > important too. It also needs to run native on FreeBSD..
> >
> > Oracle is out as we use FreeBSD and someone out there decided that they
> > wouldn't support FreeBSD (in the license as well as in the code!)..
>
> Is FreeBSD a religious issue there or can it be negotiated ?

Not religious, though I love that OS like I love my right arm.. I am stuck
with x86 hardware -- that's pretty much it (and eliminates Solaris and
several other as a possible OS I suppose)..

THANKS!!

-Mitch Vincent

Re: Corruption (Was: Re: Why Not MySQL?)

From

"Mitch Vincent"

Date:

03 May 2000, 10:42:23

> His description of table corruption and the system running slower and
> slower sounds like a disk going bad.   I've seen it hundreds of times
> on news machines.  Constant retries while trying to write to the disk
> will give slowdowns.  Having data on a spot of the disk that's unreliable
> will certainly cause data integrity problems.
>
> Mitch, have you thoroughly checked the hardware?

Checked and replaced twice. We're using Ultra 160 SCSI drives now so the
speed there isn't a problem I hope. :-)

Re: Corruption (Was: Re: Why Not MySQL?)

From

Ed Loehr

Date:

03 May 2000, 10:44:23

Mitch Vincent wrote:
> 
> > As Don asks, what happened with the v7.0 trials you were doing?  Corrupted
> > indices, I've seen occasionally in older versions, but I can't recall ever
> > seeing corrupt system tables ...
> 
> I couldn't run the 7.0 beta on our production server. It was forbidden from
> "higher up"..

Hmmm.  I wonder what would be the reaction if you showed your "higher ups"
the list of bug-fixes between your release and 7.0.  It is a *very long*
list.

I loathe unplanned/forced upgrades, especially when due to reliability
problems.  I was put in that position with pg 6.5.2 problems.  But I have
been quite pleased with 7.0 so far.

Regards,
Ed Loehr

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 10:44:25

On Wed, 3 May 2000, Mitch Vincent wrote:

> Hi Don, thanks for your reply...
> 
> > First, have you been having the same problems with PG 7.0?  I recall that
> > you had it up on a test system but nothing more.
> 
> No, I was afraid to run 7.0 beta on the production server. I do have it on
> my development server however and haven't had any problems with it (of
> course the devel server is only used by me, the production server is used by
> about 600 people)...

Okay, its no longer beta :)  run it, run it :)

My experience so far is that it was basically a 'plug-n-play' sort of
thing ... dump your database, upgrade and reload ... the only problem I
had was one script that I did something like:

SELECT field as field

without realizing it, and v7.0 won't allow that (field == field) ... other
then that, *much* more stable, and appears faster then the old ...

> Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that
> Linux and I use to be friends, now we're mortal enemies. :-)

I like you :)

Re: Corruption (Was: Re: Why Not MySQL?)

From

The Hermit Hacker

Date:

03 May 2000, 10:48:27

On Wed, 3 May 2000, Mitch Vincent wrote:

> > I don't have a GUI browser right, so searching the archives is kinda tough
> > for me :(  Can you refresh my memory for me?  There has to be something
> > logical to this, as to what the cause for the corruption is :(
> 
> Ok, the latest thing was   "cannot find attribute 15 of relation pg_am" -- I
> got that when I tried to do an query.

Someone recently posted similar, and I swore he answered himself with a
'vacuuming fixed the problem' ... but I could be mis-quoting ...

> > Version of PostgreSQL?  Compile/configure options?  Do you have any core
> > files in your data/base/* hierarchy that would be the result of a backend
> > crashing?
> 
> PG 6.5.3, no core files (this latest time at least, in the past there have
> been).
> 
> As far as configure options, nothing, just the default configuration...

Okay, with v7.0, I'd recommend adding --enable-debug, so that if a core
does creep in, we can analyze it ...

My first and foremost recommendation is to upgrade to v7.0 first ... take
your test machine and make sure you have no problem with the dump/reload,
but v7.0 is, once more, leaps and bounds ahead of v6.5.3 ... no guarantees
it will make a difference, but at least it gets you into a release that
we're going to be focusing on debugging intensely over the next little
while ...

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 11:13:23

Malcontent null <malcontent@msgto.com> writes:
> "In a nutshell I want to use postgres as a back end to an access
> database. This means that all collation done by postgres musht be case
> insensitive including like clauses."

Well, it'd certainly be easy enough to make a case-insensitive set of
text comparison functions and LIKE comparator (sounds like you already
found out where they live, so go to it).  The $64 question is exactly
when where and how to invoke them.  For your immediate purpose I suppose
you can just alter the standard operators, and thus produce a server
that can do nothing *but* case-insensitive text comparisons.  That will
certainly not be acceptable as a general-usage answer, however.

There has been a good deal of talk about supporting more of SQL92's
national-character and collation features in future releases.  What
might ultimately happen is that we offer a case-insensitive collation
mode that could be assigned to particular text columns, or even
made the default for a whole database.  Not sure how far down the road
that is.

In the meantime, there's no shame in running a locally-hacked server.
That's one of the things that open source is for, after all ;-)
        regards, tom lane

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 11:22:26

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
>> First, have you been having the same problems with PG 7.0?  I recall that
>> you had it up on a test system but nothing more.

> No, I was afraid to run 7.0 beta on the production server. I do have it on
> my development server however and haven't had any problems with it (of
> course the devel server is only used by me, the production server is used by
> about 600 people)...

FWIW, we've fixed a huge number of bugs since 6.5.*.  Even the beta
versions of 7.0 are more stable than any prior release IMHO (and we've
seen no beta test reports that would contradict that).  I'd really like
to see you try 7.0 before walking away...
        regards, tom lane

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 11:23:28

Oh, I'm downloading it to the development server as I type this..

I didn't know RC3 (virtually the release) was out. I plan on starting to use
it today if everything works on the development server.

- Mitch

----- Original Message -----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Mitch Vincent <mitch@huntsvilleal.com>
Cc: <pgsql-hackers@postgresql.org>; Don Baccus <dhogaza@pacifier.com>
Sent: Wednesday, May 03, 2000 11:21 AM
Subject: Re: [HACKERS] Why Not MySQL?


> "Mitch Vincent" <mitch@huntsvilleal.com> writes:
> >> First, have you been having the same problems with PG 7.0?  I recall
that
> >> you had it up on a test system but nothing more.
>
> > No, I was afraid to run 7.0 beta on the production server. I do have it
on
> > my development server however and haven't had any problems with it (of
> > course the devel server is only used by me, the production server is
used by
> > about 600 people)...
>
> FWIW, we've fixed a huge number of bugs since 6.5.*.  Even the beta
> versions of 7.0 are more stable than any prior release IMHO (and we've
> seen no beta test reports that would contradict that).  I'd really like
> to see you try 7.0 before walking away...
>
> regards, tom lane
>

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 11:30:26

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
>> Could you be a little more specific on your performance issues ?

> Well, I'm just noticing that simple select queries are taking 3-5 seconds -
> on a table with 63 fields, 10000ish  rows of data. The ID fields are
> indexed, as well as several always-searched varchar() fields.

Hmm.  What does EXPLAIN show for the query plan?  You might also try
turning on execution stats (run psql with PGOPTIONS="-d2 -s" for
starters) to see what getrusage() can tell.  The results will be in the
postmaster log and might look like this:

StartTransactionCommand
query: SELECT usename, relname, relkind, relhasrules FROM pg_class, pg_user WHERE usesysid = relowner and ( relkind =
'r'OR relkind = 'i' OR relkind = 'S') and relname !~ '^pg_' and (relkind != 'i' OR relname !~ '^xinx') ORDER BY relname

ProcessQuery
! system usage stats:
!    0.083256 elapsed 0.040000 user 0.000000 system sec
!    [0.080000 user 0.020000 sys total]
!    12/1 [46/11] filesystem blocks in/out
!    0/0 [1/2] page faults/reclaims, 0 [0] swaps
!    0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
!    8/5 [29/10] voluntary/involuntary context switches
! postgres usage stats:
!    Shared blocks:         18 read,          0 written, buffer hit rate = 94.29%
!    Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!    Direct blocks:          0 read,          0 written
CommitTransactionCommand
        regards, tom lane

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 11:30:32

On Wed, 3 May 2000, Mitch Vincent wrote:

> Here are some typical queries my application might generate. Please, let me
> know if you see anything that can be improved!

First comment ... Tom Lane always jumps on me on this ... if you are going
to send a QUERY to get recommendations on, send in an EXPLAIN on that
query also, so that we can see what the backend 'thinks" its going to do
...

> select * from applicants as a where a.created::date = '05-01-2000' and
> a.firstname ~* '^mitch' limit 10 offset 0

First comment, that Tom can clarify in case I'm wrong ... when I ran
UDMSearch under v6.5.3, there was a problem where a LIKE query was causing
a query to take forever to complete ... Tom, at the time, got me to change
the query so that instead of:

url LIKE '%s'

it was:

(url || '') LIKE '%s'

Now, this was in an earlier RC of v7.0 that I had to do this, and Tom made
some changes to the following one to 'fix the problem', but my performance
went from several *minutes* to several *seconds* of time to complete the
exact same query ...

> > The usual way to deal wih them is tuning your db structure and/or
> > queries or
> > setting backend options to use more memory for stuff or other such
> > things.
> 
> I'd love some pointers!  This machine has lots-n-lots of memory. I'd love to
> make postgre use more than normal if it would get me better speed!

on my machine (FreeBSD 4.0-STABLE), I'm currently running with a kernel
of:

options         SYSVSHM
options         SHMMAXPGS=262144
options         SHMSEG=32

options         SYSVSEM
options         SEMMNI=40
options         SEMMNS=240
options         SEMMNU=120
options         SEMMAP=120

options         SYSVMSG

and a -B set to 4096 and -o ' -S 16384 ' ... the -B deals with teh amoun
tof shared memory, the -S I'm using only affects stuff like ORDER BY and
GROUP BY (allocates up to how much RAM to use on a sort before going to
disk ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 11:35:24

Okay, single or dual processor machine?  I'm running Dual myself, and for
any largish DB, I would highly recommend that regardless of OS ... at
least if a query starts to hog CPU, you have a second one to use ...

On Wed, 3 May 2000, Mitch Vincent wrote:

> > > We use PostgreSQL for a fairly large application I wrote, the database
> is
> > > still pretty small, it carries info on about 25-30,000 people and about
> > > 5,000 jobs. Recently we've had huge trouble with PostgreSQL -- it seems
> that
> > > every month I stump someone with the obscure things that happen to our
> data
> > > :-)
> >
> > What version are you using ?
> 
> 6.5.3 :-)
> 
> > > >From corrupted indexes to corrupted system tables, it's almost always
> > > unrecoverable. Luckily I always have a backup to restore from and the
> world
> > > goes on... We've only recently started to notice that the backend is
> slowing
> > > down. It seems that with every additional applicant added it get
> > > exponentially slower... So, sadly I have to go find another backend for
> this
> > > application -- a commercial one too so we can get "commercial support"
> > > (yuck)..
> >
> > Could you be a little more specific on your performance issues ?
> 
> Well, I'm just noticing that simple select queries are taking 3-5 seconds -
> on a table with 63 fields, 10000ish  rows of data. The ID fields are
> indexed, as well as several always-searched varchar() fields.
> 
> Here are some typical queries my application might generate. Please, let me
> know if you see anything that can be improved!
> 
> select * from applicants as a where a.created::date = '05-01-2000' and
> a.firstname ~* '^mitch' limit 10 offset 0
> 
> select * from applicants as a,applicants_states as s where a.firstname ~*
> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0
> 
> .... There are 63 fields in the 'applicants' table, all of which are
> searchable. Would it be a good or bad thing to index all fields that are
> searchable?
> 
> > The usual way to deal wih them is tuning your db structure and/or
> > queries or
> > setting backend options to use more memory for stuff or other such
> > things.
> 
> I'd love some pointers!  This machine has lots-n-lots of memory. I'd love to
> make postgre use more than normal if it would get me better speed!
> 
> > If there is something wrong with the structure or queries, then a
> > database
> > switch will help you very little, unless your front-end tool has some
> > special
> > support for _some_ databases and not for others.
> 
> PHP/C is what I use to access PostgreSQL.
> 
> > > So, could you guys suggest some other backends I might look into?
> >
> > The usual - Oracle, Interbase, Informix, DB2, Sybase, Solid
> >
> > The website is usually obtained by putting www inf front and com at the
> > end ;)
> >
> > And let us know of your results.
> >
> > > I know
> > > it's an odd place for me to ask but the flat truth is that I think *I*
> am to
> > > blame for my Postgres troubles and even taking all of the problems into
> > > account I think PG is the best damn free RDBMS out there. It's
> functionality
> > > is superior to everyone else's, it's developers are no less than amazing
> and
> > > well -- I trust you guys to give me some honest opinions.. The
> functionality
> > > I need is basically what PG has.. Transactions are a must as well as
> some
> > > sort of sequence -- stability over performance but performance is very
> > > important too. It also needs to run native on FreeBSD..
> > >
> > > Oracle is out as we use FreeBSD and someone out there decided that they
> > > wouldn't support FreeBSD (in the license as well as in the code!)..
> >
> > Is FreeBSD a religious issue there or can it be negotiated ?
> 
> Not religious, though I love that OS like I love my right arm.. I am stuck
> with x86 hardware -- that's pretty much it (and eliminates Solaris and
> several other as a possible OS I suppose)..
> 
> THANKS!!
> 
> -Mitch Vincent
> 
> 
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 11:52:25

Hannu Krosing <hannu@tm.ee> writes:
> That's what I meant by introducing pushing the case-insensitiveness
> into ODBC,

I don't believe ODBC parses the query carefully enough to recognize
operators that would need to be altered to become case-insensitive.
I'm not even sure that it could do that --- does "WHERE f1 = f2"
need to be changed to "WHERE lower(f1) = lower(f2)"?  No way to know
that unless you know the datatypes of f1 and f2, which would mean
(a) a moderately complete SQL parser/analyzer and (b) a copy of the
system catalogs inside ODBC.  Doesn't sound like a promising avenue
of attack...
        regards, tom lane

Re: Why Not MySQL?

From

Don Baccus

Date:

03 May 2000, 13:00:31

At 11:44 AM 5/3/00 -0300, The Hermit Hacker wrote:

>My experience so far is that it was basically a 'plug-n-play' sort of
>thing ... dump your database, upgrade and reload ... the only problem I
>had was one script that I did something like:
>
>SELECT field as field

I was able to dump my birdnotes.net database and reload it without change,
if another datapoint will help your boss's confidence level.

>without realizing it, and v7.0 won't allow that (field == field) ... other
>then that, *much* more stable, and appears faster then the old ...

My system had one query in particular that suffered from a poor
plan (a very bad nested loop), the changes to the optimizer have
resulted in a much better plan in 7.0 which just flies.  I've been
very happy with 7.0.

Tell your boss we're smarter than he is and switch :)

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: Why Not MySQL?

From

Don Baccus

Date:

03 May 2000, 13:00:34

At 10:18 AM 5/3/00 -0400, Mitch Vincent wrote:

>No, I was afraid to run 7.0 beta on the production server. I do have it on
>my development server however and haven't had any problems with it (of
>course the devel server is only used by me, the production server is used by
>about 600 people)...

It's being run at http://community.aolserver.com and http://openacs.org,
and it seems to be solid.

...

>> Interbase is a possible solution.   They have a pretty good reputation,
>> and their "super server" (threaded with connections sharing a buffer
>> cache) should scale well.  My rough estimate is that they're at about
>> the place PG will be when 7.1 comes out.  I don't know if they support
>> FreeBSD, though.  Any reason you can't just put up a box with Linux?

>Ooooo. Mitch hates Linux. It's a long, boring story but lets just say that
>Linux and I use to be friends, now we're mortal enemies. :-)

I love it when personal religon gets in the way of technical decision
making!  FreeBSD's great, so is Linux.  I personally use Linux, but if
a key component of a business of mine would be better served by software
only available on FreeBSD, I'd have a server up in a matter of hours.

Oh, well...to each their own.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 13:33:24

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
> Here are some typical queries my application might generate. Please, let me
> know if you see anything that can be improved!

> select * from applicants as a where a.created::date = '05-01-2000' and
> a.firstname ~* '^mitch' limit 10 offset 0

Neither of these WHERE clauses can be used with a plain-vanilla index
(I'm assuming a.created is of time datetime?), so you're getting a
simple sequential scan over the whole table --- unless the LIMIT stops
it sooner.  If the table is large then you could get better performance
by arranging for an indexscan using whichever clause is likely to be
more selective (I'd be inclined to go for the date, I think, unless your
creation dates come in bunches).

The trick for the date test would be to have a functional index on
date(a.created).  I'm not sure how bright 6.5.* is about this, but
it definitely works in 7.0:

create table foo (f1 datetime);

-- a straight index on f1 is no help:
create index foof1 on foo(f1);
explain select * from foo where f1::date = '05-01-2000';
NOTICE:  QUERY PLAN:

Seq Scan on foo  (cost=0.00..25.00 rows=10 width=8)

-- but an index on date(f1) is:
create index foof1date on foo(date(f1));
explain select * from foo where f1::date = '05-01-2000';
NOTICE:  QUERY PLAN:

Index Scan using foof1date on foo  (cost=0.00..8.16 rows=10 width=8)

If you wanted to make the scan on firstname indexable, you'd need to
make an index on lower(firstname) and then change the query to read... lower(a.firstname) ~ '^mitch'
or possibly... lower(a.firstname) ~ lower('^mitch')
if you don't want to assume the given pattern is lowercase to begin
with.  (The second example will fail to be indexed under 6.5, but should
be just fine in 7.0.)  ~* can't use an index under any circumstance,
but ~ can if the pattern has a left-anchored fixed prefix.

> select * from applicants as a,applicants_states as s where a.firstname ~*
> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0

Again, the ~* clause is not indexable as-is, but the rstate clause
would be if you have an index on s.rstate --- however, I imagine that
it wouldn't be very selective, either, so it might not be worth the
trouble.  Changing the query to make the firstname part be indexable
could be a win.  You also need to look at how the join between a and s
is being done.  How big are these tables, anyway?

> .... There are 63 fields in the 'applicants' table, all of which are
> searchable. Would it be a good or bad thing to index all fields that are
> searchable?

A lot of indexes will hurt your insert/update/delete times, so I
wouldn't recommend having a whole bunch of indexes unless searches are
far more frequent than changes.  What you want is a few well-chosen
indexes that match the commonly used kinds of WHERE clauses in your
query mix.

> I'd love some pointers!  This machine has lots-n-lots of memory. I'd love to
> make postgre use more than normal if it would get me better speed!

Increase postmaster's -B and -S settings ...
        regards, tom lane

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 13:42:29

> The trick for the date test would be to have a functional index on
> date(a.created).  I'm not sure how bright 6.5.* is about this, but
> it definitely works in 7.0:
>
> create table foo (f1 datetime);
>
> -- a straight index on f1 is no help:
> create index foof1 on foo(f1);
> explain select * from foo where f1::date = '05-01-2000';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on foo  (cost=0.00..25.00 rows=10 width=8)
>
> -- but an index on date(f1) is:
> create index foof1date on foo(date(f1));
> explain select * from foo where f1::date = '05-01-2000';
> NOTICE:  QUERY PLAN:
>
> Index Scan using foof1date on foo  (cost=0.00..8.16 rows=10 width=8)

Thanks!.

> > select * from applicants as a,applicants_states as s where a.firstname
~*
> > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0
>
> Again, the ~* clause is not indexable as-is, but the rstate clause
> would be if you have an index on s.rstate --- however, I imagine that
> it wouldn't be very selective, either, so it might not be worth the
> trouble.  Changing the query to make the firstname part be indexable
> could be a win.  You also need to look at how the join between a and s
> is being done.  How big are these tables, anyway?

ipa=> select count(app_id) from applicants_states;count
------
244367

ipa=> select count(app_id) from applicants;
count
-----9791

> > .... There are 63 fields in the 'applicants' table, all of which are
> > searchable. Would it be a good or bad thing to index all fields that are
> > searchable?
>
> A lot of indexes will hurt your insert/update/delete times, so I
> wouldn't recommend having a whole bunch of indexes unless searches are
> far more frequent than changes.  What you want is a few well-chosen
> indexes that match the commonly used kinds of WHERE clauses in your
> query mix.

It's basically a search engine so yes, searching is FAR more frequently done
than inserts/updates/deletes

> Increase postmaster's -B and -S settings ...

I will. Thanks!

-Mitch Vincent

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 13:54:25

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
>> You also need to look at how the join between a and s
>> is being done.  How big are these tables, anyway?

> ipa=> select count(app_id) from applicants_states;
>  count
> ------
> 244367

> ipa=> select count(app_id) from applicants;
> count
> -----
>  9791

Now I'm confused --- what's the data model here?  I guess each applicants
row must match many entries in applicants_states?

Anyway, I suspect you definitely want to avoid a nested-loop join :-).
It'd be fairly reasonable for the system to use either hash or merge
join, I think.  What does EXPLAIN show that the system is actually
doing with this query?

>>>> .... There are 63 fields in the 'applicants' table, all of which are
>>>> searchable. Would it be a good or bad thing to index all fields that are
>>>> searchable?
>> 
>> A lot of indexes will hurt your insert/update/delete times, so I
>> wouldn't recommend having a whole bunch of indexes unless searches are
>> far more frequent than changes.  What you want is a few well-chosen
>> indexes that match the commonly used kinds of WHERE clauses in your
>> query mix.

> It's basically a search engine so yes, searching is FAR more frequently done
> than inserts/updates/deletes

Well, there's still a cost to having a lot of seldom-used indexes,
because the planner has to sit there and consider whether to use each
one for each query.  So I'd still recommend looking at your mix of
queries and only creating indexes that match reasonably commonly-used
WHERE clauses.
        regards, tom lane

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 14:34:27

> > ipa=> select count(app_id) from applicants_states;
> >  count
> > ------
> > 244367
>
> > ipa=> select count(app_id) from applicants;
> > count
> > -----
> >  9791
>
> Now I'm confused --- what's the data model here?  I guess each applicants
> row must match many entries in applicants_states?

Well, that's one possible search a person could do.

applicants is the 63 field table that hold general info about the person.

applicants_states is a table having only two fields --> app_id int4, rstate
varchar(2) -- this holds all the state abbreviations of the states that an
applicant will relocate too. It was either break it out into a different
table or make a ver large varchar() field in the applicant table that would
most of the time be totally blank but would have to be able to have every
state abbreviation in there, including a delimiter (comma or something)
between each one.. We brokw it out into another table so each applicant can
have virtually unlimited number of states (or other countries) in the
database.

So for each applicant there could be 53 records in the applicants_states
table.. (there are 53 different abbreviations that are valid in this
application)..

Clear as mud? :-)

Looking back I think that it would have almost been better to make a 150-200
character varchar field  -- perhaps not though..

> Anyway, I suspect you definitely want to avoid a nested-loop join :-).

*cough cough cough* I use to have a subselect there -- in fact you smacked
me around and told me to change it!

> It'd be fairly reasonable for the system to use either hash or merge
> join, I think.  What does EXPLAIN show that the system is actually
> doing with this query?

6.5.3 :

ipa=> explain select * from applicants as a,applicants_states as s where
a.firstname ~*
ipa-> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=1693.76 rows=6 width=615) ->  Seq Scan on applicants a  (cost=1554.71 rows=50 width=599) ->  Index
Scanusing applicants_states_app_id on applicants_states s
 
(cost=2.78 rows=1023 width=16)

7.0RC3 (with the exact same data) :

ipa=# explain select * from applicants as a,applicants_states as s where
a.firstname ~*
ipa-# '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0;
NOTICE:  QUERY PLAN:

Merge Join  (cost=0.00..29675.86 rows=2397 width=615) ->  Index Scan using applicants_states_app_id on
applicants_statess
 
(cost=0.00..23062.15 rows=2445 width=16) ->  Index Scan using applicants_app_id on applicants a
(cost=0.00..6581.91 rows=98 width=599)

NOTICE:  QUERY PLAN:

Merge Join  (cost=0.00..29675.86 rows=2397 width=615) ->  Index Scan using applicants_states_app_id on
applicants_statess
 
(cost=0.00..23062.15 rows=2445 width=16) ->  Index Scan using applicants_app_id on applicants a
(cost=0.00..6581.91 rows=98 width=599)

I did get two errors importing data from the 6.5.3 database into the 7.0
database :

ERROR:  DefineIndex: opclass "date_ops" does not accept datatype "timestamp"
ERROR:  DefineIndex: opclass "date_ops" does not accept datatype "timestamp"

And I have several fields of type 'datetime' in 6.5.3 which seem to get
translated to 'timestamp' in 7.0 -- then I try to index them using
date_ops..


-Mitch

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 14:45:26

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
> applicants_states is a table having only two fields --> app_id int4, rstate
> varchar(2) -- this holds all the state abbreviations of the states that an
> applicant will relocate too.

Ah, got it.  If we had better search capabilities on arrays, you could
have stored this within the applicants table as an array of char(2) ...
but as is, I suspect you did the right thing to make it a second table.


>> It'd be fairly reasonable for the system to use either hash or merge
>> join, I think.  What does EXPLAIN show that the system is actually
>> doing with this query?

> 6.5.3 : [ nested loop ]
> 7.0RC3 (with the exact same data) : [ merge join ]

So, may I have the envelope please?  What's the timings?


> I did get two errors importing data from the 6.5.3 database into the 7.0
> database :

> ERROR:  DefineIndex: opclass "date_ops" does not accept datatype "timestamp"
> ERROR:  DefineIndex: opclass "date_ops" does not accept datatype "timestamp"

> And I have several fields of type 'datetime' in 6.5.3 which seem to get
> translated to 'timestamp' in 7.0 -- then I try to index them using
> date_ops..

Oh, that's interesting.  I doubt that date_ops will work very well on
timestamp data (or on its predecessor datetime).  But 7.0 is the first
version that actually checks whether your requested index operators are
compatible with the column datatype --- previous versions would blindly
do what you told them to, and very possibly coredump depending on what
the datatypes in question where.  I wonder if that mistake was causing
some of the instability you had with 6.5?
        regards, tom lane

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 14:52:28

> > 6.5.3 : [ nested loop ]
> > 7.0RC3 (with the exact same data) : [ merge join ]
>
> So, may I have the envelope please?  What's the timings?

Eh'? I cut and pasted everything that was printed...

Here it is again, with both the beginning and ending prompts :-)

7.0 :
ipa=# explain select * from applicants as a,applicants_states as s where
a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10
offset 0;
NOTICE:  QUERY PLAN:

Hash Join  (cost=1355.82..5943.73 rows=17 width=615) ->  Seq Scan on applicants_states s  (cost=0.00..4492.54
rows=2350
width=16) ->  Hash  (cost=1355.54..1355.54 rows=112 width=599)       ->  Seq Scan on applicants a  (cost=0.00..1355.54
rows=112
width=599)

EXPLAIN
ipa=#

6.5.3 :

ipa=> explain select * from applicants as a,applicants_states as s where
a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10
offset 0;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=1693.76 rows=6 width=615) ->  Seq Scan on applicants a  (cost=1554.71 rows=50 width=599) ->  Index
Scanusing applicants_states_app_id on applicants_states s
 
(cost=2.78 rows=1023 width=16)

EXPLAIN
ipa=>


> Oh, that's interesting.  I doubt that date_ops will work very well on
> timestamp data (or on its predecessor datetime).  But 7.0 is the first
> version that actually checks whether your requested index operators are
> compatible with the column datatype --- previous versions would blindly
> do what you told them to, and very possibly coredump depending on what
> the datatypes in question where.  I wonder if that mistake was causing
> some of the instability you had with 6.5?

It's very likely that had to do with the stability problems, I indexed all
the datetime fields like that..

I index the datetime fields as you mentioned previously in your email and
WOW, the speed improvement was crazy. It's damn near instant if you're
searching just by date created (as many of my lovely users do!)..

-Mitch

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 15:15:46

Hmm, something else that I see that might break some code between 6.5.* and
7.0 (it did mine) is the fact that dates and date times are returned the
reverse that they use to be.

I make comparisions of the literal string in a few places, it didn't like
2000-03-03 20:20:02-05 as a date time :-)

No big deal but just out of curiosity, why the change?

- Mitch

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 15:35:37

On Wed, 3 May 2000, Mitch Vincent wrote:

> > > 6.5.3 : [ nested loop ]
> > > 7.0RC3 (with the exact same data) : [ merge join ]
> >
> > So, may I have the envelope please?  What's the timings?
> 
> Eh'? I cut and pasted everything that was printed...

Explain tells how the system will do things ... if you run it without the
ExPLAIn, how long does it take to run? :)

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

03 May 2000, 16:28:28

> So, may I have the envelope please?  What's the timings?

6.5.3:

StartTransactionCommand
query: select * from applicants as a,applicants_states as s where
a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10
offset 0;
ProcessQuery
! system usage stats:
!       0.436792 elapsed 0.275139 user 0.157033 system sec
!       [0.283135 user 0.173026 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       0/149 [0/332] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
!       0/8 [2/9] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:       1403 read,          0 written, buffer hit rate
= 51.22%
!       Local  blocks:          0 read,          0 written, buffer hit rate
= 0.00%
!       Direct blocks:          0 read,          0 written
CommitTransactionCommand
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)


______________________________________________________________________

7.0 :

StartTransactionCommand
query: select * from applicants as a,applicants_states as s where
a.firstname ~*
'^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0;
ProcessQuery
! system usage stats:
!       1.461997 elapsed 1.224377 user 0.234618 system sec
!       [1.238219 user 0.255382 sys total]
!       0/12 [0/12] filesystem blocks in/out
!       0/60 [0/318] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
!       0/22 [1/24] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:       2713 read,          0 written, buffer hit rate
= 25.34%
!       Local  blocks:          0 read,          0 written, buffer hit rate
= 0.00%
!       Direct blocks:          0 read,          0 written
CommitTransactionCommand
proc_exit(0)
shmem_exit(0)
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 60606 exited with status 0



Sorry, I didn't get what you wanted at first. I'm down to my last brain cell
today.



-Mitch

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

03 May 2000, 17:02:19

Mitch, one question ... what is the configuration (RAM, CPU, etc) of the
v6.5.3 machine vs v7.0 machine?  Also, how is your postmaster started
up?  What options?

On Wed, 3 May 2000, Mitch Vincent wrote:

> > So, may I have the envelope please?  What's the timings?
> 
> 6.5.3:
> 
> StartTransactionCommand
> query: select * from applicants as a,applicants_states as s where
> a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10
> offset 0;
> ProcessQuery
> ! system usage stats:
> !       0.436792 elapsed 0.275139 user 0.157033 system sec
> !       [0.283135 user 0.173026 sys total]
> !       0/0 [0/0] filesystem blocks in/out
> !       0/149 [0/332] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> !       0/8 [2/9] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:       1403 read,          0 written, buffer hit rate
> = 51.22%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written
> CommitTransactionCommand
> proc_exit(0) [#0]
> shmem_exit(0) [#0]
> exit(0)
> 
> 
> ______________________________________________________________________
> 
> 7.0 :
> 
> StartTransactionCommand
> query: select * from applicants as a,applicants_states as s where
> a.firstname ~*
> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0;
> ProcessQuery
> ! system usage stats:
> !       1.461997 elapsed 1.224377 user 0.234618 system sec
> !       [1.238219 user 0.255382 sys total]
> !       0/12 [0/12] filesystem blocks in/out
> !       0/60 [0/318] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> !       0/22 [1/24] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:       2713 read,          0 written, buffer hit rate
> = 25.34%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written
> CommitTransactionCommand
> proc_exit(0)
> shmem_exit(0)
> exit(0)
> /usr/local/pgsql/bin/postmaster: reaping dead processes...
> /usr/local/pgsql/bin/postmaster: CleanupProc: pid 60606 exited with status 0
> 
> 
> 
> Sorry, I didn't get what you wanted at first. I'm down to my last brain cell
> today.
> 
> 
> 
> -Mitch
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 17:32:11

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
> Hmm, something else that I see that might break some code between 6.5.* and
> 7.0 (it did mine) is the fact that dates and date times are returned the
> reverse that they use to be.

ISO format to be exact... not just the reverse order, but different layout.

It's always been possible to break unsuspecting apps by changing the
DATESTYLE; I think it's good practice for an app to set DATESTYLE for
itself, if it's dependent on a particular setting.

> No big deal but just out of curiosity, why the change?

ISO and Y2K political correctness ;-).  If you don't like it, set
DATESTYLE to the old default ('Postgres' I think).  It's also possible
to change the system-wide default with a suitable postmaster switch,
but I forget the details.
        regards, tom lane

Re: Why Not MySQL?

From

Andrew McMillan

Date:

03 May 2000, 17:55:31

Mitch Vincent wrote:
> 
> 7.0 :
> 
> StartTransactionCommand
> query: select * from applicants as a,applicants_states as s where
> a.firstname ~*
> '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0;
> ProcessQuery
> ! system usage stats:

With the numbers of records in the applicants file (and the probable
distribution of firstnames) a most efficient query under 7.0 will work
heaps better if you have that index on lower(a.firstname) and stop using
~* (i.e. just using ~).

I think this will be especially the case with your '63 fields, ~10,000
records.  I'm guessing that a significant portion of those fields are
TEXT or VARCHAR, so record size will be creeping up.  The best way to
see/show all of the information on this table is to:VACUUM VERBOSE ANALYZE applicants;
rather than to just:SELECT COUNT(*) FROM applicants;
because you/we will get to see the average record size as well.

Also, I believe I once read that putting all of the fixed length fields
at the start of the record will make for faster access, especially when
a scan is being done against those fields.  (Can someone confirm this
one? :-)

Do you keep statistics regarding what fields people actually _use_ for
their query matching?  If you can construct indexes to support those
frequent queries then you will find huge speed improvements.  These
speed improvements won't degrade as you add more records too (at least
not to the same extent).  You'll probably also find that people will use
those query terms more often once they twig to how much more quickly the
results come back to them!

Another point: 7.0 will much more frequently choose indexes when you use
the LIMIT clause, as you do.

Cheers,                Andrew McMillan.
-- 
_____________________________________________________________________           Andrew McMillan, e-mail:
Andrew@cat-it.co.nz
Catalyst IT Ltd, PO Box 10-225, Level 22, 105 The Terrace, Wellington
Me: +64 (21) 635 694, Fax: +64 (4) 499 5596, Office: +64 (4) 499 2267

Re: Why Not MySQL?

From

Tom Lane

Date:

03 May 2000, 18:27:30

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
>> So, may I have the envelope please?  What's the timings?

> 6.5.3:
> ! system usage stats:
> !       0.436792 elapsed 0.275139 user 0.157033 system sec
> !       [0.283135 user 0.173026 sys total]
> !       0/0 [0/0] filesystem blocks in/out
> !       0/149 [0/332] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> !       0/8 [2/9] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:       1403 read,          0 written, buffer hit rate
> = 51.22%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written

> 7.0 :
> ! system usage stats:
> !       1.461997 elapsed 1.224377 user 0.234618 system sec
> !       [1.238219 user 0.255382 sys total]
> !       0/12 [0/12] filesystem blocks in/out
> !       0/60 [0/318] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> !       0/22 [1/24] voluntary/involuntary context switches
> ! postgres usage stats:
> !       Shared blocks:       2713 read,          0 written, buffer hit rate
> = 25.34%
> !       Local  blocks:          0 read,          0 written, buffer hit rate
> = 0.00%
> !       Direct blocks:          0 read,          0 written

Well, drat.  Looks like 7.0's query plan is slower :-(.  There's
something fishy about the numbers for 6.5.3 though --- how could it have
done that query with zero blocks read?  Are you sure you are comparing
apples to apples here?  I wonder whether the 6.5 system already had the
tables cached in kernel disk buffers while 7.0 was working from a
standing start and had to physically go to the disk.  Also, did both
versions have the same -B and -S settings?
        regards, tom lane

Re: Why Not MySQL?

From

Marten Feldtmann

Date:

03 May 2000, 22:57:37

> "Mitch Vincent" <mitch@huntsvilleal.com> writes:
> 
> Well, there's still a cost to having a lot of seldom-used indexes,
> because the planner has to sit there and consider whether to use each
> one for each query.  So I'd still recommend looking at your mix of
> queries and only creating indexes that match reasonably commonly-used
> WHERE clauses.
> 
When doing insert/updates on larger tables (>500.000 entries) these
indexes are also time consuming !
For our vertical attribute object storage systems we noticed, that
the time for insert/updates are the critical part - they behave very
linear in our test suite and they seem to be the limiting factor in
our system.

Marten

system usage stats (Was: Re: Why Not MySQL? )

From

The Hermit Hacker

Date:

04 May 2000, 01:23:33

On Wed, 3 May 2000, Mitch Vincent wrote:

> > So, may I have the envelope please?  What's the timings?
> 
> 6.5.3:
> 
> StartTransactionCommand
> query: select * from applicants as a,applicants_states as s where
> a.firstname ~* '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10
> offset 0;
> ProcessQuery
> ! system usage stats:
> !       0.436792 elapsed 0.275139 user 0.157033 system sec
> !       [0.283135 user 0.173026 sys total]
> !       0/0 [0/0] filesystem blocks in/out
> !       0/149 [0/332] page faults/reclaims, 0 [0] swaps
> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> !       0/8 [2/9] voluntary/involuntary context switches

How is this to be read?  I'm looking at it, and reading it as:
 0 - voluntary 8 - involuntary

But what about the [2/9]?

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Tom Lane

Date:

04 May 2000, 01:25:35

The Hermit Hacker <scrappy@hub.org> writes:
>> ! system usage stats:
>> !       0.436792 elapsed 0.275139 user 0.157033 system sec
>> !       [0.283135 user 0.173026 sys total]
>> !       0/0 [0/0] filesystem blocks in/out
>> !       0/149 [0/332] page faults/reclaims, 0 [0] swaps
>> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
>> !       0/8 [2/9] voluntary/involuntary context switches

> How is this to be read?  I'm looking at it, and reading it as:

>   0 - voluntary
>   8 - involuntary

> But what about the [2/9]?

I believe the numbers outside brackets are for the particular query
cycle, and the ones in brackets are total for the process (ie, total
since backend start).

I didn't design the printout format ;-) ... not real sure what the
point is of printing the total-since-start numbers ...
        regards, tom lane

Re: system usage stats (Was: Re: Why Not MySQL? )

From

The Hermit Hacker

Date:

04 May 2000, 01:36:32

On Thu, 4 May 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> >> ! system usage stats:
> >> !       0.436792 elapsed 0.275139 user 0.157033 system sec
> >> !       [0.283135 user 0.173026 sys total]
> >> !       0/0 [0/0] filesystem blocks in/out
> >> !       0/149 [0/332] page faults/reclaims, 0 [0] swaps
> >> !       0 [0] signals rcvd, 0/0 [2/2] messages rcvd/sent
> >> !       0/8 [2/9] voluntary/involuntary context switches
> 
> > How is this to be read?  I'm looking at it, and reading it as:
> 
> >   0 - voluntary
> >   8 - involuntary
> 
> > But what about the [2/9]?
> 
> I believe the numbers outside brackets are for the particular query
> cycle, and the ones in brackets are total for the process (ie, total
> since backend start).
> 
> I didn't design the printout format ;-) ... not real sure what the
> point is of printing the total-since-start numbers ...

Okay, that explains that :)

Now, Mitch's results for v7.0 showed something like:

0/12 filesystem blocks in/out

You intepreted that as 12 reads from the file system ... 'out' I would
have interpreted as writes to the file system, which made zero sense
... do we have our 'in/out's backwards here?

One thing that would be nice (who wrote these stats?) would be some way to
be able to determine a suitable setting for -S from this ... someway to
know that an ORDER BY needed to swap to disk because it needed 32Meg when
only 16Meg was allocated for it ... would help give an indication where
performance could be improved by either just raising -S (in Mitch's case,
where lots of RAM is available) or more RAM should be added ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Tom Lane

Date:

04 May 2000, 01:48:33

The Hermit Hacker <scrappy@hub.org> writes:
> Now, Mitch's results for v7.0 showed something like:
> 0/12 filesystem blocks in/out
> You intepreted that as 12 reads from the file system ... 'out' I would
> have interpreted as writes to the file system, which made zero sense
> ... do we have our 'in/out's backwards here?

Good point.  Writes from a SELECT are certainly possible --- the SELECT
could be writing tuple status-flag updates, if it was the first
transaction to verify commit status of tuples created by a prior
transaction.  But that again raises the issue of whether we've got
a fair comparison.  The 6.5 test apparently only saw already-marked-
committed tuples ...
        regards, tom lane

Re: system usage stats (Was: Re: Why Not MySQL? )

From

The Hermit Hacker

Date:

04 May 2000, 01:51:36

On Thu, 4 May 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > Now, Mitch's results for v7.0 showed something like:
> > 0/12 filesystem blocks in/out
> > You intepreted that as 12 reads from the file system ... 'out' I would
> > have interpreted as writes to the file system, which made zero sense
> > ... do we have our 'in/out's backwards here?
> 
> Good point.  Writes from a SELECT are certainly possible --- the SELECT
> could be writing tuple status-flag updates, if it was the first
> transaction to verify commit status of tuples created by a prior
> transaction.  But that again raises the issue of whether we've got
> a fair comparison.  The 6.5 test apparently only saw already-marked-
> committed tuples ...

I was hoping that Mitch would have spoken up by now about it, but an email
I saw from him stated that the v7.0 machine (development) wasn't as
powerful as the v6.5.3 machine (production) ... that might account for it,
I just don't know how much different the two machines are ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Don Baccus

Date:

04 May 2000, 08:51:37

At 02:36 AM 5/4/00 -0300, The Hermit Hacker wrote:

>One thing that would be nice (who wrote these stats?) would be some way to
>be able to determine a suitable setting for -S from this ... someway to
>know that an ORDER BY needed to swap to disk because it needed 32Meg when
>only 16Meg was allocated for it ... would help give an indication where
>performance could be improved by either just raising -S (in Mitch's case,
>where lots of RAM is available) or more RAM should be added ...

It would also be nice to be able to get at these via SQL, ala Oracle.
Then toolkits like OpenACS could easily generate administration pages
that present the stats nicely to webmasters, including perhaps
putting up links to help page.  If sorting is going to disk,
a link to a short pages dicussing raising -S could be put up.

ACS Classic, the Oracle version, has admin pages (without help,
it points you to a bookshelf of Oracle DBA books instead) which
display stats, etc and it is very useful.

A lot of folks using toolkits like this will simply customize
look and feel of the web pages and at first, at least, won't
know much or anything about SQL or Postgres and need all the
help we can give them.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: system usage stats (Was: Re: Why Not MySQL? )

From

"Mitch Vincent"

Date:

04 May 2000, 09:16:39

Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18
Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a
64-bit PCI slot).

There is a big upgrade planned for this box when the motherboard we're
waiting for comes out..

The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI
drives.. However it's running several other things other than the database
and webserver so it has a small load.

Sorry for not replying sooner, had a small crisis yesterday evening :-)

- Mitch

----- Original Message -----
From: The Hermit Hacker <scrappy@hub.org>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Mitch Vincent <mitch@huntsvilleal.com>; <pgsql-hackers@postgresql.org>
Sent: Thursday, May 04, 2000 1:51 AM
Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? )


> On Thu, 4 May 2000, Tom Lane wrote:
>
> > The Hermit Hacker <scrappy@hub.org> writes:
> > > Now, Mitch's results for v7.0 showed something like:
> > > 0/12 filesystem blocks in/out
> > > You intepreted that as 12 reads from the file system ... 'out' I would
> > > have interpreted as writes to the file system, which made zero sense
> > > ... do we have our 'in/out's backwards here?
> >
> > Good point.  Writes from a SELECT are certainly possible --- the SELECT
> > could be writing tuple status-flag updates, if it was the first
> > transaction to verify commit status of tuples created by a prior
> > transaction.  But that again raises the issue of whether we've got
> > a fair comparison.  The 6.5 test apparently only saw already-marked-
> > committed tuples ...
>
> I was hoping that Mitch would have spoken up by now about it, but an email
> I saw from him stated that the v7.0 machine (development) wasn't as
> powerful as the v6.5.3 machine (production) ... that might account for it,
> I just don't know how much different the two machines are ...
>
> Marc G. Fournier                   ICQ#7615664               IRC Nick:
Scrappy
> Systems Administrator @ hub.org
> primary: scrappy@hub.org           secondary:
scrappy@{freebsd|postgresql}.org
>

Re: system usage stats (Was: Re: Why Not MySQL? )

From

The Hermit Hacker

Date:

04 May 2000, 09:33:38

On Thu, 4 May 2000, Mitch Vincent wrote:

> Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18
> Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a
> 64-bit PCI slot).
> 
> There is a big upgrade planned for this box when the motherboard we're
> waiting for comes out..
> 
> The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI
> drives.. However it's running several other things other than the database
> and webserver so it has a small load.

Wait, correct me if I'm wrong, but the more powerful CPU is in your
development server?  

My understanding is that a Celeron is a chop'd up PII ... my first
recommendation here is that if you are running a *server*, get rid of that
Celeron ... from what I've been told about the difference, Celeron is a
great, cheap chip for using in a desktop environment (its what I use at
home), but shy away from it in a server environment, as the speed
reduction of the reduced cache alone will hurt things ...

Re: system usage stats (Was: Re: Why Not MySQL? )

From

"Mitch Vincent"

Date:

04 May 2000, 09:42:39

> > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2
18
> > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have
a
> > 64-bit PCI slot).
> >
> > There is a big upgrade planned for this box when the motherboard we're
> > waiting for comes out..
> >
> > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI
> > drives.. However it's running several other things other than the
database
> > and webserver so it has a small load.
>
> Wait, correct me if I'm wrong, but the more powerful CPU is in your
> development server?
>
> My understanding is that a Celeron is a chop'd up PII ... my first
> recommendation here is that if you are running a *server*, get rid of that
> Celeron ... from what I've been told about the difference, Celeron is a
> great, cheap chip for using in a desktop environment (its what I use at
> home), but shy away from it in a server environment, as the speed
> reduction of the reduced cache alone will hurt things ...

Ooooooooooh you're preaching to the choir. I know, I'm argueing with someone
about this as we speak.

A Celeron is basically a PII with 128k of full-speed cache. NOT a server
processor, I know but sadly I can't do much about it at this point.. It's my
understanding that "we" want to wait for an AMD board that has a 64-bit PCI
slot because "we" don't like the i840 chipset for some reason "we" can't
understand.

As you can tell, I'm a bit upset about the whole thing..

-Mitch

Re: system usage stats (Was: Re: Why Not MySQL? )

From

The Hermit Hacker

Date:

04 May 2000, 09:55:55

On Thu, 4 May 2000, Mitch Vincent wrote:

> > > Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2
> 18
> > > Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have
> a
> > > 64-bit PCI slot).
> > >
> > > There is a big upgrade planned for this box when the motherboard we're
> > > waiting for comes out..
> > >
> > > The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI
> > > drives.. However it's running several other things other than the
> database
> > > and webserver so it has a small load.
> >
> > Wait, correct me if I'm wrong, but the more powerful CPU is in your
> > development server?
> >
> > My understanding is that a Celeron is a chop'd up PII ... my first
> > recommendation here is that if you are running a *server*, get rid of that
> > Celeron ... from what I've been told about the difference, Celeron is a
> > great, cheap chip for using in a desktop environment (its what I use at
> > home), but shy away from it in a server environment, as the speed
> > reduction of the reduced cache alone will hurt things ...
> 
> Ooooooooooh you're preaching to the choir. I know, I'm argueing with someone
> about this as we speak.
> 
> A Celeron is basically a PII with 128k of full-speed cache. NOT a
> server processor, I know but sadly I can't do much about it at this
> point.. It's my understanding that "we" want to wait for an AMD board
> that has a 64-bit PCI slot because "we" don't like the i840 chipset
> for some reason "we" can't understand.

Can someone out there that understands CPUs help me out here?  My
understanding is that Intel vs AMD has benefits depending on
use.  Unfortunately, I don't recall how it goes ... as a server, Intel is
faster, and for graphics processing, AMD is ... or something like that?

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Don Baccus

Date:

04 May 2000, 10:14:54

At 10:33 AM 5/4/00 -0300, The Hermit Hacker wrote:

>Wait, correct me if I'm wrong, but the more powerful CPU is in your
>development server?  
>
>My understanding is that a Celeron is a chop'd up PII ... my first
>recommendation here is that if you are running a *server*, get rid of that
>Celeron ... from what I've been told about the difference, Celeron is a
>great, cheap chip for using in a desktop environment (its what I use at
>home), but shy away from it in a server environment, as the speed
>reduction of the reduced cache alone will hurt things ...

Celerons have a smaller L2 cache (128K) than PIIs (512K), but it runs
full-speed rather than 1/2 speed like the PII cache.  Current models
aren't "chopped up" in any sense, they're the same core with a smaller
but faster cache.

So, applications that have a high cache hit rate can actually run faster
on the Celeron.

New Coppermine PIII's (those that end in E or are > 600 MHz) have
256K full-speed cache, the Coppermine-based Celeron II's 128K
full-speed.  Yes, they cut the cache size in half compared to
PII's and non-E PIII's (Katmai cores) but it's full-speed, which
turns out to be a win for nearly all applications.  Other than
cache size and FSB/memory bus speed the new Celerons and PIII's
are identical.

All Celerons run with 66 MHz FSB and RAM, current Coppermines with
100 MHz RAM (even those with a 133MHz front-side bus) or spendy
RDRAM which almost no one is buying.

So, what's the bottom line?  The numbers don't tell us much,
though I still think Tom's right that the PG7.0 one is really
slower.  You just can't say if how MUCH slower.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: system usage stats (Was: Re: Why Not MySQL? )

From

"Mitch Vincent"

Date:

04 May 2000, 10:20:59

Well well. Mitch gets  his dual PIII board/processors -- I guess I sold the
idea to the boss-man :-)


- Mitch

----- Original Message -----
From: Don Baccus <dhogaza@pacifier.com>
To: The Hermit Hacker <scrappy@hub.org>; Mitch Vincent
<mitch@huntsvilleal.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Thursday, May 04, 2000 9:55 AM
Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? )


> At 10:33 AM 5/4/00 -0300, The Hermit Hacker wrote:
>
> >Wait, correct me if I'm wrong, but the more powerful CPU is in your
> >development server?
> >
> >My understanding is that a Celeron is a chop'd up PII ... my first
> >recommendation here is that if you are running a *server*, get rid of
that
> >Celeron ... from what I've been told about the difference, Celeron is a
> >great, cheap chip for using in a desktop environment (its what I use at
> >home), but shy away from it in a server environment, as the speed
> >reduction of the reduced cache alone will hurt things ...
>
> Celerons have a smaller L2 cache (128K) than PIIs (512K), but it runs
> full-speed rather than 1/2 speed like the PII cache.  Current models
> aren't "chopped up" in any sense, they're the same core with a smaller
> but faster cache.
>
> So, applications that have a high cache hit rate can actually run faster
> on the Celeron.
>
> New Coppermine PIII's (those that end in E or are > 600 MHz) have
> 256K full-speed cache, the Coppermine-based Celeron II's 128K
> full-speed.  Yes, they cut the cache size in half compared to
> PII's and non-E PIII's (Katmai cores) but it's full-speed, which
> turns out to be a win for nearly all applications.  Other than
> cache size and FSB/memory bus speed the new Celerons and PIII's
> are identical.
>
> All Celerons run with 66 MHz FSB and RAM, current Coppermines with
> 100 MHz RAM (even those with a 133MHz front-side bus) or spendy
> RDRAM which almost no one is buying.
>
> So, what's the bottom line?  The numbers don't tell us much,
> though I still think Tom's right that the PG7.0 one is really
> slower.  You just can't say if how MUCH slower.
>
>
>
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
>

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Thomas Lockhart

Date:

04 May 2000, 10:24:50

> > A Celeron is basically a PII with 128k of full-speed cache. NOT a
> > server processor, I know but sadly I can't do much about it at this
> > point.. It's my understanding that "we" want to wait for an AMD board
> > that has a 64-bit PCI slot because "we" don't like the i840 chipset
> > for some reason "we" can't understand.
> Can someone out there that understands CPUs help me out here?  My
> understanding is that Intel vs AMD has benefits depending on
> use.  Unfortunately, I don't recall how it goes ... as a server, Intel is
> faster, and for graphics processing, AMD is ... or something like that?

From what I've read, the extra cache in the PII/III gives you a 5%
boost over the Celeron (I'll guess more for some server apps). Intel
still sells the Xeon chips, which have a cache twice as big as the
PII/III, but I'm not sure the clock has kept pace and it was always
overpriced wrt performance.

I don't remember which way the Intel/AMD thing goes, but most folks
won't notice a 5% difference in speed.

Not that anyone asked, but imho the best price/performance x86 machine
has always been a dual processor box one or two clock jumps behind the
fastest available. You get ~80% more performance for ~5% more cost
than a uniprocessor at the fastest speed. I haven't looked recently to
see if there are now uniprocessor machines at the low end that can
beat the price/performance of the dual-processor setup.
                         - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Don Baccus

Date:

04 May 2000, 10:28:56

At 10:54 AM 5/4/00 -0300, The Hermit Hacker wrote:

>Can someone out there that understands CPUs help me out here?  My
>understanding is that Intel vs AMD has benefits depending on
>use.  Unfortunately, I don't recall how it goes ... as a server, Intel is
>faster, and for graphics processing, AMD is ... or something like that?

First, take a look at my earlier post (that I didn't send to the
list because it seemed irrelevant) about some basic issues regarding
the i840.

For servers, I think the bottom line at the moment is that the Athlon
isn't available in SMP, and won't be 'til later this year.  And the
Athlon boards haven't been out long, and they're tricky to make, and
require high quality power supplies.  All and all, I'd look closely
at an Athlon for a personal workstation but not a server at this
point.  For a low-end server (say, static web pages) an i810 or
i810E with on-board video is a compact solution that fits
nicely in 2U rack cases.  A Celeron 366 does a great job for this kind
of application, cheaply (see http://donb.photo.net for an example
of a C366 at work).  Of course, you can only buy faster ones now,
they'll do the job even better.

For a high-end db server, you probably want an SMP board, I'd
guess.  Even if you only put in a single processor, this gives
the option of adding a second one if load grows and you've got
enough RAM, etc.  That rules out the Athlon at the moment.

As far as head-to-head comparisons of current PIII coppermines
and Athlons...the Athlon has 1/2 or 2/5 speed cache (slower speeds
at the high end 750 MHz+ I believe), the coppermine full-speed,
on-die cache.  But the Athlon has more L1 cache.  And the 
Athlon has faster FPP, but more graphics stuff is optimized for
Intel.

Price performance wise, the Athlon smokes the coppermines as
you can buy a much faster part for the same price.  Absolute
performance wise, in theory a 1 GHz coppermine beats a 1 GHz
Athlon for almost anything, but you can't get them unless 
your name is Michael Dell, etc.  For build-your-own types or
small systems integrators they don't really exist.

By fall, it won't matter.  The new generation Intel chips start
showing up, Athlon on-die full-speed cache version shows up
making the current part look like a slouch, and new generation
AMD parts start showing up.  We'll all realize our current
computers suck, even though they've got more power than most
of us can figure out how to use.

And ... maybe an Intel solution to the i820/i840 fiasco.  We
can always hope.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Hardware/CPU Thoughts (Was: Re: system usage stats )

From

The Hermit Hacker

Date:

04 May 2000, 10:35:53

On Thu, 4 May 2000, Thomas Lockhart wrote:

> > > A Celeron is basically a PII with 128k of full-speed cache. NOT a
> > > server processor, I know but sadly I can't do much about it at this
> > > point.. It's my understanding that "we" want to wait for an AMD board
> > > that has a 64-bit PCI slot because "we" don't like the i840 chipset
> > > for some reason "we" can't understand.
> > Can someone out there that understands CPUs help me out here?  My
> > understanding is that Intel vs AMD has benefits depending on
> > use.  Unfortunately, I don't recall how it goes ... as a server, Intel is
> > faster, and for graphics processing, AMD is ... or something like that?
> 
> >From what I've read, the extra cache in the PII/III gives you a 5%
> boost over the Celeron (I'll guess more for some server apps). Intel
> still sells the Xeon chips, which have a cache twice as big as the
> PII/III, but I'm not sure the clock has kept pace and it was always
> overpriced wrt performance.
> 
> I don't remember which way the Intel/AMD thing goes, but most folks
> won't notice a 5% difference in speed.
> 
> Not that anyone asked, but imho the best price/performance x86 machine
> has always been a dual processor box one or two clock jumps behind the
> fastest available. You get ~80% more performance for ~5% more cost
> than a uniprocessor at the fastest speed. I haven't looked recently to
> see if there are now uniprocessor machines at the low end that can
> beat the price/performance of the dual-processor setup.

this is the sort of thing I've been moving towards as well ... went a
little cheaper this last time with upgrading my home machine and have a
Dual-Celeron ... quite happy with her so far ...

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Don Baccus

Date:

04 May 2000, 10:46:56

At 02:26 PM 5/4/00 +0000, Thomas Lockhart wrote:

>From what I've read, the extra cache in the PII/III gives you a 5%
>boost over the Celeron (I'll guess more for some server apps).

For a db server the 100Mhz memory bus of the PII/III probably
wins more than 5% over the 66Mhz memory bus of the Celeron.  This
assumes you're database is reasonably big.  Lots of memory transfers
going on...

>Intel
>still sells the Xeon chips, which have a cache twice as big as the
>PII/III, but I'm not sure the clock has kept pace and it was always
>overpriced wrt performance.

Katmai Xeons come with cache size ranging from 512K (the same as a PII/PIII
Katmai) to 2MB.  That cache, though, is FULL SPEED.  Gets you about 10%
over a PII/III Katmai for server-type benchmarks I've seen.  But they're
expensive.

You can also go 4-way SMP with them, vs. 2-way with PII/III...

Now the new Coppermines have changed things...the Xeon Cu and PII/III
Cu both have identical 256K full-speed cache.  There's not much reason
to buy the Xeon unless you want 4-way SMP, and Intel recognizes this
apparently because the price is only 10% higher for these new parts.

>Not that anyone asked, but imho the best price/performance x86 machine
>has always been a dual processor box one or two clock jumps behind the
>fastest available.

That's what I did, a dual PII450.  I bought them (boxed, fan) for $180
each when PIII450s were $250 or so and PIII 500Es $299 (I bought one
of the latter for a home workstation).  So it's a PII without the
latest matrix instructions for graphics?  How many games will I
run on my web/db server? :)

>You get ~80% more performance for ~5% more cost
>than a uniprocessor at the fastest speed. I haven't looked recently to
>see if there are now uniprocessor machines at the low end that can
>beat the price/performance of the dual-processor setup.

The Athlons do, actually ... but I wouldn't trust the mobos for a
remote server at this point, not until they're proven stable.  They're
tricky to build and finicky about power supplies.

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Tom Lane

Date:

04 May 2000, 11:06:54

Don Baccus <dhogaza@pacifier.com> writes:
> So, what's the bottom line?  The numbers don't tell us much,
> though I still think Tom's right that the PG7.0 one is really
> slower.  You just can't say if how MUCH slower.

Actually I was a lot more concerned about disk performance than CPU
speed.  I notice no one's said anything about the relative speed
of Mitch's two different disk setups ...
        regards, tom lane

Re: system usage stats (Was: Re: Why Not MySQL? )

From

"Ross J. Reedstrom"

Date:

04 May 2000, 12:03:58

On Thu, May 04, 2000 at 11:06:20AM -0400, Tom Lane wrote:
> Don Baccus <dhogaza@pacifier.com> writes:
> > So, what's the bottom line?  The numbers don't tell us much,
> > though I still think Tom's right that the PG7.0 one is really
> > slower.  You just can't say if how MUCH slower.
> 
> Actually I was a lot more concerned about disk performance than CPU
> speed.  I notice no one's said anything about the relative speed
> of Mitch's two different disk setups ...
> 

Mitch wrote:
> Ok, the production server is a Celeron 433, 512 MEgs of PC100 ECC RAM, 2 18
> Gig Ultra 160 SCSI drives (only running at 80 megs since we dono't have a
> 64-bit PCI slot).
> 
> There is a big upgrade planned for this box when the motherboard we're
> waiting for comes out..
> 
> The development server is a PII450, 128 Megs of RAM and Ultra 2 SCSI
> drives.. However it's running several other things other than the database
> and webserver so it has a small load.
> 

So, 6.5.3 is running on Ultra 160 drives, with the controller throttled to
80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a controller
maximum of 80 MB/s.  However, the sustained transfer speed of the drives
themselves are what should be limiting: if they're all relatively modern
drives, 20MB/s is typical, so neither config will max out the controller.
(2 drives each, right?)

Ross
-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: system usage stats (Was: Re: Why Not MySQL? )

From

Don Baccus

Date:

04 May 2000, 13:17:05

At 11:03 AM 5/4/00 -0500, Ross J. Reedstrom wrote:

>So, 6.5.3 is running on Ultra 160 drives, with the controller throttled to
>80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a controller
>maximum of 80 MB/s.  However, the sustained transfer speed of the drives
>themselves are what should be limiting: if they're all relatively modern
>drives, 20MB/s is typical, so neither config will max out the controller.
>(2 drives each, right?)

Not to mention that seek times make it very difficult to max out 
a controller even if theoretically possible with four drives, unless
you're striping and doing large transfers or lookaheads, etc.

If one's got 10K drives and the other 7.2K drives, you'll certainly
see a difference in transfer rate and seek time.

So ... what are the disk configurations?

- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.

Re: system usage stats (Was: Re: Why Not MySQL? )

From

"Mitch Vincent"

Date:

04 May 2000, 13:55:57

Production 10k, development 7.2k RPM :-)

- Mitch

----- Original Message -----
From: Don Baccus <dhogaza@pacifier.com>
To: Ross J. Reedstrom <reedstrm@wallace.ece.rice.edu>; Tom Lane
<tgl@sss.pgh.pa.us>
Cc: <pgsql-hackers@postgresql.org>
Sent: Thursday, May 04, 2000 1:11 PM
Subject: Re: system usage stats (Was: Re: [HACKERS] Why Not MySQL? )


> At 11:03 AM 5/4/00 -0500, Ross J. Reedstrom wrote:
>
> >So, 6.5.3 is running on Ultra 160 drives, with the controller throttled
to
> >80 MB/s, and 7.0 is running on Ultra 2 drives, which also has a
controller
> >maximum of 80 MB/s.  However, the sustained transfer speed of the drives
> >themselves are what should be limiting: if they're all relatively modern
> >drives, 20MB/s is typical, so neither config will max out the controller.
> >(2 drives each, right?)
>
> Not to mention that seek times make it very difficult to max out
> a controller even if theoretically possible with four drives, unless
> you're striping and doing large transfers or lookaheads, etc.
>
> If one's got 10K drives and the other 7.2K drives, you'll certainly
> see a difference in transfer rate and seek time.
>
> So ... what are the disk configurations?
>
>
>
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
>

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

04 May 2000, 18:39:02

> Well, drat.  Looks like 7.0's query plan is slower :-(.  There's
> something fishy about the numbers for 6.5.3 though --- how could it have
> done that query with zero blocks read?  Are you sure you are comparing
> apples to apples here?  I wonder whether the 6.5 system already had the
> tables cached in kernel disk buffers while 7.0 was working from a
> standing start and had to physically go to the disk.

This is very possible as the 6.5.3 PG is running on the production server
which is constantly being queried.

>Also, did both
> versions have the same -B and -S settings?

I didn't specify any -B or -S settings so both are using their respective
defaults..

Thanks!

Re: Why Not MySQL?

From

The Hermit Hacker

Date:

04 May 2000, 20:23:02

On Thu, 4 May 2000, Mitch Vincent wrote:

> > Well, drat.  Looks like 7.0's query plan is slower :-(.  There's
> > something fishy about the numbers for 6.5.3 though --- how could it have
> > done that query with zero blocks read?  Are you sure you are comparing
> > apples to apples here?  I wonder whether the 6.5 system already had the
> > tables cached in kernel disk buffers while 7.0 was working from a
> > standing start and had to physically go to the disk.
> 
> This is very possible as the 6.5.3 PG is running on the production server
> which is constantly being queried.
> 
> >Also, did both
> > versions have the same -B and -S settings?
> 
> I didn't specify any -B or -S settings so both are using their respective
> defaults..

For you machine, go with something like '-S <32*1024>' to use 32Meg of RAM
for ORDER/GROUP BY ... 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: Why Not MySQL?

From

"Mitch Vincent"

Date:

05 May 2000, 10:00:10

I was just playing with some of the suggested lower() indexes and ran into a
bit of trouble, it's no doubt from my lack of understanding but still, I
wanted to ask..

Do dum :

ipa=# create index applicants_firstname on applicants(lower(firstname));
ERROR:  DefineIndex: function 'lower(varchar)' does not exist

...that syntax is right, isn't it?

Thanks!

- Mitch

----- Original Message -----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Mitch Vincent <mitch@huntsvilleal.com>
Cc: <pgsql-hackers@postgresql.org>
Sent: Wednesday, May 03, 2000 1:31 PM
Subject: Re: [HACKERS] Why Not MySQL?


> "Mitch Vincent" <mitch@huntsvilleal.com> writes:
> > Here are some typical queries my application might generate. Please, let
me
> > know if you see anything that can be improved!
>
> > select * from applicants as a where a.created::date = '05-01-2000' and
> > a.firstname ~* '^mitch' limit 10 offset 0
>
> Neither of these WHERE clauses can be used with a plain-vanilla index
> (I'm assuming a.created is of time datetime?), so you're getting a
> simple sequential scan over the whole table --- unless the LIMIT stops
> it sooner.  If the table is large then you could get better performance
> by arranging for an indexscan using whichever clause is likely to be
> more selective (I'd be inclined to go for the date, I think, unless your
> creation dates come in bunches).
>
> The trick for the date test would be to have a functional index on
> date(a.created).  I'm not sure how bright 6.5.* is about this, but
> it definitely works in 7.0:
>
> create table foo (f1 datetime);
>
> -- a straight index on f1 is no help:
> create index foof1 on foo(f1);
> explain select * from foo where f1::date = '05-01-2000';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on foo  (cost=0.00..25.00 rows=10 width=8)
>
> -- but an index on date(f1) is:
> create index foof1date on foo(date(f1));
> explain select * from foo where f1::date = '05-01-2000';
> NOTICE:  QUERY PLAN:
>
> Index Scan using foof1date on foo  (cost=0.00..8.16 rows=10 width=8)
>
> If you wanted to make the scan on firstname indexable, you'd need to
> make an index on lower(firstname) and then change the query to read
> ... lower(a.firstname) ~ '^mitch'
> or possibly
> ... lower(a.firstname) ~ lower('^mitch')
> if you don't want to assume the given pattern is lowercase to begin
> with.  (The second example will fail to be indexed under 6.5, but should
> be just fine in 7.0.)  ~* can't use an index under any circumstance,
> but ~ can if the pattern has a left-anchored fixed prefix.
>
> > select * from applicants as a,applicants_states as s where a.firstname
~*
> > '^mitch' and s.rstate='AL' and s.app_id=a.app_id limit 10 offset 0
>
> Again, the ~* clause is not indexable as-is, but the rstate clause
> would be if you have an index on s.rstate --- however, I imagine that
> it wouldn't be very selective, either, so it might not be worth the
> trouble.  Changing the query to make the firstname part be indexable
> could be a win.  You also need to look at how the join between a and s
> is being done.  How big are these tables, anyway?
>
> > .... There are 63 fields in the 'applicants' table, all of which are
> > searchable. Would it be a good or bad thing to index all fields that are
> > searchable?
>
> A lot of indexes will hurt your insert/update/delete times, so I
> wouldn't recommend having a whole bunch of indexes unless searches are
> far more frequent than changes.  What you want is a few well-chosen
> indexes that match the commonly used kinds of WHERE clauses in your
> query mix.
>
> > I'd love some pointers!  This machine has lots-n-lots of memory. I'd
love to
> > make postgre use more than normal if it would get me better speed!
>
> Increase postmaster's -B and -S settings ...
>
> regards, tom lane
>

Re: Why Not MySQL?

From

Tom Lane

Date:

05 May 2000, 11:00:13

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
> ipa=# create index applicants_firstname on applicants(lower(firstname));
> ERROR:  DefineIndex: function 'lower(varchar)' does not exist

> ...that syntax is right, isn't it?

Hmm, that's annoying.  I guess you are going to have to make that field
be of type text.

Actually, since text and varchar look the same under the hood, the
existing lower() code would work just fine on varchar.  One fix for this
would be to add a pg_proc entry for lower(varchar), which you could do
by hand if you wanted:

regression=# create index f1lower on f1v (lower(f1));
ERROR:  DefineIndex: function 'lower(varchar)' does not exist

regression=# create function lower(varchar) returns text as 'lower'
regression-# language 'internal' with (iscachable);
CREATE

regression=# select * from pg_proc where proname = 'lower';proname | proowner | prolang | proisinh | proistrusted |
proiscachable| pronargs | proretset | prorettype | proargtypes | probyte_pct | properbyte_cpu | propercall_cpu |
prooutin_ratio| prosrc | probin
 

---------+----------+---------+----------+--------------+---------------+----------+-----------+------------+-------------+-------------+----------------+----------------+----------------+--------+--------lower
 |      256 |      11 | f        | t            | t             |        1 | f         |         25 |          25 |
   100 |              0 |              0 |            100 | lower  | -lower   |      256 |      11 | f        | t
    | t             |        1 | f         |         25 |        1043 |         100 |              0 |              0 |
          100 | lower  | -
 
(2 rows)

-- ok, looks like I got it right ...

regression=# create index f1lower on f1v (lower(f1));
CREATE

This will be a tiny bit slower than if the function were really truly
built-in, but it should work well enough.

But since type varchar is considered binary-compatible with type text,
you shouldn't have had to create the extra function entry.  It looks
like the indexing routines do not pay attention to binary type
compatibility when looking up functions for functional indexes.  I'm not
going to try fixing that now, but it's something that should be on the
TODO list: * Functional indexes should allow functions on binary-compatible types
        regards, tom lane

Re: Indexing varchar fields with lower()

From

"Mitch Vincent"

Date:

05 May 2000, 11:30:12

Excellent, Tom. Thanks!

It went from

ipa=# explain select * from applicants as a where a.firstname ~* '^mitch'
limit 10 offset 0;
NOTICE:  QUERY PLAN:

Seq Scan on applicants a  (cost=0.00..1355.54 rows=98 width=599)

EXPLAIN
ipa=#

To....

ipa=# explain select * from applicants as a where lower(a.firstname) ~
lower('^mitch') limit 10 offset 0;
NOTICE:  QUERY PLAN:

Index Scan using applicants_firstname on applicants a  (cost=0.00..228.47
rows=98 width=599)

EXPLAIN

On 7.0 RC5.

Could putting that function in there even though it's not 'built-in' cause
any problems that you can think of? (Mainly worried about any kind of index
corruption like I've seen before)

Thanks again!

- Mitch

Re: Indexing varchar fields with lower()

From

Tom Lane

Date:

05 May 2000, 12:35:14

"Mitch Vincent" <mitch@huntsvilleal.com> writes:
> Could putting that function in there even though it's not 'built-in' cause
> any problems that you can think of? (Mainly worried about any kind of index
> corruption like I've seen before)

No, shouldn't be a problem.  The only thing non "built in" about it is
that fmgr.c has to find the function by name rather than by OID (there's
a shortcut for the OIDs of the standard pg_proc entries).  So the lookup
is a tad slower, that's all.
        regards, tom lane