Thread: The New Slashdot Setup (includes MySql server)

The New Slashdot Setup (includes MySql server)

From
Alessio Bragadini
Date:
Info on the new slashdot.org setup

<http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>

interesting because of the plans (meaning $$$) they have to improve
MySql, and because they are the flagship MySql site/application. 

In the comment page, replying to the usual "Why not PostgreSql?" thread
someone pointed out an extract from the MySql docs that seems to me
blatantly false
(http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).

-- 
Alessio F. Bragadini            alessio@albourne.com
APL Financial Services          http://www.sevenseas.org/~alessio
Nicosia, Cyprus                 phone: +357-2-750652

"It is more complicated than you think"               -- The Eighth Networking Truth from RFC 1925


Re: The New Slashdot Setup (includes MySql server)

From
Bruce Momjian
Date:
> Info on the new slashdot.org setup
> 
> <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> 
> interesting because of the plans (meaning $$$) they have to improve
> MySql, and because they are the flagship MySql site/application. 
> 
> In the comment page, replying to the usual "Why not PostgreSql?" thread
> someone pointed out an extract from the MySql docs that seems to me
> blatantly false
> (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).

Just finished reading the thread.  I am surprised how many people
slammed them on their MySQL over PostgreSQL decision.  People are
slamming MySQL all over the place.  :-)

Seems like inertia was the reason to stay with MySQL.  What that means
to me is that for their application space, PostgreSQL already has
superior technology, and people realize it.  This means we are on our
way up, and MySQL is, well, ....

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: The New Slashdot Setup (includes MySql server)

From
The Hermit Hacker
Date:
thanks for the pointer ... I just posted my response ... specifically
pointing out how "accurate" the MySQL docs tend to be *rofl*


On Thu, 18 May 2000, Alessio Bragadini wrote:

> Info on the new slashdot.org setup
> 
> <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> 
> interesting because of the plans (meaning $$$) they have to improve
> MySql, and because they are the flagship MySql site/application. 
> 
> In the comment page, replying to the usual "Why not PostgreSql?" thread
> someone pointed out an extract from the MySql docs that seems to me
> blatantly false
> (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
> 
> -- 
> Alessio F. Bragadini            alessio@albourne.com
> APL Financial Services          http://www.sevenseas.org/~alessio
> Nicosia, Cyprus                 phone: +357-2-750652
> 
> "It is more complicated than you think"
>                 -- The Eighth Networking Truth from RFC 1925
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: The New Slashdot Setup (includes MySql server)

From
Benjamin Adida
Date:
on 5/18/00 1:12 PM, Bruce Momjian at pgman@candle.pha.pa.us wrote:

> Seems like inertia was the reason to stay with MySQL.  What that means
> to me is that for their application space, PostgreSQL already has
> superior technology, and people realize it.  This means we are on our
> way up, and MySQL is, well, ....

There is this growing desire among some OpenACS people to replicate the
Slashdot functionality in an OpenACS module (probably a weekend's worth of
work). I wish I had a bit more free time to do it. It's time to show what
can be done with a real RDBMS (and a real web application environment, but
that's a different story).

-Ben



Re: The New Slashdot Setup (includes MySql server)

From
The Hermit Hacker
Date:
On Thu, 18 May 2000, Bruce Momjian wrote:

> > Info on the new slashdot.org setup
> > 
> > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> > 
> > interesting because of the plans (meaning $$$) they have to improve
> > MySql, and because they are the flagship MySql site/application. 
> > 
> > In the comment page, replying to the usual "Why not PostgreSql?" thread
> > someone pointed out an extract from the MySql docs that seems to me
> > blatantly false
> > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
> 
> Just finished reading the thread.  I am surprised how many people
> slammed them on their MySQL over PostgreSQL decision.  People are
> slamming MySQL all over the place.  :-)
> 
> Seems like inertia was the reason to stay with MySQL.  What that means
> to me is that for their application space, PostgreSQL already has
> superior technology, and people realize it.  This means we are on our
> way up, and MySQL is, well, ....

In SlashDot's defence here ... I dooubt there is much they do that would
require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
and *alot* of SELECT ...




Re: The New Slashdot Setup (includes MySql server)

From
Alfred Perlstein
Date:
* The Hermit Hacker <scrappy@hub.org> [000518 11:51] wrote:
> On Thu, 18 May 2000, Bruce Momjian wrote:
> 
> > > Info on the new slashdot.org setup
> > > 
> > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> > > 
> > > interesting because of the plans (meaning $$$) they have to improve
> > > MySql, and because they are the flagship MySql site/application. 
> > > 
> > > In the comment page, replying to the usual "Why not PostgreSql?" thread
> > > someone pointed out an extract from the MySql docs that seems to me
> > > blatantly false
> > > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
> > 
> > Just finished reading the thread.  I am surprised how many people
> > slammed them on their MySQL over PostgreSQL decision.  People are
> > slamming MySQL all over the place.  :-)
> > 
> > Seems like inertia was the reason to stay with MySQL.  What that means
> > to me is that for their application space, PostgreSQL already has
> > superior technology, and people realize it.  This means we are on our
> > way up, and MySQL is, well, ....
> 
> In SlashDot's defence here ... I dooubt there is much they do that would
> require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
> and *alot* of SELECT ...

If those guys still are doing multiple selects for each page view after
at least 2 years of being around and choking on the load, they seriously
need to get a clue.  mod_perl... belch!

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: The New Slashdot Setup (includes MySql server)

From
The Hermit Hacker
Date:
okay, that is a good point ... I know what the difference in performance
the two vs one select issue can produce ...


On Thu, 18 May 2000, Alfred Perlstein wrote:

> * The Hermit Hacker <scrappy@hub.org> [000518 11:51] wrote:
> > On Thu, 18 May 2000, Bruce Momjian wrote:
> > 
> > > > Info on the new slashdot.org setup
> > > > 
> > > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> > > > 
> > > > interesting because of the plans (meaning $$$) they have to improve
> > > > MySql, and because they are the flagship MySql site/application. 
> > > > 
> > > > In the comment page, replying to the usual "Why not PostgreSql?" thread
> > > > someone pointed out an extract from the MySql docs that seems to me
> > > > blatantly false
> > > > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
> > > 
> > > Just finished reading the thread.  I am surprised how many people
> > > slammed them on their MySQL over PostgreSQL decision.  People are
> > > slamming MySQL all over the place.  :-)
> > > 
> > > Seems like inertia was the reason to stay with MySQL.  What that means
> > > to me is that for their application space, PostgreSQL already has
> > > superior technology, and people realize it.  This means we are on our
> > > way up, and MySQL is, well, ....
> > 
> > In SlashDot's defence here ... I dooubt there is much they do that would
> > require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
> > and *alot* of SELECT ...
> 
> If those guys still are doing multiple selects for each page view after
> at least 2 years of being around and choking on the load, they seriously
> need to get a clue.  mod_perl... belch!
> 
> -- 
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Hannu Krosing
Date:
The Hermit Hacker wrote:
> 
> thanks for the pointer ... I just posted my response ... specifically
> pointing out how "accurate" the MySQL docs tend to be *rofl*

And now there is a response to your response stating the following

> 
> The MySQL people have said exactly the same sort of things about
> the PostgreSQL people. So please stop the name-calling and
> the quotes around "test", it's not going to get you anywhere. 
> 
> That being said, the standard MySQL benchmark _still_ is 30 times
> faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off,
> _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled,

btw, how does one "enable" vacuum ?

> I might add). The main reason seems to be some sort of failure to 
> use the index in the SELECT and UPDATE test loops on the part of 
> PostgreSQL. 
> 
> The benchmark, for the curious, works like this: 
> 
> First it creates a table with an index: 
> 
> create table bench1 (id int NOT NULL,id2 int NOT NULL,id3 int NOT NULL,dummy1 char(30)); create unique
> index bench1_index_ on bench1 using btree (id,id2); create index bench1_index_1 on bench1 using btree (id3); 
> 
> Then it fills the table with 300.000 entries with unique id values. 
> 
> Then, it issues a query like this: 
> 
>         update bench1 set dummy1='updated' where id=1747 
> 
> which causes the backend to do one thousand read() calls. For each query. 

could it be that just updating 1 unique index causes 1k read()'s ?

> No wonder it's slow. An EXPLAIN query states that it's using the 
> index, though. I have no clue what happens here. I've sent this
> to the pgsql-general mailing list and have just reposted it to -hackers. 

I somehow missed it (on -hackers at least) so I repost it here

> Oh yes, the benchmark also revealed that CREATE TABLE in PostgreSQL 7.0 
> leaks about 2k of memory.

-------------------
Hannu


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Hannu Krosing:
> And now there is a response to your response stating the following
> 
That response is from me, actually. (I subscribed to -hackers two hours
ago, so I'm sorry if I missed anything.)

> > That being said, the standard MySQL benchmark _still_ is 30 times
> > faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off,
> > _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled,
> 
> btw, how does one "enable" vacuum ?

run-all-tests ... --fast.

The code has stuff like
  $server->vacuum(1,\$dbh) if $opt_fast and defined $server->{vacuum};

sprinkled at strategic places.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Silence is the element in which great things fashion themselves.              --Thomas Carlyle


> > > That being said, the standard MySQL benchmark _still_ is 30 times
> > > faster for MySQL 3.23 than on PostgreSQL 7.0 (with fsync turned off,
> > > _and_ nonstandard speed-up PostgreSQL features like VACUUM enabled,

VACUUM is not a speed-up feature, it's a slow-down feature. It reclaims
space and that takes time. It does update system statistics which can
help performance if done after a data load or perhaps once a day.

But "sprinkling the code" with vacuum sounds like a big performance
killer. Hope you are not counting vacuum as part of your 1000 read()
calls.


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Chris:
> VACUUM is not a speed-up feature, it's a slow-down feature. It reclaims
> space and that takes time. It does update system statistics which can
> help performance if done after a data load or perhaps once a day.
> 
OK, thanks for the clarification.

> But "sprinkling the code" with vacuum sounds like a big performance
> killer. Hope you are not counting vacuum as part of your 1000 read()
> calls.
> 
Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
call. They actually scan the pg_index table of the benchmark database. 

Why they do that is another question entirely. (a) these tables should
have indices, and (b) whatever postgres wants to know should have been
cached someplace. Oh yes, (c) what's in pg_index that needs to be 4
MBytes big?

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Man is the only animal that laughs and weeps; for he is
the only animal that is struck with the difference between
what things are and what they ought to be.                       -- William Hazlitt (1778-1830)


Matthias Urlichs wrote:

> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
> call. They actually scan the pg_index table of the benchmark database.

Does this only happen on the first call to INSERT/UPDATE after
connecting to the database, or does it happen with all subsequent calls
too?


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Chris:
> Matthias Urlichs wrote:
> 
> > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
> > call. They actually scan the pg_index table of the benchmark database.
> 
> Does this only happen on the first call to INSERT/UPDATE after
> connecting to the database, or does it happen with all subsequent calls
> too?
> 
All of them. Whatever the server is looking up here, it's _not_ cached.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
GRITCH 1. n. A complaint (often caused by a GLITCH (q.v.)).  2. v. To  complain.  Often verb-doubled: "Gritch gritch".
3.Glitch.                               -- From the AI Hackers' Dictionary
 


RE: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> Behalf Of Matthias Urlichs
> 
> Why they do that is another question entirely. (a) these tables should
> have indices, and (b) whatever postgres wants to know should have been
> cached someplace. Oh yes, (c) what's in pg_index that needs to be 4
> MBytes big?
>

What does 'vacuum pg_index' show ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Hiroshi Inoue
Date:
Matthias Urlichs wrote:

> Hi,
>
> Chris:
> > Matthias Urlichs wrote:
> >
> > > Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
> > > call. They actually scan the pg_index table of the benchmark database.
> >
> > Does this only happen on the first call to INSERT/UPDATE after
> > connecting to the database, or does it happen with all subsequent calls
> > too?
> >
> All of them. Whatever the server is looking up here, it's _not_ cached.
>

Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages.
So it would read pages from disk every time,
Unfortunately pg_index has no index to scan the index entries of a relation now.

However why is pg_index so large ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp




Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Hiroshi Inoue:
> What does 'vacuum pg_index' show ?
> 
test=> vacuum pg_index;
NOTICE:  Skipping "pg_index" --- only table owner can VACUUM it
VACUUM

OK, so I suppose I should do it as the postgres user...
test=> vacuum pg_index;
VACUUM

The debug output says:
DEBUG:  --Relation pg_index--
DEBUG:  Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen
164,MaxLen 164;
 
Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec.
DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec.
DEBUG:  Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec.
DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec.

... which helped. A lot.

Thanks, everybody. The first quick benchmark run I did afterwards states
that PostgreSQL is now only half as fast as MySQL, instead of the factor
of 30 seen previously, on the MySQL benchmark test.  ;-)

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Dorian Graying:The unwillingness to gracefully allow one's body to show signs
of aging.        -Douglas Coupland, Generation X


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Hiroshi Inoue:
> 
> Maybe shared buffer isn't so large as to keep all the(4.1M) pg_index pages.

That seems to be the case.

> So it would read pages from disk every time, Unfortunately pg_index
> has no index to scan the index entries of a relation now.
> 
Well, it's reasonable that you can't keep an index on the table which
states what the indices are. ;-)

... on the other hand, Apple's HFS file system stores all the information
about the on-disk locations of their files as a B-Tree in, in, you
guessed it, a B-Tree which is saved on disk as an (invisible) file.
Thus, the thing stores the information on where its sectors are located
at, inside itself.
To escape this catch-22 situation, the location of the first three
extents (which is usually all it takes anyway) is stored elsewhere.

Possibly, something like this would work with postgres too.

> However why is pg_index so large ?
> 
Creating ten thousand tables will do that to you.

Is there an option I can set to increase the appropriate cache, so that
the backend can keep the data in memory?

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Famous last words:       They'd never (be stupid enough to) make him a manager.


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
The Hermit Hacker
Date:
On Fri, 19 May 2000, Matthias Urlichs wrote:

> Hi,
> 
> Hiroshi Inoue:
> > What does 'vacuum pg_index' show ?
> > 
> test=> vacuum pg_index;
> NOTICE:  Skipping "pg_index" --- only table owner can VACUUM it
> VACUUM
> 
> OK, so I suppose I should do it as the postgres user...
> test=> vacuum pg_index;
> VACUUM
> 
> The debug output says:
> DEBUG:  --Relation pg_index--
> DEBUG:  Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen
164,MaxLen 164;
 
> Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec.
> DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec.
> DEBUG:  Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec.
> DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec.
> 
> ... which helped. A lot.
> 
> Thanks, everybody. The first quick benchmark run I did afterwards states
> that PostgreSQL is now only half as fast as MySQL, instead of the factor
> of 30 seen previously, on the MySQL benchmark test.  ;-)

Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin*




Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Mitch Vincent"
Date:
> ... which helped. A lot.
>
> Thanks, everybody. The first quick benchmark run I did afterwards states
> that PostgreSQL is now only half as fast as MySQL, instead of the factor
> of 30 seen previously, on the MySQL benchmark test.  ;-)

while (horse == DEAD) {

beat();

}

... Anyway.. I can see this being true (the MySQL being twice as fast as
PostgreSQL) however I don't think that MySQL being faster than PostgreSQL
was ever up for debate. When you take a RDBMS and strip out a huge amount of
features, of course you're going to get a faster end product. It's just not
nearly as safe, feature rich or easy to work with (from a programmers
standpoint).

I looked at MySQL to use for my applications, for all of ten seconds.... To
code in and around, MySQL just isn't a useable RDBMS for me and I can hardly
see how it's useful for anyone doing the kind of programming I do..

What it is very good for is something like RADIUS/POP3  authentication, I
use it at my ISP to keep all my user authentication in one place... However
the only thing I catred about was speed there, and there are all of two
things I ever do to that database. I SELECT (once every auth request) and
occasionally I INSERT and possibly UPDATE, that coupled with the fact that
there are only two to three things in the database per user (username,
password and domain for POP3 auth) -- it's just not a very complicated thing
to do... I use a SQL backend because it's very easy to maintain and I can
easily write software to manipulate the data held in the tables -- that's
all.

With the other applications I and my company write, it's a totally different
story. I just don't see how a person can write any kind of a larger
application and not need all the features MySQL lacks...

I like MySQL for certain things -- however I've never considered  "MySQL vs
PostgreSQL" -- they're just two totally different databases for totally
different uses IMHO.

-Mitch






Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

The Hermit Hacker:
> > Thanks, everybody. The first quick benchmark run I did afterwards states
> > that PostgreSQL is now only half as fast as MySQL, instead of the factor
> > of 30 seen previously, on the MySQL benchmark test.  ;-)
> 
> Wow, shock of shocks ... MySQL has more inaccuracies in their docs? *grin*

No, that factor of 30 was my result after running the benchmark for the
first time. Presumably, unless I skip the large_number_of_tables test,
it'll be just as slow the second time around.

The MySQL people probably didn't dig deeper into PostgreSQL's innards.
They don't seem to think it's their job to find out exactly why their
benchmark runs so slow on some other databases, and I don't particularly
fault them for that attitude.


The PostgreSQL community has an attitude too, after all.

One of these might be to answer "you must have had fsync turned on"
whenever somebody reports a way-too-slow benchmark.  In this case,
that's definitely not true.


Another attitude of the PostgreSQL developers might be to answer "run
VACUUM" whenever somebody reports performance problems. That answer is
not helpful at all WRT this benchmark, because the user who caused the
problem ("test", in my case) isn't permitted to run VACUUM on the
pg_index table.

The alternate solution would be for the backend to notice "Gee, I just
scanned a whole heap of what turned out to be empty space in this here
pg_index file, maybe it would be a good idea call vacuum() on it."

Or, if that doesn't work, increase the buffer for holding its content.


Anyway, I fully expect to have a more reasonable benchmark result by
tomorrow, and the MySQL guys will get a documentation update. Which they
_will_ put in the next update's documentation file. Trust me.  ;-)

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
"The so-called Christian world is contracepting itself out of existence."-- Fr. L. Kieffer, HLI Reports, August 1989,
asquoted in "The Far   Right, Speaking For Themselves," a Planned Parenthood pamphlet
 


Matthias Urlichs wrote:

> Another attitude of the PostgreSQL developers might be to answer "run
> VACUUM" whenever somebody reports performance problems. That answer is
> not helpful at all WRT this benchmark, because the user who caused the
> problem ("test", in my case) isn't permitted to run VACUUM on the
> pg_index table.

Speaking of which, why can't any user who can change meta-data, also
Vacuum meta-data ? It's not a threat to security is it?


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Michael A. Olson"
Date:
At 02:04 PM 5/19/00 +0200, you wrote:

> Well, it's reasonable that you can't keep an index on the table which
> states what the indices are. ;-)
> 
> ... on the other hand, Apple's HFS file system stores all the information
> about the on-disk locations of their files as a B-Tree in, in, you
> guessed it, a B-Tree which is saved on disk as an (invisible) file.
> Thus, the thing stores the information on where its sectors are located
> at, inside itself.
> To escape this catch-22 situation, the location of the first three
> extents (which is usually all it takes anyway) is stored elsewhere.
> 
> Possibly, something like this would work with postgres too.

This is one of several things we did at Illustra to make the backend
run faster.  I did the design and implementation, but it was a few
years ago, so the details are hazy.  Here's what I remember.

We had to solve three problems:

First, you had to be able to run initdb and bootstrap the system
without the index on pg_index in place.  As I recall, we had to
carefully order the creation of the first several tables to make
that work, but it wasn't rocket science.

Second, when the index on pg_index gets created, you need to update
it with tuples that describe it.  This is really just the same as
hard-coding the pg_attribute attribute entries into pg_attribute --
ugly, but not that bad.

Third, we had to abstract a lot of the hard-coded table scans in
the bowels of the system to call a routine that checked for the
existence of an index on the system table, and used it.  In order
for the index on pg_index to get used, its reldesc had to be nailed
in the cache.  Getting it there at startup was more hard-coded
ugliness, but you only had do to it one time.

The advantage is that you can then index a bunch more of the system
catalog tables, and on a bunch more attributes.  That produced some
surprising speedups.

This was simple enough that I'm certain the same technique would
work in the current engine.
                mike



"Matthias Urlichs" <smurf@noris.net> writes:
>>>> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
>>>> call. They actually scan the pg_index table of the benchmark database.

Ohh ... pg_index is the culprit!  OK, I know exactly where that's coming
from: the planner is looking around to see what indexes might be
interesting for planning the query.  Several comments here:

1. Probably we ought to try to bypass most of the planning process for
a simple INSERT ... VALUES.  (I thought I had fixed that, but apparently
it's not getting short-circuited soon enough, if the search for indexes
is still happening.)

2. The search is not using either an index or a cache IIRC.  Needs to
be fixed but there may be no suitable index present in 7.0.

3. I have been toying with the notion of having relcache entries store
information about the indexes associated with the table, so that the
planner wouldn't have to search through pg_index at all.  The trouble
with doing that is getting the info updated when an index is added or
dropped; haven't quite figured out how to do that...
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> The advantage is that you can then index a bunch more of the system
> catalog tables, and on a bunch more attributes.  That produced some
> surprising speedups.

We have indexes on all system tables that need it.  The pg_index index
was done quite easily and is new for 7.0.  A check for recursion and
fallback to sequential scan for pg_index table rows in the pg_index
table allows it to happen.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
I would like to see if VACUUM ANALYZE helps.

> Hi,
> 
> Hiroshi Inoue:
> > What does 'vacuum pg_index' show ?
> > 
> test=> vacuum pg_index;
> NOTICE:  Skipping "pg_index" --- only table owner can VACUUM it
> VACUUM
> 
> OK, so I suppose I should do it as the postgres user...
> test=> vacuum pg_index;
> VACUUM
> 
> The debug output says:
> DEBUG:  --Relation pg_index--
> DEBUG:  Pages 448: Changed 0, reaped 448, Empty 0, New 0; Tup 34: Vac 21443, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen
164,MaxLen 164;
 
> Re-using: Free/Avail. Space 3574948/3567176; EndEmpty/Avail. Pages 0/447. CPU 0.46s/0.00u sec.
> DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 21443. CPU 0.05s/0.36u sec.
> DEBUG:  Rel pg_index: Pages: 448 --> 1; Tuple(s) moved: 2. CPU 0.03s/0.03u sec.
> DEBUG:  Index pg_index_indexrelid_index: Pages 86; Tuples 34: Deleted 2. CPU 0.01s/0.00u sec.
> 
> ... which helped. A lot.
> 
> Thanks, everybody. The first quick benchmark run I did afterwards states
> that PostgreSQL is now only half as fast as MySQL, instead of the factor
> of 30 seen previously, on the MySQL benchmark test.  ;-)
> 
> -- 
> Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
> The quote was selected randomly. Really.       |        http://smurf.noris.de/
> -- 
> Dorian Graying:
>     The unwillingness to gracefully allow one's body to show signs
> of aging.
>             -Douglas Coupland, Generation X
> 


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Chris <chris@bitmead.com> writes:
>> not helpful at all WRT this benchmark, because the user who caused the
>> problem ("test", in my case) isn't permitted to run VACUUM on the
>> pg_index table.

> Speaking of which, why can't any user who can change meta-data, also
> Vacuum meta-data ? It's not a threat to security is it?

No, but it is a potential route to a denial-of-service attack, because
VACUUM has to acquire an exclusive lock on the target table.  An
unprivileged user can't vacuum pg_index for the same reason he can't
lock it: he could effectively shut down all other users of that
database, at least for a while (and VACUUMs issued in a tight loop
might manage to make things pretty unusable).

The design assumption here is that VACUUMs will be run periodically by a
cron job executing as user postgres; typically once a day at a low-load
time of day is a good plan.

There has been some talk of switching away from the no-overwrite storage
manager to a more conventional overwriting manager.  That'd reduce or
eliminate the need for periodic VACUUMs.  But currently, you can't
really run a Postgres installation without 'em.
        regards, tom lane


Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> The advantage is that you can then index a bunch more of the system
>> catalog tables, and on a bunch more attributes.  That produced some
>> surprising speedups.

> We have indexes on all system tables that need it.

There isn't any fundamental reason why the planner can't be using an
index to scan pg_index; we just need to code it that way.  Right now
it's coded as a sequential scan.

Unfortunately there is no index on pg_index's indrelid column in 7.0,
so this is not fixable without an initdb.  TODO item for 7.1, I guess.

More generally, someone should examine the other places where
heap_getnext() loops occur, and see if any of them look like performance
bottlenecks...
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> The advantage is that you can then index a bunch more of the system
> >> catalog tables, and on a bunch more attributes.  That produced some
> >> surprising speedups.
> 
> > We have indexes on all system tables that need it.
> 
> There isn't any fundamental reason why the planner can't be using an
> index to scan pg_index; we just need to code it that way.  Right now
> it's coded as a sequential scan.
> 
> Unfortunately there is no index on pg_index's indrelid column in 7.0,
> so this is not fixable without an initdb.  TODO item for 7.1, I guess.

The reason there is no index is because it is not a unique column, and
at the time I was adding system indexes for 7.0, I was looking for
indexes that could be used for system cache lookups.  The index you are
describing returns multiple tuples, so it would be an actual index call
in the code.  i will add this to the TODO.


> 
> More generally, someone should examine the other places where
> heap_getnext() loops occur, and see if any of them look like performance
> bottlenecks...

Good idea.  Added to TODO.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Michael A. Olson"
Date:
At 12:39 PM 5/19/00 -0400, Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
> > We have indexes on all system tables that need it.
> 
> There isn't any fundamental reason why the planner can't be using an
> index to scan pg_index; we just need to code it that way.  Right now
> it's coded as a sequential scan.

Eliminating the hard-coded seqscans of catalogs in the bowels of the
system was the hardest part of the project.  As I said, it was good
to do.  It made parsing and planning queries much, much faster.
                mike



Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> At 12:39 PM 5/19/00 -0400, Tom Lane wrote:
> 
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > 
> > > We have indexes on all system tables that need it.
> > 
> > There isn't any fundamental reason why the planner can't be using an
> > index to scan pg_index; we just need to code it that way.  Right now
> > it's coded as a sequential scan.
> 
> Eliminating the hard-coded seqscans of catalogs in the bowels of the
> system was the hardest part of the project.  As I said, it was good
> to do.  It made parsing and planning queries much, much faster.

All the sequential catalog scans that return one row are gone.  What has
not been done is adding indexes for scans returning more than one row.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Bruce Momjian <pgman@candle.pha.pa.us> writes:
> All the sequential catalog scans that return one row are gone.  What has
> not been done is adding indexes for scans returning more than one row.

I've occasionally wondered whether we can't find a way to use the
catcaches for searches that can return multiple rows.  It'd be easy
enough to add an API for catcache that could return multiple rows given
a nonunique search key.  The problem is how to keep the catcache up to
date with underlying reality for this kind of query.  Deletions of rows
will be handled by the existing catcache invalidation mechanism, but
how can we know when some other backend has added a row that will match
a search condition?  Haven't seen an answer short of scanning the table
every time, which makes the catcache no win at all.
        regards, tom lane


Re: The New Slashdot Setup (includes MySql server)

From
Kaare Rasmussen
Date:
Hmmm.

> can be done with a real RDBMS (and a real web application environment, but
> that's a different story).

Do you happen to know one?



Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > All the sequential catalog scans that return one row are gone.  What has
> > not been done is adding indexes for scans returning more than one row.
> 
> I've occasionally wondered whether we can't find a way to use the
> catcaches for searches that can return multiple rows.  It'd be easy
> enough to add an API for catcache that could return multiple rows given
> a nonunique search key.  The problem is how to keep the catcache up to
> date with underlying reality for this kind of query.  Deletions of rows
> will be handled by the existing catcache invalidation mechanism, but
> how can we know when some other backend has added a row that will match
> a search condition?  Haven't seen an answer short of scanning the table
> every time, which makes the catcache no win at all.

Good point.  You can invalidate stuff, but how to find new stuff that
doesn't have a specific key?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


"Michael A. Olson" <mao@sleepycat.com> writes:
> Third, we had to abstract a lot of the hard-coded table scans in
> the bowels of the system to call a routine that checked for the
> existence of an index on the system table, and used it.

The way that we've been approaching this is by switching from hard-coded
sequential scans (heap_getnext() calls) to hard-coded indexscans
(index_getnext() calls) at places where performance dictates it.

An advantage of doing it that way is that you don't have the
bootstrapping/circularity problems that Mike describes; the code doesn't
need to consult pg_index to know whether there is an index to use, it
just has the necessary info hard-coded in.  For the same reason it's
very quick.

Nonetheless it's also a pretty ugly answer.  I'd rather the code wasn't
so tightly tied to a particular set of indexes for system tables.

I was thinking about doing something like what Mike describes: replace
uses of heap_beginscan() with calls to a routine that would examine the
passed ScanKey(s) to see if there is a relevant index, and then start
either a heap or index scan as appropriate.  The circularity issue could
be resolved by having that routine have hard-coded knowledge of some of
the system-table indexes (or even all of them, which is still better
than having that knowledge scattered throughout the code).  But the
performance cost of identifying the right index based on ScanKeys gives
me pause.  It's hard to justify that per-search overhead when the
hard-coded approach works well enough.

Thoughts anyone?
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> "Michael A. Olson" <mao@sleepycat.com> writes:
> > Third, we had to abstract a lot of the hard-coded table scans in
> > the bowels of the system to call a routine that checked for the
> > existence of an index on the system table, and used it.
> 
> The way that we've been approaching this is by switching from hard-coded
> sequential scans (heap_getnext() calls) to hard-coded indexscans
> (index_getnext() calls) at places where performance dictates it.
> 
> An advantage of doing it that way is that you don't have the
> bootstrapping/circularity problems that Mike describes; the code doesn't
> need to consult pg_index to know whether there is an index to use, it
> just has the necessary info hard-coded in.  For the same reason it's
> very quick.

I like hard-coded.  There aren't many of them, last time I looked. 
Maybe 5-10 that need index scan.  The rest are already done using the
catalog cache.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> Behalf Of Tom Lane
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> The advantage is that you can then index a bunch more of the system
> >> catalog tables, and on a bunch more attributes.  That produced some
> >> surprising speedups.
> 
> > We have indexes on all system tables that need it.
> 
> There isn't any fundamental reason why the planner can't be using an
> index to scan pg_index; we just need to code it that way.  Right now
> it's coded as a sequential scan.
> 
> Unfortunately there is no index on pg_index's indrelid column in 7.0,
> so this is not fixable without an initdb.  TODO item for 7.1, I guess.
>

I've noticed the fact since before but haven't complained.
As far as I see,pg_index won't so big. In fact Matthias's case has
only 1 page after running vacuum for pg_index.  In such cases
sequential scan is faster than index scan as you know.
I don't agree with you to increase system indexes easily.
Though I implemented REINDEX command to recover system
indexes it doesn't mean index corruption is welcome.

I know another case. pg_attrdef has no index on (adrelid,attnum)
though it has an index on (adrelid).

> More generally, someone should examine the other places where
> heap_getnext() loops occur, and see if any of them look like performance
> bottlenecks...

Please don't lose sequential scan stuff even when changes to
index scan is needed because -P option of standalone postgres
needs sequential scan for system tables.

Regards.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> More generally, someone should examine the other places where
>> heap_getnext() loops occur, and see if any of them look like performance
>> bottlenecks...

> Please don't lose sequential scan stuff even when changes to
> index scan is needed because -P option of standalone postgres
> needs sequential scan for system tables.

Good point.  I'd still like not to clutter the code with deciding
which kind of scan to invoke, though.  Maybe we could put the
begin_xxx routine in charge of ignoring a request for an indexscan
when -P is used.  (AFAIR there's no real difference for the calling
code, it sets up scankeys and so forth just the same either way, no?
We should just need a switching layer in front of heap_beginscan/
index_beginscan and heap_getnext/index_getnext...)
        regards, tom lane


"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> Unfortunately there is no index on pg_index's indrelid column in 7.0,
>> so this is not fixable without an initdb.  TODO item for 7.1, I guess.

> I've noticed the fact since before but haven't complained.
> As far as I see,pg_index won't so big. In fact Matthias's case has
> only 1 page after running vacuum for pg_index.  In such cases
> sequential scan is faster than index scan as you know.

True, but the differential isn't very big either when dealing with
a small table.  I think I'd rather use an index and be assured that
performance doesn't degrade drastically when the database contains
many indexes.

I've also been thinking about ways to implement the relcache-based
caching of index information that I mentioned before.  That doesn't
address the scanning problem in general but it should improve
performance for this part of the planner quite a bit.  The trick is to
ensure that other backends update their cached info whenever an index
is added or deleted.  I thought of one way to do that: force an update
of the owning relation's pg_class tuple during CREATE or DROP INDEX,
even when we don't have any actual change to make in its contents ---
that'd force a relcache invalidate cycle at other backends.  (Maybe
we don't even need to change the pg_class tuple, but just send out a
shared-cache-invalidate message as if we had.)

> I know another case. pg_attrdef has no index on (adrelid,attnum)
> though it has an index on (adrelid).

Doesn't look to me like we need an index on (adrelid,attnum), at
least not in any paths that are common enough to justify maintaining
another index.  The (adrelid) index supports loading attrdef data
into the relcache, which is the only path I'm particularly concerned
about performance of...
        regards, tom lane


I wrote:
> We should just need a switching layer in front of heap_beginscan/
> index_beginscan and heap_getnext/index_getnext...)

After refreshing my memory of how these are used, it seems that
we'd have to change the API of either the heap or index scan routines
in order to unify them like that.  Might be worth doing to maintain
code cleanliness, though.  The places Hiroshi has fixed to support
both index and seq scan look really ugly to my eyes ...
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> I've noticed the fact since before but haven't complained.
> As far as I see,pg_index won't so big. In fact Matthias's case has
> only 1 page after running vacuum for pg_index.  In such cases
> sequential scan is faster than index scan as you know.
> I don't agree with you to increase system indexes easily.
> Though I implemented REINDEX command to recover system
> indexes it doesn't mean index corruption is welcome.
> 
> I know another case. pg_attrdef has no index on (adrelid,attnum)
> though it has an index on (adrelid).
> 
> > More generally, someone should examine the other places where
> > heap_getnext() loops occur, and see if any of them look like performance
> > bottlenecks...
> 
> Please don't lose sequential scan stuff even when changes to
> index scan is needed because -P option of standalone postgres
> needs sequential scan for system tables.

Certainly whatever we do will be discussed.  I realize initdb is an
issue.  However, I am not sure sequential scan is faster than index scan
for finding only a few rows in the table.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> >> Unfortunately there is no index on pg_index's indrelid column in 7.0,
> >> so this is not fixable without an initdb.  TODO item for 7.1, I guess.
> 
> > I've noticed the fact since before but haven't complained.
> > As far as I see,pg_index won't so big. In fact Matthias's case has
> > only 1 page after running vacuum for pg_index.  In such cases
> > sequential scan is faster than index scan as you know.
> 
> True, but the differential isn't very big either when dealing with
> a small table.  I think I'd rather use an index and be assured that
> performance doesn't degrade drastically when the database contains
> many indexes.

Agreed.

> 
> I've also been thinking about ways to implement the relcache-based
> caching of index information that I mentioned before.  That doesn't
> address the scanning problem in general but it should improve
> performance for this part of the planner quite a bit.  The trick is to
> ensure that other backends update their cached info whenever an index
> is added or deleted.  I thought of one way to do that: force an update
> of the owning relation's pg_class tuple during CREATE or DROP INDEX,
> even when we don't have any actual change to make in its contents ---
> that'd force a relcache invalidate cycle at other backends.  (Maybe
> we don't even need to change the pg_class tuple, but just send out a
> shared-cache-invalidate message as if we had.)

Oh, good idea.  Just invalidate the relation so we reload.

BTW, Hiroshi is the one who gave me the recursion-prevision fix for the
system index additions for 7.0.


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


I wrote:
>>>>> Nonono, the 1000 read() calls are triggered by a simple INSERT or UPDATE
>>>>> call. They actually scan the pg_index table of the benchmark database.
>
> Ohh ... pg_index is the culprit!  OK, I know exactly where that's coming
> from: the planner is looking around to see what indexes might be
> interesting for planning the query.  Several comments here:
>
> 1. Probably we ought to try to bypass most of the planning process for
> a simple INSERT ... VALUES.  (I thought I had fixed that, but apparently
> it's not getting short-circuited soon enough, if the search for indexes
> is still happening.)


It never pays to assume you know what's happening without having looked
:-(.  It turns out the planner is not the only culprit: the executor's
ExecOpenIndices() routine *also* does a sequential scan of pg_index.
I did shortcircuit the planner's search in the INSERT ... VALUES case,
but of course the executor still must find out whether the table has
indexes.

In UPDATE, DELETE, or INSERT ... SELECT, pg_index is scanned *twice*,
once in the planner and once in the executor.  (In fact it's worse
than that: the planner scans pg_index separately for each table named
in the query.  At least the executor only does it once since it only
has to worry about one target relation.)

Definitely need to cache the indexing information...
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> I wrote:
> > We should just need a switching layer in front of heap_beginscan/
> > index_beginscan and heap_getnext/index_getnext...)
> 
> After refreshing my memory of how these are used, it seems that
> we'd have to change the API of either the heap or index scan routines
> in order to unify them like that.  Might be worth doing to maintain
> code cleanliness, though.  The places Hiroshi has fixed to support
> both index and seq scan look really ugly to my eyes ...

Agreed, and I think there are a few places that have them.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Thomas Lockhart
Date:
> The MySQL people probably didn't dig deeper into PostgreSQL's innards.
> They don't seem to think it's their job to find out exactly why their
> benchmark runs so slow on some other databases, and I don't particularly
> fault them for that attitude.

Hmm. And then who's job is it to take someone else's work and make it
accurate? If the shoe were on the other foot: if I generated a
benchmark suite and features list, and it contained major and numerous
inaccuracies, who would you expect to be responsible (or at least feel
responsible) for correcting/updating/improving it? 'Twould be me imho.

We've tried, and failed (to date) to contribute information to the
"crashme" travesty. My recollection was a ~30% error rate on
information for Postgres, and I didn't look into the stats for other
databases. Check the archives for details.

> The PostgreSQL community has an attitude too, after all.

Yup ;)

> One of these might be to answer "you must have had fsync turned on"
> whenever somebody reports a way-too-slow benchmark.  In this case,
> that's definitely not true.

I'm sorry that has been your experience. imho, that initial response
might be considered "helpful advice", not "attitude". And I'll submit
that most postings I've seen (I'm mostly on the -hackers list) are
followed up to the bitter end if the poster can state the problem
succinctly and can follow up with specific information. But I'm a
developer, so don't have the right outlook.

> Anyway, I fully expect to have a more reasonable benchmark result by
> tomorrow, and the MySQL guys will get a documentation update. Which they
> _will_ put in the next update's documentation file. Trust me.  ;-)

Fantastic! We've been down this road before, and have had little luck
in getting more than a token update of inaccuracies. Any little bit
helps.

And while you're at it, can you update their docs and web site to make
clear that transactions and atomicity are not anywhere near the
feature list of MySQL yet? TIA
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


RE: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> I wrote:
> > We should just need a switching layer in front of heap_beginscan/
> > index_beginscan and heap_getnext/index_getnext...)
> 
> After refreshing my memory of how these are used, it seems that
> we'd have to change the API of either the heap or index scan routines
> in order to unify them like that.  Might be worth doing to maintain
> code cleanliness, though.  The places Hiroshi has fixed to support
> both index and seq scan look really ugly to my eyes ...
>

Yes,it's ugly unfortunately.  So I had hesitated to commit it for
pretty long. There's a trial of unification in my trial implementation
of ALTER TABLE DROP COLUMN in command.c.
Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


RE: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> >> Unfortunately there is no index on pg_index's indrelid column in 7.0,
> >> so this is not fixable without an initdb.  TODO item for 7.1, I guess.
> 
> I've also been thinking about ways to implement the relcache-based
> caching of index information that I mentioned before.  That doesn't
> address the scanning problem in general but it should improve
> performance for this part of the planner quite a bit.

Sounds reasonble to me.

> The trick is to
> ensure that other backends update their cached info whenever an index
> is added or deleted.  I thought of one way to do that: force an update
> of the owning relation's pg_class tuple during CREATE or DROP INDEX,
> even when we don't have any actual change to make in its contents ---
> that'd force a relcache invalidate cycle at other backends.  (Maybe
> we don't even need to change the pg_class tuple, but just send out a
> shared-cache-invalidate message as if we had.)
>

Seems CREATE INDEX already sends shared_cache_invalidate
message. DROP INDEX doesn't ?? I'm not sure.
> > I know another case. pg_attrdef has no index on (adrelid,attnum)
> > though it has an index on (adrelid).
> 
> Doesn't look to me like we need an index on (adrelid,attnum), at
> least not in any paths that are common enough to justify maintaining
> another index.  The (adrelid) index supports loading attrdef data
> into the relcache, which is the only path I'm particularly concerned
> about performance of...
>

It seems to me that an index on (adrelid,adnum) should
exist instead of the current index. It identifies pg_attrdef.
I say *Oops* about it in my trial implementation of ALTER
TABLE DROP COLUMN.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp 



"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>>>> I know another case. pg_attrdef has no index on (adrelid,attnum)
>>>> though it has an index on (adrelid).
>> 
>> Doesn't look to me like we need an index on (adrelid,attnum), at
>> least not in any paths that are common enough to justify maintaining
>> another index.  The (adrelid) index supports loading attrdef data
>> into the relcache, which is the only path I'm particularly concerned
>> about performance of...

> It seems to me that an index on (adrelid,adnum) should
> exist instead of the current index. It identifies pg_attrdef.
> I say *Oops* about it in my trial implementation of ALTER
> TABLE DROP COLUMN.

Right, I saw that.  But it seems to be the only place where such an
index would be useful.  The relcache-loading routines, which seem to
be the only performance-critical access to pg_attrdef, prefer an index
on adrelid only.  Is it worth maintaining a 2-column index (which is
bulkier and slower than a 1-column one) just to speed up ALTER TABLE
DROP COLUMN?
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Tom Lane:
> It never pays to assume you know what's happening without having looked
> :-(.  It turns out the planner is not the only culprit: the executor's
> ExecOpenIndices() routine *also* does a sequential scan of pg_index.

That meshes with my observation that updates seem to do twice as many
read() calls on pg_index than inserts.

For this test, anyway.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Statistics: Nubers looking for an argument.


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Thomas Lockhart:
>
> Hmm. And then who's job is it to take someone else's work and make it
> accurate? If the shoe were on the other foot: if I generated a
> benchmark suite and features list, and it contained major and numerous
> inaccuracies, who would you expect to be responsible (or at least feel
> responsible) for correcting/updating/improving it? 'Twould be me imho.
>
Umm, there's still a difference between saying (a) "it's broken, fix
it", (b) "here's my analysis as to what exactly is broken, can you fix
it", and (c) "here's a patch that fixes it".

I get the distinct impression that most of the communication between the
PostgreSQL and MySQL people has been looking more like (a) in the
past... if I can help both projects by doing some "translation" towards
(b) and (c), if at all possible, then so much the better.

> We've tried, and failed (to date) to contribute information to the
> "crashme" travesty. My recollection was a ~30% error rate on
> information for Postgres, and I didn't look into the stats for other
> databases. Check the archives for details.
>
Attached is the current crashme output. "crash_me_safe" is off only
because of the fact that some tests go beyond available memory.
There's no sense in testing how far you can push a "SELECT a from b where
c = 'xxx(several megabytes worth of Xes)'" query when the size fo a TEXT
field is limited to 32k.

Limits with '+' in front of the number say that this is the max value
tested, without implying whether higher values are OK or not.

If you have any remarks, especially about the '=no' results (i.e. you
think PostgreSQL can do that, therefore the crashme test must be wrong
somehow), tell me. Otherwise I'll forward the results to the MySQL
people next week.


The crash-me test script, BTW, is included in MySQL's sql-bench
subdirectory.

--
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
--
The real character of a man is found out by his amusements.
                                -- Joshua Reynolds

Attachment

Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Mike Mascari
Date:
Matthias Urlichs wrote:
> Attached is the current crashme output. "crash_me_safe" is off only
> because of the fact that some tests go beyond available memory.
> There's no sense in testing how far you can push a 
> "SELECT a from b where c = 'xxx(several megabytes worth of Xes)'"
> query when the size fo a TEXT field is limited to 32k.
> 
> Limits with '+' in front of the number say that this is the max value
> tested, without implying whether higher values are OK or not.
> 
> If you have any remarks, especially about the '=no' results (i.e. you
> think PostgreSQL can do that, therefore the crashme test must be wrong
> somehow), tell me. Otherwise I'll forward the results to the MySQL
> people next week.

How about:

1. alter_rename_table = no

The syntax in PostgreSQL is ALTER TABLE x RENAME TO y;

2. atomic_updates = no

Huh? Besides being paranoid about fsync()'ing transactions how is
a transaction based MVCC not atomic with respect to updates?

3. automatic_rowid = no

The description simply says Automatic rowid. Does this apply to
query result sets or to the underlying relation? If the latter,
PostgreSQL has, of course, an OID for every tuple in the
database.

4. binary_items = no

Read up on large objects...

5. connections = 32

This, should, of course be +32, since PostgreSQL can easily
handle hundreds of simultaneous connections.

6. create_table_select = no

Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle),
and SELECT INTO syntax.

7. except = no

PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit
they're slow).

I'm starting to get very tired of this. I don't see why
PostgreSQL users are obligated to get MySQL tests correct. And
I'm only 15% through the list...

Bottom line...either the test writers are ignorant or deceptive.
Either way I won't trust my data with them...

Mike Mascari


RE: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> >>>> I know another case. pg_attrdef has no index on (adrelid,attnum)
> >>>> though it has an index on (adrelid).
> >> 
> >> Doesn't look to me like we need an index on (adrelid,attnum), at
> >> least not in any paths that are common enough to justify maintaining
> >> another index.  The (adrelid) index supports loading attrdef data
> >> into the relcache, which is the only path I'm particularly concerned
> >> about performance of...
> 
> > It seems to me that an index on (adrelid,adnum) should
> > exist instead of the current index. It identifies pg_attrdef.
> > I say *Oops* about it in my trial implementation of ALTER
> > TABLE DROP COLUMN.
> 
> Right, I saw that.  But it seems to be the only place where such an
> index would be useful.  The relcache-loading routines, which seem to
> be the only performance-critical access to pg_attrdef, prefer an index
> on adrelid only.  Is it worth maintaining a 2-column index (which is
> bulkier and slower than a 1-column one) just to speed up ALTER TABLE
> DROP COLUMN?
>

I don't mind so much about the performance in this case.
The difference would be little.

Isn't it a fundamental principle to define primary(unique
identification) constraint for each table ?
I had never thought that the only one index of pg_attrdef 
isn't an unique identification index until I came across the
unexpcted result of my DROP COLUMN test case.

Comments ?

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Mike Mascari:
> 
> 1. alter_rename_table = no
> 
> The syntax in PostgreSQL is ALTER TABLE x RENAME TO y;
> 
They say "alter table crash_q rename crash_q1".

What does the official standard say (assuming any exists) -- is the "to"
optional or not?

> 2. atomic_updates = no
> 
> Huh? Besides being paranoid about fsync()'ing transactions how is
> a transaction based MVCC not atomic with respect to updates?
> 
That's a misnomer. They actually mean this:
create table crash_q (a integer not null);create unique index crf on crash_q(a);
insert into crash_q values (2);insert into crash_q values (3);insert into crash_q values (1);update crash_q set a=a+1;

> 3. automatic_rowid = no
> 
> The description simply says Automatic rowid. Does this apply to
> query result sets or to the underlying relation? If the latter,
> PostgreSQL has, of course, an OID for every tuple in the
> database.
> 
I'll have them fix that. MySQL calls them "_rowid" and apparently tests
only for these.

> 4. binary_items = no
> 
> Read up on large objects...
> 
... with an ... erm ... let's call it "nonstandard" ... interface.

> 5. connections = 32
> 
> This, should, of course be +32, since PostgreSQL can easily
> handle hundreds of simultaneous connections.
> 
The testing code (Perl) looks like this, and it bombs after the 32nd
connection.
 for ($i=1; $i < $max_connections ; $i++) {   if (!($dbh=DBI->connect($server->{'data_source'},$opt_user,$opt_password,
           { PrintError => 0})))   {     print "Last connect error: $DBI::errstr\n" if ($opt_debug);     last;   }
$dbh->{LongReadLen}=$longreadlen; # Set retrieval buffer   print "." if ($opt_debug);   push(@connect,$dbh); } print
"$i\n";

I do not know where that limit comes from.
It might be the DBI interface to PostgreSQL, or a runtime limit.

Anyway, $max_connections has the value to 1000.

> 6. create_table_select = no
> 
> Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle),
> and SELECT INTO syntax.

Test code:create table crash_q SELECT * from crash_me;

Again, is the "AS" optional or not?

> 7. except = no
> 
> PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit
> they're slow).
> 
Looking at the test, we see it doing this:
create table crash_me (a integer not null,b char(10) not null);insert into crash_me (a,b) values (1,'a');create table
crash_me2(a integer not null,b char(10) not null, c integer);insert into crash_me2 (a,b,c) values (1,'b',1);select *
fromcrash_me except select * from crash_me2;
 

For what it's worth, there is at least one database which doesn't
have this restriction (i.e., that the number of columns must be
identical) (namely SOLID).

So this test needs to be split into two. I'll do that.

> I'm starting to get very tired of this. I don't see why
> PostgreSQL users are obligated to get MySQL tests correct. And
> I'm only 15% through the list...
> 
_Somebody_ has to get these things right. I'm not suggesting that it's
any obligation of yours specifically, but somebody's gotta do it, and
(IMHO) it can only be done by somebody who already knows _something_
about the databse to be tested.

> Bottom line...either the test writers are ignorant or deceptive.

Or the tests are just badly written. Or they're too old and suffer from
severe bit rot.


For what its worth, I do NOT think the people who wrote these tests
are either ignorant or deceptive. Most, if not all, of these tests
are OK when checked against at least one SQLish database.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Freedom of opinion can only exist when
the government thinks itself secure.                       -- Bertrand Russell (1872-1967)


Re: More Performance

From
"Matthias Urlichs"
Date:
Hi,

I've found another one of these performance problems in the benchmark,
related to another ignored index.

The whole thing works perfectly after a VACUUM ANALYZE on the
table.

IMHO this is somewhat non-optimal. In the absence of information
to the contrary, PostgreSQL should default to using an index if
it might be appropriate, not ignore it.

I am thus throwing away yet another benchmark run -- the query now runs
300 times faster. *Sigh* 

test=# vacuum bench1;
VACUUM
test=# \d bench1        Table "bench1"Attribute |   Type   | Modifier 
-----------+----------+----------id        | integer  | not nullid2       | integer  | not nullid3       | integer  |
notnulldummy1    | char(30) | 
 
Indices: bench1_index_,        bench1_index_1

test=# \d bench1_index_


Index "bench1_index_"Attribute |  Type   
-----------+---------id        | integerid2       | integer
unique btree

test=# 
test=# 
test=# \d bench1_index_1
Index "bench1_index_1"Attribute |  Type   
-----------+---------id3       | integer
btree

test=# explain update bench1 set dummy1='updated' where id=150;
NOTICE:  QUERY PLAN:

Seq Scan on bench1  (cost=0.00..6843.00 rows=3000 width=18)

EXPLAIN
test=# vacuum bench1;
VACUUM
test=# explain update bench1 set dummy1='updated' where id=150;
NOTICE:  QUERY PLAN:

Seq Scan on bench1  (cost=0.00..6843.00 rows=3000 width=18)

EXPLAIN
test=# select count(*) from bench1;count  
--------300000
(1 row)

test=# select count(*) from bench1 where id = 150;count 
-------    1
(1 row)

test=# explain select count(*) from bench1 where id = 150;
NOTICE:  QUERY PLAN:

Aggregate  (cost=6850.50..6850.50 rows=1 width=4) ->  Seq Scan on bench1  (cost=0.00..6843.00 rows=3000 width=4)

EXPLAIN


***************************************************************

Related to this:

test=# explain select id from bench1 order by id;
NOTICE:  QUERY PLAN:

Sort  (cost=38259.21..38259.21 rows=300000 width=4) ->  Seq Scan on bench1  (cost=0.00..6093.00 rows=300000 width=4)

EXPLAIN

The basic idea to speed this one up (a lot...) would be to walk the index.

This is _after_ ANALYZE, of course.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
To be positive: To be mistaken at the top of one's voice.               -- Ambrose Bierce, The Devil's Dictionary


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
I know I am going to regret believing that I will actually make any
difference, but I am going to shoot myself anyway.

I am writing this more for the new PostgreSQL members who were not
around last time than in any belief it will make a difference on the
MySQL end.



> Hi,
> 
> Mike Mascari:
> > 
> > 1. alter_rename_table = no
> > 
> > The syntax in PostgreSQL is ALTER TABLE x RENAME TO y;
> > 
> They say "alter table crash_q rename crash_q1".
> 
> What does the official standard say (assuming any exists) -- is the "to"
> optional or not?

I don't see any RENAME in the SQL92 spec.  Now, how hard is it to do a
'man alter_table' and see what it says at the top of the screen?

> 
> > 2. atomic_updates = no
> > 
> > Huh? Besides being paranoid about fsync()'ing transactions how is
> > a transaction based MVCC not atomic with respect to updates?
> > 
> That's a misnomer. They actually mean this:
> 
>     create table crash_q (a integer not null);
>     create unique index crf on crash_q(a);
> 
>     insert into crash_q values (2);
>     insert into crash_q values (3);
>     insert into crash_q values (1);
>     update crash_q set a=a+1;

Poorly named, huh?  How do you think it got such a name?  This item was
on the crashme tests before TRANSACTION was on there?  Can you explain
how a very exotic issue got on there year(s) before transactions. 
Transactions got on there only because I complained.

> 
> > 3. automatic_rowid = no
> > 
> > The description simply says Automatic rowid. Does this apply to
> > query result sets or to the underlying relation? If the latter,
> > PostgreSQL has, of course, an OID for every tuple in the
> > database.
> > 
> I'll have them fix that. MySQL calls them "_rowid" and apparently tests
> only for these.

Well, I don't see _rowid in the SQL spec either, so we are both
non-standard here, though I believe our OID is SQL3.


> 
> > 4. binary_items = no
> > 
> > Read up on large objects...
> > 
> ... with an ... erm ... let's call it "nonstandard" ... interface.

Yes.

> 
> > 5. connections = 32
> > 
> > This, should, of course be +32, since PostgreSQL can easily
> > handle hundreds of simultaneous connections.
> > 
> The testing code (Perl) looks like this, and it bombs after the 32nd
> connection.
> 
>   for ($i=1; $i < $max_connections ; $i++)
>   {
>     if (!($dbh=DBI->connect($server->{'data_source'},$opt_user,$opt_password,
>               { PrintError => 0})))
>     {
>       print "Last connect error: $DBI::errstr\n" if ($opt_debug);
>       last;
>     }
>     $dbh->{LongReadLen}= $longreadlen; # Set retrieval buffer
>     print "." if ($opt_debug);
>     push(@connect,$dbh);
>   }
>   print "$i\n";
> 
> I do not know where that limit comes from.
> It might be the DBI interface to PostgreSQL, or a runtime limit.
> 
> Anyway, $max_connections has the value to 1000.

You have to recompile the backend to increase it.  Not on the client
end.  See FAQ.

> 
> > 6. create_table_select = no
> > 
> > Again. PostgreSQL supports CREATE TABLE AS SELECT (i.e. Oracle),
> > and SELECT INTO syntax.
> 
> Test code:
>     create table crash_q SELECT * from crash_me;
> 
> Again, is the "AS" optional or not?

man create_table.  That is all it takes.  There is not standard for
this.  It is from Oracle.  Is their AS optional?  Does it really matter?

> 
> > 7. except = no
> > 
> > PostgreSQL has had both INTERSECT and EXCEPT since 6.5.0 (albeit
> > they're slow).
> > 
> Looking at the test, we see it doing this:
> 
>     create table crash_me (a integer not null,b char(10) not null);
>     insert into crash_me (a,b) values (1,'a');
>     create table crash_me2 (a integer not null,b char(10) not null, c integer);
>     insert into crash_me2 (a,b,c) values (1,'b',1);
>     select * from crash_me except select * from crash_me2;
> 
> For what it's worth, there is at least one database which doesn't
> have this restriction (i.e., that the number of columns must be
> identical) (namely SOLID).
> 
> So this test needs to be split into two. I'll do that.

So you test EXCEPT by having a different number of columns.  I can see
it now, "Hey we don't have EXCEPT.  PostgreSQL does it, but they can't
handle a different number of columns.  Let's do only that test so we
look equal."

> 
> > I'm starting to get very tired of this. I don't see why
> > PostgreSQL users are obligated to get MySQL tests correct. And
> > I'm only 15% through the list...
> > 
> _Somebody_ has to get these things right. I'm not suggesting that it's
> any obligation of yours specifically, but somebody's gotta do it, and
> (IMHO) it can only be done by somebody who already knows _something_
> about the database to be tested.
> 
> > Bottom line...either the test writers are ignorant or deceptive.
> 
> Or the tests are just badly written. Or they're too old and suffer from
> severe bit rot.
> 
> 
> For what its worth, I do NOT think the people who wrote these tests
> are either ignorant or deceptive. Most, if not all, of these tests
> are OK when checked against at least one SQLish database.

In looking at each of these items, it is impossible for me to believe
that the tests were not written by either very ignorant people ("I can't
run 'man') or very deceptive people ("Let's make ourselves look good.").

If you view this from outside the MySQL crowd, can you see how we would
feel this way?  This is just a small example of the volumes of reasons
we have in believing this.

If you are going to publish things about other databases on your web
site, you had better do a reasonable job to see that is it accurate and
fair.  If it can't be done, take it off.  Don't leave it up and have it
be wrong, and ignore people in the past who tell you it is wrong.

It never has been fair, and I suspect never will be, because this is
hashed around every year with little change or acknowledgement.

So, yea, we have an attitude.  We are usually nice folks, so if the
majority of us have a bad attitude, there must be some basis for that
feeling, and I can tell you, the majority of us do have a bad attitude
on the topic.

I know the MySQL folks don't have a bad attitude about us, and you know,
they don't because we never did anything like this Crashme to them.  But
actually, we are tired of being pushed by an ignorant/deceptive crashme
test, and we are starting to push back.   But, you can be sure we will
never stoop to the level of the crashme test.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: More Performance

From
Bruce Momjian
Date:
> Hi,
> 
> I've found another one of these performance problems in the benchmark,
> related to another ignored index.
> 
> The whole thing works perfectly after a VACUUM ANALYZE on the
> table.
> 
> IMHO this is somewhat non-optimal. In the absence of information
> to the contrary, PostgreSQL should default to using an index if
> it might be appropriate, not ignore it.

This is an interesting idea.  So you are saying that if a column has no
vacuum analyze statistics, assume it is unique?  Or are you talking
about a table that has never been vacuumed?  Then we assume it is a
large table.  Interesting.  It would help some queries, but hurt others.
We have gone around and around on what the default stats should be.
Tom Lane can comment on this better than I can.

> 
> Related to this:
> 
> test=# explain select id from bench1 order by id;
> NOTICE:  QUERY PLAN:
> 
> Sort  (cost=38259.21..38259.21 rows=300000 width=4)
>   ->  Seq Scan on bench1  (cost=0.00..6093.00 rows=300000 width=4)
> 
> EXPLAIN
> 
> The basic idea to speed this one up (a lot...) would be to walk the index.
> 
> This is _after_ ANALYZE, of course.

But you are grabbing the whole table.  Our indexes are separate files. 
The heap is unordered, meaning a sequential scan and order by is usually
faster than an index walk unless there is a restrictive WHERE clause.

Thanks for the tip about needing an index on pg_index.  That will be in
7.1.  I remember previous crashme rounds did bring up some good info for
us, like the fact older releases couldn't handle trailing comments from
perl.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

[ Sorry if this reply is much too long. I know that...]

Bruce Momjian:
> I know I am going to regret believing that I will actually make any
> difference, but I am going to shoot myself anyway.
> 
I sincerely hope/believe you're wrong.

> > What does the official standard say (assuming any exists) -- is the "to"
> > optional or not?
> 
> I don't see any RENAME in the SQL92 spec.  Now, how hard is it to do a
> 'man alter_table' and see what it says at the top of the screen?
> 
It's not a question of your manpage vs. their manpage. I can read your
manpage just fine. It's a question of whether there is somethign that
can be regarded as a standard on it or not. "Official" is a poor wording
in this case -- sorry.

If yes, then the test will be changed to do it the standard way.
If no, then I might have to test for both syntaxes, which is a PITA.


While I'm at it, I note that the last sentence of that manpage says
The clauses to rename columns and tables are Postgres extensionsfrom SQL92.

Correct me when I'm wrong, but is that really _your_ extension, or
did some other database vendor (who?) come up with it?

> > Anyway, $max_connections has the value to 1000.
> 
> You have to recompile the backend to increase it.  Not on the client
> end.  See FAQ.

I was compiling and running the backend with default options(*). That
means that the tests will show the default limits. It does this for all
the other databases in the crash-me test result suite. (Oracle:40,
mysql:100, solid:248, empress:10, interbase:10)

Anyway, the max value for PostgreSQL, without recompiling the backend,
is 1024 according to the FAQ; but there's no way an automated test can
find out _that_.

I'll add a comment ("installation default") to that test column.

(*) except for fsync, of course, in the interest of fairness.

> man create_table.  That is all it takes.  There is not standard for
> this.  It is from Oracle.  Is their AS optional?  Does it really matter?
> 
No.

What matters is that your opinion is that they are responsible for making
the test 100% accurate. Their reply to that is that many database
vendors actually provided fixes for this test instead of bitching
about how inaccurate it is, thus they feel the obligation is on your
side.

Now I am of neither side. I am, IMHO, thus in a position to ask you
about your opinion of these inaccuracies, I am going to change 
the crashme test to be a whole lot more accurate WRT PostgreSQL,
I will feed these changes back to the MySQL people, and they'll
incorporate these changes into their next release. (Their head honcho
(Monty) has said so on their mailing list. I _am_ going to take him up
on it, and I can be quite obnoxious if somebody reneges on a promise.
*EVIL*GRIN* )

If your opinion is that you have a right to be annoyed about all of this
because you went through the whole thing last year, and the year before
that, and ..., ... well, I can understand your point of view.

But I honestly think that the problem is not one of either malice or
stupidity. "Different sets of priorities" and "different project
structure" are equally-valid assumptions. At least for me. Until I'm
proven wrong (IF I am).

> So you test EXCEPT by having a different number of columns.  I can see
> it now, "Hey we don't have EXCEPT.  PostgreSQL does it, but they can't
> handle a different number of columns.  Let's do only that test so we
> look equal."
> 
They might as well have written that test while checking their crash-me
script against SOLID and noting a few features MySQL doesn't have yet.
Or they may have gotten it from them in the first place.


I might add that their test lists 52 features of PostgreSQL which
MySQL doesn't have (13 functions). It also lists 122 features of MySQL
which PostgreSQL doesn't have; 78 of those are extra functions (40 of
these, just for M$-ODBC compatibility).

So it seems that overall, that crash-me test result is reasonably
balanced (39 vs. 44 non-function differences -- let's face it, adding
another function for compatibility with SQL variant FOO is one of the
easier exercises here, whatever the current value of FOO is).

The result is going to be even more balanced when I'm through with it,
but I cannot do that on my own, as I do not have enough experience with
either PostgreSQL or the various SQL standards. Thus, I'm asking.

Is that really a problem?

> If you view this from outside the MySQL crowd, can you see how we would
> feel this way?  This is just a small example of the volumes of reasons
> we have in believing this.
> 
I would like not to view this from any outside, inside, or whatever
viewpoint. My goal is to get at least some part of the petty arguments
out of the way because, in MY book at least, the _real_ "battle", such
as there is, isn't PostgreSQL against MySQL! It's more-or-less-
-open-source databases on one side and closed-source products, some of
which are the equivalent of MS Word in the database world (you know who
I'm talkign about ;-) on the other side.

> It never has been fair, and I suspect never will be, because this is
> hashed around every year with little change or acknowledgement.
> 
It is about as fair as a certain comparison chart on your site has been.
It's gone now, thus as far as I'm concerned it's water under the bridge.
Besides, I'm not interested. Some of the members of this list seem
to be pretty much burned out on the whole issue -- I can live with that;
but I'm trying to do something about the problem. Don't shoot the
messenger. ;-)


-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Lady Luck brings added income today.
Lady friend takes it away tonight.


MySQL's "crashme" (was Re: Performance)

From
Tom Lane
Date:
"Matthias Urlichs" <smurf@noris.net> writes:
>> Hmm. And then who's job is it to take someone else's work and make it
>> accurate? If the shoe were on the other foot: if I generated a
>> benchmark suite and features list, and it contained major and numerous
>> inaccuracies, who would you expect to be responsible (or at least feel
>> responsible) for correcting/updating/improving it? 'Twould be me imho.
>> 
> Umm, there's still a difference between saying (a) "it's broken, fix
> it", (b) "here's my analysis as to what exactly is broken, can you fix
> it", and (c) "here's a patch that fixes it".

Good luck.  Close analysis of the crashme test leaves an extremely bad
taste in the mouth: there are just too many cases where it's clearly
designed as a pro-MySQL advertising tool and not an honest attempt to
describe reality.  Shall we consider details?

> Attached is the current crashme output. "crash_me_safe" is off only
> because of the fact that some tests go beyond available memory.
> There's no sense in testing how far you can push a "SELECT a from b where
> c = 'xxx(several megabytes worth of Xes)'" query when the size fo a TEXT
> field is limited to 32k.

I would not like to see us labeled "crashme unsafe" merely because
someone is too impatient to let the test run to conclusion.  But there's
a more interesting problem here: using stock crashme and Postgres 7.0,
on my system it's crashme that crashes and not Postgres!  The crashme
Perl script is a huge memory hog and runs into the kernel's process-size
limit long before the connected backend does.  To get it to run to
completion, I have to reduce the thing's limit on the longest query it
will try:

*** crash-me~    Sat May 20 12:28:11 2000
--- crash-me    Sat May 20 13:21:11 2000
***************
*** 104,110 **** #  $max_connections="+1000";       # Number of simultaneous connections
! $max_buffer_size="+16000000";   # size of communication buffer. $max_string_size="+8000000";    # Enough for this
test$max_name_length="+512";        # Actually 256, but ... $max_keys="+64";                # Probably too big.
 
--- 104,110 ---- #  $max_connections="+1000";       # Number of simultaneous connections
! $max_buffer_size="+1000000";    # size of communication buffer. $max_string_size="+8000000";    # Enough for this
test$max_name_length="+512";        # Actually 256, but ... $max_keys="+64";                # Probably too big.
 

A few months ago I was able to use max_buffer_size = +2000000, but
crashme 1.43 seems to be an even worse memory hog than its predecessors.
At this setting, the Perl process tops out at about 114Mb while the
connected backend grows to 66Mb.  (I run with a process limit of 128Mb.)
To be fair, this could be Perl's fault more than crashme's.  I'm using
Perl 5.005_03 ... anyone know if more recent versions use less memory?


Now, on to some specific complaints:

> alter_drop_col=no            # Alter table drop column

While our ALTER TABLE support is certainly pretty weak, it should be
noted that this test will continue to fail even when we have ALTER TABLE
DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax.

> alter_rename_table=no            # Alter table rename table

We have ALTER TABLE RENAME ... but not under the syntax crashme is
testing.  Since SQL92 doesn't specify a syntax for RENAME, there's no
absolute authority for this --- but a quick check of the comparative
crashme results at http://www.mysql.com/crash-me-choose.htmy shows that
*none* of the major commercial DBMSs "pass" this test.  Rather curious
that crashme uses a MySQL-only syntax for this test, no?

> atomic_updates=no            # atomic updates

What's actually being tested here is whether the DBMS will let you do
"update crash_q set a=a+1" in a table with a unique index on "a" and
consecutive pre-existing values.  In other words, is the uniqueness
constraint checked on a per-tuple-update basis, or deferred to end of
transaction?  It's fair to blame Postgres for not supporting a deferred
uniqueness check, but this test is extremely misleadingly labeled.
A person who hadn't examined the guts of crashme would probably think
it tests whether concurrent transactions see each others' results
atomically.

> automatic_rowid=no            # Automatic rowid

Test is actually looking for a system column named "_rowid".  Our OIDs
serve the same purpose, and I believe there are equivalent features in
many other DBMSes.  Again, MySQL is the only "passer" of this test,
which says more about their level of standardization than other
people's.

> binary_items=no                # binary items (0x41)

We have binary literals (per the test name) and hex literals (what
it actually appears to be testing).  Unfortunately for us, ours are
SQL92-compliant syntax, and what crashme is looking for isn't.

> comment_#=no                # # as comment
> comment_--=yes            # -- as comment
> comment_/**/=yes            # /* */ as comment
> comment_//=no                # // as comment

It'd be helpful to the reader if they indicated which two of these
conventions are SQL-compliant ... of course, that might expose the
fact that MySQL isn't ...

> connections=32                # Simultaneous connections

Should probably be noted that this is just the default limit (chosen to
avoid creating problems on small systems) and can easily be raised at
postmaster start time.

> crash_me_safe=no            # crash me safe

I get "yes", and I'd *really* appreciate it if you not submit this
misleading statement.

> create_table_select=no            # create table from select

This is looking for "create table crash_q SELECT * from crash_me",
which again appears to be a MySQL-only syntax.  We have the same feature
but we want "AS" in front of the "SELECT".  Dunno how other DBMSs do it.

> date_zero=no                # Supports 0000-00-00 dates

Note this is not checking to see if the date format yyyy-mm-dd is
accepted, it's checking to see if the specific value '0000-00-00'
is accepted.  Haven't these people heard of NULL?  Another test that
only MySQL "passes".

> except=no                # except

This test is checking:
create table crash_me (a integer not null,b char(10) not null);
create table crash_me2 (a integer not null,b char(10) not null, c integer);
select * from crash_me except select * from crash_me2;
Postgres rejects it with
ERROR:  Each UNION | EXCEPT | INTERSECT query must have the same number of columns.
Unsurprisingly, hardly anyone else accepts it either.

> except_all=no                # except all

While we do not have "except all", when we do this test will still fail
for the same reason as above.

> func_extra_not=no            # Function NOT in SELECT

What they are looking for here is "SELECT NOT 0", which Postgres rejects
as a type violation.  SQL-compliant "NOT FALSE" would work.

BTW, while I haven't got the patience to go through the function list in
detail, quite a few functions that we actually have are shown as "not
there" because of type resolution issues.  For example they test exp()
with "select exp(1)" which fails because of ambiguity about whether
exp(float8) or exp(numeric) is wanted.  This will get cleaned up soon,
but it's not really a big problem in practice...

> having_with_alias=no            # Having on alias

Again, how curious that MySQL is the only DBMS shown as passing this
test.  Couldn't be because it violates SQL92, could it?

> insert_select=no            # insert INTO ... SELECT ...

We would pass this test if the crashme script weren't buggy: it fails
to clean up after a prior test that creates a crash_q table with
different column names.  The prior test is testing "drop table if
exists", which means the only way to be shown as having this
SQL-standard feature is to implement the not-standard "if exists".

> intersect=no                # intersect
> intersect_all=no            # intersect all

See above comments for EXCEPT.

> logical_value=1            # Value of logical operation (1=1)

A rather odd result, considering that what Postgres actually returns for
"SELECT (1=1)" is 't'.  But showing the correct answer isn't one of
crashme's highest priorities...

> minus_neg=no                # Calculate 1--1

Another case where "passing" the test means accepting MySQL's version of
reality instead of SQL92's.  All the SQL-compliant DBMSs think -- is a
comment introducer, so "select a--1 from crash_me" produces an error ...
but not in MySQL ...

> quote_ident_with_"=no            # " as identifier quote (ANSI SQL)
> quote_ident_with_[=no            # [] as identifier quote
> quote_ident_with_`=no            # ` as identifier quote

Here at least they admit which variant is ANSI ;-).  Postgres doesn't
pass because we think 'select "A" from crash_me' should look for a
column named upper-case-A, but the column is actually named
lower-case-a.  We are not conforming to the letter of the SQL standard
here --- SQL says an unquoted name should be mapped to all upper case,
not all lower case as we do it, which is how the column got to be named
that way.  We're closer than MySQL though...

> select_string_size=+16208        # constant string size in SELECT

I got 1048567 here, roughly corresponding to where I set max_buffer_size.
Not sure why you get a smaller answer.

> select_table_update=no            # Update with sub select

We certainly have update with sub select.  What they're looking for is
the non-SQL-compliant syntaxupdate crash_q set crash_q.b=    (select b from crash_me where crash_q.a = crash_me.a);
It works in Postgres if you remove the illegal table specification:update crash_q set b=    (select b from crash_me
wherecrash_q.a = crash_me.a);
 

> type_sql_bit=yes            # Type bit
> type_sql_bit(1_arg)=yes            # Type bit(1 arg)
> type_sql_bit_varying(1_arg)=yes        # Type bit varying(1 arg)

It should probably be noted that we only have syntax-level support for
BIT types in 7.0; they don't actually work.  The test is not deep enough
to notice that, however.


General comments:

It appears that they've cleaned up their act a little bit.  The last
time I examined crashme in any detail, there was an even longer list
of tests that checked for standard features but were careful to use a
nonstandard variant so they could claim that other people failed to
have the feature at all.

More generally, it's difficult to take seriously a test method and
presentation method that puts more weight on how many variant spellings
of "log()" you accept than on whether you have subselects.  (I count
five entries versus two.)

One could also complain about the very large number of tests that are
checking features that are non-SQL if not downright SQL-contradictory,
but are listed simply as bullet points with no pro or con.  A naive
reader would think that green stars are always good; they are not,
but how are you to tell without a copy of the SQL spec in hand?

Finally, the test coverage seems to have been designed with an eye
towards giving MySQL as many green stars as possible, not towards
exercising the most important features of SQL.  It would be interesting
to see considerably more coverage of subselects, for example, and I
expect that'd turn up shortcomings in a number of products including
Postgres.  But it won't happen as long as crashme is a tool of, by, and
for MySQL partisans (at least not till MySQL has subselects, whereupon
the test coverage will no doubt change).


Just FYI, I attach a diff between what you presented and what I get from
running the current crashme.  I don't understand exactly what's causing
the small differences in the values of some of the size limits.
Perhaps it is a side effect of using a different max_buffer_size, but
it seems really weird.
        regards, tom lane

37c37
< crash_me_safe=no            # crash me safe
---
> crash_me_safe=yes            # crash me safe
309c309
< max_char_size=8104            # max char() size
---
> max_char_size=8088            # max char() size
315c315
< max_index_length=2704            # index length
---
> max_index_length=2700            # index length
317c317
< max_index_part_length=2704        # max index part length
---
> max_index_part_length=2700        # max index part length
319,321c319,321
< max_index_varchar_part_length=2704    # index varchar part length
< max_row_length=7949            # max table row length (without blobs)
< max_row_length_with_null=7949        # table row length with nulls (without blobs)
---
> max_index_varchar_part_length=2700    # index varchar part length
> max_row_length=7937            # max table row length (without blobs)
> max_row_length_with_null=7937        # table row length with nulls (without blobs)
326c326
< max_text_size=8104            # max text or blob size
---
> max_text_size=8092            # max text or blob size
328c328
< max_varchar_size=8104            # max varchar() size
---
> max_varchar_size=8088            # max varchar() size
344c344
< operating_system=Linux 2.3.99s-noris-pre9-2 i686    # crash-me tested on
---
> operating_system=HP-UX B.10.20 9000/780    # crash-me tested on
355c355
< query_size=16777216            # query size
---
> query_size=1048576            # query size
369c369
< select_string_size=+16208        # constant string size in SELECT
---
> select_string_size=1048567        # constant string size in SELECT
490c490
< where_string_size=+16208        # constant string size in where
---
> where_string_size=1048542        # constant string size in where

Re: More Performance

From
"Matthias Urlichs"
Date:
Hi,

Bruce Momjian:
> > 
> > test=# explain select id from bench1 order by id;
> > Sort  (cost=38259.21..38259.21 rows=300000 width=4)
> >   ->  Seq Scan on bench1  (cost=0.00..6093.00 rows=300000 width=4)
> > 
> The heap is unordered, meaning a sequential scan and order by is usually
> faster than an index walk unless there is a restrictive WHERE clause.
> 
What heap? The index is a b-tree in this case. Thus you should be able
to walk it and get the sorted result without ever touching the data
file.

Whether that makes sense with the current structure of the PostgreSQL
backend is a different question, of course. Certain othr databases
(no, not just MySQL ;-) are capable of doing that optimization, however.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
The difference between a rich man and a poor man is this -- the former
eats when he pleases, the latter when he can get it.                               -- Sir Walter Raleigh


Re: More Performance

From
"Matthias Urlichs"
Date:
Hi,

Bruce Momjian:
> > IMHO this is somewhat non-optimal. In the absence of information
> > to the contrary, PostgreSQL should default to using an index if
> > it might be appropriate, not ignore it.
> 
> This is an interesting idea.  So you are saying that if a column has no
> vacuum analyze statistics, assume it is unique?

Nope. But why should vacuum analyze be the one and only part of
PostgreSQL where statistics are ever updated?

When you have no statistics, a "column_name=CONSTANT" query for an
indexed column yields exactly one result (actually, "significantly fewer
results than there are 8-kbyte records in the table" would do), you
might want to record the fact that using the index might, in hindsight,
have been a good idea after all.

Then, when the next query like that comes in, you use the index.

Maybe I'm too naive ;-)  but I fail to see how this approach could
be either hard to implement or detrimental to performance.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
An Army travels on her stomach.


Re: More Performance

From
Bruce Momjian
Date:
> Hi,
> 
> Bruce Momjian:
> > > 
> > > test=# explain select id from bench1 order by id;
> > > Sort  (cost=38259.21..38259.21 rows=300000 width=4)
> > >   ->  Seq Scan on bench1  (cost=0.00..6093.00 rows=300000 width=4)
> > > 
> > The heap is unordered, meaning a sequential scan and order by is usually
> > faster than an index walk unless there is a restrictive WHERE clause.
> > 
> What heap? The index is a b-tree in this case. Thus you should be able
> to walk it and get the sorted result without ever touching the data
> file.
> 
> Whether that makes sense with the current structure of the PostgreSQL
> backend is a different question, of course. Certain othr databases
> (no, not just MySQL ;-) are capable of doing that optimization, however.

We can't read data from the index.  It would be nice if we could, but we
can't.  I think we believe that there are very few cases where this
would be win.  Usually you need non-indexed data too.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


"Matthias Urlichs" <smurf@noris.net> writes:
>> So you test EXCEPT by having a different number of columns.  I can see
>> it now, "Hey we don't have EXCEPT.  PostgreSQL does it, but they can't
>> handle a different number of columns.  Let's do only that test so we
>> look equal."
>> 
> They might as well have written that test while checking their crash-me
> script against SOLID and noting a few features MySQL doesn't have yet.
> Or they may have gotten it from them in the first place.

Our gripe is not that they're testing an extension we haven't got.
It's that the test result is misleadingly labeled.  It doesn't say
"EXCEPT with incompatible select lists", it says "EXCEPT", full stop.
That's deceptive.  And no, we do not think it's an honest mistake.
It's part of a consistent pattern of misstatements that's been going on
for a long time.  Sure, any one might be an honest mistake, but when you
see the same sort of thing over and over again, your credulity drops to
a low level.  crashme is designed to make MySQL look good and everyone
else (not just Postgres) look bad.

I'm glad to hear your optimism about cleaning this up.  Perhaps you
can actually accomplish something, but most of us decided long ago
that crashme is not meant as a fair comparison.  We have other things
to do than ride herd on crashme and try to keep them to the straight
and narrow, when they clearly have no desire to make it an unbiased
test and will not do so without constant prodding.
        regards, tom lane


Re: More Performance

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> What heap? The index is a b-tree in this case. Thus you should be able
>> to walk it and get the sorted result without ever touching the data
>> file.

> We can't read data from the index.  It would be nice if we could, but we
> can't.

The reason we can't is that we don't store tuple validity data in
indexes.  The index entry has the key value and a pointer to the tuple
in the main heap file, but we have to visit the tuple to find out
whether it's committed or dead.  If we did otherwise, then committing or
killing tuples would be lots slower than it is, because we'd have to
find and mark all the index entries pointing at the tuple, not just the
tuple itself.  It's a tradeoff... but we think it's a good one.

> I think we believe that there are very few cases where this
> would be win.  Usually you need non-indexed data too.

Right, non-toy examples usually read additional data columns anyway.
        regards, tom lane


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
Bruce Momjian
Date:
> Our gripe is not that they're testing an extension we haven't got.
> It's that the test result is misleadingly labeled.  It doesn't say
> "EXCEPT with incompatible select lists", it says "EXCEPT", full stop.
> That's deceptive.  And no, we do not think it's an honest mistake.
> It's part of a consistent pattern of misstatements that's been going on
> for a long time.  Sure, any one might be an honest mistake, but when you
> see the same sort of thing over and over again, your credulity drops to
> a low level.  crashme is designed to make MySQL look good and everyone
> else (not just Postgres) look bad.
> 
> I'm glad to hear your optimism about cleaning this up.  Perhaps you
> can actually accomplish something, but most of us decided long ago
> that crashme is not meant as a fair comparison.  We have other things
> to do than ride herd on crashme and try to keep them to the straight
> and narrow, when they clearly have no desire to make it an unbiased
> test and will not do so without constant prodding.

The basic issue is that you can tell us that this big crashme mess
happened by mistake, and that there there was no deceptive intent.

However, really, we are not stupid enough to believe it.

Why don't you find out who wrote this thing, and ask them what they were
thinking when they wrote it?  I bet you will find out our perception is
correct. 

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Bruce Momjian:
> > 
> > > 2. atomic_updates = no
> > That's a misnomer. They actually mean this:
> > 
> >     create table crash_q (a integer not null);
> >     create unique index crf on crash_q(a);
> > 
> >     insert into crash_q values (2);
> >     insert into crash_q values (3);
> >     insert into crash_q values (1);
> >     update crash_q set a=a+1;
> 
> Poorly named, huh?  How do you think it got such a name?  This item was
> on the crashme tests before TRANSACTION was on there?

It probably got that name because nobody thought about people
associating atomicity with transactions.

Anyway, the issue isn't all that exotic. ms-sql, mimer, db2, solid and
sybase are listed as supporting this kind of update.


If you can think of an understandable five-word-or-so description for
it, I'll happily rename the test. I've been thinking about it for the
last ten minutes or so, but couldn't come up with one. :-/


A different question is whether the database bungles the update when the
first few row can be updated and THEN you run into a conflict.

PostgreSQL handles this case correctly, MySQL doesn't => I'll add a
test for it.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
The only way to be good at everything you do is to only do the things
you are good at.


MySQL crashme test and PostgreSQL

From
Bruce Momjian
Date:
[CC: to general list.]

> > > What does the official standard say (assuming any exists) -- is the "to"
> > > optional or not?
> >
> > I don't see any RENAME in the SQL92 spec.  Now, how hard is it to do a
> > 'man alter_table' and see what it says at the top of the screen?
> >
> It's not a question of your manpage vs. their manpage. I can read your
> manpage just fine. It's a question of whether there is something that
> can be regarded as a standard on it or not. "Official" is a poor wording
> in this case -- sorry.
>
> If yes, then the test will be changed to do it the standard way.
> If no, then I might have to test for both syntaxes, which is a PITA.
>

You know, you are asking what syntax is SQL standard.  It is actually
not our job to report it to you.  If you are responsible for the test,
you should know what the standard says, and test against that.  If you
are not responsible for the test, then it shows that the person who is
responsible for the test doesn't care enough to test for SQL standard
syntax, only for MySQL syntax.

You know, there is a saying, "Do it right, or don't do it at all."  That
is pretty much the PostgreSQL style.  And if you are going to criticize
someone, you better be sure you are right.

We didn't write the crashme test, we don't host it on our web site, we
didn't ask to be in it.  Someone has to be responsible for the test, and
knowing standard SQL syntax, and that must be whoever put it on the
MySQL site.  We really don't want to hear that it dropped from the sky
and landed on the MySQL site, and no one there is responsible for it.

If we put something on our site, we are responsible for it.  If we don't
like something or can't take ownership of it, we remove it.

Now, I am not picking on you.  You may have the best of intentions.  But
basically someone has decided to put it on the MySQL site, and has not
considered it worth their while to learn the SQL standard.  They would
rather make other people tell them about the SQL standard, and maybe,
just maybe, we will fix the test someday.  Well, I will tell you, we
have better things to do than fix the MySQL crashme test.

> What matters is that your opinion is that they are responsible for making
> the test 100% accurate. Their reply to that is that many database
> vendors actually provided fixes for this test instead of bitching
> about how inaccurate it is, thus they feel the obligation is on your
> side.

BINGO!  You know, if other database vendors are stupid enough to do
MySQL's work for them and read the SQL standard for them, well...

You can't just point fingers and say no one at MySQL is responsible.
The MySQL bias is written all through that test.

> Now I am of neither side. I am, IMHO, thus in a position to ask you
> about your opinion of these inaccuracies, I am going to change
> the crashme test to be a whole lot more accurate WRT PostgreSQL,
> I will feed these changes back to the MySQL people, and they'll
> incorporate these changes into their next release. (Their head honcho
> (Monty) has said so on their mailing list. I _am_ going to take him up
> on it, and I can be quite obnoxious if somebody reneges on a promise.
> *EVIL*GRIN* )

You know, how do we know he is not just saying that hoping no one will
actually take him up on it.

You know, Monty was on this list last year, and he asked why we had a
bad attitude about MySQL, and we told him about the crashme test, and
you know, nothing happened.   So I don't think it is very important to
Monty to be fair, or more accurately, he would rather keep a test that
makes MySQL look good, than to spend time making the test fair.  He made
his choice.  I can tell you our reaction would be totally different.

> I might add that their test lists 52 features of PostgreSQL which
> MySQL doesn't have (13 functions). It also lists 122 features of MySQL
> which PostgreSQL doesn't have; 78 of those are extra functions (40 of
> these, just for M$-ODBC compatibility).


>
> So it seems that overall, that crash-me test result is reasonably
> balanced (39 vs. 44 non-function differences -- let's face it, adding
> another function for compatibility with SQL variant FOO is one of the
> easier exercises here, whatever the current value of FOO is).

You have to make the test deceptive to get MySQL to be on par with
PostgreSQL.  Period.  Doesn't MySQL admit they have fewer features than
PostgreSQL.  How did MySQL get an equal score on features?  Answer me
that one.

We have given enough of our time to this, and have pointed out many
problems.  Why don't you go an get those fixed, to show that the MySQL
group is working in good faith on this, and then, go and get a copy of
the standard, or a book about standard SQL, and start actually doing
something about the test.

And if it is not worth your time, and it is not worth any one else's
time at MySQL, then you folks have to admit you want to criticize
PostgreSQL without spending time to be fair about it.

I am going to suggest that no one else in the PostgreSQL group send any
more problem reports about the crashme tests until some changes appear
on the MySQL end.  Tom Lane has already done a great job of illustrating
the issues involved.  Pointing to actual SQL items is not the real
problem.  The MySQL attitude about crashme is the problem.

Also, I have heard about the hit squads attacking MySQL.  I never
condone inaccuracy or attacks, but I can understand why it is happening.

For years, I believe the deceptiveness of the MySQL crashme test has
hampered acceptance of PostgreSQL.  And our response was to just reply
with our opinion when asked about it.  We didn't create a web page to
attack MySQL and make them look bad.  We believed that in the end, truth
always wins.  So we kept going, and you know, in the end, truth does
win.  We have a $25 million dollar company forming around PostgreSQL,
with maybe more to come.  We are on our way up, even though the MySQL
crashme test delayed us.

And there is a saying "If you are not nice to people on your way up,
they will not be nice to you on the way down."  I bet the hit squads are
frustrated people who have seen unfair things said about PostgreSQL for
years, with nothing they could do about it.  Now they can do something,
and they are talking.  But instead of one web page with deceptive
results, you have 100 people all over the net slamming MySQL.  There is
a certain poetic justice in that.  The saying goes, "Oh what a tangled
web we weave, When first we practice to deceive".

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: MySQL crashme test and PostgreSQL

From
Mike Mascari
Date:
Bruce Momjian wrote:
>
> [CC: to general list.]
>
> > I might add that their test lists 52 features of PostgreSQL which
> > MySQL doesn't have (13 functions). It also lists 122 features of
> > MySQL which PostgreSQL doesn't have; 78 of those are extra
> > functions (40 of these, just for M$-ODBC compatibility).
>
> >
> > So it seems that overall, that crash-me test result is reasonably
> > balanced (39 vs. 44 non-function differences -- let's face it,
> > adding another function for compatibility with SQL variant FOO is
> > one of the easier exercises here, whatever the current value of
> > FOO is).
>
> You have to make the test deceptive to get MySQL to be on par with
> PostgreSQL.  Period.  Doesn't MySQL admit they have fewer features
> than PostgreSQL.  How did MySQL get an equal score on features?
> Answer me that one.

That's easy:

MySQL has type mediumint
PostgreSQL has transactions

MySQL allows 'and' as string markers
PostgreSQL has views

MySQL has case insensitive compare
PostgreSQL has referential integrity

MySQL has support for 0000-00-00 dates
PostgreSQL has subqueries

MySQL has 'drop table if exists'
PostgreSQL has multiversion concurrency control

etc.

See? Equal. I hope my sarcasm is not too overstated.

Mike Mascari

Re: Performance (was: The New Slashdot Setup (includes MySql server))

From
"Matthias Urlichs"
Date:
Hi,

Bruce Momjian:
> Why don't you find out who wrote this thing, and ask them what they were
> thinking when they wrote it?  I bet you will find out our perception is
> correct. 
> 
I'm working on it.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
It's not against any religion to want to dispose of a pigeon.              --Tom Lehrer


Re: MySQL crashme test and PostgreSQL

From
"Matthias Urlichs"
Date:
Hi,

Bruce Momjian:
> Also, I have heard about the hit squads attacking MySQL.  I never
> condone inaccuracy or attacks, but I can understand why it is happening.
> 
You _are_ doing your side of the story a disservice, you know that?

> For years, I believe the deceptiveness of the MySQL crashme test has
> hampered acceptance of PostgreSQL.  And our response was to just reply
> with our opinion when asked about it.

Yeah, I can see that.

Let me tell you up front that your opinion is not at all helpful to
either the cause of PostgreSQL or to the problems between you and the
MySQL people, especially when stated like this.


This is the Internet. The right thing to do if somebody spreads bad
information (a biased, inaccurate, wrong, deceptive, what-have-you)
crash-me test would be to write your own test which either prefers
PostgreSQL, or is reasonably neutral.


I'll shut up now, until the first of my patches is in the crash-me
suite. Perhaps that will have _some_ impact here.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Question: "Do you consider $10 a week enough for a longshoreman with
a family to support?"
Answer: "If that's all he can get, and he takes it, I should say it's enough."                       -- J. P. Morgan
(1837-1913)


Re: MySQL crashme test and PostgreSQL

From
"Matthias Urlichs"
Date:
Hi,

Mike Mascari:
> MySQL has type mediumint 
> PostgreSQL has transactions
> 
> MySQL allows 'and' as string markers
> PostgreSQL has views

Look, we all know that transaction support is more important than type
mediumint or the function ODBC LENGTH or cosine or whatever.

But what should I, or anybody else, do about it, in your opinion? Take
the "unimportant" tests out? Invent a couple of inconsequential stuff
PostgreSQL can do to balance the book? Repeat the "transactions=no"
entry in the crash-me results file ten times?

> See? Equal. I hope my sarcasm is not too overstated.

Sarcasm hasn't helped in the past, and it's unlikely to help in the future.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Lemme through! I'm a necrophiliac!    -- overheard at a traffic accident


Re: More Performance

From
Tom Lane
Date:
"Matthias Urlichs" <smurf@noris.net> writes:
> I've found another one of these performance problems in the benchmark,
> related to another ignored index.
> The whole thing works perfectly after a VACUUM ANALYZE on the
> table.
> IMHO this is somewhat non-optimal. In the absence of information
> to the contrary, PostgreSQL should default to using an index if
> it might be appropriate, not ignore it.

Just FYI: Postgres absolutely does not "ignore" an index in the absence
of VACUUM ANALYZE stats.  However, the default assumptions about
selectivity stats create cost estimates that are not too far different
for index and sequential scans.  On a never-vacuumed table you will
get an indexscan for "WHERE col = foo".  If the table has been vacuumed
but never vacuum analyzed, it turns out that you get varying results
depending on the size of the table and the average tuple size (since the
planner now has non-default info about the table size, but still nothing
about the actual selectivity of the WHERE condition).

The cost estimation code is under active development, and if you check
the pgsql list archives you will find lively discussions about its
deficiencies ;-).  But I'm having a hard time mustering much concern
about whether it behaves optimally in the vacuum-but-no-vacuum-analyze
case.
        regards, tom lane


Re: MySQL's "crashme" (was Re: Performance)

From
"Matthias Urlichs"
Date:
Hi,

Tom Lane:
> I would not like to see us labeled "crashme unsafe" merely because
> someone is too impatient to let the test run to conclusion.

There's not much anybody can do about a backend which dies because of a
"hard" out-of-memory error which the OS only notices when all it can do
is segfault the client.

Anyway, I'll go through your list of problems and create an appropriate
patch for the beast.


Not to forget: THANK YOU for taking the time to go through this.


> > alter_drop_col=no            # Alter table drop column
> 
> While our ALTER TABLE support is certainly pretty weak, it should be
> noted that this test will continue to fail even when we have ALTER TABLE
> DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax.
> 
You mean because the COLUMN keyword is missing? Added.

> > alter_rename_table=no            # Alter table rename table
> 
> We have ALTER TABLE RENAME ... but not under the syntax crashme is
> testing. 

TO keyword added.

> > atomic_updates=no            # atomic updates
> 
Test clarified and two new tests added. The result now is:

atomic_updates=no            # update constraint check are deferred
atomic_updates_ok=yes            # failed atomic update

MySQL has "no" and "no".

> > automatic_rowid=no            # Automatic rowid
> 
> Test is actually looking for a system column named "_rowid".  Our OIDs
> serve the same purpose

I'll add a test for OIDs.

> > binary_items=no                # binary items (0x41)
> 
> We have binary literals (per the test name) and hex literals (what
> it actually appears to be testing).  Unfortunately for us, ours are
> SQL92-compliant syntax, and what crashme is looking for isn't.
> 
I'll tell them to fix that.

> > comment_#=no                # # as comment
> 
> It'd be helpful to the reader if they indicated which two of these
> conventions are SQL-compliant ... of course, that might expose the
> fact that MySQL isn't ...
> 
Are there any problems with using '#' as a comment character, given that
MySQL doesn't support user-defined operators?

> > connections=32                # Simultaneous connections
> 
> Should probably be noted that this is just the default limit

Noted.


> > crash_me_safe=no            # crash me safe
> 
> I get "yes", and I'd *really* appreciate it if you not submit this
> misleading statement.
> 
I won't submit test results (not until the thing is cleaned up to
everybody's satisfaction), but I'll change the documentation of the
thing to state that 

>>> Some of the tests you are about to execute require a lot of memory.
>>> Your tests _will_ adversely affect system performance. Either this
>>> crash-me test program, or the actual database back-end, _will_ die with
>>> an out-of-memory error. So might any other program on your system if it
>>> requests more memory at the wrong time.

Good enough?


> > date_zero=no                # Supports 0000-00-00 dates
> Another test that only MySQL "passes".
... and SOLID.

> > except=no                # except
> Unsurprisingly, hardly anyone else accepts it either.
SOLID again.

I'll mark the features that are necessary for SQL compliancy (as well as
those that actually are detrimental to it).  _After_ the actual test
results are cleaned up.

> What they are looking for here is "SELECT NOT 0", which Postgres rejects
> as a type violation.  SQL-compliant "NOT FALSE" would work.
> 
... not with MySQL, which doesn't have FALSE in the first place.  :-(

I added another test for TRUE/FALSE, and fixed the NOT-0 thing.

> > having_with_alias=no            # Having on alias
> 
> Again, how curious that MySQL is the only DBMS shown as passing this
> test.  Couldn't be because it violates SQL92, could it?
> 
No, but it's an extremely nice feature to have, IMHO.

I will not do anything about tests for extensions that won't hurt one
way or another. Classifying them, as noted above, should be sufficient.

> > insert_select=no            # insert INTO ... SELECT ...
> 
> We would pass this test if the crashme script weren't buggy: it fails
> to clean up after a prior test that creates a crash_q table with
> different column names.

Fixed.

> > logical_value=1            # Value of logical operation (1=1)
> 
> A rather odd result, considering that what Postgres actually returns for
> "SELECT (1=1)" is 't'.  But showing the correct answer isn't one of
> crashme's highest priorities...
> 
> > minus_neg=no                # Calculate 1--1
> 
> Another case where "passing" the test means accepting MySQL's version of
> reality instead of SQL92's.  All the SQL-compliant DBMSs think -- is a
> comment introducer

So does MySQL -- when the '--' is followed by a space.

They do explain that in their documentation. I have to agree with them-- you can turn a perfectly legal SQL statement
intodangerous nonsense
 
with this kind of comment.

>>> $dbh->do("update foo set bar = baz-$adjust where something-or-other").

That kind of mistake can be rather expensive.

> > select_string_size=+16208        # constant string size in SELECT
> 
> I got 1048567 here, roughly corresponding to where I set max_buffer_size.
> Not sure why you get a smaller answer.
> 
Note the '+'. I have changed the test to 2*max_row_size since anything
bigger will return an empty answer anyway.

> > select_table_update=no            # Update with sub select
> 
> We certainly have update with sub select.  What they're looking for is
> the non-SQL-compliant syntax
>     update crash_q set crash_q.b=
>         (select b from crash_me where crash_q.a = crash_me.a);

Gah. Thanks; fixed. 

> One could also complain about the very large number of tests that are
> checking features that are non-SQL if not downright SQL-contradictory,
> but are listed simply as bullet points with no pro or con.  A naive
> reader would think that green stars are always good; they are not,
> but how are you to tell without a copy of the SQL spec in hand?
> 
I'll adapt the comments, but that is quite time consuming (and the
changes are extensive) and thus will have to wait until after the first
round is in one of their next alpha-test releases.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
"What did you do when the ship sank?"
"I grabbed a cake of soap and washed myself ashore."


Re: MySQL's "crashme" (was Re: Performance)

From
Kaare Rasmussen
Date:
> > minus_neg=no                # Calculate 1--1

Minus_neg expressed as select 1- -1; works. 



Re: Re: MySQL crashme test and PostgreSQL

From
Benjamin Adida
Date:
on 5/21/00 12:34 AM, Matthias Urlichs at smurf@noris.net wrote:

> But what should I, or anybody else, do about it, in your opinion? Take
> the "unimportant" tests out? Invent a couple of inconsequential stuff
> PostgreSQL can do to balance the book? Repeat the "transactions=no"
> entry in the crash-me results file ten times?

Take the unimportant tests out. Absolutely. Explain why the important tests
are important. The MySQL team is responsible for teaching a generation of
hackers that "transactions aren't important, they're just for lazy coders."

The solution here looks extremely simple. The only risk, of course, is that
it makes MySQL look bad, which I understand could be an unwanted outcome on
your end.

-Ben



Re: MySQL's "crashme" (was Re: Performance)

From
Tom Lane
Date:
"Matthias Urlichs" <smurf@noris.net> writes:
> Tom Lane:
>> I would not like to see us labeled "crashme unsafe" merely because
>> someone is too impatient to let the test run to conclusion.

> There's not much anybody can do about a backend which dies because of a
> "hard" out-of-memory error which the OS only notices when all it can do
> is segfault the client.

I'm just saying that it's unfair to downrate us when the problem is
demonstrably in crashme itself and not in Postgres.

>>>> alter_drop_col=no            # Alter table drop column
>> 
>> While our ALTER TABLE support is certainly pretty weak, it should be
>> noted that this test will continue to fail even when we have ALTER TABLE
>> DROP COLUMN, because crashme is testing for a non-SQL-compliant syntax.
>> 
> You mean because the COLUMN keyword is missing? Added.

No, the COLUMN keyword is optional according to the SQL92 specification:
        <alter table statement> ::=             ALTER TABLE <table name> <alter table action>
        <alter table action> ::=               <add column definition>             | <alter column definition>
  | <drop column definition>             | <add table constraint definition>             | <drop table constraint
definition>
        <drop column definition> ::=             DROP [ COLUMN ] <column name> <drop behavior>
        <drop behavior> ::= CASCADE | RESTRICT

What is *not* optional is a <drop behavior> keyword.  Although we don't
yet implement DROP COLUMN, our parser already has this statement in it
--- and it follows the SQL92 grammar.

>>>> comment_#=no                # # as comment
>> 
>> It'd be helpful to the reader if they indicated which two of these
>> conventions are SQL-compliant ... of course, that might expose the
>> fact that MySQL isn't ...
>> 
> Are there any problems with using '#' as a comment character, given that
> MySQL doesn't support user-defined operators?

Only in that your scripts don't port to spec-compliant DBMSes ...

>>>> Some of the tests you are about to execute require a lot of memory.
>>>> Your tests _will_ adversely affect system performance. Either this
>>>> crash-me test program, or the actual database back-end, _will_ die with
>>>> an out-of-memory error. So might any other program on your system if it
>>>> requests more memory at the wrong time.

> Good enough?

No, pretty misleading I'd say.  Since the crashme script does have a
limit on max_buffer_size, it *will* run to completion if run on a
machine with a sufficiently large per-process memory limit (and enough
swap of course).  I may just be old-fashioned in running with a
not-so-large memory limit.  It'd probably be more helpful if you
document the behavior seen when crashme runs out of memory (for me,
the script just stops cold with no notice) and what to do about it
(reduce max_buffer_size until it runs to completion).

>>>> date_zero=no                # Supports 0000-00-00 dates
>> Another test that only MySQL "passes".
> ... and SOLID.

Still doesn't mean it's a good idea ;-)

>>>> except=no                # except
>> Unsurprisingly, hardly anyone else accepts it either.
> SOLID again.

It'd be appropriate to divide this into two tests, or at least relabel
it.

>>>> minus_neg=no                # Calculate 1--1
>> 
>> Another case where "passing" the test means accepting MySQL's version of
>> reality instead of SQL92's.  All the SQL-compliant DBMSs think -- is a
>> comment introducer

> So does MySQL -- when the '--' is followed by a space.

Considering how much we got ragged on for not being perfectly compliant
with SQL-spec handling of comments (up till 7.0 our parser didn't
recognize "--" as a comment if it was embedded in a multicharacter
operator --- but we knew that was a bug), I don't have a lot of sympathy
for MySQL unilaterally redefining the spec here.  And I have even less
for them devising a test that can only be "passed" by non-spec-compliant
parsers, and then deliberately mislabeling it to give the impression
that the spec-compliant systems are seriously broken.  How about
labeling the results "Fails to recognize -- comment introducer unless
surrounded by whitespace"?


Anyway, I am pleased to see you trying to clean up the mess.
Good luck!
        regards, tom lane


Re: [GENERAL] Re: MySQL crashme test and PostgreSQL

From
Bruce Momjian
Date:
> That's easy:
>
> MySQL has type mediumint
> PostgreSQL has transactions
>
> MySQL allows 'and' as string markers
> PostgreSQL has views
>
> MySQL has case insensitive compare
> PostgreSQL has referential integrity
>
> MySQL has support for 0000-00-00 dates
> PostgreSQL has subqueries
>
> MySQL has 'drop table if exists'
> PostgreSQL has multiversion concurrency control
>
> etc.
>
> See? Equal. I hope my sarcasm is not too overstated.

It took me a minute to figure this out.  Wow, that was funny.  I am
still laughing.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Re: MySQL crashme test and PostgreSQL

From
Bruce Momjian
Date:
> Hi,
>
> Bruce Momjian:
> > Also, I have heard about the hit squads attacking MySQL.  I never
> > condone inaccuracy or attacks, but I can understand why it is happening.
> >
> You _are_ doing your side of the story a disservice, you know that?

Hey, I am not saying I like it happening.  All I am saying is that I can
understand why it is happening.  Certainly MSSQL and Oracle are the real
products we need to compete against.

>
> > For years, I believe the deceptiveness of the MySQL crashme test has
> > hampered acceptance of PostgreSQL.  And our response was to just reply
> > with our opinion when asked about it.
>
> Yeah, I can see that.
>
> Let me tell you up front that your opinion is not at all helpful to
> either the cause of PostgreSQL or to the problems between you and the
> MySQL people, especially when stated like this.
>
>
> This is the Internet. The right thing to do if somebody spreads bad
> information (a biased, inaccurate, wrong, deceptive, what-have-you)
> crash-me test would be to write your own test which either prefers
> PostgreSQL, or is reasonably neutral.

We have better things to do than compete against deceptive tests.  We
just work to make are our product better and better.  Making another
crashme test is not going to make PostgreSQL a better piece of software.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Isn't it a fundamental principle to define primary(unique
> identification) constraint for each table ?
> I had never thought that the only one index of pg_attrdef 
> isn't an unique identification index until I came across the
> unexpcted result of my DROP COLUMN test case.

Good point --- I was only thinking about the performance aspect, but
if we're going to have unique indexes to prevent errors in other
system tables then pg_attrdef deserves one too.

Actually, I have a more radical proposal: why not get rid of pg_attrdef
entirely, and add its two useful columns (adsrc, adbin) to pg_attribute?
If we allow them to be NULL for attributes with no default, then there's
no space overhead where they're not being used, and we eliminate any
need for the second table.
        regards, tom lane


Re: MySQL's "crashme" (was Re: Performance)

From
"Matthias Urlichs"
Date:
Hi,

Tom Lane:
> I'm just saying that it's unfair to downrate us when the problem is
> demonstrably in crashme itself and not in Postgres.
> 
Right.

>          <drop behavior> ::= CASCADE | RESTRICT
> 
> What is *not* optional is a <drop behavior> keyword.  Although we don't
> yet implement DROP COLUMN, our parser already has this statement in it
> --- and it follows the SQL92 grammar.
> 
Ah, sorry, I apparently misparsed the BNF. (It was kind of late at
night...)

> No, pretty misleading I'd say.  Since the crashme script does have a
> limit on max_buffer_size, it *will* run to completion if run on a
> machine with a sufficiently large per-process memory limit (and enough
> swap of course).

Hmm, I could add an explicit option to limit memory usage instead.
(Right now it's hardcoded in the test script.)

> >>>> date_zero=no                # Supports 0000-00-00 dates
> >> Another test that only MySQL "passes".
> > ... and SOLID.
> 
> Still doesn't mean it's a good idea ;-)
> 
No argument from me...

> >>>> except=no                # except
> It'd be appropriate to divide this into two tests, or at least relabel
> it.
> 
Already done.

> Considering how much we got ragged on for not being perfectly compliant
> with SQL-spec handling of comments (up till 7.0 our parser didn't
> recognize "--" as a comment if it was embedded in a multicharacter
> operator --- but we knew that was a bug), I don't have a lot of sympathy
> for MySQL unilaterally redefining the spec here.

They do note this noncompliance with the SQL spec in their documentation,
along with a few others.

I'll clean this one up (adding a note about the noncompliance) a bit
more, after they incorporate my patch into the next version.

> Anyway, I am pleased to see you trying to clean up the mess.
> Good luck!
> 
Thanks.


-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.       |        http://smurf.noris.de/
-- 
Mathematicians do it symmetrically.


MySQL crashme

From
Bruce Momjian
Date:
Matthias, let me add that I wish you luck in updating the MySQL crashme
test.  You certainly seem to be on top of the issues, and I hope they
can be resolved.

I know a lot of people on this side are hoping you can make it happen. 

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: The New Slashdot Setup (includes MySql server)

From
Jeff MacDonald
Date:
agreed, a while back i actually contacted rob malda and offered
to convert slashdot to postgres.. he asked why i would want to do this
, said postgres's features yada yada.. his reply

.. that's dandy but we don't need those features.

sad to say but mysql has a niche and slashdot fills it.

jeff

On Thu, 18 May 2000, The Hermit Hacker wrote:

> On Thu, 18 May 2000, Bruce Momjian wrote:
> 
> > > Info on the new slashdot.org setup
> > > 
> > > <http://slashdot.org/article.pl?sid=00/05/18/1427203&mode=nocomment>
> > > 
> > > interesting because of the plans (meaning $$$) they have to improve
> > > MySql, and because they are the flagship MySql site/application. 
> > > 
> > > In the comment page, replying to the usual "Why not PostgreSql?" thread
> > > someone pointed out an extract from the MySql docs that seems to me
> > > blatantly false
> > > (http://slashdot.org/comments.pl?sid=00/05/18/1427203&cid=131).
> > 
> > Just finished reading the thread.  I am surprised how many people
> > slammed them on their MySQL over PostgreSQL decision.  People are
> > slamming MySQL all over the place.  :-)
> > 
> > Seems like inertia was the reason to stay with MySQL.  What that means
> > to me is that for their application space, PostgreSQL already has
> > superior technology, and people realize it.  This means we are on our
> > way up, and MySQL is, well, ....
> 
> In SlashDot's defence here ... I dooubt there is much they do that would
> require half of what we offer ... it *very* little INSERT/UPDATE/DELETE
> and *alot* of SELECT ...
> 
>