Thread: RE: [GENERAL] Benchmarks

RE: [GENERAL] Benchmarks

From
"Culberson, Philip"
Date:
In his very insightful post last week, Mike Mascari pointed out that, on
tables with heavy insert/updates, it was much faster to drop the index,
vacuum analyze, and then rebuild the index.  Maybe in vacuum there is a
specific inefficiency in what Mike coined "defragment"ing indexes.

[Snip]

8. Not running VACUUM - PostgreSQL won't use indexes, or won't optimize
correctly unless the record count and dispersion estimates are up-to-date.
People have reported problems with running vacuum while under heavy load. We
haven't seen it, but we run vacuum each night at 4:05 a.m. However, if you
perform a LARGE number of INSERTS/UPDATES, it is better for you to do the
following:

DROP INDEX index_on_heavilty_used_table;
VACUUM ANALYZE;
CREATE INDEX index_on_heavily_used_table;

Because VACUUM will sit there, and, row by row, essentially "defragment"
your indexes, which can take damn near forever for any number of updates or
deletes greater than, say, 30,000 rows.

[Snip]

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Thursday, January 06, 2000 10:14 AM
To: Dustin Sallings
Cc: The Hermit Hacker; pgsql-general@hub.org
Subject: Re: [GENERAL] Benchmarks


>     Untrue, vacuum is *extremely* important for updating statistics.
> If you have a lot of data in a table, and you have never vacuumed, you
> might as well not have any indices.  It'd be nice if you could seperate
> the stat update from the storage reclaim.  Actually, it'd be nice if you
> could reuse storage, so that an actual vacuum wouldn't be necessary unless
> you just wanted to free up disk space you might end up using again anyway.
>
>     The vacuum also doesn't seem to be very efficient.  In one of my
> databases, a vacuum could take in excess of 24 hours, while I've written a
> small SQL script that does a select rename and a insert into select from
> that will do the same job in about ten minutes.  This is a database that
> cannot lock for more than a few minutes.

This is serious.  Why would an INSERT / RENAME be so much faster.  Are
we that bad with VACUUM?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

************

RE: [GENERAL] Benchmarks

From
Dustin Sallings
Date:
On Thu, 6 Jan 2000, Culberson, Philip wrote:

    This is a considerable amount faster.  I never thought about the
indices getting hit here.  Thanks a lot.

# In his very insightful post last week, Mike Mascari pointed out that, on
# tables with heavy insert/updates, it was much faster to drop the index,
# vacuum analyze, and then rebuild the index.  Maybe in vacuum there is a
# specific inefficiency in what Mike coined "defragment"ing indexes.
#
# [Snip]
#
# 8. Not running VACUUM - PostgreSQL won't use indexes, or won't optimize
# correctly unless the record count and dispersion estimates are up-to-date.
# People have reported problems with running vacuum while under heavy load. We
# haven't seen it, but we run vacuum each night at 4:05 a.m. However, if you
# perform a LARGE number of INSERTS/UPDATES, it is better for you to do the
# following:
#
# DROP INDEX index_on_heavilty_used_table;
# VACUUM ANALYZE;
# CREATE INDEX index_on_heavily_used_table;
#
# Because VACUUM will sit there, and, row by row, essentially "defragment"
# your indexes, which can take damn near forever for any number of updates or
# deletes greater than, say, 30,000 rows.
#
# [Snip]
#
# -----Original Message-----
# From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
# Sent: Thursday, January 06, 2000 10:14 AM
# To: Dustin Sallings
# Cc: The Hermit Hacker; pgsql-general@hub.org
# Subject: Re: [GENERAL] Benchmarks
#
#
# >     Untrue, vacuum is *extremely* important for updating statistics.
# > If you have a lot of data in a table, and you have never vacuumed, you
# > might as well not have any indices.  It'd be nice if you could seperate
# > the stat update from the storage reclaim.  Actually, it'd be nice if you
# > could reuse storage, so that an actual vacuum wouldn't be necessary unless
# > you just wanted to free up disk space you might end up using again anyway.
# >
# >     The vacuum also doesn't seem to be very efficient.  In one of my
# > databases, a vacuum could take in excess of 24 hours, while I've written a
# > small SQL script that does a select rename and a insert into select from
# > that will do the same job in about ten minutes.  This is a database that
# > cannot lock for more than a few minutes.
#
# This is serious.  Why would an INSERT / RENAME be so much faster.  Are
# we that bad with VACUUM?
#
# --
#   Bruce Momjian                        |  http://www.op.net/~candle
#   maillist@candle.pha.pa.us            |  (610) 853-3000
#   +  If your life is a hard drive,     |  830 Blythe Avenue
#   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#
# ************
#
#

--
SA, beyond.com           My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________


Re: MySQL / PostgreSQL (was: Postgres object orientation)

From
"Alain TESIO"
Date:
> Subsidiary question : why is mySQL excluded from RDBMS
> comparison on postgress www site ?

Maybe because it's much faster ;)
Sure, MySQL doesn't support transactions, rollbacks, ...
Maybe this question will sound a bit naive, but why doesn't
we have the choice to send queries to PostgreSQL as
transactional or not ? It's probably not meaningful to
say that a single query is transactional or not, but what
about a global parameter at the server level ? Forgive me
again for the naivety of this question, this may mean to
have two completely different engines. And it would have
been already done if possible ...

I've compared both engines and MySQL is much faster.
However I'll need transaction to ensure reliability
for the database updates. I've thought at using PostgreSQL
for updates, and MySQL for select, the database being
dumped from PostgreSQL and reloaded into MySQL every
night. Probably with specific queries and scripts rather
than a dump to get a MySQL-compliant dump file.
Has anyone an experience about a similar solution ?

Alain





Re: [GENERAL] Re: MySQL / PostgreSQL (was: Postgres object orientation)

From
^chewie
Date:
On Fri, 7 Jan 2000, Alain TESIO wrote:

> I've compared both engines and MySQL is much faster.

Alain, have you run postgreSQL with the -F option and compared them then?
I'd venture to say that if you did so, you'ld find PostgreSQL a bit more
desireable on the speed factor.  Just remember, when you do so, you loose
some recovery possibilities -- as you're relying on the OS to sync data
from memory to disc.

----------------------------------------------------------------
Chad Walstrom                         mailto:chewie@wookimus.net
a.k.a ^chewie, gunnarr               http://wookimus.net/~chewie

   Gnupg = B4AB D627 9CBD 687E 7A31  1950 0CC7 0B18 206C 5AFD
----------------------------------------------------------------


Re: [GENERAL] Re: MySQL / PostgreSQL (was: Postgres object orientation)

From
Ron Chmara
Date:
Alain TESIO wrote:
> > Subsidiary question : why is mySQL excluded from RDBMS
> > comparison on postgress www site ?

I believe it's fairly stated: mySQL is not an RDBMS, so it
is not listed. A similar question: Why is Filemaker excluded?
FoxPro?

They all have a speedy, flat file, structure, glued together
in such a way that relational-like interactions are available.

> Maybe because it's much faster ;)

The speed of not having some relational feature that you
desparately need, but is not availabe, is approximately
zero. :-)

I like it for simple structures, simple db's, but as soon as
you're trying to manage 30-40 tables, mySQL starts to get
in its own way. For running SQL to a flat db? It's much
better than PostgreSQL. 200 Tables? Don't even think about
it, get an RDBMS.

> Sure, MySQL doesn't support transactions, rollbacks, ...

Foreign keys...

> I've compared both engines and MySQL is much faster.

Yup. Different design goals.

> However I'll need transaction to ensure reliability
> for the database updates. I've thought at using PostgreSQL
> for updates, and MySQL for select, the database being
> dumped from PostgreSQL and reloaded into MySQL every
> night. Probably with specific queries and scripts rather
> than a dump to get a MySQL-compliant dump file.
> Has anyone an experience about a similar solution ?

We are in the midst of nightly dumps of mySQL -> PostgreSQL,
in order to gradually migrate to a more robust solution...
however, it seems like your proposed idea _could_ work, with
some decent scripting. Fast for simple scans, too...
basically, you'd need text dumps, which are then re-wrapped
to mySQL-friendly insert (or whatever) statements.

HTH,
-Bop

Re: -F option

From
"Alain TESIO"
Date:
> Alain, have you run postgreSQL with the -F option and compared them then?
> I'd venture to say that if you did so, you'ld find PostgreSQL a bit more
> desireable on the speed factor.  Just remember, when you do so, you loose
> some recovery possibilities -- as you're relying on the OS to sync data
> from memory to disc.

Hello,

I've often experienced crashes with the -F option, the script which fills my
database can never run completely. If any developper is interested, I have
a dump file (about 2 Mo gzipped, directly produced from pg_dump) which
is crashing on a create index with the option -F set, and is working fine
without it.
My configuration is : Linux RedHat 6.0, Kernel 2.2-15, PostgreSQL 6.5.3,
64 Mo RAM.

Alain



Re: [BUGS] Re: -F option

From
Tom Lane
Date:
"Alain TESIO" <tesio@easynet.fr> writes:
> I've often experienced crashes with the -F option, the script which fills my
> database can never run completely. If any developper is interested, I have
> a dump file (about 2 Mo gzipped, directly produced from pg_dump) which
> is crashing on a create index with the option -F set, and is working fine
> without it.

Really!?  Wow, I can hardly guess what's causing that, but it sure
deserves looking into.  Send me your dump, or give me an FTP or web
pointer if that's easier.

            regards, tom lane