Thread: Re: Large databases, performance

Re: Large databases, performance

From
"Shridhar Daithankar"
Date:
On 3 Oct 2002 at 11:57, Robert Treat wrote:

> NOTE: Setting follow up to the performance list
>
> Funny that the status quo seems to be if you need fast selects on data
> that has few inserts to pick mysql, otherwise if you have a lot of
> inserts and don't need super fast selects go with PostgreSQL; yet your
> data seems to cut directly against this.

Well, couple of things..

The number of inserts aren't few. it's 5000/sec.required in the field Secondly
I don't know really but postgresql seems doing pretty fine in parallel selects.
If we use mysql with transaction support then numbers are really close..

May be it's time to rewrite famous myth that postgresql is slow. When properly
tuned or given enough head room, it's almost as fast as mysql..

> I'm curious, did you happen to run the select tests while also running
> the insert tests? IIRC the older mysql versions have to lock the table
> when doing the insert, so select performance goes in the dumper in that
> scenario, perhaps that's not an issue with 3.23.52?

IMO even if it locks tables that shouldn't affect select performance. It would
be fun to watch when we insert multiple chunks of data and fire queries
concurrently. I would be surprised if mysql starts slowing down..

> It also seems like the vacuum after each insert is unnecessary, unless
> your also deleting/updating data behind it. Perhaps just running an
> ANALYZE on the table would suffice while reducing overhead.

I believe that was vacuum analyze only. But still it takes lot of time. Good
thing is it's not blocking..

Anyway I don't think such frequent vacuums are going to convince planner to
choose index scan over sequential scan. I am sure it's already convinced..

Regards,
 Shridhar

-----------------------------------------------------------
Shridhar Daithankar
LIMS CPE Team Member, PSPL.
mailto:shridhar_daithankar@persistent.co.in
Phone:- +91-20-5678900 Extn.270
Fax  :- +91-20-5678901
-----------------------------------------------------------


Re: Large databases, performance

From
Robert Treat
Date:
On Thu, 2002-10-03 at 12:17, Shridhar Daithankar wrote:
> On 3 Oct 2002 at 11:57, Robert Treat wrote:
> May be it's time to rewrite famous myth that postgresql is slow.

That myth has been dis-proven long ago, it just takes awhile for
everyone to catch on ;-)

When properly
> tuned or given enough head room, it's almost as fast as mysql..
>
> > I'm curious, did you happen to run the select tests while also running
> > the insert tests? IIRC the older mysql versions have to lock the table
> > when doing the insert, so select performance goes in the dumper in that
> > scenario, perhaps that's not an issue with 3.23.52?
>
> IMO even if it locks tables that shouldn't affect select performance. It would
> be fun to watch when we insert multiple chunks of data and fire queries
> concurrently. I would be surprised if mysql starts slowing down..
>

Hmm... been awhile since I dug into mysql internals, but IIRC once the
table was locked, you had to wait for the insert to complete so the
table would be unlocked and the select could go through. (maybe this is
a myth that I need to get clued in on)

> > It also seems like the vacuum after each insert is unnecessary, unless
> > your also deleting/updating data behind it. Perhaps just running an
> > ANALYZE on the table would suffice while reducing overhead.
>
> I believe that was vacuum analyze only. But still it takes lot of time. Good
> thing is it's not blocking..
>
> Anyway I don't think such frequent vacuums are going to convince planner to
> choose index scan over sequential scan. I am sure it's already convinced..
>

My thinking was that if your just doing inserts, you need to update the
statistics but don't need to check on unused tuples.

Robert Treat


Re: Large databases, performance

From
"Shridhar Daithankar"
Date:
On 3 Oct 2002 at 12:26, Robert Treat wrote:

> On Thu, 2002-10-03 at 12:17, Shridhar Daithankar wrote:
> > On 3 Oct 2002 at 11:57, Robert Treat wrote:
> > May be it's time to rewrite famous myth that postgresql is slow.
>
> That myth has been dis-proven long ago, it just takes awhile for
> everyone to catch on ;-)

:-)

> Hmm... been awhile since I dug into mysql internals, but IIRC once the
> table was locked, you had to wait for the insert to complete so the
> table would be unlocked and the select could go through. (maybe this is
> a myth that I need to get clued in on)

If that turns out to be true, I guess mysql will nose dive out of window.. May
be time to run a test that's nearer to real world expectation, especially in
terms on concurrency..

I don't think tat will be an issue with mysql with transaction support. The
vanilla one might suffer.. Not the other one.. At least theoretically..

> My thinking was that if your just doing inserts, you need to update the
> statistics but don't need to check on unused tuples.

Any other way of doing that other than vacuum analyze? I thought that was the
only way..

Bye
 Shridhar

--
"Even more amazing was the realization that God has Internet access.  Iwonder
if He has a full newsfeed?"(By Matt Welsh)


Re: [HACKERS] Large databases, performance

From
Hans-Jürgen Schönig
Date:
Shridhar Daithankar wrote:

>On 3 Oct 2002 at 11:57, Robert Treat wrote:
>
>
>
>>NOTE: Setting follow up to the performance list
>>
>>Funny that the status quo seems to be if you need fast selects on data
>>that has few inserts to pick mysql, otherwise if you have a lot of
>>inserts and don't need super fast selects go with PostgreSQL; yet your
>>data seems to cut directly against this.
>>
>>
>
>Well, couple of things..
>
>The number of inserts aren't few. it's 5000/sec.required in the field Secondly
>I don't know really but postgresql seems doing pretty fine in parallel selects.
>If we use mysql with transaction support then numbers are really close..
>
>May be it's time to rewrite famous myth that postgresql is slow. When properly
>tuned or given enough head room, it's almost as fast as mysql..
>
>

In the case of concurrent transactions MySQL does not do as well due to
very bad locking behavious. PostgreSQL is far better because it does row
level locking instead of table locking.
If you have many concurrent transactions MySQL performs some sort of
"self-denial-of-service". I'd choose PostgreSQL in order to make sure
that the database does not block.


>>I'm curious, did you happen to run the select tests while also running
>>the insert tests? IIRC the older mysql versions have to lock the table
>>when doing the insert, so select performance goes in the dumper in that
>>scenario, perhaps that's not an issue with 3.23.52?
>>
>>
>
>IMO even if it locks tables that shouldn't affect select performance. It would
>be fun to watch when we insert multiple chunks of data and fire queries
>concurrently. I would be surprised if mysql starts slowing down..
>
>

In the case of concurrent SELECTs and INSERT/UPDATE/DELETE operations
MySQL will slow down for sure. The more concurrent transactions you have
the worse MySQL will be.

>>It also seems like the vacuum after each insert is unnecessary, unless
>>your also deleting/updating data behind it. Perhaps just running an
>>ANALYZE on the table would suffice while reducing overhead.
>>
>>
>
>I believe that was vacuum analyze only. But still it takes lot of time. Good
>thing is it's not blocking..
>
>Anyway I don't think such frequent vacuums are going to convince planner to
>choose index scan over sequential scan. I am sure it's already convinced..
>
>

PostgreSQL allows you to improve execution plans by giving the planner a
hint.
In addition to that: if you need REAL performance and if you are running
similar queries consider using SPI.

Also: 7.3 will support PREPARE/EXECUTE.

If you are running MySQL you will not be able to add features to the
database easily.
In the case of PostgreSQL you have a broad range of simple interfaces
which make many things pretty simple (eg. optimized data types in < 50
lines of C code).

PostgreSQL is the database of the future and you can perform a lot of
tuning.
MySQL is a simple frontend to a filesystem and it is fast as long as you
are doing SELECT 1+1 operations.

Also: Keep in mind that PostgreSQL has a wonderful core team. MySQL is
built on Monty Widenius and the core team = Monty.
Also: PostgreSQL = ANSI compilant, MySQL = Monty compliant

In the past few years I have seen that there is no database system which
can beat PostgreSQL's flexibility and stability.
I am familiar with various database systems but believe: PostgreSQL is
the best choice.

    Hans


>Regards,
> Shridhar
>
>-----------------------------------------------------------
>Shridhar Daithankar
>LIMS CPE Team Member, PSPL.
>mailto:shridhar_daithankar@persistent.co.in
>Phone:- +91-20-5678900 Extn.270
>Fax  :- +91-20-5678901
>-----------------------------------------------------------
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>
>



--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at>, cluster.postgresql.at
<http://cluster.postgresql.at>, www.cybertec.at
<http://www.cybertec.at>, kernel.cybertec.at <http://kernel.cybertec.at>


Re: [HACKERS] Large databases, performance

From
Manfred Koizar
Date:
On Thu, 03 Oct 2002 21:47:03 +0530, "Shridhar Daithankar"
<shridhar_daithankar@persistent.co.in> wrote:
>I believe that was vacuum analyze only.

Well there is

    VACUUM [tablename];

and there is

    ANALYZE [tablename];

And

    VACUUM ANALYZE [tablename];

is VACUUM followed by ANALYZE.

Servus
 Manfred

Re: Large databases, performance

From
Ron Johnson
Date:
On Thu, 2002-10-03 at 11:17, Shridhar Daithankar wrote:
> On 3 Oct 2002 at 11:57, Robert Treat wrote:
>
[snip]
> > I'm curious, did you happen to run the select tests while also running
> > the insert tests? IIRC the older mysql versions have to lock the table
> > when doing the insert, so select performance goes in the dumper in that
> > scenario, perhaps that's not an issue with 3.23.52?
>
> IMO even if it locks tables that shouldn't affect select performance. It would
> be fun to watch when we insert multiple chunks of data and fire queries
> concurrently. I would be surprised if mysql starts slowing down..

What kind of lock?  Shared lock or exclusive lock?  If SELECT
performance tanked when doing simultaneous INSERTs, then maybe there
were exclusive table locks.

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "What other evidence do you have that they are terrorists, |
|  other than that they trained in these camps?"             |
|   17-Sep-2002 Katie Couric to an FBI agent regarding the 5 |
|   men arrested near Buffalo NY                             |
+------------------------------------------------------------+


Re: [HACKERS] Large databases, performance

From
Ron Johnson
Date:
On Thu, 2002-10-03 at 11:51, Hans-Jürgen Schönig wrote:
> Shridhar Daithankar wrote:
>
> >On 3 Oct 2002 at 11:57, Robert Treat wrote:
[snip]
> PostgreSQL allows you to improve execution plans by giving the planner a
> hint.
> In addition to that: if you need REAL performance and if you are running
> similar queries consider using SPI.

What is SPI?

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "What other evidence do you have that they are terrorists, |
|  other than that they trained in these camps?"             |
|   17-Sep-2002 Katie Couric to an FBI agent regarding the 5 |
|   men arrested near Buffalo NY                             |
+------------------------------------------------------------+


use [PERF] instead of

From
Jean-Luc Lachance
Date:
May I suggest that instead of [pgsql-performance] that [PERF] be used to
save some of the subject line.

Ron Johnson wrote:
>
> On Thu, 2002-10-03 at 11:51, Hans-Jürgen Schönig wrote:
> > Shridhar Daithankar wrote:
> >
> > >On 3 Oct 2002 at 11:57, Robert Treat wrote:
> [snip]
> > PostgreSQL allows you to improve execution plans by giving the planner a
> > hint.
> > In addition to that: if you need REAL performance and if you are running
> > similar queries consider using SPI.
>
> What is SPI?
>
> --
> +------------------------------------------------------------+
> | Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
> | Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
> |                                                            |
> | "What other evidence do you have that they are terrorists, |
> |  other than that they trained in these camps?"             |
> |   17-Sep-2002 Katie Couric to an FBI agent regarding the 5 |
> |   men arrested near Buffalo NY                             |
> +------------------------------------------------------------+
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: [HACKERS] Large databases, performance

From
Andrew Sullivan
Date:
On Thu, Oct 03, 2002 at 06:51:05PM +0200, Hans-J?rgen Sch?nig wrote:

> In the case of concurrent transactions MySQL does not do as well due to
> very bad locking behavious. PostgreSQL is far better because it does row
> level locking instead of table locking.

It is my understanding that MySQL no longer does this on InnoDB
tables.  Whether various bag-on-the-side table types are a good thing
I will leave to others; but there's no reason to go 'round making
claims about old versions of MySQL any more than there is a reason to
continue to talk about PostgreSQL not being crash safe.  MySQL has
moved along nearly as quickly as PostgreSQL.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: [HACKERS] Large databases, performance

From
"Shridhar Daithankar"
Date:
On 3 Oct 2002 at 18:53, Manfred Koizar wrote:

> On Thu, 03 Oct 2002 21:47:03 +0530, "Shridhar Daithankar"
> <shridhar_daithankar@persistent.co.in> wrote:
> >I believe that was vacuum analyze only.
>
> Well there is
>
>     VACUUM [tablename];
>
> and there is
>
>     ANALYZE [tablename];
>
> And
>
>     VACUUM ANALYZE [tablename];
>
> is VACUUM followed by ANALYZE.

I was using vacuum analyze.

Good that you pointed out. Now I will modify the postgresql auto vacuum daemon
that I wrote to analyze only in case of excesive inserts. I hope that's lighter
on performance compared to vacuum analyze..

Bye
 Shridhar

--
Mix's Law:    There is nothing more permanent than a temporary building.    There is
nothing more permanent than a temporary tax.


Re: [GENERAL] [HACKERS] Large databases, performance

From
"scott.marlowe"
Date:
On Thu, 3 Oct 2002, Hans-Jürgen Schönig wrote:

> In the case of concurrent transactions MySQL does not do as well due to
> very bad locking behavious. PostgreSQL is far better because it does row
> level locking instead of table locking.
> If you have many concurrent transactions MySQL performs some sort of
> "self-denial-of-service". I'd choose PostgreSQL in order to make sure
> that the database does not block.

While I'm no big fan of MySQL, I must point out that with innodb tables,
the locking is row level, and the ability to handle parallel read / write
is much improved.

Also, Postgresql does NOT use row level locking, it uses MVCC, which is
"better than row level locking" as Tom puts it.

Of course, hot backup is only 2,000 Euros for an innodb table mysql, while
hot backup for postgresql is free. :-)

That said, MySQL still doesn't handle parallel load nearly as well as
postgresql, it's just better than it once was.

> Also: Keep in mind that PostgreSQL has a wonderful core team. MySQL is
> built on Monty Widenius and the core team = Monty.
> Also: PostgreSQL = ANSI compilant, MySQL = Monty compliant

This is a very valid point.  The "committee" that creates and steers
Postgresql is very much a meritocracy.  The "committee" that steers MySQL
is Monty.

I'm much happier knowing that every time something important needs to be
done we have a whole cupboard full of curmudgeons arguing the fine points
so that the "right thing" gets done.



Re: Large databases, performance

From
"Michael Paesold"
Date:
Andrew Sullivan <andrew@libertyrms.info> wrote:

> On Thu, Oct 03, 2002 at 06:51:05PM +0200, Hans-J?rgen Sch?nig wrote:
>
> > In the case of concurrent transactions MySQL does not do as well due to
> > very bad locking behavious. PostgreSQL is far better because it does row
> > level locking instead of table locking.
>
> It is my understanding that MySQL no longer does this on InnoDB
> tables.  Whether various bag-on-the-side table types are a good thing
> I will leave to others; but there's no reason to go 'round making
> claims about old versions of MySQL any more than there is a reason to
> continue to talk about PostgreSQL not being crash safe.  MySQL has
> moved along nearly as quickly as PostgreSQL.

Locking and transactions is not fine in MySQL (with InnoDB) though. I tried
to do selects on a table I was concurrently inserting to. In a single thread
I was constantly inserting 1000 rows per transaction. While inserting I did
some random selects on the same table. It often happend that the insert
transactions were aborted due to dead lock problems. There I see the problem
with locking reads.
I like PostgreSQL's MVCC!

Regards,
Michael Paesold