Re: Drupal and PostgreSQL - performance issues? - Mailing list pgsql-general

From Ivan Sergio Borgonovo
Subject Re: Drupal and PostgreSQL - performance issues?
Date
Msg-id 20081014184456.0b920620@dawn.webthatworks.it
Whole thread Raw
In response to Re: Drupal and PostgreSQL - performance issues?  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Responses Re: Drupal and PostgreSQL - performance issues?  (Mikkel Høgh <mikkel@hoegh.org>)
further tests with 8.3 was: Re: Drupal and PostgreSQL - performance issues?  (Ivan Sergio Borgonovo <mail@webthatworks.it>)
List pgsql-general
On Tue, 14 Oct 2008 06:56:02 -0600
"Scott Marlowe" <scott.marlowe@gmail.com> wrote:

> >> This is a useful question, but there are reasonable answers to
> >> it. The key underlying principle is that it's impossible to
> >> know what will work well in a given situation until that
> >> situation is tested. That's why benchmarks from someone else's
> >> box are often mostly useless on your box, except for predicting
> >> generalities and then only when they agree with other people's
> >> benchmarks. PostgreSQL ships with a very conservative default
> >> configuration because (among other things, perhaps) 1) it's a
> >> configuration that's very unlikely to fail miserably for most
> >> situations, and 2)

> > So your target are potential skilled DBA that have a coffe pot as
> > testing machine?

> Actually a lot has been done to better tune pgsql out of the box,
> but since it uses shared memory and many oses still come with
> incredibly low shared mem settings we're stuck.

From my naive understanding the parameters you can tweak for major
improvements can be counted on one hand's finger:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Are you going to say that expert hands could squeeze other 20%
performance more after the usual, pretty simple tweaks an automatic
tool could achieve on a general workload for a cms?
From the specific test Mikkel did?

Wouldn't it be a much better starting point to discuss if PostgreSQL
is a suitable tool for a CMS if those pretty basic tweaks were done
automatically or included as example config in PostgreSQL
distribution?

> > Still you've another DB that kick your ass in most common
> > hardware configuration and workload. Something has to be done
> > about the tuning. Again... a not tuned Ferrari can't win a F1 GP
> > competing with a tuned McLaren but it can stay close. A Skoda
> > Fabia can't.

> Except the current benchmark is how fast you can change the tires.

It is not and anyway you won't reply: you've to tune PostgreSQL so
you can change tires very fast.
The test was a very low load of Drupal on pg and mysql.
Are you expecting that on a higher load pg will outperform mysql?
Then say so. Don't say you've to tune PostgreSQL.
Do you think that PostgreSQL can outperform mysql on very low load
with tuning?

> > When people come here and ask why PostgreSQL is slow as a Skoda
> > compared to a Ferrari in some tasks and you reply they have to
> > tune... a) they will think you're trying to sell them a Skoda b)
> > they will think you're selling a Ferrari in a mounting kit.

> Actually the most common answer is to ask them if they've actually
> used a realistic benchmark.  Then tune.

realistic? What was not realistic about Mikkel's test?
I'd say it is not the kind of workload PostgreSQL was built for.
BTW I don't buy the idea that even with correct tuning Drupal is
going to be any faster with PostgreSQL on a mostly "read-only"
benchmark.
This makes an even more painful experience and undermine the trust
of the people coming here and asking why PostgreSQL is slow compared
to mySQL.
The best replies I've read were from Ang Chin Han.

> > Remember we are talking about PostgreSQL vs. MySQL performance
> > running Drupal.

> Yes, and the very first consideration should be, "Will the db I'm
> choosing be likely to eat my data?"  If you're not sure on that one
> all the benchmarketing in the world won't make a difference.

Maybe a DB eating data is a sustainable solution to your problem.
But I think the web is mature enough so that very few web apps worth
to be used could consider their data integrity a cheap asset.
This could be a good selling point even for people that grew up with
MySQL.

> > But still people point at benchmark where PostgreSQL outperform
> > MySQL.
> > People get puzzled.

> Because they don't understand what databases are and what they do
> maybe?

Then? I doubt that pointing them at tuning docs will make them
understand if PostgreSQL is the right tool.

> > I don't have direct experience on corrupted DB... but I'd say it
> > is easier to program PostgreSQL than MySQL once your project is
> > over 30 lines of code because it is less sloppy.
> > This is easier to prove: point at the docs and to SQL standard.

> Lots of people feel MySQL's tutorial style docs are easier to
> comprehend. Especially those unfamiliar with dbs.  I prefer
> PostgreSQL's docs, as they are more thorough better suited for a
> semi-knowledgable DBA.

What I meant was... I don't like a DB that silently turn my strings
into int or trim strings to make them fit into a varchar etc...

> >> it's assumed that if server performance matters, someone will
> >> spend time tuning things. The fact that database X performs
> >> better than PostgreSQL out of the box is fairly irrelevant; if
> >> performance matters, you won't use the defaults, you'll find
> >> better ones that work for you.

> > The fact that out of the box on common hardware PostgreSQL
> > under-perform MySQL with default config would matter if few
> > paragraph below you wouldn't say that integrity has a *big*
> > performance cost even on read-only operation.
> > When people come back crying that PostgreSQL under-perform with
> > Drupal they generally show a considerable gap between the 2.

> Again, this is almost always for 1 to 5 users.  Real world DBs have
> dozens to hundreds to even thousands of simultaneous users.  My
> PostgreSQL servers at work routinely have 10 or 20 queries running
> at the same time, and peak at 100 or more.

That's more in the right direction to help people comparing MySQL
and PostgreSQL. Still you don't point them at tuning, because it is
not going to make PostgreSQL shine anyway.

> > But generally the performance gap is astonishing on default
> > configuration.

> Only for unrealistic benchmarks.  Seriously, for any benchmark with
> large concurrency and / or high write percentage, postgreSQL wins.

Then DON'T point at the default config as the culprit.
a) it is true that config plays a BIG role in MySQL
*reasonable* benchmark comparisons: offer a reasonable config!
b) it is not true: stop pointing at tuning. It is to say the least
puzzling.

> >It is hard to win the myth surrounding PostgreSQL...
> > but again... if you've to trade integrity for speed... at least
> > you should have numbers to show what are you talking about. Then
> > people may decide.
> > You're using a X% slower, Y% more reliable DB.
> > You're using a X% slower, Y% more scalable DB. etc...
>
> It's not just integrity for speed!  IT's the fact that MySQL has
> serious issues with large concurrency, especially when there's a
> fair bit of writes going on.  This is especially true for myisam,
> but not completely solved in the Oracle-owned innodb table handler.

But no one is going to believe you if the answer is:
MySQL is going to eat your data. That's sound like FUD.
I think that even after 1 year of:
siege -H "Cookie: drupalsessid" -c 5 "http://drupal-site.local/" -b
-t30s
MySQL is not going to eat your data.
If it smells like FUD people may think it is.

> > Well horror stories about PostgreSQL being doggy slow are quite
> > common among MySQL users.

> Users who run single thread benchmarks.  Let them pit their MySQL
> servers against my production PostgreSQL servers with a realistic
> load.

This doesn't make the default 2 lines of promoting PostgreSQL any
better:
- you've to tune
- mySQL will eat your data

> > If I see a performance gap of 50% I'm going to think that's not
> > going to be that easy to fill it with "tuning".
> > That means:
> > - I may think that with a *reasonable* effort I could come close
> > and then I'll have to find other good reasons other than
> > performances to chose A in spite of B

> Then you are putting your cart before your horse.  Choosing a db
> based on a single synthetic benchmark is like buying a car based
> on the color of the shift knob.  Quality is far more important.
> And so is your data: "MySQL mangling your data faster than any
> other db!"  is not a good selling point..

But the reply: you've to tune and mysql will eat your data doesn't
get nearer to the kernel of the problem.

> > Now... you've to tune is not the kind of answer that will help
> > me to take a decision in favour of PostgreSQL.

> Then please use MySQL.  I've got a db that works well for me.  When
> MySQL proves incapable of handling the load, then come back and ask
> for help migrating.

OK... then you admit that suggesting to tune and scaring people is
just delaying the migration ;)

> > Why do comparisons between PostgreSQL and MySQL come up so
> > frequently?
> >
> > Because MySQL "is the DB of the Web".
> > Many web apps are (were) mainly "read-only" and their data
> > integrity is (was) not so important.
> > Many Web apps are (were) simple.
> >
> > Web apps and CMS are a reasonably large slice of what's moving on
> > the net. Do these applications need the features PostgreSQL has?
>
> Is their data important?  Is downtime a bad thing for them?
>
> > Is there any trade off? Is it worth to pay that trade off?
> >
> > Is it worth to conquer this audience even if they are not skilled
> > DBA?
>
> Only if they're willing to learn.  I can't spend all day tuning

I bet some are willing to learn. Just pointing them at tuning and
using FUD-like arguments is not a very good advertising for
PostgreSQL Kung-Fu school.

BTW I hope someone may find good use of this:

2xXeon HT CPU 3.20GHz (not dual core), 4Gb RAM, RAID 5 SCSI
* absolutely not tuned Apache
* absolutely not tuned Drupal with little content, some blocks and
some google adds
(just CSS aggregation and caching enabled)
* lightly tuned PostgreSQL 8.1
shared_buffers = 3500
work_mem = 32768
checkpoint_segments = 10
effective_cache_size = 15000
random_page_cost = 3
default_statistics_target = 30

siege -H "Cookie: drupalsessid" -c1 "localhost/d1"
-b -t30s

-c 1
Transactions:                    485 hits
Availability:                 100.00 %
Elapsed time:                  29.95 secs
Data transferred:               5.33 MB
Response time:                  0.06 secs
Transaction rate:              16.19 trans/sec
Throughput:                     0.18 MB/sec
Concurrency:                    1.00
Successful transactions:         485
Failed transactions:               0
Longest transaction:            0.13
Shortest transaction:           0.06

-c 5
Transactions:                   1017 hits
Availability:                 100.00 %
Elapsed time:                  29.61 secs
Data transferred:              11.29 MB
Response time:                  0.15 secs
Transaction rate:              34.35 trans/sec
Throughput:                     0.38 MB/sec
Concurrency:                    4.98
Successful transactions:        1017
Failed transactions:               0
Longest transaction:            0.24
Shortest transaction:           0.08

-c 20
Transactions:                    999 hits
Availability:                 100.00 %
Elapsed time:                  30.11 secs
Data transferred:              11.08 MB
Response time:                  0.60 secs
Transaction rate:              33.18 trans/sec
Throughput:                     0.37 MB/sec
Concurrency:                   19.75
Successful transactions:         999
Failed transactions:               0
Longest transaction:            1.21
Shortest transaction:           0.10

-c 100
Transactions:                   1085 hits
Availability:                 100.00 %
Elapsed time:                  29.97 secs
Data transferred:               9.61 MB
Response time:                  2.54 secs
Transaction rate:              36.20 trans/sec
Throughput:                     0.32 MB/sec
Concurrency:                   91.97
Successful transactions:         911
Failed transactions:               0
Longest transaction:           12.41
Shortest transaction:           0.07

-c 200
Transactions:                   1116 hits
Availability:                 100.00 %
Elapsed time:                  30.02 secs
Data transferred:               9.10 MB
Response time:                  4.85 secs
Transaction rate:              37.18 trans/sec
Throughput:                     0.30 MB/sec
Concurrency:                  180.43
Successful transactions:         852
Failed transactions:               0
Longest transaction:           15.85
Shortest transaction:           0.25

-c 400
Transactions:                   1133 hits
Availability:                 100.00 %
Elapsed time:                  29.76 secs
Data transferred:               8.51 MB
Response time:                  6.98 secs
Transaction rate:              38.07 trans/sec
Throughput:                     0.29 MB/sec
Concurrency:                  265.85
Successful transactions:         736
Failed transactions:               0
Longest transaction:           28.55
Shortest transaction:           0.00

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


pgsql-general by date:

Previous
From: Steve Atkins
Date:
Subject: Re: Update with a Repeating Sequence
Next
From: "Fernando Moreno"
Date:
Subject: Re: db_user_namespace, md5 and changing passwords