Thread: Postgres performance

Postgres performance

From

bertolima@yahoo.it (mauro)

Date:

24 February 2005, 04:47:07

Hi, I understand this is an super-older thread!! note: i like postgres
and not mysql!
I'm a Postgres server user: I've postgres 7.2 and 8.0 in many servers
and I've tested performance Postgres vs MySQL and I see Postgres is
very slow.  But my question is: why? Cannot Postgres team make faster
because cannot change internal architecture?
Ok, postgres supports SUB-query and many super-feature, but mySQL now
supports it and is faster!

Re: Postgres performance

From

Richard Huxton

Date:

24 February 2005, 08:05:23

mauro wrote:
> Hi, I understand this is an super-older thread!! note: i like
> postgres and not mysql!

Hi Mauro! Whether you like it or hate it, you're in the right place to
ask questions about it.

> I'm a Postgres server user: I've postgres 7.2 and 8.0 in many servers
>  and I've tested performance Postgres vs MySQL and I see Postgres is 
> very slow. 

Very slow how - maxing out CPU or disk I/O?
On what hardware?
On what queries?
With how much data?
With how many connections?
What configuration changes have you made?
> But my question is: why? Cannot Postgres team make faster
>  because cannot change internal architecture? Ok, postgres supports
> SUB-query and many super-feature, but mySQL now supports it and is
> faster!

Not always, AFAICT. The four most common reasons why PG tests slower 
than Mysql are:
1. You haven't configured or have misconfigured PostgreSQL.
2. You are testing a MySQL-tuned application (lots of small, simple 
queries, no views, no subselects etc)
3. You are only testing one connection (try 10,20,50 simultaneous users 
and see who wins then).
4. You are not testing the transaction-safe storage systems in MySQL

See if you can answer some of the questions above and I'm sure we'll be 
able to get your database server running smoothly.

--  Richard Huxton  Archonet Ltd

Re: Postgres performance

From

bertolima@yahoo.it (mauro)

Date:

02 March 2005, 06:00:49

> Not always, AFAICT. The four most common reasons why PG tests slower 
> than Mysql are:
> 1. You haven't configured or have misconfigured PostgreSQL.
> 2. You are testing a MySQL-tuned application (lots of small, simple 
> queries, no views, no subselects etc)
> 3. You are only testing one connection (try 10,20,50 simultaneous users 
> and see who wins then).
> 4. You are not testing the transaction-safe storage systems in MySQL
> 
> See if you can answer some of the questions above and I'm sure we'll be 
> able to get your database server running smoothly.
Hi,  I've used the benchmark
http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok,
it's without bench on views, sub-select, transaction,..)
The database files are in stripe (RAID 0) on two SATA hd (transfer
rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off),
no optimation on I/O scheduler, DBMS are in default configuration (so
I don't benefit nobody). Total time:
Pgsql: 7h 20'
MySQL: 14' (!!)

This is the configuration where is running Postgres 8.0 and MySql:

Linux version 2.6.9-1.667 (bhcompile@dolly.build.redhat.com) (gcc
version 3.4.2
20041017 (Red Hat 3.4.2-6.fc3)) #1 Tue Nov 2 14:50:10 EST 2004

Memory: 1024128k/1047744k available (2389k kernel code, 22900k
reserved, 1276k d
ata, 164k init)

CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU: AMD Athlon(tm) 64 Processor 3000+ stepping 00
powernow-k8: Found 1 AMD Athlon 64 / Opteron processors (version
1.00.09b)
powernow-k8: BIOS error - no PSB

hda: SAMSUNG SP0411N, ATA DISK drive
hda: max request size: 1024KiB
hda: 78242976 sectors (40060 MB) w/2048KiB Cache, CHS=16383/255/63,
UDMA(133)

ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 193
ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 193

cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 12
model name      : AMD Athlon(tm) 64 Processor 3000+
stepping        : 0
cpu MHz         : 2002.652
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3932.16
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

The analisys:
Test    Postgres (sec)    Mysql (sec)
alter_table_add                         1    34
alter_table_drop                            27
connect                                 36    4
connect+select_1_row                    45    5
connect+select_simple                   38    4
count                                   21    5
count_distinct                          9    7
count_distinct_2                        17    11
count_distinct_big                      87    26
count_distinct_group                    57    16
count_distinct_group_on_key             19    7
count_distinct_group_on_key_parts       56    16
count_distinct_key_prefix               38    6
count_group_on_key_parts                7    7
count_on_key                            226    53
create+drop                             145    3
create_MANY_tables                      50    3
create_index                            1    1
create_key+drop                         98    5
create_table                            0    0
delete_all                              13    2
delete_all_many_keys                    11    10
delete_big                              0    0
delete_big_many_keys                    11    10
delete_key                              6    1
drop_index                              0    0
drop_table                                  0
drop_table_when_MANY_tables             23    2
insert                                  177    24
insert_duplicates                       59    6
insert_key                              695    21
insert_many_fields                      84    2
insert_select_1_key                     6    1
insert_select_2_keys                    8    1
min_max                                 9    3
min_max_on_key                          1145    27
multiple_value_insert                       1
order_by_big                            25    19
order_by_big_key                        19    14
order_by_big_key2                       17    14
order_by_big_key_desc                   19    15
order_by_big_key_diff                   17    18
order_by_big_key_prefix                 17    15
order_by_key2_diff                      2    2
order_by_key_prefix                     0    1
order_by_range                          1    1
outer_join                              40    8
outer_join_found                        38    8
outer_join_not_found                    37    5
outer_join_on_key                       38    6
select_1_row                            2    0
select_2_rows                           1    1
select_big                              16    14
select_big_str                          12    8
select_column+column                    2    0
select_diff_key                         120    24
select_distinct                             3
select_group                            46    20
select_group_when_MANY_tables           9    1
select_join                                 1
select_key                              66    25
select_key2                             59    26
select_key2_return_key                  57    25
select_key2_return_prim                 59    25
select_key_prefix                       59    27
select_key_prefix_join                      3
select_key_return_key                   3    24
select_many_fields                      48    5
select_query_cache                      104    21
select_query_cache2                     104    22
select_range                            188    42
select_range_key2                       8922    4
select_range_prefix                     8905    4
select_simple                           1    1
select_simple_join                          0
update_big                              467    8
update_of_key                           58    3
update_of_key_big                       27    4
update_of_primary_key_many_keys         2049    5
update_with_key                         269    17
update_with_key_prefix                  88    5
wisc_benchmark                          3    2                                                   
TOTALS                                  25273    847

What do you think about this?
Thank you!

Re: Postgres performance

From

Michael Fuhr

Date:

02 March 2005, 08:17:36

On Tue, Mar 01, 2005 at 02:52:31AM -0800, mauro wrote:

> select_range_key2                       8922    4
> select_range_prefix                     8905    4
> update_of_primary_key_many_keys         2049    5

These look suspect, especially the first two, and they account for
over 78% of the total.  Do you know what the table definitions and
queries look like?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Postgres performance

From

Richard Huxton

Date:

02 March 2005, 08:39:04

mauro wrote:
>>Not always, AFAICT. The four most common reasons why PG tests slower 
>>than Mysql are:
>>1. You haven't configured or have misconfigured PostgreSQL.
>>2. You are testing a MySQL-tuned application (lots of small, simple 
>>queries, no views, no subselects etc)
>>3. You are only testing one connection (try 10,20,50 simultaneous users 
>>and see who wins then).
>>4. You are not testing the transaction-safe storage systems in MySQL
>>
>>See if you can answer some of the questions above and I'm sure we'll be 
>>able to get your database server running smoothly.
> 
> Hi, 
>   I've used the benchmark
> http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok,
> it's without bench on views, sub-select, transaction,..)

Other points about this benchmark:
1. It's a good 5 years old. Nothing wrong in that, but the fact that it 
hasn't been updated in that time doesn't bode well. If nothing else, it 
is designed to test PostgreSQL version 6.x
2. As you say, it doesn't actually use any of the features of a modern 
database.
3. Although vacuum is mentioned, it's not obvious to me that it's being 
run. Also, I don't see any analyze run of the populated tables.
4. It wasn't immediately obvious to me how the tests were dealing with 
varying amounts of data being cached on different runs.
5. I couldn't see how many simultaneous connections were being tested.
6. In fact, I couldn't find a clear rationale about what these tests 
were supposed to simulate - what sort of environment.

> The database files are in stripe (RAID 0) on two SATA hd (transfer
> rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off),
> no optimation on I/O scheduler,

Largely irrelevant for these particular tests.
> DBMS are in default configuration (so
> I don't benefit nobody).

If you're running with default configuration, you'll want to compare the 
two on a PII-200 with 32MB of RAM. That's roughly the default settings 
for PG's config. PG isn't designed to be run with the default 
configuration settings, it's designed to run almost anywhere.

> Total time:
> Pgsql: 7h 20'
> MySQL: 14' (!!)
> 
> This is the configuration where is running Postgres 8.0 and MySql:
[snipped long list of hardware details/run results]

> What do you think about this?

I think you didn't read my last message. I'll quote the relevent points 
again:
- begin quote -
Not always, AFAICT. The four most common reasons why PG tests slower 
than Mysql are:
1. You haven't configured or have misconfigured PostgreSQL.
2. You are testing a MySQL-tuned application (lots of small, simple 
queries, no views, no subselects etc)
3. You are only testing one connection (try 10,20,50 simultaneous users 
and see who wins then).
4. You are not testing the transaction-safe storage systems in MySQL
- end quote -

How many of these points apply to the benchmark you used? (Hint - it 
looks like all 4 to me).

Of course, if, on your production systems you:
1. Don't intend to configure your database system
2. Don't want views/triggers/subselects/partial indexes/functional 
indexes/...etc
3. Only have one simultaneous user
4. Don't use transactions and don't mind an inconsistent database.

In that case, these test results are relevant, and the right choice is 
clearly MySQL.

If you want to actually come up with some useful test figures, you'll 
want to:
1. Spend a reasonable amount of time learning how to setup and configure 
each system.
2. Understand your users' requirements, and design the tests accordingly.
3. Actually use the database to do what it is designed for.
4. Make sure you aren't using SQL structures that favour one database 
system over another (or have one schema for each database being tested)
5. Account for other factors in your tests - how much time is spent in 
Java/PHP etc. vs time in the database?

Best of luck Mauro, realistic testing is not a simple process and you've 
got a lot of work ahead of you. Don't forget there's the performance 
list that can help with specific problems too.

--  Richard Huxton  Archonet Ltd

Re: Postgres performance

From

Richard Huxton

Date:

02 March 2005, 10:28:51

Mauro Bertoli wrote:
> Hi, thanks a lot! you are rigth, but I did read your
> message ;) 
> Yes,
> 1- I misconfigured PostgreSQL (I thought that was
> already configured in base to the released version -
> Fedora Core 3 64bit).
> 2- The bench is, clearly after your precisations, an
> MySQL tuned application tests.
> 3- I think the bench test only one connection, I
> didn't see (in a fast reading) no threading request in
> the bench code to simulate users requests.
> 4- I didn't test transaction-safe (that isn't used
> explicitly in my application)

Well, do you care whether your data is consistent or not? If not, you 
don't need transactions.

>  I understand it isn't simple.. I use the dbms in data
> analysis environment and the more time is spent in
> query (php is 0.1%) with more sub-selects and maybe
> there's, in the same time, from 1 to 1000 users
> insert/update data. I tests the dbms with my data
> analysis framework simulating an super-extensive
> request.

You'll find inserts/updates with lots of users is where PostgreSQL works 
well compared to other systems.

> Do you know where I can find an tutorial to configure
> hardware dependent Postgres internal values?

There's some useful stuff here: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
and also here: http://www.powerpostgresql.com/PerfList

--  Richard Huxton  Archonet Ltd

Re: Postgres performance

From

Richard Huxton

Date:

02 March 2005, 12:06:22

Mauro Bertoli wrote:
> Hi Richard, thank you for your apreciated answers!!!
> - start quote -
>  Well, do you care whether your data is consistent or
>  not? If not, you 
>  don't need transactions.
> - end quote -
> I don't require transaction because the query aren't
> complex and update a single tuple (in SELECT
> transactions are useless)

You're still using transactions. One per UPDATE/INSERT granted, but 
you're using them anyway. Even if you don't issue BEGIN...COMMIT. 
Otherwise you don't know your update was written to disk.

> - start quote -
>  You'll find inserts/updates with lots of users is
>  where PostgreSQL works 
>  well compared to other systems.
> - end quote -
> Uhhmm.. this is interesting...
> 
> - tutorial links -
> Thx, now I read it and test an hardware tuned
> configuration... I read that is not very simple... :O
> 
> Another question: 
> - why postgres release aren't already configured
> (hardware tuning)? isn't possible configure it during
> installation?

Configured for what? PG can't tell how many disks you have, or how 
you've set them up. It also can't tell whether this machine is a 
dedicated DB server, or sharing space with a webserver. Or part of a 
virtual OS installation and the hardware is shared by 100 other virtual 
OSes.
Occasionally, people do propose an auto-tuning utility at setup, but you 
really need at least a dozen different options to do it properly. Oh, 
and then you'll need to do it for 30 versions of Unix on a variety of 
hardware and Windows too.

> - why postgres use a new process for every query ?
> (mySQL, if I'm not wrong, use threads... I think its
> faster)

Using a separate process for each means a problem in one process only 
affects that process. Threads aren't necessarily much faster (except on 
Windows) and in any case that only affects connection time.

> - why connection time is slower? (compared to mySQL)?

See above, but it's still not bad. If connection time is a limiting 
factor for you, then you have a very strange or very large workload. You 
might want to explore pgpool for connection pooling if you have a large 
website to avoid having a lot of idle connections though.

> - why postgres require analyze? (mySQL, if I'm not
> wrong, don't require it)

PG's planner is statistics-based. That is, it can tell that if you have 
a list of English surnames then "Smith" occurs more than "zgwasq". In 
some cases reading the whole table might be quicker than going to the 
index many times.
The analyse scans (a percentage of) the whole table to see if these 
statistics have changed. This is different from a VACUUM which recovers 
space where rows have been deleted or updated.

--  Richard Huxton  Archonet Ltd

Re: Postgres performance

From

Scott Marlowe

Date:

02 March 2005, 15:00:19

On Tue, 2005-03-01 at 04:52, mauro wrote:
> > Not always, AFAICT. The four most common reasons why PG tests slower 
> > than Mysql are:
> > 1. You haven't configured or have misconfigured PostgreSQL.
> > 2. You are testing a MySQL-tuned application (lots of small, simple 
> > queries, no views, no subselects etc)
> > 3. You are only testing one connection (try 10,20,50 simultaneous users 
> > and see who wins then).
> > 4. You are not testing the transaction-safe storage systems in MySQL
> > 
> > See if you can answer some of the questions above and I'm sure we'll be 
> > able to get your database server running smoothly.
> Hi, 
>   I've used the benchmark
> http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok,
> it's without bench on views, sub-select, transaction,..)
> The database files are in stripe (RAID 0) on two SATA hd (transfer
> rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off),
> no optimation on I/O scheduler, DBMS are in default configuration (so
> I don't benefit nobody). Total time:
> Pgsql: 7h 20'
> MySQL: 14' (!!)

Why is a dragster faster than a freight train?  Because it only has to
run for 5 or 6 seconds and you expect the engine to implode on ever
fourth run.  The freight train, on the other hand, has to run day after
day and deliver its cargo without damage.

The reason MySQL can be so fast is that it's not really a database in
the classical sense.  It does floating point maths on exact numeric
types.  It does almost no error checking, and if you lose power during
updates all your data could quite easily be gone.

While it's a fine storage system for certain content management tasks,
it's not reliable enough for things like accounting or where the answers
have to be right.

The reason PostgreSQL is slower is because it (and by extension the team
behind it) cares about your data. 

Here's a list of the things MySQL will gladly do wrong:

http://sql-info.de/mysql/gotchas.html

I wouldn't trust such a database for an kind of mission critical system
that handled important data, and anyone who does is gambling against the
house odds.

Re: Postgres performance

From

PFC

Date:

02 March 2005, 21:40:39

> The reason PostgreSQL is slower is because it (and by extension the team
> behind it) cares about your data.
Sure, postgres is (a bit but not much) slower for a simple query like  
SELECT * FROM one table WHERE id=some number, and postgres is a lot slower  
for UPDATES (although I heard that it's faster than MySQL InnoDB)... but  
try a query with a join on few tables, even a simple one, and postgres  
will outperform mysql, sometimes by 2x, sometimes 1000 times. I had a case  
with a join between 4 tables, two of them having 50k records ; I was only  
pulling 6 records... mysql spent half a second and postgres 0.5 ms... hell  
!

Re: Postgres performance

From

Scott Marlowe

Date:

02 March 2005, 22:06:21

On Wed, 2005-03-02 at 15:45, PFC wrote:
> > The reason PostgreSQL is slower is because it (and by extension the team
> > behind it) cares about your data.
> 
>     Sure, postgres is (a bit but not much) slower for a simple query like  
> SELECT * FROM one table WHERE id=some number, and postgres is a lot slower  
> for UPDATES (although I heard that it's faster than MySQL InnoDB)... but  
> try a query with a join on few tables, even a simple one, and postgres  
> will outperform mysql, sometimes by 2x, sometimes 1000 times. I had a case  
> with a join between 4 tables, two of them having 50k records ; I was only  
> pulling 6 records... mysql spent half a second and postgres 0.5 ms... hell  
> !

Or better yet, a query like this:

select a.lt ,b.perspective as YYY_pers,b.averageresponsetime as YYY_aver,b.lowestresponsetime as
YYY_lowe,b.highestresponsetimeas YYY_high,b.totalcount as YYY_tota,c.perspective as XXX_pers,c.averageresponsetime as
XXX_aver,c.lowestresponsetimeas XXX_lowe,c.highestresponsetime as XXX_high,c.totalcount as XXX_tota,d.perspective as
BBB_pers,d.averageresponsetimeas BBB_aver,d.lowestresponsetime as BBB_lowe,d.highestresponsetime as
BBB_high,d.totalcountas BBB_tota,e.perspective as AAA_pers,e.averageresponsetime as AAA_aver,e.lowestresponsetime as
AAA_lowe,e.highestresponsetimeas AAA_high,e.totalcount as AAA_tota,f.perspective as CCC_pers,f.averageresponsetime as
CCC_aver,f.lowestresponsetimeas CCC_lowe,f.highestresponsetime as CCC_high,f.totalcount as CCC_tota,g.perspective as
ZZZ_pers,g.averageresponsetimeas ZZZ_aver,g.lowestresponsetime as ZZZ_lowe,g.highestresponsetime as
ZZZ_high,g.totalcountas ZZZ_tota 
 
from (   select distinct date_trunc('minutes', lastflushtime) as lt from businessrequestsummary   where lastflushtime
between'2005-01-01 00:00:00' and '2005-03-31 00:00:00'
 
) as a 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='YYY'
 
)as b on (a.lt=b.lt) 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='XXX'
 
)as c on (a.lt=c.lt) 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='BBB'
 
)as d on (a.lt=d.lt) 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='AAA'
 
)as e on (a.lt=e.lt) 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='CCC'
 
)as f on (a.lt=f.lt) 
left join
(   select date_trunc('minutes', lastflushtime) as
lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where
perspective='ZZZ'
 
)as g on (a.lt=g.lt) 

Basically, the more complex the query gets, the worse MySQL generally does, since it's query planner
is a pretty simple rules based one.

Re: Postgres performance

From

Andrew Sullivan

Date:

02 March 2005, 23:09:37

This sort of discussion should really go onto -performance, but I'm
at pains to stomp out a common misperception.

On Wed, Mar 02, 2005 at 10:45:38PM +0100, PFC wrote:
> 
>     Sure, postgres is (a bit but not much) slower for a simple
>     query like SELECT * FROM one table WHERE id=some number, and

This is true _only if_ nobody else is writing at the same time you
are.  That is, for single-user or read-only databases, MySQL appears
to have a really significant advantage when using the standard MyISAM
table type.  The problem with that table type is that it requires the
_whole table_ be locked during write operations.  

In any case, for any sort of real database work, nobody sane would
use anything except the InnoDB table type.  That's a more reasonable
fruit-comparison than MySQL using MyISAM.  In the latter case, you
may as well compare PostgreSQL to flat file writing. 

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.    --Alexander Hamilton

Re: Postgres performance

From

Ian Barwick

Date:

02 March 2005, 23:35:39

On Wed, 02 Mar 2005 09:00:14 -0600, Scott Marlowe
<smarlowe@g2switchworks.com> wrote:
(...)
> The reason PostgreSQL is slower is because it (and by extension the team
> behind it) cares about your data.
> 
> Here's a list of the things MySQL will gladly do wrong:
> 
> http://sql-info.de/mysql/gotchas.html

Leaving MySQL or other databases out of the equation for the moment:
the above site is a purely dynamic website (i.e. no static files, not
even images) driven by a PostgreSQL backend. There are several issues
with the underlying application (a DIY hack job ;-) which mean it
isn't as fast as it could be. However, although I haven't been able to
run comparisions with other RDBMSs I find it hard to imagine where
significant speed gains could be made at the database end, especially
if stored procedures are not available (any raw speed increase could
well be eaten up by the need to implement several critical functions
in the application).

Recently I added a function (for another site on the same server,
running from the same database) to generate a blog-style calendar for
a given month to show on which days an article was written. Despite
involving a three-table join with a longish list of join conditions it
proved to be jaw-droppingly fast (a few milliseconds, fast enough not
to have to cache the result anywhere, which is what I was originally
expecting to have to do) and as an added bonus returns the weekday
expressed as an integer, so all the application has to do is a little
formatting to produce the end result.

I've also run a PostgreSQL-based multi-thousand page site (with a
simpler structure) without any complaints speedwise; and when one of
the disks died very nastily during an intensive write operation
(software raid on dodgy hardware) I was even able to rsync the
database files direct from the surviving disk over to a backup server
and restart PostgreSQL there straight off, without any evident
problems. (Disclaimer: it was an emergency, and the data was
non-critical; nevertheless I never found any evidence of corruption).

Ian Barwick

Re: Postgres performance

From

Christopher Browne

Date:

03 March 2005, 18:04:16

lists@boutiquenumerique.com (PFC) writes:
>> The reason PostgreSQL is slower is because it (and by extension the team
>> behind it) cares about your data.
>
>     Sure, postgres is (a bit but not much) slower for a simple
> query like SELECT * FROM one table WHERE id=some number, and
> postgres is a lot slower for UPDATES (although I heard that it's
> faster than MySQL InnoDB)...

There is a _HUGE_ set of misconceptions here.
 1.  The speed difference is only repeatedly true for simple selects     when done against MyISAM tables.
 2.  That speed difference for UPDATEs is only true if you are talking     about ONE MySQL(tm) client doing updates
againstMyISAM tables.
 
     MyISAM does not support row locks; if multiple clients are     trying to update a table, they must fight for a
singletable     lock, with the result that updating tables doesn't scale _at     all_ with MySQL(tm) for the default
tabletype.
 

If you only have one process touching the database, MySQL(tm) can
therefore look quite a lot better than PostgreSQL.  Move to 2 clients
and it's not quite so good.  Move to 100 concurrent clients all trying
to do updates and you may discover that you simply can't do that...
-- 
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.

Re: Postgres performance

From

Mauro Bertoli

Date:

04 March 2005, 22:04:50

Hi Richard, thank you for your apreciated answers!!!
- start quote -Well, do you care whether your data is consistent ornot? If not, you don't need transactions.
- end quote -
I don't require transaction because the query aren't
complex and update a single tuple (in SELECT
transactions are useless)

- start quote -You'll find inserts/updates with lots of users iswhere PostgreSQL works well compared to other systems.
- end quote -
Uhhmm.. this is interesting...

- tutorial links -
Thx, now I read it and test an hardware tuned
configuration... I read that is not very simple... :O

Another question: 
- why postgres release aren't already configured
(hardware tuning)? isn't possible configure it during
installation?
- why postgres use a new process for every query ?
(mySQL, if I'm not wrong, use threads... I think its
faster)
- why connection time is slower? (compared to mySQL)?
- why postgres require analyze? (mySQL, if I'm not
wrong, don't require it)
Yours answers will be very apreciated! Thx

    
___________________________________ 
Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! 
http://it.messenger.yahoo.it

Re: Postgres performance

From

Mauro Bertoli

Date:

04 March 2005, 22:05:04

Hi, thanks a lot! you are rigth, but I did read your
message ;) 
Yes,
1- I misconfigured PostgreSQL (I thought that was
already configured in base to the released version -
Fedora Core 3 64bit).
2- The bench is, clearly after your precisations, an
MySQL tuned application tests.
3- I think the bench test only one connection, I
didn't see (in a fast reading) no threading request in
the bench code to simulate users requests.
4- I didn't test transaction-safe (that isn't used
explicitly in my application)I understand it isn't simple.. I use the dbms in data
analysis environment and the more time is spent in
query (php is 0.1%) with more sub-selects and maybe
there's, in the same time, from 1 to 1000 users
insert/update data. I tests the dbms with my data
analysis framework simulating an super-extensive
request. 
Do you know where I can find an tutorial to configure
hardware dependent Postgres internal values?
Thx, best regards,Mauro

    
___________________________________ 
Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! 
http://it.messenger.yahoo.it

Re: Postgres performance

From

PFC

Date:

04 March 2005, 23:21:06

> I don't require transaction because the query aren't
> complex and update a single tuple (in SELECT
> transactions are useless)
You mean, you have no foreign keys in your database ?In SELECT they are definitely useful (think select for update,
isolation 
 
level serializable...)

> - start quote -
>  You'll find inserts/updates with lots of users is
>  where PostgreSQL works
>  well compared to other systems.
> - end quote -
> Uhhmm.. this is interesting...
pg does not lock the whole table everytime anyone wants to write in it.  
In MySQL when you run a big select, all write activity stops during that.  
If you run a big update, all activity other than this update has to wait.

> - why postgres use a new process for every query ?
> (mySQL, if I'm not wrong, use threads... I think its
> faster)
Not for every query, for every CONNECTION.You are using persistant connections are you. Are you ?

> - why connection time is slower? (compared to mySQL)?
This is of no importance as everyone uses persistent connections anyway.

> - why postgres require analyze? (mySQL, if I'm not
> wrong, don't require it)
> Yours answers will be very apreciated! Thx
So it has a planner which knows what it's doing ;) instead of just  
guessing in the dark.
And MySQL requires analyze too (read the docs), optimize table which  
looks like vacuum to me, and sometimes repair table...

Re: Postgres performance

From

PFC

Date:

05 March 2005, 22:53:04


> No, I haven't foreign keys in the older version, in
> that new I've it... however I manage relations from
> app code (PHP)...
Really ?In my experience this is a sure way to get inconsistencies slowly  
creeping into your database, and you also get a load of funky concurrency  
issues.

> doesn't MYSQL allow to use 'foreign
> keys' in sure and fast way then?
It does, IF you use the InnoDB engine... which is slower than postgres...  
and there are a lot of gotchas.>

>>     Not for every query, for every CONNECTION.
>>     You are using persistant connections are you. Are
> you ?
> I'm using PHP and every user (can be from 1 user to
> 100 users) must connect to the database... do you know
> how I can use persistant connection? I think it's
> impossible... I'm wrong?
Well, first, I get a connection establishment time of about 20 ms in  
mysql and 60 ms in postgres. This information is useless as I use  
persistent connections, obviously, because it is crazy to spend 20 ms  
connecting just to make a 0.5 ms query.
Now, in PHP, you can use mysql_pconnect instead of mysql_connect to get a  
persistent connection. mod_php keeps a pool of connections. The same thing  
probably applies for postgres, but as I don't use it with PHP (only with  
Python) I can't tell. Look in the docs for "persistent connections".
This way, each Apache server process keeps a persistent connection open,  
and re-uses it for every page. You save the connection establishment time  
and load.

>> > - why connection time is slower? (compared to
>> mySQL)?
Because MySQL forks a thread whereas Postgres forks a process.

>>     This is of no importance as everyone uses
>> persistent connections anyway.
> See last answer...
I hope my explanations are useful.

>>     And MySQL requires analyze too (read the docs),
>> optimize table which
>> looks like vacuum to me, and sometimes repair
>> table...
> Ok... they are conceptually implemented in the same
> mode...
Well, not really.
For instance when you make joins, postgres will look the ANALYZE stats  
and say "Hm, this value seems rare, I'll use an index scan to get these  
few values" or "This column has few distinct values, I'll better load them  
all into a hash before joining to this big table instead of making a lot  
of index scans"... it can get a lot more complicated.
MySQL thinks "I see indexed column, I don't know what a hash join is,  
thus I use index."
Both try to estimate the size of result sets to choose plans, postgres  
generally does it well, mysql sometimes can do something which happens to  
work, most of the time it makes no diference.
But using the MySQL analyze seems to speed up some of my queries, though.  
I don't think it has such detailed stats as postgres, though.
Point is, if the query gets complex, forget MySQL...

Re: Postgres performance

From

PFC

Date:

06 March 2005, 19:29:12


>>     Really ?
>>     In my experience this is a sure way to get
>> inconsistencies slowly
>> creeping into your database, and you also get a load
>> of funky concurrency
>> issues.
> Yes, you are rigth... my insert/update are very simple
> and without problems and so I think to use 'foreign
> key' coded to make faster/simpler the management and
> don't overloading the db (and use exception code
> management )... but I had a problem with pgSQL because
> server was very busy and the same query was replicate
> (because users refresh the page... :( ) so now I've
> foreign keys...
It's a typical case, program being stopped between insertion of parent  
and child row. Although in this case FK's is not the solution,  
transactions are.

> What about it? (i think it's the same with mySQL...)
> I don't know how users can connect to... 1, 10,
> 1000... I must create a pool with 1000 connections? is
> this fine? if  connections aren't released I must
> reset manually, it is dangerous... do you think?
Just set the max number of connections for postgres a bit higher than the  
max number of apache processes in apache.conf.

Re: Postgres performance

From

Mauro Bertoli

Date:

08 March 2005, 04:24:23

> > No, I haven't foreign keys in the older version,
> in
> > that new I've it... however I manage relations
> from
> > app code (PHP)...
> 
>     Really ?
>     In my experience this is a sure way to get
> inconsistencies slowly  
> creeping into your database, and you also get a load
> of funky concurrency  
> issues.
Yes, you are rigth... my insert/update are very simple
and without problems and so I think to use 'foreign
key' coded to make faster/simpler the management and
don't overloading the db (and use exception code
management )... but I had a problem with pgSQL because
server was very busy and the same query was replicate
(because users refresh the page... :( ) so now I've
foreign keys...


>     Now, in PHP, you can use mysql_pconnect instead of
> mysql_connect to get a  
> persistent connection. mod_php keeps a pool of
> connections. The same thing  
> probably applies for postgres, but as I don't use it
> with PHP (only with  
> Python) I can't tell. Look in the docs for
> "persistent connections".
mmm...This is interesting... I see it's usefull but
there are some 'warnings' about use it (see
http://it.php.net/manual/en/features.persistent-connections.php)
:
"Note, however, that this can have some drawbacks if
you are using a database with connection limits that
are exceeded by persistent child connections. If your
database has a limit of 16 simultaneous connections,
and in the course of a busy server session, 17 child
threads attempt to connect, one will not be able to.
If there are bugs in your scripts which do not allow
the connections to shut down (such as infinite loops),
the database with only 16 connections may be rapidly
swamped. Check your database documentation for
information on handling abandoned or idle
connections."

What about it? (i think it's the same with mySQL...)
I don't know how users can connect to... 1, 10,
1000... I must create a pool with 1000 connections? is
this fine? if  connections aren't released I must
reset manually, it is dangerous... do you think?

Thx! Mauro

    
___________________________________ 
Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! 
http://it.messenger.yahoo.it

Re: Postgres performance

From

Mauro Bertoli

Date:

08 March 2005, 04:24:24

> > Yes, you are rigth... my insert/update are very
> simple
> > and without problems and so I think to use
> 'foreign
> > key' coded to make faster/simpler the management
> and
> > don't overloading the db (and use exception code
> > management )... but I had a problem with pgSQL
> because
> > server was very busy and the same query was
> replicate
> > (because users refresh the page... :( ) so now
> I've
> > foreign keys...
> 
>     It's a typical case, program being stopped between
> insertion of parent  
> and child row. Although in this case FK's is not the
> solution,  
> transactions are.
I insert a tuple (in the table answers) if the key
'iduser->idquestion' don't exists but when the key
exists I update the value; this is coded in php
because I have thought that otherwise I must use
functions... there's no parent/child rows... how can I
use transactions here?

Mauro

    
___________________________________ 
Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! 
http://it.messenger.yahoo.it

Re: Postgres performance

From

Mauro Bertoli

Date:

08 March 2005, 04:24:25

--- PFC <lists@boutiquenumerique.com> wrote:
>     You mean, you have no foreign keys in your database
?
>     In SELECT they are definitely useful (think select
> for update, isolation  
> level serializable...)
No, I haven't foreign keys in the older version, in
that new I've it... however I manage relations from
app code (PHP)... doesn't MYSQL allow to use 'foreign
keys' in sure and fast way then?


> > - why postgres use a new process for every query ?
> > (mySQL, if I'm not wrong, use threads... I think
> its
> > faster)
> 
>     Not for every query, for every CONNECTION.
>     You are using persistant connections are you. Are
you ?
I'm using PHP and every user (can be from 1 user to
100 users) must connect to the database... do you know
how I can use persistant connection? I think it's
impossible... I'm wrong?
> 
> > - why connection time is slower? (compared to
> mySQL)?
> 
>     This is of no importance as everyone uses
> persistent connections anyway.
See last answer...

> > - why postgres require analyze? (mySQL, if I'm not
> > wrong, don't require it)
> > Yours answers will be very apreciated! Thx
> 
>     So it has a planner which knows what it's doing ;)
> instead of just  
> guessing in the dark.
> 
>     And MySQL requires analyze too (read the docs),
> optimize table which  
> looks like vacuum to me, and sometimes repair
> table...
Ok... they are conceptually implemented in the same
mode...
Thx a lot! Now my ideas about pg & mySQL are cleared
(and corrects ;) )
Mauro

    
___________________________________ 
Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! 
http://it.messenger.yahoo.it