Thread: MySQL vs PG TPC-H benchmarks

MySQL vs PG TPC-H benchmarks

From
Nick Barr
Date:
Hi,

Has anyone had a look at:

http://people.ac.upc.es/zgomez/

I realize that MySQL & PG cannot really be compared (especially when you
consider the issues that MySQL has with things like data integrity) but
still surely PG would perform better than the stats show (i.e. #7 31.28
seconds versus 42 minutes!!!).

On a side note it certainly looks like linux kernel 2.6 is quite a bit
faster in comparision to 2.4.

Nick




Re: MySQL vs PG TPC-H benchmarks

From
Rod Taylor
Date:
> I realize that MySQL & PG cannot really be compared (especially when you
> consider the issues that MySQL has with things like data integrity) but
> still surely PG would perform better than the stats show (i.e. #7 31.28
> seconds versus 42 minutes!!!).

We know that PostgreSQL 7.5 will perform much better than 7.4 did due to
the efforts of OSDN and Tom.

I've enquired as to whether they ran ANALYZE after the data load. They
don't explicitly mention it, and given the mention it took 2.5days to
load 1GB of data, they're not regular PostgreSQL users.

--
Rod Taylor <rbt [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
PGP Key: http://www.rbt.ca/signature.asc

Attachment

Re: MySQL vs PG TPC-H benchmarks

From
Rod Taylor
Date:
On Wed, 2004-04-21 at 08:19, Rod Taylor wrote:
> > I realize that MySQL & PG cannot really be compared (especially when you
> > consider the issues that MySQL has with things like data integrity) but
> > still surely PG would perform better than the stats show (i.e. #7 31.28
> > seconds versus 42 minutes!!!).
>
> We know that PostgreSQL 7.5 will perform much better than 7.4 did due to
> the efforts of OSDN and Tom.

OSDL not OSDN.

> I've enquired as to whether they ran ANALYZE after the data load. They
> don't explicitly mention it, and given the mention it took 2.5days to
> load 1GB of data, they're not regular PostgreSQL users.


Re: MySQL vs PG TPC-H benchmarks

From
Paul Thomas
Date:
On 21/04/2004 09:31 Nick Barr wrote:
> Hi,
>
> Has anyone had a look at:
>
> http://people.ac.upc.es/zgomez/
>
> I realize that MySQL & PG cannot really be compared (especially when you
> consider the issues that MySQL has with things like data integrity) but
> still surely PG would perform better than the stats show (i.e. #7 31.28
> seconds versus 42 minutes!!!).

Looks like he's using the default postgresql.conf settings in which case
I'm not suprised at pg looking so slow. His stated use of foreign keys
invalidates the tests anyway as MyISAM tables don't support FKs so we're
probably seeing FK check overheads in pg that are simply ignore by MySQL.
In an honest test, MySQL should be reported as failing those tests.

Perhaps one of the advocay team will pick up the batton?
>
> On a side note it certainly looks like linux kernel 2.6 is quite a bit
> faster in comparision to 2.4.

Yes, I've seen other benchmarks which also show that.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business             |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: MySQL vs PG TPC-H benchmarks

From
"Cestmir Hybl"
Date:
> Looks like he's using the default postgresql.conf settings in which case
> I'm not suprised at pg looking so slow.

The question also is, IMHO, why the hell, postgreSQL still comes out of the
box with so stupid configuration defaults, totally underestimated for todays
average hardware configuration (1+GHz, 0.5+GB RAM, fast FSB, fast HDD).

It seems to me better strategy to force that 1% of users to "downgrade" cfg.
than vice-versa.

regards
ch


Re: MySQL vs PG TPC-H benchmarks

From
Paul Thomas
Date:
On 21/04/2004 14:31 Cestmir Hybl wrote:
> > Looks like he's using the default postgresql.conf settings in which
> case
> > I'm not suprised at pg looking so slow.
>
> The question also is, IMHO, why the hell, postgreSQL still comes out of
> the
> box with so stupid configuration defaults, totally underestimated for
> todays
> average hardware configuration (1+GHz, 0.5+GB RAM, fast FSB, fast HDD).
>
> It seems to me better strategy to force that 1% of users to "downgrade"
> cfg.
> than vice-versa.
>
> regards
> ch
>

This has been discussed many times before. Check the archives.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for
Business             |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: MySQL vs PG TPC-H benchmarks

From
Josh Berkus
Date:
Folks,

I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody
flame him!

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: MySQL vs PG TPC-H benchmarks

From
"Matthew T. O'Connor"
Date:
Paul Thomas wrote:

> Looks like he's using the default postgresql.conf settings in which
> case I'm not suprised at pg looking so slow. His stated use of foreign
> keys invalidates the tests anyway as MyISAM tables don't support FKs
> so we're probably seeing FK check overheads in pg that are simply
> ignore by MySQL. In an honest test, MySQL should be reported as
> failing those tests.


Either failures, or they should not have been using MyISAM, they should
have used the table format that supports FK's.  This is just not apples
to apples.



Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Jan Wieck
Date:
Josh Berkus wrote:

> Folks,
>
> I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody
> flame him!
>

Please keep in mind that the entire test has, other than a similar
database schema and query types maybe, nothing to do with a TPC-H. I
don't see any kind of SUT. Foreign key support on the DB level is not
required by any of the TPC benchmarks. But the System Under Test, which
is the combination of middleware application and database together with
all computers and network components these parts are running on, must
implement all the required semantics, like ACID properties, referential
integrity &c. One could implement a TPC-H with flat files, it's just a
major pain in the middleware.

A proper TPC benchmark implementation would for example be a complete
PHP+DB application, where the user interaction is done by an emulated
"browser" and what is measured is the http response times, not anything
going on between PHP and the DB. Assuming that all requirements of the
TPC specification are implemented by either using available DB features,
or including appropriate workarounds in the PHP code, that would very
well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by timing such a system
after a considerable rampup time, giving the DB system a chance to
properly populate caches and so forth. Rebooting the machine just before
the test is the wrong thing here and will especially kill any advanced
cache algorithms like ARC.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Eduardo Almeida
Date:
Folks,

I�m doing the 100GB TPC-H and I�ll show the previous
results to our community (Postgres) in 3 weeks before
finishing the study.

My intention is to carry through a test with a VLDB in
a low cost platform (PostgreSQL, Linux and cheap HW)
and not to compare with another DBMS.

So far I can tell you that the load time on PG 7.4.2
with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
with 8 disks (960 GB) loaded the database in less than
24 hours.
About 7hs:30min to load the data and 16:09:25 to
create the indexes

The Power test still running and that�s why I�ll not
present anything so far. Now I�ll just send to the
list my environment configuration.

- The configuration of the machine is:
Dual opteron 64 bits model 240
4GB RAM
960 GB on RAID 0
Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
kernel for this test)
Java SDK  java version "1.4.2_04"
PostgreSQL JDBC pg74.1jdbc3.jar

- The TPC-H configuration is:
TPC-H 2.0.0
100GB
load using flat files
Refresh functions using java

- The PostgreSQL 7.4.2 configuration is:

add_missing_from               | on
 australian_timezones           | off
 authentication_timeout         | 60
 check_function_bodies          | on
 checkpoint_segments            | 128
 checkpoint_timeout             | 300
 checkpoint_warning             | 30
 client_encoding                | SQL_ASCII
 client_min_messages            | notice
 commit_delay                   | 0
 commit_siblings                | 5
 cpu_index_tuple_cost           | 0.001
 cpu_operator_cost              | 0.0025
 cpu_tuple_cost                 | 0.01
 DateStyle                      | ISO, MDY
 db_user_namespace              | off
 deadlock_timeout               | 1000
 debug_pretty_print             | off
 debug_print_parse              | off
 debug_print_plan               | off
 debug_print_rewritten          | off
 default_statistics_target      | 10
 default_transaction_isolation  | read committed
 default_transaction_read_only  | off
 dynamic_library_path           | $libdir
 effective_cache_size           | 150000
 enable_hashagg                 | on
 enable_hashjoin                | on
 enable_indexscan               | on
 enable_mergejoin               | on
 enable_nestloop                | on
 enable_seqscan                 | on
 enable_sort                    | on
 enable_tidscan                 | on
 explain_pretty_print           | on
 extra_float_digits             | 0
 from_collapse_limit            | 8
 fsync                          | off
 geqo                           | on
 geqo_effort                    | 1
 geqo_generations               | 0
geqo_pool_size                 | 0
 geqo_selection_bias            | 2
 geqo_threshold                 | 11
 join_collapse_limit            | 8
 krb_server_keyfile             | unset
 lc_collate                     | en_US
 lc_ctype                       | en_US
 lc_messages                    | C
 lc_monetary                    | C
 lc_numeric                     | C
 lc_time                        | C
 log_connections                | off
 log_duration                   | off
 log_error_verbosity            | default
 log_executor_stats             | off
 log_hostname                   | off
 log_min_duration_statement     | -1
 log_min_error_statement        | panic
 log_min_messages               | notice
 log_parser_stats               | off
 log_pid                        | off
 log_planner_stats              | off
 log_source_port                | off
 log_statement                  | off
 log_statement_stats            | off
 log_timestamp                  | off
 max_connections                | 10
 max_expr_depth                 | 10000
 max_files_per_process          | 1000
 max_fsm_pages                  | 20000
 max_fsm_relations              | 1000
 max_locks_per_transaction      | 64
 password_encryption            | on
 port                           | 5432
 pre_auth_delay                 | 0
 preload_libraries              | unset
 random_page_cost               | 1.25
 regex_flavor                   | advanced
 rendezvous_name                | unset
 search_path                    | $user,public
 server_encoding                | SQL_ASCII
 server_version                 | 7.4.2
 shared_buffers                 | 40000
 silent_mode                    | off
sort_mem                       | 65536
 sql_inheritance                | on
 ssl                            | off
 statement_timeout              | 10000000
 stats_block_level              | off
 stats_command_string           | off
 stats_reset_on_server_start    | on
 stats_row_level                | off
 stats_start_collector          | on
 superuser_reserved_connections | 2
 syslog                         | 0
 syslog_facility                | LOCAL0
 syslog_ident                   | postgres
 tcpip_socket                   | on
 TimeZone                       | unknown
 trace_notify                   | off
 transaction_isolation          | read committed
 transaction_read_only          | off
 transform_null_equals          | off
 unix_socket_directory          | unset
 unix_socket_group              | unset
 unix_socket_permissions        | 511
 vacuum_mem                     | 65536
 virtual_host                   | unset
 wal_buffers                    | 32
 wal_debug                      | 0
 wal_sync_method                | fdatasync
 zero_damaged_pages             | off
(113 rows)


suggestions, doubts and commentaries are very welcome

regards
______________________________
Eduardo Cunha de Almeida
Administra��o de Banco de Dados
UFPR - CCE
+55-41-361-3321
eduardo.almeida@ufpr.br
edalmeida@yahoo.com

--- Jan Wieck <JanWieck@Yahoo.com> wrote:
> Josh Berkus wrote:
>
> > Folks,
> >
> > I've sent a polite e-mail to Mr. Gomez offering
> our help.  Please, nobody
> > flame him!
> >
>
> Please keep in mind that the entire test has, other
> than a similar
> database schema and query types maybe, nothing to do
> with a TPC-H. I
> don't see any kind of SUT. Foreign key support on
> the DB level is not
> required by any of the TPC benchmarks. But the
> System Under Test, which
> is the combination of middleware application and
> database together with
> all computers and network components these parts are
> running on, must
> implement all the required semantics, like ACID
> properties, referential
> integrity &c. One could implement a TPC-H with flat
> files, it's just a
> major pain in the middleware.
>
> A proper TPC benchmark implementation would for
> example be a complete
> PHP+DB application, where the user interaction is
> done by an emulated
> "browser" and what is measured is the http response
> times, not anything
> going on between PHP and the DB. Assuming that all
> requirements of the
> TPC specification are implemented by either using
> available DB features,
> or including appropriate workarounds in the PHP
> code, that would very
> well lead to something that can compare PHP+MySQL
> vs. PHP+PostgreSQL.
>
> All TPC benchmarks I have seen are performed by
> timing such a system
> after a considerable rampup time, giving the DB
> system a chance to
> properly populate caches and so forth. Rebooting the
> machine just before
> the test is the wrong thing here and will especially
> kill any advanced
> cache algorithms like ARC.
>
>
> Jan
>
> --
>
#======================================================================#
> # It's easier to get forgiveness for being wrong
> than for being right. #
> # Let's break this rule - forgive me.
>                   #
> #==================================================
> JanWieck@Yahoo.com #
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>
> http://www.postgresql.org/docs/faqs/FAQ.htmlIP 5:
> Have you checked our extensive FAQ?
>
>
http://www.postgresql.org/docs/faqs/FAQ.html





__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25�
http://photos.yahoo.com/ph/print_splash

Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Jan Wieck
Date:
Eduardo Almeida wrote:

> Folks,
>
> I’m doing the 100GB TPC-H and I’ll show the previous
> results to our community (Postgres) in 3 weeks before
> finishing the study.
>
> My intention is to carry through a test with a VLDB in
> a low cost platform (PostgreSQL, Linux and cheap HW)
> and not to compare with another DBMS.

QphH and Price/QphH will be enought for us to see where in the list we
are. Unfortunately there are only Sybase and MS SQL results published in
the 100 GB category. The 300 GB has DB2 as well. Oracle starts at 1 TB
and in the 10 TB category Oracle and DB2 are the only players left.


Jan

>
> So far I can tell you that the load time on PG 7.4.2
> with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
> with 8 disks (960 GB) loaded the database in less than
> 24 hours.
> About 7hs:30min to load the data and 16:09:25 to
> create the indexes
>
> The Power test still running and that’s why I’ll not
> present anything so far. Now I´ll just send to the
> list my environment configuration.
>
> - The configuration of the machine is:
> Dual opteron 64 bits model 240
> 4GB RAM
> 960 GB on RAID 0
> Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
> kernel for this test)
> Java SDK  java version "1.4.2_04"
> PostgreSQL JDBC pg74.1jdbc3.jar
>
> - The TPC-H configuration is:
> TPC-H 2.0.0
> 100GB
> load using flat files
> Refresh functions using java
>
> - The PostgreSQL 7.4.2 configuration is:
>
> add_missing_from               | on
>  australian_timezones           | off
>  authentication_timeout         | 60
>  check_function_bodies          | on
>  checkpoint_segments            | 128
>  checkpoint_timeout             | 300
>  checkpoint_warning             | 30
>  client_encoding                | SQL_ASCII
>  client_min_messages            | notice
>  commit_delay                   | 0
>  commit_siblings                | 5
>  cpu_index_tuple_cost           | 0.001
>  cpu_operator_cost              | 0.0025
>  cpu_tuple_cost                 | 0.01
>  DateStyle                      | ISO, MDY
>  db_user_namespace              | off
>  deadlock_timeout               | 1000
>  debug_pretty_print             | off
>  debug_print_parse              | off
>  debug_print_plan               | off
>  debug_print_rewritten          | off
>  default_statistics_target      | 10
>  default_transaction_isolation  | read committed
>  default_transaction_read_only  | off
>  dynamic_library_path           | $libdir
>  effective_cache_size           | 150000
>  enable_hashagg                 | on
>  enable_hashjoin                | on
>  enable_indexscan               | on
>  enable_mergejoin               | on
>  enable_nestloop                | on
>  enable_seqscan                 | on
>  enable_sort                    | on
>  enable_tidscan                 | on
>  explain_pretty_print           | on
>  extra_float_digits             | 0
>  from_collapse_limit            | 8
>  fsync                          | off
>  geqo                           | on
>  geqo_effort                    | 1
>  geqo_generations               | 0
> geqo_pool_size                 | 0
>  geqo_selection_bias            | 2
>  geqo_threshold                 | 11
>  join_collapse_limit            | 8
>  krb_server_keyfile             | unset
>  lc_collate                     | en_US
>  lc_ctype                       | en_US
>  lc_messages                    | C
>  lc_monetary                    | C
>  lc_numeric                     | C
>  lc_time                        | C
>  log_connections                | off
>  log_duration                   | off
>  log_error_verbosity            | default
>  log_executor_stats             | off
>  log_hostname                   | off
>  log_min_duration_statement     | -1
>  log_min_error_statement        | panic
>  log_min_messages               | notice
>  log_parser_stats               | off
>  log_pid                        | off
>  log_planner_stats              | off
>  log_source_port                | off
>  log_statement                  | off
>  log_statement_stats            | off
>  log_timestamp                  | off
>  max_connections                | 10
>  max_expr_depth                 | 10000
>  max_files_per_process          | 1000
>  max_fsm_pages                  | 20000
>  max_fsm_relations              | 1000
>  max_locks_per_transaction      | 64
>  password_encryption            | on
>  port                           | 5432
>  pre_auth_delay                 | 0
>  preload_libraries              | unset
>  random_page_cost               | 1.25
>  regex_flavor                   | advanced
>  rendezvous_name                | unset
>  search_path                    | $user,public
>  server_encoding                | SQL_ASCII
>  server_version                 | 7.4.2
>  shared_buffers                 | 40000
>  silent_mode                    | off
> sort_mem                       | 65536
>  sql_inheritance                | on
>  ssl                            | off
>  statement_timeout              | 10000000
>  stats_block_level              | off
>  stats_command_string           | off
>  stats_reset_on_server_start    | on
>  stats_row_level                | off
>  stats_start_collector          | on
>  superuser_reserved_connections | 2
>  syslog                         | 0
>  syslog_facility                | LOCAL0
>  syslog_ident                   | postgres
>  tcpip_socket                   | on
>  TimeZone                       | unknown
>  trace_notify                   | off
>  transaction_isolation          | read committed
>  transaction_read_only          | off
>  transform_null_equals          | off
>  unix_socket_directory          | unset
>  unix_socket_group              | unset
>  unix_socket_permissions        | 511
>  vacuum_mem                     | 65536
>  virtual_host                   | unset
>  wal_buffers                    | 32
>  wal_debug                      | 0
>  wal_sync_method                | fdatasync
>  zero_damaged_pages             | off
> (113 rows)
>
>
> suggestions, doubts and commentaries are very welcome
>
> regards
> ______________________________
> Eduardo Cunha de Almeida
> Administração de Banco de Dados
> UFPR - CCE
> +55-41-361-3321
> eduardo.almeida@ufpr.br
> edalmeida@yahoo.com
>
> --- Jan Wieck <JanWieck@Yahoo.com> wrote:
>> Josh Berkus wrote:
>>
>> > Folks,
>> >
>> > I've sent a polite e-mail to Mr. Gomez offering
>> our help.  Please, nobody
>> > flame him!
>> >
>>
>> Please keep in mind that the entire test has, other
>> than a similar
>> database schema and query types maybe, nothing to do
>> with a TPC-H. I
>> don't see any kind of SUT. Foreign key support on
>> the DB level is not
>> required by any of the TPC benchmarks. But the
>> System Under Test, which
>> is the combination of middleware application and
>> database together with
>> all computers and network components these parts are
>> running on, must
>> implement all the required semantics, like ACID
>> properties, referential
>> integrity &c. One could implement a TPC-H with flat
>> files, it's just a
>> major pain in the middleware.
>>
>> A proper TPC benchmark implementation would for
>> example be a complete
>> PHP+DB application, where the user interaction is
>> done by an emulated
>> "browser" and what is measured is the http response
>> times, not anything
>> going on between PHP and the DB. Assuming that all
>> requirements of the
>> TPC specification are implemented by either using
>> available DB features,
>> or including appropriate workarounds in the PHP
>> code, that would very
>> well lead to something that can compare PHP+MySQL
>> vs. PHP+PostgreSQL.
>>
>> All TPC benchmarks I have seen are performed by
>> timing such a system
>> after a considerable rampup time, giving the DB
>> system a chance to
>> properly populate caches and so forth. Rebooting the
>> machine just before
>> the test is the wrong thing here and will especially
>> kill any advanced
>> cache algorithms like ARC.
>>
>>
>> Jan
>>
>> --
>>
> #======================================================================#
>> # It's easier to get forgiveness for being wrong
>> than for being right. #
>> # Let's break this rule - forgive me.
>>                   #
>> #==================================================
>> JanWieck@Yahoo.com #
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: Have you checked our extensive FAQ?
>>
>>
>> http://www.postgresql.org/docs/faqs/FAQ.htmlIP 5:
>> Have you checked our extensive FAQ?
>>
>>
> http://www.postgresql.org/docs/faqs/FAQ.html
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Photos: High-quality 4x6 digital prints for 25¢
> http://photos.yahoo.com/ph/print_splash


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Grega Bremec
Date:
...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard:
>
> - The configuration of the machine is:
> Dual opteron 64 bits model 240
> 4GB RAM
> 960 GB on RAID 0
> Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
> kernel for this test)
> Java SDK  java version "1.4.2_04"
> PostgreSQL JDBC pg74.1jdbc3.jar
>
> - The TPC-H configuration is:
> TPC-H 2.0.0
> 100GB
> load using flat files
> Refresh functions using java
>

I'll just add for the reference, to those that aren't aware of it, the Java
virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so
utterly unstable that most every notable shuffling around in the memory
crashes it. :)

Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode.

I won't be getting into how much this affects the benchmarks as I didn't
really get into how CPU- and memory-intensive the refresh functions are in
these, so as I said - let's keep it a reference.

Cheers,
--
    Grega Bremec
    Senior Administrator
    Noviforum Ltd., Software & Media
    http://www.noviforum.si/

Attachment

Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Tom Lane
Date:
Eduardo Almeida <edalmeida@yahoo.com> writes:
> About 7hs:30min to load the data and 16:09:25 to
> create the indexes

You could probably improve the index-create time by temporarily
increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
several hundred meg to work in.  (You don't want sort_mem that big
normally, because there may be many sorts happening in parallel,
but in a data-loading context there'll just be one active sort.)

            regards, tom lane

Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Tom Lane
Date:
Markus Bertheau <twanger@bluetwanger.de> writes:
>> You could probably improve the index-create time by temporarily
>> increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
>> several hundred meg to work in.  (You don't want sort_mem that big
>> normally, because there may be many sorts happening in parallel,
>> but in a data-loading context there'll just be one active sort.)

> Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

Already done for 7.5.

http://archives.postgresql.org/pgsql-committers/2004-02/msg00025.php

            regards, tom lane

Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Markus Bertheau
Date:
В Чтв, 22.04.2004, в 17:54, Tom Lane пишет:
> Eduardo Almeida <edalmeida@yahoo.com> writes:
> > About 7hs:30min to load the data and 16:09:25 to
> > create the indexes
>
> You could probably improve the index-create time by temporarily
> increasing sort_mem.  It wouldn't be unreasonable to give CREATE INDEX
> several hundred meg to work in.  (You don't want sort_mem that big
> normally, because there may be many sorts happening in parallel,
> but in a data-loading context there'll just be one active sort.)

Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

--
Markus Bertheau <twanger@bluetwanger.de>


Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks

From
Eduardo Almeida
Date:
Folks,

I forgot to mention that I used Shell scripts to load
the data and use Java just to run the refresh
functions.

Talking about sort_mem config, I used 65000 but in the
TPCH specification they said that you are not able to
change the configs when you start the benchmark, is
that a big problem to use 65000? In the TPCH 100GB we
run 5 streams in parallel for the throughput test!  To
power test I think is not a problem because it runs
one query after another.

Another thing is that I put statement_timeout =
10000000

Some queries may exceed this timeout and I�ll send the
EXPLAIN for this ones.

The last thing is that Jan forgets to mention that
Teradata doesn�t show up now but in older lists shows
3TB and 10TB results.

regards
Eduardo
--- Jan Wieck <JanWieck@Yahoo.com> wrote:
> Eduardo Almeida wrote:
>
> > Folks,
> >
> > I�m doing the 100GB TPC-H and I�ll show the
> previous
> > results to our community (Postgres) in 3 weeks
> before
> > finishing the study.
> >
> > My intention is to carry through a test with a
> VLDB in
> > a low cost platform (PostgreSQL, Linux and cheap
> HW)
> > and not to compare with another DBMS.
>
> QphH and Price/QphH will be enought for us to see
> where in the list we
> are. Unfortunately there are only Sybase and MS SQL
> results published in
> the 100 GB category. The 300 GB has DB2 as well.
> Oracle starts at 1 TB
> and in the 10 TB category Oracle and DB2 are the
> only players left.
>
>
> Jan
>
> >
> > So far I can tell you that the load time on PG
> 7.4.2
> > with kernel 2.6.5 on Opteron 64 model 240 in RAID
> 0
> > with 8 disks (960 GB) loaded the database in less
> than
> > 24 hours.
> > About 7hs:30min to load the data and 16:09:25 to
> > create the indexes
> >
> > The Power test still running and that�s why I�ll
> not
> > present anything so far. Now I�ll just send to the
> > list my environment configuration.
> >
> > - The configuration of the machine is:
> > Dual opteron 64 bits model 240
> > 4GB RAM
> > 960 GB on RAID 0
> > Mandrake Linux 64 with Kernel 2.6.5  (I compiled a
> > kernel for this test)
> > Java SDK  java version "1.4.2_04"
> > PostgreSQL JDBC pg74.1jdbc3.jar
> >
> > - The TPC-H configuration is:
> > TPC-H 2.0.0
> > 100GB
> > load using flat files
> > Refresh functions using java
> >
> > - The PostgreSQL 7.4.2 configuration is:
> >
> > add_missing_from               | on
> >  australian_timezones           | off
> >  authentication_timeout         | 60
> >  check_function_bodies          | on
> >  checkpoint_segments            | 128
> >  checkpoint_timeout             | 300
> >  checkpoint_warning             | 30
> >  client_encoding                | SQL_ASCII
> >  client_min_messages            | notice
> >  commit_delay                   | 0
> >  commit_siblings                | 5
> >  cpu_index_tuple_cost           | 0.001
> >  cpu_operator_cost              | 0.0025
> >  cpu_tuple_cost                 | 0.01
> >  DateStyle                      | ISO, MDY
> >  db_user_namespace              | off
> >  deadlock_timeout               | 1000
> >  debug_pretty_print             | off
> >  debug_print_parse              | off
> >  debug_print_plan               | off
> >  debug_print_rewritten          | off
> >  default_statistics_target      | 10
> >  default_transaction_isolation  | read committed
> >  default_transaction_read_only  | off
> >  dynamic_library_path           | $libdir
> >  effective_cache_size           | 150000
> >  enable_hashagg                 | on
> >  enable_hashjoin                | on
> >  enable_indexscan               | on
> >  enable_mergejoin               | on
> >  enable_nestloop                | on
> >  enable_seqscan                 | on
> >  enable_sort                    | on
> >  enable_tidscan                 | on
> >  explain_pretty_print           | on
> >  extra_float_digits             | 0
> >  from_collapse_limit            | 8
> >  fsync                          | off
> >  geqo                           | on
> >  geqo_effort                    | 1
> >  geqo_generations               | 0
> > geqo_pool_size                 | 0
> >  geqo_selection_bias            | 2
> >  geqo_threshold                 | 11
> >  join_collapse_limit            | 8
> >  krb_server_keyfile             | unset
> >  lc_collate                     | en_US
> >  lc_ctype                       | en_US
> >  lc_messages                    | C
> >  lc_monetary                    | C
> >  lc_numeric                     | C
> >  lc_time                        | C
> >  log_connections                | off
> >  log_duration                   | off
> >  log_error_verbosity            | default
> >  log_executor_stats             | off
> >  log_hostname                   | off
> >  log_min_duration_statement     | -1
> >  log_min_error_statement        | panic
> >  log_min_messages               | notice
> >  log_parser_stats               | off
> >  log_pid                        | off
> >  log_planner_stats              | off
> >  log_source_port                | off
> >  log_statement                  | off
> >  log_statement_stats            | off
> >  log_timestamp                  | off
> >  max_connections                | 10
> >  max_expr_depth                 | 10000
> >  max_files_per_process          | 1000
> >  max_fsm_pages                  | 20000
> >  max_fsm_relations              | 1000
> >  max_locks_per_transaction      | 64
> >  password_encryption            | on
> >  port                           | 5432
> >  pre_auth_delay                 | 0
> >  preload_libraries              | unset
> >  random_page_cost               | 1.25
> >  regex_flavor                   | advanced
> >  rendezvous_name                | unset
> >  search_path                    | $user,public
> >  server_encoding                | SQL_ASCII
> >  server_version                 | 7.4.2
> >  shared_buffers                 | 40000
> >  silent_mode                    | off
> > sort_mem                       | 65536
> >  sql_inheritance                | on
> >  ssl                            | off
> >  statement_timeout              | 10000000
> >  stats_block_level              | off
> >  stats_command_string           | off
> >  stats_reset_on_server_start    | on
> >  stats_row_level                | off
> >  stats_start_collector          | on
> >  superuser_reserved_connections | 2
> >  syslog                         | 0
> >  syslog_facility                | LOCAL0
> >  syslog_ident                   | postgres
> >  tcpip_socket                   | on
> >  TimeZone                       | unknown
> >  trace_notify                   | off
> >  transaction_isolation          | read committed
> >  transaction_read_only          | off
> >  transform_null_equals          | off
> >  unix_socket_directory          | unset
> >  unix_socket_group              | unset
> >  unix_socket_permissions        | 511
> >  vacuum_mem                     | 65536
> >  virtual_host                   | unset
> >  wal_buffers                    | 32
> >  wal_debug                      | 0
> >  wal_sync_method                | fdatasync
> >  zero_damaged_pages             | off
> > (113 rows)
> >
> >
> > suggestions, doubts and commentaries are very
> welcome
> >
> > regards
> > ______________________________
> > Eduardo Cunha de Almeida
> > Administra��o de Banco de Dados
> > UFPR - CCE
> > +55-41-361-3321
> > eduardo.almeida@ufpr.br
> > edalmeida@yahoo.com
> >
> > --- Jan Wieck <JanWieck@Yahoo.com> wrote:
> >> Josh Berkus wrote:
> >>
> >> > Folks,
> >> >
> >> > I've sent a polite e-mail to Mr. Gomez offering
> >> our help.  Please, nobody
> >> > flame him!
> >> >
> >>
> >> Please keep in mind that the entire test has,
> other
>
=== message truncated ===





__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25�
http://photos.yahoo.com/ph/print_splash