Re: Joel's Performance Issues WAS : Opteron vs Xeon - Mailing list pgsql-performance

From Joel Fradkin
Subject Re: Joel's Performance Issues WAS : Opteron vs Xeon
Date
Msg-id 000f01c54679$8d2f1f70$797ba8c0@jfradkin
Whole thread Raw
In response to Re: Joel's Performance Issues WAS : Opteron vs Xeon  (John A Meinel <john@arbash-meinel.com>)
Responses Re: Joel's Performance Issues WAS : Opteron vs Xeon
List pgsql-performance
Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.

Yes plain table. Difference in rows is one of the datasets had sears data in
it. It (speed differences found) is much worse on some of my views, which is
what forced me to start looking at other options.

I suppose knowing your work_mem, and shared_buffers settings would be
useful. I have posted my configs, but will add the Tampa to the bottom
again. My desktop has
# - Memory -

shared_buffers = 8000        # min 16, at least max_connections*2, 8KB
each
work_mem = 8000#1024        # min 64, size in KB
maintenance_work_mem = 16384    # min 1024, size in KB
#max_stack_depth = 2048        # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 30000#20000        # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 1000    # min 100, ~50 bytes each
# - Planner Cost Constants -

effective_cache_size = 80000#1000    # typically 8KB each
random_page_cost = 2        # units are one sequential page fetch cost

How were you measuring "data retrieval time"? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.
This was ran on the machine with database (as was MYSQL and MSSQL).
The PG timing was from PGADMIN and the 18 secs was second run, first run was
Same time to return the data and 70 secs to do the first part like 147 secs
all told, compared to the 40 seconds first run of MYSQL and 56 Seconds
MSSQL. MYSQL was done in their query tool, it returns the rows as well and
MSSQL was done in their query analyzer. All three tools appear to use a
similar approach. Just an FYI doing an explain analyze of my problem view
took much longer then actually returning the data in MSSQL and MYSQL. I have
done extensive testing with MYSQL (just this table and two of my problem
views). I am not using the transactional version, because I need the best
speed.


I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.

John
=:->

This is the Linux box config.
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.
# data_directory = 'ConfigDir'        # use data in another directory
#data_directory = '/pgdata/data'
# hba_file = 'ConfigDir/pg_hba.conf'    # the host-based authentication file
# ident_file = 'ConfigDir/pg_ident.conf'  # the IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
# external_pid_file = '(none)'        # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'    # what IP interface(s) to listen on;
                # defaults to localhost, '*' = any

listen_addresses = '*'
port = 5432
max_connections = 100
    # note: increasing max_connections costs about 500 bytes of shared
    # memory per connection slot, in addition to costs from
shared_buffers
    # and max_locks_per_transaction.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777    # octal
#rendezvous_name = ''        # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60    # 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 12288 #5000  min 16, at least max_connections*2, 8KB each
#work_mem = 1024        # min 64, size in KB
work_mem = 16384 # 8192
#maintenance_work_mem = 16384    # min 1024, size in KB
#max_stack_depth = 2048        # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 100000    #30000    # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1500 #1000    # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000    # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0        # 0-1000 milliseconds
#vacuum_cost_page_hit = 1    # 0-10000 credits
#vacuum_cost_pagE_miss = 10    # 0-10000 credits
#vacuum_cost_page_dirty = 20    # 0-10000 credits
#vacuum_cost_limit = 200    # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200        # 10-10000 milliseconds between rounds
#bgwriter_percent = 1        # 0-100% of dirty buffers in each round
#bgwriter_maxpages = 100    # 0-1000 buffers max per round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -
fsync = true            # turns forced synchronization on or off
wal_sync_method = open_sync# fsync    # the default varies across
platforms:
                # fsync, fdatasync, open_sync, or
open_datasync
wal_buffers = 2048#8        # min 4, 8KB each
#commit_delay = 0        # range 0-100000, in microseconds
#commit_siblings = 5        # range 1-1000

# - Checkpoints -

checkpoint_segments = 100 #3    # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300    # range 30-3600, in seconds
#checkpoint_warning = 30    # 0 is off, in seconds

# - Archiving -

#archive_command = ''        # command to use to archive a logfile
segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_hashagg = true
#enable_hashjoin = true
#enable_indexscan = true
enable_mergejoin = false
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true

# - Planner Cost Constants -

effective_cache_size = 262144    #40000 typically 8KB each
#random_page_cost = 4        # units are one sequential page fetch cost
random_page_cost = 2
#cpu_tuple_cost = 0.01        # (same)
#cpu_index_tuple_cost = 0.001    # (same)
#cpu_operator_cost = 0.0025    # (same)

# - Genetic Query Optimizer -

#geqo = true
#geqo_threshold = 12
#geqo_effort = 5        # range 1-10
#geqo_pool_size = 0        # selects default based on effort
#geqo_generations = 0        # selects default based on effort
#geqo_selection_bias = 2.0    # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 250#10    # range 1-1000
#from_collapse_limit = 8
#join_collapse_limit = 8    # 1 disables collapsing of explicit JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'    # Valid values are combinations of stderr,
                                # syslog and eventlog, depending on
                                # platform.

# This is relevant when logging to stderr:
redirect_stderr = true      # Enable capturing of stderr into log files.
# These are only relevant if redirect_stderr is true:
log_directory = 'pg_log'    # Directory where log files are written.
                            # May be specified absolute or relative to
PGDATA
log_filename = 'postgresql-%a.log' # Log file name pattern.
                            # May include strftime() escapes
log_truncate_on_rotation = true  # If true, any existing log file of the
                            # same name as the new log file will be
truncated
                            # rather than appended to.  But such truncation
                            # only occurs on time-driven rotation,
                            # not on restarts or size-driven rotation.
                            # Default is false, meaning append to existing
                            # files in all cases.
log_rotation_age = 1440     # Automatic rotation of logfiles will happen
after
                            # so many minutes.  0 to disable.
log_rotation_size = 0       # Automatic rotation of logfiles will happen
after
                            # so many kilobytes of log output.  0 to
disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   log, notice, warning, error

#log_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   info, notice, warning, error, log,
fatal,
                #   panic

#log_error_verbosity = default    # terse, default, or verbose messages

#log_min_error_statement = panic # Values in order of increasing severity:
                 #   debug5, debug4, debug3, debug2, debug1,
                 #   info, notice, warning, error,
panic(off)

#log_min_duration_statement = -1 # -1 is disabled, in milliseconds.

#silent_mode = false         # DO NOT USE without syslog or
redirect_stderr

# - What to Log -

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
#log_connections = false
#log_disconnections = false
#log_duration = false
#log_line_prefix = ''        # e.g. '<%u%%%d> '
                # %u=user name %d=database name
                # %r=remote host and port
                # %p=PID %t=timestamp %i=command tag
                # %c=session id %l=session line number
                # %s=session start timestamp %x=transaction
id
                # %q=stop here in non-session processes
                # %%='%'
#log_statement = 'none'        # none, mod, ddl, all
#log_hostname = false


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = false
#log_planner_stats = false
#log_executor_stats = false
#log_statement_stats = false

# - Query/Index Statistics Collector -

#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'    # schema names
#default_tablespace = ''    # a tablespace name, or '' for default
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0        # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown        # actually, defaults to TZ environment
setting
#australian_timezones = false
#extra_float_digits = 0        # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'en_US.UTF-8'        # locale for system error message
strings
lc_monetary = 'en_US.UTF-8'        # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'        # locale for number formatting
lc_time = 'en_US.UTF-8'            # locale for time formatting

# - Other Defaults -

#explain_pretty_print = true
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000    # in milliseconds
#max_locks_per_transaction = 64    # min 10, ~200*max_connections bytes each


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced    # advanced, extended, or basic
#sql_inheritance = true
#default_with_oids = true

# - Other Platforms & Clients -

#transform_null_equals = false


pgsql-performance by date:

Previous
From: "Shoaib Burq (VPAC)"
Date:
Subject: Re: two queries and dual cpu (perplexed)
Next
From: Tom Lane
Date:
Subject: Re: When are index scans used over seq scans?