Re: Slow Query - Mailing list pgsql-performance

From Shawn
Subject Re: Slow Query
Date
Msg-id 20070903095749.5045211b@boffin.xmtservices.net
Whole thread Raw
In response to Re: Slow Query  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow Query
List pgsql-performance
On Sat, 01 Sep 2007 23:00:10 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Shawn <postgres@xmtservices.net> writes:
> > The query just ran and here is the basic output:
>
> > UPDATE 15445
> > Time: 22121.141 ms
>
> > and
>
> > explain ANALYZE update shawns_data set alias = null;
> >                                                     QUERY
> > PLAN
> > -----------------------------------------------------------------------------------------------------------------
> > Seq Scan on shawns_data  (cost=0.00..465.45 rows=15445 width=480)
> > (actual time=0.034..67.743 rows=15445 loops=1) Total runtime:
> > 1865.002 ms (2 rows)
>
> Hmmm ... did you run the real query and the EXPLAIN in immediate
> succession?  If so, the only reason I can think of for the speed
> difference is that all the rows were fetched already for the second
> run.  Which doesn't make a lot of sense given the hardware specs
> you mentioned.  Try watching "vmstat 1" and see if there's some
> noticeable difference in the behavior.
>

Actually no,

The runs were at least an 24 hours apart.

Running the query standalone was quite interesting.  There's a lot of
disk output:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0   2580 450992  84420 16007864    0    0     1     1    1    2  3  0 94  2
 0  0   2580 450992  84420 16007864    0    0     0     0 1003   18  0  0 100  0
 0  0   2580 450660  84420 16007864    0    0     0  6764 1127  202  0  0 100  0
 0  0   2580 450660  84420 16007864    0    0     0     0 1004   28  0  0 100  0
 0  0   2580 450660  84420 16007864    0    0     0     0 1011   58  0  0 100  0
 0  0   2580 450660  84420 16007864    0    0     0     0 1011   50  0  0 100  0
 0  0   2580 450660  84428 16007864    0    0     0    32 1011   53  0  0 100  0
 0  0   2580 450660  84428 16007864    0    0     0  3034 1095  191  0  0 100  0
 0  0   2580 450164  84428 16007864    0    0     0     0 1032  362  0  0 100  0
 0  0   2580 450164  84428 16007864    0    0     0     0 1002   22  0  0 100  0
 1  0   2580 446444  84428 16012240    0    0     0  2568 1024  398  4  1 95  0
 1  0   2580 456172  84428 16001760    0    0     8 11549 1136  616 10  3 87  0
 0  0   2580 452164  84428 16005524    0    0     4 26580 1885 1479  4  2 93  1
 0  0   2580 452164  84436 16005516    0    0     0    40 1004   26  0  0 100  0
 0  0   2580 452164  84436 16005524    0    0     0     0 1001   36  0  0 100  0
 0  0   2580 452164  84436 16005524    0    0     0     0 1001   24  0  0 100  0
 0  0   2580 452164  84436 16005524    0    0     0     0 1001   36  0  0 100  0
 0  0   2580 452164  84436 16005524    0    0     0     3 1005   27  0  0 100  0
 0  0   2580 452040  84436 16005524    0    0     0     0 1006   41  0  0 100  0
 0  0   2580 452040  84436 16005524    0    0     0     0 1002   20  0  0 100  0

Also it runs a lot faster by itself, so I am thinking there is some query early on that is
really fragmenting the table.  Here's the whole script:

\timing
select now();
create temporary table unit_operation(lid varchar(5), hang_time varchar(3),multi_csd_enabled bool,multi_node_enabled
bool,wide_area_enabledbool); 
create temporary table unit(lid varchar(5),alias char(8),type smallint);
\copy unit from '/tmp/wc_unit'
\copy unit_operation from '/tmp/wc_unit_operation'
truncate csd;
insert into csd(lid,alias,type) select lid,alias,type from unit;
update csd set hang_time = uo.hang_time, multi_csd_enabled = uo.multi_csd_enabled, multi_node_enabled =
uo.multi_node_enabled,wide_area_enabled = uo.wide_area_enabled from unit_operation uo where csd.lid = uo.lid; 
create temporary table groups(lid varchar(5),alias varchar(8));
create temporary table group_operation(lid varchar(15),hang_time varchar(8));
\copy groups from '/tmp/wc_groups'
\copy group_operation from '/tmp/wc_group_operation'
truncate gid;
insert into gid(gid,alias) select lid,alias from groups;
update gid set hang_time = go.hang_time from group_operation go where gid.gid = go.lid;
truncate rc_unit_state;
\copy rc_unit_state from '/tmp/wc_rc_unit_state' with null as ''
update lidrange set csd = false, data = false, disable = false, lost = false;
update lidrange set csd = true from csd where lidrange.lid = csd.lid;
update lidrange set data = true from shawns_data where lidrange.lid = shawns_data.lid;
update lidrange set disable = true from rc_unit_state where lidrange.lid = rc_unit_state.lid;
update shawns_data set alias = null where alias is not null;
update shawns_data set alias = csd.alias from csd where shawns_data.lid = csd.lid;
update lidrange set lost = true from shawns_data where shawns_data.lid = lidrange.lid and shawns_data.lost = true
\i '/home/shawn/vacuum_data.sql'

As you can see there is nothing really accessing the shawns_data table before the query in question.
I have also allocated 380 Megs of RAM to any virtual process, I can't see where any of these tables
could take up this much RAM.  Here is my postgresql.conf:

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.)  White space may be used.  Comments are introduced
# with '#' anywhere on a line.  The complete list of option names and
# allowed values can be found in the PostgreSQL documentation.  The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the server.
#
# Any option can also be given as a command line switch to the server,
# e.g., 'postgres -c log_connections=on'.  Some options can be changed at
# run-time with the 'SET' SQL command.
#
# This file is read on server startup and when the server receives a
# SIGHUP.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pg_ctl reload". Some
# settings, which are marked below, require a server shutdown and restart
# to take effect.
#
# Memory units:  kB = kilobytes MB = megabytes GB = gigabytes
# Time units:    ms = milliseconds s = seconds min = minutes h = hours d = days


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'        # use data in another directory
                    # (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf'    # host-based authentication file
                    # (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf'    # ident configuration file
                    # (change requires restart)

# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = '(none)'        # write an extra PID file
                    # (change requires restart)


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'        # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost', '*' = all
                    # (change requires restart)
#port = 5432                # (change requires restart)
max_connections = 20            # (change requires restart)
# Note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 3    # (change requires restart)
#unix_socket_directory = ''        # (change requires restart)
#unix_socket_group = ''            # (change requires restart)
#unix_socket_permissions = 0777        # octal
                    # (change requires restart)
#bonjour_name = ''            # defaults to the computer name
                    # (change requires restart)

# - Security & Authentication -

#authentication_timeout = 1min        # 1s-600s
#ssl = off                # (change requires restart)
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''        # (change requires restart)
#krb_srvname = 'postgres'        # (change requires restart)
#krb_server_hostname = ''        # empty string matches any keytab entry
                    # (change requires restart)
#krb_caseins_users = off        # (change requires restart)

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds;
                    # 0 selects the system default
#tcp_keepalives_interval = 0        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
#tcp_keepalives_count = 0        # TCP_KEEPCNT;
                    # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 300MB            # min 128kB or max_connections*16kB
                    # (change requires restart)
#temp_buffers = 8MB            # min 800kB
#max_prepared_transactions = 5        # can be 0 or more
                    # (change requires restart)
# Note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 300MB            # min 64kB
maintenance_work_mem = 260MB        # min 1MB
max_stack_depth = 7MB            # min 100kB

# - Free Space Map -

max_fsm_pages = 15360000        # min max_fsm_relations*16, 6 bytes each
                    # (change requires restart)
#max_fsm_relations = 1000        # min 100, ~70 bytes each
                    # (change requires restart)

# - Kernel Resource Usage -

#max_files_per_process = 1000        # min 25
                    # (change requires restart)
#shared_preload_libraries = ''        # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0            # 0-1000 milliseconds
#vacuum_cost_page_hit = 1        # 0-10000 credits
#vacuum_cost_page_miss = 10        # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 0-10000 credits

# - Background writer -

#bgwriter_delay = 200ms            # 10-10000ms between rounds
#bgwriter_lru_percent = 1.0        # 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5        # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333        # 0-100% of all buffers scanned/round
#bgwriter_all_maxpages = 5        # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = on                # turns forced synchronization on or off
#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
#full_page_writes = on            # recover from partial page writes
#wal_buffers = 64kB            # min 32kB
                    # (change requires restart)
#commit_delay = 0            # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

checkpoint_segments = 50        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min        # range 30s-1h
#checkpoint_warning = 30s        # 0 is off

# - Archiving -

#archive_command = ''        # command to use to archive a logfile segment
#archive_timeout = 0        # force a logfile segment switch after this
                # many seconds; 0 is off


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0            # measured on an arbitrary scale
#random_page_cost = 4.0            # same scale as above
#cpu_tuple_cost = 0.01            # same scale as above
#cpu_index_tuple_cost = 0.005        # same scale as above
#cpu_operator_cost = 0.0025        # same scale as above
#effective_cache_size = 128MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0            # selects default based on effort
#geqo_generations = 0            # selects default based on effort
#geqo_selection_bias = 2.0        # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10        # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit
                    # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'        # Valid values are combinations of
                    # stderr, syslog and eventlog,
                    # depending on platform.

# This is used when logging to stderr:
#redirect_stderr = off            # Enable capturing of stderr into log
                    # files
                    # (change requires restart)


# These are only used if redirect_stderr is on:
#log_directory = 'pg_log'        # Directory where log files are written
                    # Can be absolute or relative to PGDATA
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
                    # Can include strftime() escapes
log_truncate_on_rotation = off # If on, any existing log file of the same
                    # name as the new log file will be
                    # truncated rather than appended to. But
                    # such truncation only occurs on
                    # time-driven rotation, not on restarts
                    # or size-driven rotation. Default is
                    # off, meaning append to existing files
                    # in all cases.
#log_rotation_age = 1m            # Automatic rotation of logfiles will
                    # happen after that time.  0 to
                    # disable.
log_rotation_size = 10MB        # Automatic rotation of logfiles will
                    # happen after that much log
                    # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice        # Values, in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = notice        # Values, in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_error_verbosity = default        # terse, default, or verbose messages

#log_min_error_statement = error    # Values in order of increasing severity:
                     #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                     #   info
                    #   notice
                    #   warning
                    #   error
                    #   fatal
                    #   panic (effectively off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all statements
                    # and their durations.

#silent_mode = off            # DO NOT USE without syslog or
                    # redirect_stderr
                    # (change requires restart)

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''            # Special values:
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = PID
                    #   %t = timestamp (no milliseconds)
                    #   %m = timestamp with milliseconds
                    #   %i = command tag
                    #   %c = session id
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %x = transaction id
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_statement = 'none'            # none, ddl, mod, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Query/Index Statistics Collector -

#stats_command_string = on
#update_process_title = on

#stats_start_collector = on        # needed for block or row stats
                    # (change requires restart)
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off    # (change requires restart)


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

#autovacuum = off            # enable autovacuum subprocess?
                    # 'on' requires stats_start_collector
                    # and stats_row_level to also be on
#autovacuum_naptime = 1min        # time between autovacuum runs
#autovacuum_vacuum_threshold = 500    # min # of tuple updates before
                    # vacuum
#autovacuum_analyze_threshold = 250    # min # of tuple updates before
                    # analyze
#autovacuum_vacuum_scale_factor = 0.2    # fraction of rel size before
                    # vacuum
#autovacuum_analyze_scale_factor = 0.1    # fraction of rel size before
                    # analyze
#autovacuum_freeze_max_age = 200000000    # maximum XID age before forced vacuum
                    # (change requires restart)
#autovacuum_vacuum_cost_delay = -1    # default vacuum cost delay for
                    # autovacuum, -1 means use
                    # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1    # default vacuum cost limit for
                    # autovacuum, -1 means use
                    # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '"$user",public'        # schema names
#default_tablespace = ''        # a tablespace name, '' uses
                    # the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0            # 0 is disabled
#vacuum_freeze_min_age = 100000000

# - Locale and Formatting -

datestyle = 'iso, mdy'
#timezone = unknown            # actually, defaults to TZ
                    # environment setting
#timezone_abbreviations = 'Default'     # select the set of available timezone
                    # abbreviations. Currently, there are
                    #   Default
                    #   Australia
                    #   India
                    # However you can also create your own
                    # file in share/timezonesets/.
#extra_float_digits = 0            # min -15, max 2
#client_encoding = sql_ascii        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'en_US'            # locale for system error message
                    # strings
lc_monetary = 'en_US'            # locale for monetary formatting
lc_numeric = 'en_US'            # locale for number formatting
lc_time = 'en_US'                # locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'
#local_preload_libraries = ''


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1s
#max_locks_per_transaction = 64        # min 10
                    # (change requires restart)
# Note: each lock table slot uses ~270 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#array_nulls = on
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#standard_conforming_strings = off
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''        # list of custom variable class names


Any suggestions to improve things?

Shawn




pgsql-performance by date:

Previous
From: Shawn
Date:
Subject: Re: Slow Query
Next
From: "Kevin Grittner"
Date:
Subject: Re: Slow Query