Query plan on identical tables differs . Why ? - Mailing list pgsql-performance

From Fabio Panizzutti
Subject Query plan on identical tables differs . Why ?
Date
Msg-id 005c01c438e7$ced25c30$3c02020a@ufficio
Whole thread Raw
Responses Re: Query plan on identical tables differs . Why ?
List pgsql-performance
Hello I'm tuning a postgresql (7.4.2) server for best performance .
I have a question about the planner .
I have two identical tables : one stores short data (about 2.000.000
record now) and
the other historycal data ( about 8.000.000 record now and growing ...)


A simple test query :   select tag_id,valore_tag,data_tag from
storico_misure where (data_tag>'2004-05-03' and data_tag <'2004-05-12')
and tag_id=37423 ;

Takes 57,637 ms on the short table and 1321,448 ms (!!) on the
historycal table .Tables are vacuumed and reindexed .



Tables and query plans :

\d storico_misure
                   Table "tenore.storico_misure"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 data_tag                | timestamp without time zone | not null
 tag_id                  | integer                     | not null
 unita_misura            | character varying(6)        | not null
 valore_tag              | numeric(20,3)               | not null
 qualita                 | integer                     | not null
 numero_campioni         | numeric(5,0)                |
 frequenza_campionamento | numeric(3,0)                |
Indexes:
    "pk_storico_misure_2" primary key, btree (data_tag, tag_id)
    "pk_anagtstorico_misuree_idx_2" btree (tag_id)
    "storico_misure_data_tag_idx_2" btree (data_tag)

storico=# \d storico_misure_short
                Table "tenore.storico_misure_short"
         Column          |            Type             | Modifiers
-------------------------+-----------------------------+-----------
 data_tag                | timestamp without time zone | not null
 tag_id                  | integer                     | not null
 unita_misura            | character varying(6)        | not null
 valore_tag              | numeric(20,3)               | not null
 qualita                 | integer                     | not null
 numero_campioni         | numeric(5,0)                |
 frequenza_campionamento | numeric(3,0)                |
Indexes:
    "storico_misure_short_pkey_2" primary key, btree (data_tag, tag_id)
    "pk_anagtstorico_misuree_short_idx_2" btree (tag_id)
    "storico_misure_short_data_tag_idx_2" btree (data_tag)

storico=#
storico=#
storico=# explain select tag_id,valore_tag,data_tag from storico_misure
where (data_tag>'2004-05-03' and data_tag <'2004-05-12') and
tag_id=37423 ;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
--------------------------
 Index Scan using pk_storico_misure_2 on storico_misure
(cost=0.00..1984.64 rows=658 width=21)
   Index Cond: ((data_tag > '2004-05-03 00:00:00'::timestamp without
time zone) AND (data_tag < '2004-05-12 00:00:00'::timestamp without time
zone) AND (tag_id = 37423))
(2 rows)

Time: 1,667 ms
storico=# explain select tag_id,valore_tag,data_tag from
storico_misure_short where (data_tag>'2004-05-03' and data_tag
<'2004-05-12') and tag_id=37423 ;
                                                                   QUERY
PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
-
 Index Scan using pk_anagtstorico_misuree_short_idx_2 on
storico_misure_short  (cost=0.00..1784.04 rows=629 width=20)
   Index Cond: (tag_id = 37423)
   Filter: ((data_tag > '2004-05-03 00:00:00'::timestamp without time
zone) AND (data_tag < '2004-05-12 00:00:00'::timestamp without time
zone))


How can i force the planner to use the same query plan ? I'd like to
test if using the same query plan i've better performace .

Thanks in advance




this is my posgresql.conf

#-----------------------------------------------------------------------
----
# CONNECTIONS AND AUTHENTICATION
#-----------------------------------------------------------------------
----

# - Connection Settings -

tcpip_socket = true
max_connections = 100
    # note: increasing max_connections costs about 500 bytes of
shared
    # memory per connection slot, in addition to costs from
shared_buffers
        # and max_locks_per_transaction.
#superuser_reserved_connections = 2
port = 5432
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777    # octal
#virtual_host = ''        # what interface to listen on; defaults
to any
#rendezvous_name = ''        # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60    # 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#-----------------------------------------------------------------------
----
# RESOURCE USAGE (except WAL)
#-----------------------------------------------------------------------
----

# - Memory -

shared_buffers = 3000        # min 16, at least max_connections*2,
8KB each
sort_mem = 4096         # min 64, size in KB
vacuum_mem = 32768        # min 1024, size in KB

# - Free Space Map -

#max_fsm_pages = 20000        # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000    # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000    # min 25
#preload_libraries = ''


#-----------------------------------------------------------------------
----
# WRITE AHEAD LOG
#-----------------------------------------------------------------------
----

# - Settings -

fsync = false            # turns forced synchronization on or off
#wal_sync_method = fsync    # the default varies across platforms:
                # fsync, fdatasync, open_sync, or
open_datasync
#wal_buffers = 8        # min 4, 8KB each

# - Checkpoints -

checkpoint_segments = 12    # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300    # range 30-3600, in seconds
#checkpoint_warning = 30    # 0 is off, in seconds
#commit_delay = 0        # range 0-100000, in microseconds
#commit_siblings = 5        # range 1-1000


#-----------------------------------------------------------------------
----
# QUERY TUNING
#-----------------------------------------------------------------------
----

# - Planner Method Enabling -

enable_hashagg = false
enable_hashjoin = false
enable_indexscan = true
enable_mergejoin = true
enable_nestloop = false
enable_seqscan = true
enable_sort = false
enable_tidscan = false

# - Planner Cost Constants -

#effective_cache_size = 1000    # typically 8KB each
#random_page_cost = 4        # units are one sequential page fetch
cost
#cpu_tuple_cost = 0.01        # (same)
#cpu_index_tuple_cost = 0.001    # (same)
#cpu_operator_cost = 0.0025    # (same)

# - Genetic Query Optimizer -

#geqo = true
#geqo_threshold = 11
#geqo_effort = 1
#geqo_generations = 0
#geqo_pool_size = 0        # default based on tables in statement,
                # range 128-1024
#geqo_selection_bias = 2.0    # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10    # range 1-1000
#from_collapse_limit = 8
#join_collapse_limit = 8    # 1 disables collapsing of explicit
JOINs


#-----------------------------------------------------------------------
----
# ERROR REPORTING AND LOGGING
#-----------------------------------------------------------------------
----

# - Syslog -

#syslog = 0            # range 0-2; 0=stdout; 1=both; 2=syslog
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'

# - When to Log -

#client_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2,
debug1,
                #   log, info, notice, warning, error

#log_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2,
debug1,
                #   info, notice, warning, error, log,
fatal,
                #   panic

#log_error_verbosity = default   # terse, default, or verbose messages

#log_min_error_statement = panic # Values in order of increasing
severity:
                 #   debug5, debug4, debug3, debug2,
debug1,
                 #   info, notice, warning, error,
panic(off)

#log_min_duration_statement = -1 # Log all statements whose
                 # execution time exceeds the value, in
                 # milliseconds.  Zero prints all
queries.
                 # Minus-one disables.

#silent_mode = false         # DO NOT USE without Syslog!

# - What to Log -

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
#log_connections = false
#log_duration = false
#log_pid = false
#log_statement = false
#log_timestamp = false
#log_hostname = false
#log_source_port = false


#-----------------------------------------------------------------------
----
# RUNTIME STATISTICS
#-----------------------------------------------------------------------
----

# - Statistics Monitoring -

#log_parser_stats = false
#log_planner_stats = false
#log_executor_stats = false
#log_statement_stats = false

# - Query/Index Statistics Collector -

stats_start_collector = true
stats_command_string = true
#stats_block_level = false
stats_row_level = true
#stats_reset_on_server_start = true


#-----------------------------------------------------------------------
----
# CLIENT CONNECTION DEFAULTS
#-----------------------------------------------------------------------
----

# - Statement Behavior -

#search_path = '$user,public'    # schema names
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
statement_timeout = 360000        # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown        # actually, defaults to TZ environment
setting
#australian_timezones = false
#extra_float_digits = 0        # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database
encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'it_IT.UTF-8'        # locale for system error
message strings
lc_monetary = 'it_IT.UTF-8'        # locale for monetary formatting
lc_numeric = 'it_IT.UTF-8'        # locale for number formatting
lc_time = 'it_IT.UTF-8'            # locale for time formatting

# - Other Defaults -

#explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 10000        # min 10


#-----------------------------------------------------------------------
----
# LOCK MANAGEMENT
#-----------------------------------------------------------------------
----

#deadlock_timeout = 1000    # in milliseconds
#max_locks_per_transaction = 64    # min 10, ~260*max_connections bytes
each


#-----------------------------------------------------------------------
----
# VERSION/PLATFORM COMPATIBILITY
#-----------------------------------------------------------------------
----

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced    # advanced, extended, or basic
#sql_inheritance = true

# - Other Platforms & Clients -

#transform_null_equals = false









pgsql-performance by date:

Previous
From: Bjoern Metzdorf
Date:
Subject: Quad processor options - summary
Next
From: Neil Conway
Date:
Subject: Re: Clarification on some settings