Home > mailing lists

Thread: select on 1milion register = 6s

select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

28 July 2007, 16:27:24

Friends,

Who can help me? My SELECT in a base with 1 milion register, using expression index = 6seconds…

Please, I don’t know how to makes it better.

Thanks

Re: select on 1milion register = 6s

From

Craig James

Date:

28 July 2007, 16:51:46

Bruno Rodrigues Siqueira wrote:
>             Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds…

Run your query using

   EXPLAIN ANALYZE SELECT ... your query ...

and then post the results to this newsgroup.  Nobody can help until they see the results of EXPLAIN ANALYZE.  Also,
includeall other relevant information, such as Postgres version, operating system, amount of memory, and any changes
youhave made to the Postgres configuration file. 

Craig

Re: select on 1milion register = 6s

From

Hervé Piedvache

Date:

28 July 2007, 16:57:10

Do you have analyzed your table before doing this ?

Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit :
> Friends,
>
>
>
>
>
>
>
>             Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds.
>
>
>
>
>
> Please, I don't  know how to makes it better.
>
>
>
>
>
> Thanks



--
Hervé Piedvache

RES: select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

28 July 2007, 17:12:20

Ok.


    Query

EXPLAIN
ANALYZE
select
       to_char(data_encerramento,'mm/yyyy')  as opcoes_mes,
       to_char(data_encerramento,'yyyy-mm') as ordem
from detalhamento_bas
where

to_char( data_encerramento ,'yyyy-mm')
between   '2006-12' and  '2007-01'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC





    QUERY RESULT

QUERY PLAN
Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
time=14431.537..14431.538 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
  ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
time=14431.521..14431.523 rows=2 loops=1)
        ->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
(cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
rows=2335819 loops=1)
              Index Cond: ((to_char(data_encerramento, 'yyyy-mm'::text) >=
'2006-12'::text) AND (to_char(data_encerramento, 'yyyy-mm'::text) <=
'2007-01'::text))
Total runtime: 14431.605 ms








    SERVER
           DELL PowerEdge 2950
          XEON Quad-Core 3.0Ghz
          4Gb RAM
          Linux CentOS 5.0 64-bits





     Postgres 8.1.4




     Postgresql.conf


# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload". Some settings, such as listen_addresses, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'        # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'    # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf'    # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'        # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

listen_addresses = '*'        # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost', '*' = all
#port = 5432
max_connections = 10
# note: increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777        # octal
#bonjour_name = ''            # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60        # 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''        # empty string matches any keytab
entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds;
                    # 0 selects the system default
#tcp_keepalives_interval = 0        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
#tcp_keepalives_count = 0        # TCP_KEEPCNT;
                    # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 50000            # min 16 or max_connections*2, 8KB
each
temp_buffers = 1000            # min 100, 8KB each
#max_prepared_transactions = 5        # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 3145728            # min 64, size in KB
maintenance_work_mem = 4194304        # min 1024, size in KB
max_stack_depth = 2048            # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 208000            # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 10000        # min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000        # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

vacuum_cost_delay = 50            # 0-1000 milliseconds
#vacuum_cost_page_hit = 1        # 0-10000 credits
#vacuum_cost_page_miss = 10        # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 0-10000 credits

# - Background writer -

bgwriter_delay = 200            # 10-10000 milliseconds between
rounds
bgwriter_lru_percent = 20.0        # 0-100% of LRU buffers
scanned/round
bgwriter_lru_maxpages = 100        # 0-1000 buffers max written/round
bgwriter_all_percent = 3        # 0-100% of all buffers
scanned/round
bgwriter_all_maxpages = 600        # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

fsync = off                # turns forced synchronization on or
off
#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
full_page_writes = off            # recover from partial page writes
wal_buffers = 2300            # min 4, 8KB each
commit_delay = 10            # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

checkpoint_segments = 256        # in logfile segments, min 1, 16MB
each
checkpoint_timeout = 300        # range 30-3600, in seconds
checkpoint_warning = 99        # in seconds, 0 is off

# - Archiving -

#archive_command = ''            # command to use to archive a
logfile
                    # segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on

# - Planner Cost Constants -

effective_cache_size = 41943040        # typically 8KB each
random_page_cost = 1            # units are one sequential page
fetch
                    # cost
cpu_tuple_cost = 0.001            # (same)
cpu_index_tuple_cost = 0.0005        # (same)
cpu_operator_cost = 0.00025        # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0            # selects default based on effort
#geqo_generations = 0            # selects default based on effort
#geqo_selection_bias = 2.0        # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10        # range 1-1000
constraint_exclusion = on
#from_collapse_limit = 8
join_collapse_limit = 1        # 1 disables collapsing of explicit
                    # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'        # Valid values are combinations of
                    # stderr, syslog and eventlog,
                    # depending on platform.

# This is used when logging to stderr:
redirect_stderr = on            # Enable capturing of stderr into
log
                    # files

# These are only used if redirect_stderr is on:
log_directory = 'pg_log'        # Directory where log files are
written
                    # Can be absolute or relative to
PGDATA
log_filename = 'postgresql-%a.log'    # Log file name pattern.
                    # Can include strftime() escapes
log_truncate_on_rotation = on    # If on, any existing log file of the same
                    # name as the new log file will be
                    # truncated rather than appended to.
But
                    # such truncation only occurs on
                    # time-driven rotation, not on
restarts
                    # or size-driven rotation. Default
is
                    # off, meaning append to existing
files
                    # in all cases.
log_rotation_age = 1440            # Automatic rotation of logfiles
will
                    # happen after so many minutes.  0
to
                    # disable.
log_rotation_size = 0            # Automatic rotation of logfiles
will
                    # happen after so many kilobytes of
log
                    # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice        # Values, in order of decreasing
detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = notice        # Values, in order of decreasing
detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_error_verbosity = default        # terse, default, or verbose
messages

#log_min_error_statement = panic    # Values in order of increasing
severity:
                     #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                     #   info
                    #   notice
                    #   warning
                    #   error
                    #   panic(off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all
statements
                    # and their durations, in
milliseconds.

#silent_mode = off            # DO NOT USE without syslog or
                    # redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''            # Special values:
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = PID
                    #   %t = timestamp (no milliseconds)
                    #   %m = timestamp with milliseconds
                    #   %i = command tag
                    #   %c = session id
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %x = transaction id
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_statement = 'none'            # none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

stats_start_collector = off
#stats_command_string = off
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

autovacuum = on            # enable autovacuum subprocess?
#autovacuum_naptime = 60        # time between autovacuum runs, in
secs
#autovacuum_vacuum_threshold = 1000    # min # of tuple updates before
                    # vacuum
#autovacuum_analyze_threshold = 500    # min # of tuple updates before
                    # analyze
#autovacuum_vacuum_scale_factor = 0.4    # fraction of rel size before
                    # vacuum
#autovacuum_analyze_scale_factor = 0.2    # fraction of rel size before
                    # analyze
#autovacuum_vacuum_cost_delay = -1    # default vacuum cost delay for
                    # autovac, -1 means use
                    # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1    # default vacuum cost limit for
                    # autovac, -1 means use
                    # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'        # schema names
#default_tablespace = ''        # a tablespace name, '' uses
                    # the default
#check_function_bodies = on
default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0            # 0 is disabled, in milliseconds

# - Locale and Formatting -

datestyle = 'iso, dmy'
#timezone = unknown            # actually, defaults to TZ
                    # environment setting
#australian_timezones = off
#extra_float_digits = 0            # min -15, max 2
client_encoding = LATIN1        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'pt_BR.ISO-8859-1'            # locale for system
error message
                    # strings
lc_monetary = 'pt_BR.ISO-8859-1'            # locale for
monetary formatting
lc_numeric = 'pt_BR.ISO-8859-1'            # locale for number
formatting
lc_time = 'pt_BR.ISO-8859-1'                # locale for time
formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

deadlock_timeout = 1000        # in milliseconds
#max_locks_per_transaction = 64        # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = off
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''        # list of custom variable class
names











-----Mensagem original-----
De: Craig James [mailto:craig_james@emolecules.com]
Enviada em: sábado, 28 de julho de 2007 16:59
Para: Bruno Rodrigues Siqueira; pgsql-performance@postgresql.org
Assunto: Re: [PERFORM] select on 1milion register = 6s

Bruno Rodrigues Siqueira wrote:
>             Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds

Run your query using

   EXPLAIN ANALYZE SELECT ... your query ...

and then post the results to this newsgroup.  Nobody can help until they see
the results of EXPLAIN ANALYZE.  Also, include all other relevant
information, such as Postgres version, operating system, amount of memory,
and any changes you have made to the Postgres configuration file.

Craig

Re: RES: select on 1milion register = 6s

From

Ragnar

Date:

28 July 2007, 19:30:13

On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:

> where
>
> to_char( data_encerramento ,'yyyy-mm')
> between   '2006-12' and  '2007-01'

assuming data_encerramento is a date column, try:
WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

gnari

RES: RES: select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

28 July 2007, 19:39:07

Data_encerramento is a timestamp column


I will try your tip.
Thanks


-----Mensagem original-----
De: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] Em nome de Ragnar
Enviada em: sábado, 28 de julho de 2007 19:36
Para: Bruno Rodrigues Siqueira
Cc: pgsql-performance@postgresql.org
Assunto: Re: RES: [PERFORM] select on 1milion register = 6s

On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:

> where
>
> to_char( data_encerramento ,'yyyy-mm')
> between   '2006-12' and  '2007-01'

assuming data_encerramento is a date column, try:
WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

gnari



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

RES: select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

28 July 2007, 19:40:59

Yes.



Look this... and please, tell me if you can help me...


Thanks

Query

EXPLAIN
ANALYZE
select
       to_char(data_encerramento,'mm/yyyy')  as opcoes_mes,
       to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas
where

to_char( data_encerramento ,'yyyy-mm')
between   '2006-12' and  '2007-01'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC





    QUERY RESULT

QUERY PLAN
Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
time=14431.537..14431.538 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
  ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
time=14431.521..14431.523 rows=2 loops=1)
        ->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
(cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
rows=2335819 loops=1)
              Index Cond: ((to_char(data_encerramento, 'yyyy-mm'::text) >=
'2006-12'::text) AND (to_char(data_encerramento, 'yyyy-mm'::text) <=
'2007-01'::text))
Total runtime: 14431.605 ms








    SERVER
           DELL PowerEdge 2950
          XEON Quad-Core 3.0Ghz
          4Gb RAM
          Linux CentOS 5.0 64-bits





     Postgres 8.1.4




     Postgresql.conf


# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced #
with '#' anywhere on a line. The complete list of option names and # allowed
values can be found in the PostgreSQL documentation. The # commented-out
settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it #
to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the # postmaster,
e.g. 'postmaster -c log_connections=on'. Some options # can be changed at
run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster # receives
a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the
postmaster for the changes to take effect, or use # "pg_ctl reload". Some
settings, such as listen_addresses, require # a postmaster shutdown and
restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'        # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'    # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf'    # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'        # write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

listen_addresses = '*'        # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost', '*' = all
#port = 5432 max_connections = 10 # note: increasing max_connections costs
~400 bytes of shared memory per # connection slot, plus lock space (see
max_locks_per_transaction).  You # might also need to raise shared_buffers
to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777        # octal
#bonjour_name = ''            # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60        # 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''        # empty string matches any keytab
entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds;
                    # 0 selects the system default
#tcp_keepalives_interval = 0        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
#tcp_keepalives_count = 0        # TCP_KEEPCNT;
                    # 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 50000            # min 16 or max_connections*2, 8KB
each
temp_buffers = 1000            # min 100, 8KB each
#max_prepared_transactions = 5        # can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory # per transaction slot, plus lock space (see
max_locks_per_transaction).
work_mem = 3145728            # min 64, size in KB
maintenance_work_mem = 4194304        # min 1024, size in KB
max_stack_depth = 2048            # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 208000            # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 10000        # min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000        # min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

vacuum_cost_delay = 50            # 0-1000 milliseconds
#vacuum_cost_page_hit = 1        # 0-10000 credits
#vacuum_cost_page_miss = 10        # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 0-10000 credits

# - Background writer -

bgwriter_delay = 200            # 10-10000 milliseconds between
rounds
bgwriter_lru_percent = 20.0        # 0-100% of LRU buffers
scanned/round
bgwriter_lru_maxpages = 100        # 0-1000 buffers max written/round
bgwriter_all_percent = 3        # 0-100% of all buffers
scanned/round
bgwriter_all_maxpages = 600        # 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

fsync = off                # turns forced synchronization on or
off
#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
full_page_writes = off            # recover from partial page writes
wal_buffers = 2300            # min 4, 8KB each
commit_delay = 10            # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

checkpoint_segments = 256        # in logfile segments, min 1, 16MB
each
checkpoint_timeout = 300        # range 30-3600, in seconds
checkpoint_warning = 99        # in seconds, 0 is off

# - Archiving -

#archive_command = ''            # command to use to archive a
logfile
                    # segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on

# - Planner Cost Constants -

effective_cache_size = 41943040        # typically 8KB each
random_page_cost = 1            # units are one sequential page
fetch
                    # cost
cpu_tuple_cost = 0.001            # (same)
cpu_index_tuple_cost = 0.0005        # (same)
cpu_operator_cost = 0.00025        # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0            # selects default based on effort
#geqo_generations = 0            # selects default based on effort
#geqo_selection_bias = 2.0        # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10        # range 1-1000
constraint_exclusion = on
#from_collapse_limit = 8
join_collapse_limit = 1        # 1 disables collapsing of explicit
                    # JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'        # Valid values are combinations of
                    # stderr, syslog and eventlog,
                    # depending on platform.

# This is used when logging to stderr:
redirect_stderr = on            # Enable capturing of stderr into
log
                    # files

# These are only used if redirect_stderr is on:
log_directory = 'pg_log'        # Directory where log files are
written
                    # Can be absolute or relative to
PGDATA
log_filename = 'postgresql-%a.log'    # Log file name pattern.
                    # Can include strftime() escapes
log_truncate_on_rotation = on    # If on, any existing log file of the same
                    # name as the new log file will be
                    # truncated rather than appended to.
But
                    # such truncation only occurs on
                    # time-driven rotation, not on
restarts
                    # or size-driven rotation. Default
is
                    # off, meaning append to existing
files
                    # in all cases.
log_rotation_age = 1440            # Automatic rotation of logfiles
will
                    # happen after so many minutes.  0
to
                    # disable.
log_rotation_size = 0            # Automatic rotation of logfiles
will
                    # happen after so many kilobytes of
log
                    # output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice        # Values, in order of decreasing
detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = notice        # Values, in order of decreasing
detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_error_verbosity = default        # terse, default, or verbose
messages

#log_min_error_statement = panic    # Values in order of increasing
severity:
                     #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                     #   info
                    #   notice
                    #   warning
                    #   error
                    #   panic(off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all
statements
                    # and their durations, in
milliseconds.

#silent_mode = off            # DO NOT USE without syslog or
                    # redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''            # Special values:
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = PID
                    #   %t = timestamp (no milliseconds)
                    #   %m = timestamp with milliseconds
                    #   %i = command tag
                    #   %c = session id
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %x = transaction id
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_statement = 'none'            # none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

stats_start_collector = off
#stats_command_string = off
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

autovacuum = on            # enable autovacuum subprocess?
#autovacuum_naptime = 60        # time between autovacuum runs, in
secs
#autovacuum_vacuum_threshold = 1000    # min # of tuple updates before
                    # vacuum
#autovacuum_analyze_threshold = 500    # min # of tuple updates before
                    # analyze
#autovacuum_vacuum_scale_factor = 0.4    # fraction of rel size before
                    # vacuum
#autovacuum_analyze_scale_factor = 0.2    # fraction of rel size before
                    # analyze
#autovacuum_vacuum_cost_delay = -1    # default vacuum cost delay for
                    # autovac, -1 means use
                    # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1    # default vacuum cost limit for
                    # autovac, -1 means use
                    # vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'        # schema names
#default_tablespace = ''        # a tablespace name, '' uses
                    # the default
#check_function_bodies = on
default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0            # 0 is disabled, in milliseconds

# - Locale and Formatting -

datestyle = 'iso, dmy'
#timezone = unknown            # actually, defaults to TZ
                    # environment setting
#australian_timezones = off
#extra_float_digits = 0            # min -15, max 2
client_encoding = LATIN1        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb -- they might be changed
lc_messages = 'pt_BR.ISO-8859-1'            # locale for system
error message
                    # strings
lc_monetary = 'pt_BR.ISO-8859-1'            # locale for
monetary formatting
lc_numeric = 'pt_BR.ISO-8859-1'            # locale for number
formatting
lc_time = 'pt_BR.ISO-8859-1'                # locale for time
formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

deadlock_timeout = 1000        # in milliseconds
#max_locks_per_transaction = 64        # min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = off
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = on

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''        # list of custom variable class
names







-----Mensagem original-----
De: Hervé Piedvache [mailto:bill.footcow@gmail.com]
Enviada em: sábado, 28 de julho de 2007 16:57
Para: pgsql-performance@postgresql.org
Cc: Bruno Rodrigues Siqueira
Assunto: Re: [PERFORM] select on 1milion register = 6s

Do you have analyzed your table before doing this ?

Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit :
> Friends,
>
>
>
>
>
>
>
>             Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds.
>
>
>
>
>
> Please, I don't  know how to makes it better.
>
>
>
>
>
> Thanks



--
Hervé Piedvache

RES: select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

28 July 2007, 20:07:26

Yes, i do.





-----Mensagem original-----
De: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] Em nome de Hervé Piedvache
Enviada em: sábado, 28 de julho de 2007 16:57
Para: pgsql-performance@postgresql.org
Cc: Bruno Rodrigues Siqueira
Assunto: Re: [PERFORM] select on 1milion register = 6s

Do you have analyzed your table before doing this ?

Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit :
> Friends,
>
>
>
>
>
>
>
>             Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds.
>
>
>
>
>
> Please, I don't  know how to makes it better.
>
>
>
>
>
> Thanks



--
Hervé Piedvache

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: select on 1milion register = 6s

From

"Scott Marlowe"

Date:

29 July 2007, 01:51:48

On 7/28/07, Bruno Rodrigues Siqueira <bruno@ravnus.com> wrote:
>
> Ok.
> QUERY PLAN
> Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
> time=14431.537..14431.538 rows=2 loops=1)
>   Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
>   ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
> time=14431.521..14431.523 rows=2 loops=1)
>         ->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
> (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
> rows=2335819 loops=1)

See the row mismatch there?  It expects about 11k rows, gets back 2.3
million.  That's a pretty big misestimate.  Have you run analyze
recently on this table?

Is there a reason you're doing this:

to_char( data_encerramento ,'yyyy-mm')
between   '2006-12' and  '2007-01'

when you should be able to just do:

data_encerramento between   '2006-12-01' and  '2007-01-31'
?  that should be able to use good estimates from analyze.  My guess
is the planner is making a bad guess because of the way you're
handling the dates.

>         SERVER
>                   DELL PowerEdge 2950
>                   XEON Quad-Core 3.0Ghz
>                   4Gb RAM
>                   Linux CentOS 5.0 64-bits
>      Postgres 8.1.4

>      Postgresql.conf
> # - Memory -
>
> shared_buffers = 50000                  # min 16 or max_connections*2, 8KB

400 Meg is kind of low for a server with 4 G ram.  25% is more
reasonable (i.e. 125000 buffers)

> work_mem = 3145728                      # min 64, size in KB
> maintenance_work_mem = 4194304          # min 1024, size in KB

Whoa nellie!  thats ~ 3 Gig of work mem, and 4 gig of maintenance work
mem.  In a machine with 4 gig ram, that's a recipe for disaster.

Something more reasonable would be 128000 (~125Meg) for each since
you've limited your machine to 10 connections you should be ok.
setting work_mem too high can run your machine out of memory and into
a swap storm that will kill performance.

> fsync = off                             # turns forced synchronization on or
> off

So, the data in this database isn't important?  Cause that's what
fsync = off says to me.  Better to buy yourself a nice battery backed
caching RAID controller than turn off fsync.

> effective_cache_size = 41943040         # typically 8KB each

And you're machine has 343,604,830,208 bytes of memory available for
caching?  Seems a little high to me.

> random_page_cost = 1                    # units are one sequential page
> fetch

Seldom if ever is it a good idea to bonk the planner on the head with
random_page_cost=1.  setting it to 1.2 ot 1.4 is low enough, but 1.4
to 2.0 is more realistic.

> stats_start_collector = off
> #stats_command_string = off
> #stats_block_level = off
> #stats_row_level = off
> #stats_reset_on_server_start = off

I think you need stats_row_level on for autovacuum, but I'm not 100% sure.

Let us know what happens after fixing these settings and running
analyze and running explain analyze, with possible changes to the
query.

Re: RES: select on 1milion register = 6s

From

Decibel!

Date:

29 July 2007, 13:35:47

On Sat, Jul 28, 2007 at 10:36:16PM +0000, Ragnar wrote:
> On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:
>
> > where
> >
> > to_char( data_encerramento ,'yyyy-mm')
> > between   '2006-12' and  '2007-01'
>
> assuming data_encerramento is a date column, try:
> WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

IMO, much better would be:

WHERE data_encerramento >= '2006-12-01' AND data_encerramento <
'2007-02-01'

This means you don't have to worry about last day of the month or
timestamp precision. In fact, since the field is actually a timestamp,
the between posted above won't work correctly.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

msg-22763-27745.dat

RES: RES: select on 1milion register = 6s

From

"Bruno Rodrigues Siqueira"

Date:

29 July 2007, 13:44:33

Look it




EXPLAIN
 ANALYZE
select
       to_char(data_encerramento,'mm/yyyy')  as opcoes_mes,
       to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas
where

data_encerramento  =  '01/12/2006'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC


****************************************************************************


QUERY PLAN
Sort  (cost=60.72..60.72 rows=1 width=8) (actual time=4.586..4.586 rows=0
loops=1)
  Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
  ->  HashAggregate  (cost=60.72..60.72 rows=1 width=8) (actual
time=4.579..4.579 rows=0 loops=1)
        ->  Index Scan using detalhamento_bas_idx3005 on detalhamento_bas
(cost=0.00..60.67 rows=105 width=8) (actual time=4.576..4.576 rows=0
loops=1)
              Index Cond: (data_encerramento = '2006-12-01
00:00:00'::timestamp without time zone)
Total runtime: 4.629 ms


////////////////////////////////////////////////////////////////////////////

EXPLAIN
 ANALYZE
select
       to_char(data_encerramento,'mm/yyyy')  as opcoes_mes,
       to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas
where

data_encerramento >=  '01/12/2006'  and
data_encerramento < '01/02/2007'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC

****************************************************************************

QUERY PLAN
Sort  (cost=219113.10..219113.10 rows=4 width=8) (actual
time=10079.212..10079.213 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
  ->  HashAggregate  (cost=219113.09..219113.09 rows=4 width=8) (actual
time=10079.193..10079.195 rows=2 loops=1)
        ->  Seq Scan on detalhamento_bas  (cost=0.00..217945.41 rows=2335358
width=8) (actual time=0.041..8535.792 rows=2335819 loops=1)
              Filter: ((data_encerramento >= '2006-12-01
00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01
00:00:00'::timestamp without time zone))
Total runtime: 10079.256 ms







Strange!!! Why does the index not works?

All my querys doesn't work with range dates.... I don't know what to do...
Please, help!


Bruno



-----Mensagem original-----
De: Decibel! [mailto:decibel@decibel.org]
Enviada em: domingo, 29 de julho de 2007 13:36
Para: Ragnar
Cc: Bruno Rodrigues Siqueira; pgsql-performance@postgresql.org
Assunto: Re: RES: [PERFORM] select on 1milion register = 6s

On Sat, Jul 28, 2007 at 10:36:16PM +0000, Ragnar wrote:
> On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:
>
> > where
> >
> > to_char( data_encerramento ,'yyyy-mm')
> > between   '2006-12' and  '2007-01'
>
> assuming data_encerramento is a date column, try:
> WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'

IMO, much better would be:

WHERE data_encerramento >= '2006-12-01' AND data_encerramento <
'2007-02-01'

This means you don't have to worry about last day of the month or
timestamp precision. In fact, since the field is actually a timestamp,
the between posted above won't work correctly.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: select on 1milion register = 6s

From

Alvaro Herrera

Date:

29 July 2007, 15:02:30

Scott Marlowe wrote:
> On 7/28/07, Bruno Rodrigues Siqueira <bruno@ravnus.com> wrote:

> > stats_start_collector = off
> > #stats_command_string = off
> > #stats_block_level = off
> > #stats_row_level = off
> > #stats_reset_on_server_start = off
>
> I think you need stats_row_level on for autovacuum, but I'm not 100% sure.

That's correct (of course you need "start_collector" on as well).  Most
likely, autovacuum is not even running.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: RES: RES: select on 1milion register = 6s

From

Decibel!

Date:

29 July 2007, 22:29:35

On Sun, Jul 29, 2007 at 01:44:23PM -0300, Bruno Rodrigues Siqueira wrote:
> EXPLAIN
>  ANALYZE
> select
>        to_char(data_encerramento,'mm/yyyy')  as opcoes_mes,
>        to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas
> where
>
> data_encerramento >=  '01/12/2006'  and
> data_encerramento < '01/02/2007'
>
> GROUP BY opcoes_mes, ordem
> ORDER BY ordem DESC
>
> ****************************************************************************
>
> QUERY PLAN
> Sort  (cost=219113.10..219113.10 rows=4 width=8) (actual
> time=10079.212..10079.213 rows=2 loops=1)
>   Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
>   ->  HashAggregate  (cost=219113.09..219113.09 rows=4 width=8) (actual
> time=10079.193..10079.195 rows=2 loops=1)
>         ->  Seq Scan on detalhamento_bas  (cost=0.00..217945.41 rows=2335358
> width=8) (actual time=0.041..8535.792 rows=2335819 loops=1)
>               Filter: ((data_encerramento >= '2006-12-01
> 00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01
> 00:00:00'::timestamp without time zone))
> Total runtime: 10079.256 ms
>
> Strange!!! Why does the index not works?

It's unlikely that it's going to be faster to index scan 2.3M rows than
to sequential scan them. Try setting enable_seqscan=false and see if it
is or not.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Attachment

msg-22763-27749.dat

Re: RES: RES: select on 1milion register = 6s

From

Decibel!

Date:

30 July 2007, 03:54:19

Please reply-all so others can learn and contribute.

On Sun, Jul 29, 2007 at 09:38:12PM -0700, Craig James wrote:
> Decibel! wrote:
> >It's unlikely that it's going to be faster to index scan 2.3M rows than
> >to sequential scan them. Try setting enable_seqscan=false and see if it
> >is or not.
>
> Out of curiosity ... Doesn't that depend on the table?  Are all of the data
> for one row stored contiguously, or are the data stored column-wise?  If
> it's the former, and the table has hundreds of columns, or a few columns
> with large text strings, then wouldn't the time for a sequential scan
> depend not on the number of rows, but rather the total amount of data?

Yes, the time for a seqscan is mostly dependent on table size and not
the number of rows. But the number of rows plays a very large role in
the cost of an indexscan.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachment

msg-22763-27751.dat

Re: select on 1milion register = 6s

From

Jan Dittmer

Date:

01 August 2007, 14:59:19

Scott Marlowe wrote:
>> random_page_cost = 1                    # units are one sequential page
>> fetch
>
> Seldom if ever is it a good idea to bonk the planner on the head with
> random_page_cost=1.  setting it to 1.2 ot 1.4 is low enough, but 1.4
> to 2.0 is more realistic.

Which is probably the reason why the planner thinks a seq scan is
faster than an index scan...

Jan