Thread: select on 1milion register = 6s
Friends,
Who can help me? My SELECT in a base with 1 milion register, using expression index = 6seconds…
Please, I don’t know how to makes it better.
Thanks
Bruno Rodrigues Siqueira wrote: > Who can help me? My SELECT in a base with 1 milion register, > using expression index = 6seconds… Run your query using EXPLAIN ANALYZE SELECT ... your query ... and then post the results to this newsgroup. Nobody can help until they see the results of EXPLAIN ANALYZE. Also, includeall other relevant information, such as Postgres version, operating system, amount of memory, and any changes youhave made to the Postgres configuration file. Craig
Do you have analyzed your table before doing this ? Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit : > Friends, > > > > > > > > Who can help me? My SELECT in a base with 1 milion register, > using expression index = 6seconds. > > > > > > Please, I don't know how to makes it better. > > > > > > Thanks -- Hervé Piedvache
Ok. Query EXPLAIN ANALYZE select to_char(data_encerramento,'mm/yyyy') as opcoes_mes, to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas where to_char( data_encerramento ,'yyyy-mm') between '2006-12' and '2007-01' GROUP BY opcoes_mes, ordem ORDER BY ordem DESC QUERY RESULT QUERY PLAN Sort (cost=11449.37..11449.40 rows=119 width=8) (actual time=14431.537..14431.538 rows=2 loops=1) Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) -> HashAggregate (cost=11448.79..11448.96 rows=119 width=8) (actual time=14431.521..14431.523 rows=2 loops=1) -> Index Scan using detalhamento_bas_idx3003 on detalhamento_bas (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155 rows=2335819 loops=1) Index Cond: ((to_char(data_encerramento, 'yyyy-mm'::text) >= '2006-12'::text) AND (to_char(data_encerramento, 'yyyy-mm'::text) <= '2007-01'::text)) Total runtime: 14431.605 ms SERVER DELL PowerEdge 2950 XEON Quad-Core 3.0Ghz 4Gb RAM Linux CentOS 5.0 64-bits Postgres 8.1.4 Postgresql.conf # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Please note that re-commenting a setting is NOT sufficient to revert it # to the default value, unless you restart the postmaster. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". Some settings, such as listen_addresses, require # a postmaster shutdown and restart to take effect. #--------------------------------------------------------------------------- # FILE LOCATIONS #--------------------------------------------------------------------------- # The default values of these variables are driven from the -D command line # switch or PGDATA environment variable, represented here as ConfigDir. #data_directory = 'ConfigDir' # use data in another directory #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file #ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file # If external_pid_file is not explicitly set, no extra pid file is written. #external_pid_file = '(none)' # write an extra pid file #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - listen_addresses = '*' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all #port = 5432 max_connections = 10 # note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. #superuser_reserved_connections = 2 #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #bonjour_name = '' # defaults to the computer name # - Security & Authentication - #authentication_timeout = 60 # 1-600, in seconds #ssl = off #password_encryption = on #db_user_namespace = off # Kerberos #krb_server_keyfile = '' #krb_srvname = 'postgres' #krb_server_hostname = '' # empty string matches any keytab entry #krb_caseins_users = off # - TCP Keepalives - # see 'man 7 tcp' for details #tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count = 0 # TCP_KEEPCNT; # 0 selects the system default #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 50000 # min 16 or max_connections*2, 8KB each temp_buffers = 1000 # min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 3145728 # min 64, size in KB maintenance_work_mem = 4194304 # min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB # - Free Space Map - max_fsm_pages = 208000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 10000 # min 100, ~70 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' # - Cost-Based Vacuum Delay - vacuum_cost_delay = 50 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-10000 credits #vacuum_cost_page_miss = 10 # 0-10000 credits #vacuum_cost_page_dirty = 20 # 0-10000 credits #vacuum_cost_limit = 200 # 0-10000 credits # - Background writer - bgwriter_delay = 200 # 10-10000 milliseconds between rounds bgwriter_lru_percent = 20.0 # 0-100% of LRU buffers scanned/round bgwriter_lru_maxpages = 100 # 0-1000 buffers max written/round bgwriter_all_percent = 3 # 0-100% of all buffers scanned/round bgwriter_all_maxpages = 600 # 0-1000 buffers max written/round #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - fsync = off # turns forced synchronization on or off #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync full_page_writes = off # recover from partial page writes wal_buffers = 2300 # min 4, 8KB each commit_delay = 10 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 # - Checkpoints - checkpoint_segments = 256 # in logfile segments, min 1, 16MB each checkpoint_timeout = 300 # range 30-3600, in seconds checkpoint_warning = 99 # in seconds, 0 is off # - Archiving - #archive_command = '' # command to use to archive a logfile # segment #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Configuration - enable_bitmapscan = on enable_hashagg = on enable_hashjoin = on enable_indexscan = on enable_mergejoin = on enable_nestloop = on enable_seqscan = on enable_sort = on enable_tidscan = on # - Planner Cost Constants - effective_cache_size = 41943040 # typically 8KB each random_page_cost = 1 # units are one sequential page fetch # cost cpu_tuple_cost = 0.001 # (same) cpu_index_tuple_cost = 0.0005 # (same) cpu_operator_cost = 0.00025 # (same) # - Genetic Query Optimizer - #geqo = on #geqo_threshold = 12 #geqo_effort = 5 # range 1-10 #geqo_pool_size = 0 # selects default based on effort #geqo_generations = 0 # selects default based on effort #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 constraint_exclusion = on #from_collapse_limit = 8 join_collapse_limit = 1 # 1 disables collapsing of explicit # JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Where to Log - #log_destination = 'stderr' # Valid values are combinations of # stderr, syslog and eventlog, # depending on platform. # This is used when logging to stderr: redirect_stderr = on # Enable capturing of stderr into log # files # These are only used if redirect_stderr is on: log_directory = 'pg_log' # Directory where log files are written # Can be absolute or relative to PGDATA log_filename = 'postgresql-%a.log' # Log file name pattern. # Can include strftime() escapes log_truncate_on_rotation = on # If on, any existing log file of the same # name as the new log file will be # truncated rather than appended to. But # such truncation only occurs on # time-driven rotation, not on restarts # or size-driven rotation. Default is # off, meaning append to existing files # in all cases. log_rotation_age = 1440 # Automatic rotation of logfiles will # happen after so many minutes. 0 to # disable. log_rotation_size = 0 # Automatic rotation of logfiles will # happen after so many kilobytes of log # output. 0 to disable. # These are relevant when logging to syslog: #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - #client_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #log_error_verbosity = default # terse, default, or verbose messages #log_min_error_statement = panic # Values in order of increasing severity: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # panic(off) #log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements # and their durations, in milliseconds. #silent_mode = off # DO NOT USE without syslog or # redirect_stderr # - What to Log - #debug_print_parse = off #debug_print_rewritten = off #debug_print_plan = off #debug_pretty_print = off #log_connections = off #log_disconnections = off #log_duration = off #log_line_prefix = '' # Special values: # %u = user name # %d = database name # %r = remote host and port # %h = remote host # %p = PID # %t = timestamp (no milliseconds) # %m = timestamp with milliseconds # %i = command tag # %c = session id # %l = session line number # %s = session start timestamp # %x = transaction id # %q = stop here in non-session # processes # %% = '%' # e.g. '<%u%%%d> ' #log_statement = 'none' # none, mod, ddl, all #log_hostname = off #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - #log_parser_stats = off #log_planner_stats = off #log_executor_stats = off #log_statement_stats = off # - Query/Index Statistics Collector - stats_start_collector = off #stats_command_string = off #stats_block_level = off #stats_row_level = off #stats_reset_on_server_start = off #--------------------------------------------------------------------------- # AUTOVACUUM PARAMETERS #--------------------------------------------------------------------------- autovacuum = on # enable autovacuum subprocess? #autovacuum_naptime = 60 # time between autovacuum runs, in secs #autovacuum_vacuum_threshold = 1000 # min # of tuple updates before # vacuum #autovacuum_analyze_threshold = 500 # min # of tuple updates before # analyze #autovacuum_vacuum_scale_factor = 0.4 # fraction of rel size before # vacuum #autovacuum_analyze_scale_factor = 0.2 # fraction of rel size before # analyze #autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # autovac, -1 means use # vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovac, -1 means use # vacuum_cost_limit #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '$user,public' # schema names #default_tablespace = '' # a tablespace name, '' uses # the default #check_function_bodies = on default_transaction_isolation = 'read committed' #default_transaction_read_only = off #statement_timeout = 0 # 0 is disabled, in milliseconds # - Locale and Formatting - datestyle = 'iso, dmy' #timezone = unknown # actually, defaults to TZ # environment setting #australian_timezones = off #extra_float_digits = 0 # min -15, max 2 client_encoding = LATIN1 # actually, defaults to database # encoding # These settings are initialized by initdb -- they might be changed lc_messages = 'pt_BR.ISO-8859-1' # locale for system error message # strings lc_monetary = 'pt_BR.ISO-8859-1' # locale for monetary formatting lc_numeric = 'pt_BR.ISO-8859-1' # locale for number formatting lc_time = 'pt_BR.ISO-8859-1' # locale for time formatting # - Other Defaults - #explain_pretty_print = on #dynamic_library_path = '$libdir' #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- deadlock_timeout = 1000 # in milliseconds #max_locks_per_transaction = 64 # min 10 # note: each lock table slot uses ~220 bytes of shared memory, and there are # max_locks_per_transaction * (max_connections + max_prepared_transactions) # lock table slots. #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = off #backslash_quote = safe_encoding # on, off, or safe_encoding #default_with_oids = off #escape_string_warning = off #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = on # - Other Platforms & Clients - #transform_null_equals = off #--------------------------------------------------------------------------- # CUSTOMIZED OPTIONS #--------------------------------------------------------------------------- #custom_variable_classes = '' # list of custom variable class names -----Mensagem original----- De: Craig James [mailto:craig_james@emolecules.com] Enviada em: sábado, 28 de julho de 2007 16:59 Para: Bruno Rodrigues Siqueira; pgsql-performance@postgresql.org Assunto: Re: [PERFORM] select on 1milion register = 6s Bruno Rodrigues Siqueira wrote: > Who can help me? My SELECT in a base with 1 milion register, > using expression index = 6seconds Run your query using EXPLAIN ANALYZE SELECT ... your query ... and then post the results to this newsgroup. Nobody can help until they see the results of EXPLAIN ANALYZE. Also, include all other relevant information, such as Postgres version, operating system, amount of memory, and any changes you have made to the Postgres configuration file. Craig
On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote: > where > > to_char( data_encerramento ,'yyyy-mm') > between '2006-12' and '2007-01' assuming data_encerramento is a date column, try: WHERE data_encerramento between '2006-12-01' and '2007-01-31' gnari
Data_encerramento is a timestamp column I will try your tip. Thanks -----Mensagem original----- De: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] Em nome de Ragnar Enviada em: sábado, 28 de julho de 2007 19:36 Para: Bruno Rodrigues Siqueira Cc: pgsql-performance@postgresql.org Assunto: Re: RES: [PERFORM] select on 1milion register = 6s On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote: > where > > to_char( data_encerramento ,'yyyy-mm') > between '2006-12' and '2007-01' assuming data_encerramento is a date column, try: WHERE data_encerramento between '2006-12-01' and '2007-01-31' gnari ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
Yes. Look this... and please, tell me if you can help me... Thanks Query EXPLAIN ANALYZE select to_char(data_encerramento,'mm/yyyy') as opcoes_mes, to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas where to_char( data_encerramento ,'yyyy-mm') between '2006-12' and '2007-01' GROUP BY opcoes_mes, ordem ORDER BY ordem DESC QUERY RESULT QUERY PLAN Sort (cost=11449.37..11449.40 rows=119 width=8) (actual time=14431.537..14431.538 rows=2 loops=1) Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) -> HashAggregate (cost=11448.79..11448.96 rows=119 width=8) (actual time=14431.521..14431.523 rows=2 loops=1) -> Index Scan using detalhamento_bas_idx3003 on detalhamento_bas (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155 rows=2335819 loops=1) Index Cond: ((to_char(data_encerramento, 'yyyy-mm'::text) >= '2006-12'::text) AND (to_char(data_encerramento, 'yyyy-mm'::text) <= '2007-01'::text)) Total runtime: 14431.605 ms SERVER DELL PowerEdge 2950 XEON Quad-Core 3.0Ghz 4Gb RAM Linux CentOS 5.0 64-bits Postgres 8.1.4 Postgresql.conf # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Please note that re-commenting a setting is NOT sufficient to revert it # to the default value, unless you restart the postmaster. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". Some settings, such as listen_addresses, require # a postmaster shutdown and restart to take effect. #--------------------------------------------------------------------------- # FILE LOCATIONS #--------------------------------------------------------------------------- # The default values of these variables are driven from the -D command line # switch or PGDATA environment variable, represented here as ConfigDir. #data_directory = 'ConfigDir' # use data in another directory #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file #ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file # If external_pid_file is not explicitly set, no extra pid file is written. #external_pid_file = '(none)' # write an extra pid file #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - listen_addresses = '*' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all #port = 5432 max_connections = 10 # note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. #superuser_reserved_connections = 2 #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #bonjour_name = '' # defaults to the computer name # - Security & Authentication - #authentication_timeout = 60 # 1-600, in seconds #ssl = off #password_encryption = on #db_user_namespace = off # Kerberos #krb_server_keyfile = '' #krb_srvname = 'postgres' #krb_server_hostname = '' # empty string matches any keytab entry #krb_caseins_users = off # - TCP Keepalives - # see 'man 7 tcp' for details #tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count = 0 # TCP_KEEPCNT; # 0 selects the system default #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 50000 # min 16 or max_connections*2, 8KB each temp_buffers = 1000 # min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 3145728 # min 64, size in KB maintenance_work_mem = 4194304 # min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB # - Free Space Map - max_fsm_pages = 208000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 10000 # min 100, ~70 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' # - Cost-Based Vacuum Delay - vacuum_cost_delay = 50 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-10000 credits #vacuum_cost_page_miss = 10 # 0-10000 credits #vacuum_cost_page_dirty = 20 # 0-10000 credits #vacuum_cost_limit = 200 # 0-10000 credits # - Background writer - bgwriter_delay = 200 # 10-10000 milliseconds between rounds bgwriter_lru_percent = 20.0 # 0-100% of LRU buffers scanned/round bgwriter_lru_maxpages = 100 # 0-1000 buffers max written/round bgwriter_all_percent = 3 # 0-100% of all buffers scanned/round bgwriter_all_maxpages = 600 # 0-1000 buffers max written/round #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - fsync = off # turns forced synchronization on or off #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync full_page_writes = off # recover from partial page writes wal_buffers = 2300 # min 4, 8KB each commit_delay = 10 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 # - Checkpoints - checkpoint_segments = 256 # in logfile segments, min 1, 16MB each checkpoint_timeout = 300 # range 30-3600, in seconds checkpoint_warning = 99 # in seconds, 0 is off # - Archiving - #archive_command = '' # command to use to archive a logfile # segment #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Configuration - enable_bitmapscan = on enable_hashagg = on enable_hashjoin = on enable_indexscan = on enable_mergejoin = on enable_nestloop = on enable_seqscan = on enable_sort = on enable_tidscan = on # - Planner Cost Constants - effective_cache_size = 41943040 # typically 8KB each random_page_cost = 1 # units are one sequential page fetch # cost cpu_tuple_cost = 0.001 # (same) cpu_index_tuple_cost = 0.0005 # (same) cpu_operator_cost = 0.00025 # (same) # - Genetic Query Optimizer - #geqo = on #geqo_threshold = 12 #geqo_effort = 5 # range 1-10 #geqo_pool_size = 0 # selects default based on effort #geqo_generations = 0 # selects default based on effort #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 constraint_exclusion = on #from_collapse_limit = 8 join_collapse_limit = 1 # 1 disables collapsing of explicit # JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Where to Log - #log_destination = 'stderr' # Valid values are combinations of # stderr, syslog and eventlog, # depending on platform. # This is used when logging to stderr: redirect_stderr = on # Enable capturing of stderr into log # files # These are only used if redirect_stderr is on: log_directory = 'pg_log' # Directory where log files are written # Can be absolute or relative to PGDATA log_filename = 'postgresql-%a.log' # Log file name pattern. # Can include strftime() escapes log_truncate_on_rotation = on # If on, any existing log file of the same # name as the new log file will be # truncated rather than appended to. But # such truncation only occurs on # time-driven rotation, not on restarts # or size-driven rotation. Default is # off, meaning append to existing files # in all cases. log_rotation_age = 1440 # Automatic rotation of logfiles will # happen after so many minutes. 0 to # disable. log_rotation_size = 0 # Automatic rotation of logfiles will # happen after so many kilobytes of log # output. 0 to disable. # These are relevant when logging to syslog: #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - #client_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #log_error_verbosity = default # terse, default, or verbose messages #log_min_error_statement = panic # Values in order of increasing severity: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # panic(off) #log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements # and their durations, in milliseconds. #silent_mode = off # DO NOT USE without syslog or # redirect_stderr # - What to Log - #debug_print_parse = off #debug_print_rewritten = off #debug_print_plan = off #debug_pretty_print = off #log_connections = off #log_disconnections = off #log_duration = off #log_line_prefix = '' # Special values: # %u = user name # %d = database name # %r = remote host and port # %h = remote host # %p = PID # %t = timestamp (no milliseconds) # %m = timestamp with milliseconds # %i = command tag # %c = session id # %l = session line number # %s = session start timestamp # %x = transaction id # %q = stop here in non-session # processes # %% = '%' # e.g. '<%u%%%d> ' #log_statement = 'none' # none, mod, ddl, all #log_hostname = off #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - #log_parser_stats = off #log_planner_stats = off #log_executor_stats = off #log_statement_stats = off # - Query/Index Statistics Collector - stats_start_collector = off #stats_command_string = off #stats_block_level = off #stats_row_level = off #stats_reset_on_server_start = off #--------------------------------------------------------------------------- # AUTOVACUUM PARAMETERS #--------------------------------------------------------------------------- autovacuum = on # enable autovacuum subprocess? #autovacuum_naptime = 60 # time between autovacuum runs, in secs #autovacuum_vacuum_threshold = 1000 # min # of tuple updates before # vacuum #autovacuum_analyze_threshold = 500 # min # of tuple updates before # analyze #autovacuum_vacuum_scale_factor = 0.4 # fraction of rel size before # vacuum #autovacuum_analyze_scale_factor = 0.2 # fraction of rel size before # analyze #autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # autovac, -1 means use # vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovac, -1 means use # vacuum_cost_limit #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '$user,public' # schema names #default_tablespace = '' # a tablespace name, '' uses # the default #check_function_bodies = on default_transaction_isolation = 'read committed' #default_transaction_read_only = off #statement_timeout = 0 # 0 is disabled, in milliseconds # - Locale and Formatting - datestyle = 'iso, dmy' #timezone = unknown # actually, defaults to TZ # environment setting #australian_timezones = off #extra_float_digits = 0 # min -15, max 2 client_encoding = LATIN1 # actually, defaults to database # encoding # These settings are initialized by initdb -- they might be changed lc_messages = 'pt_BR.ISO-8859-1' # locale for system error message # strings lc_monetary = 'pt_BR.ISO-8859-1' # locale for monetary formatting lc_numeric = 'pt_BR.ISO-8859-1' # locale for number formatting lc_time = 'pt_BR.ISO-8859-1' # locale for time formatting # - Other Defaults - #explain_pretty_print = on #dynamic_library_path = '$libdir' #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- deadlock_timeout = 1000 # in milliseconds #max_locks_per_transaction = 64 # min 10 # note: each lock table slot uses ~220 bytes of shared memory, and there are # max_locks_per_transaction * (max_connections + max_prepared_transactions) # lock table slots. #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = off #backslash_quote = safe_encoding # on, off, or safe_encoding #default_with_oids = off #escape_string_warning = off #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = on # - Other Platforms & Clients - #transform_null_equals = off #--------------------------------------------------------------------------- # CUSTOMIZED OPTIONS #--------------------------------------------------------------------------- #custom_variable_classes = '' # list of custom variable class names -----Mensagem original----- De: Hervé Piedvache [mailto:bill.footcow@gmail.com] Enviada em: sábado, 28 de julho de 2007 16:57 Para: pgsql-performance@postgresql.org Cc: Bruno Rodrigues Siqueira Assunto: Re: [PERFORM] select on 1milion register = 6s Do you have analyzed your table before doing this ? Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit : > Friends, > > > > > > > > Who can help me? My SELECT in a base with 1 milion register, > using expression index = 6seconds. > > > > > > Please, I don't know how to makes it better. > > > > > > Thanks -- Hervé Piedvache
Yes, i do. -----Mensagem original----- De: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] Em nome de Hervé Piedvache Enviada em: sábado, 28 de julho de 2007 16:57 Para: pgsql-performance@postgresql.org Cc: Bruno Rodrigues Siqueira Assunto: Re: [PERFORM] select on 1milion register = 6s Do you have analyzed your table before doing this ? Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit : > Friends, > > > > > > > > Who can help me? My SELECT in a base with 1 milion register, > using expression index = 6seconds. > > > > > > Please, I don't know how to makes it better. > > > > > > Thanks -- Hervé Piedvache ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
On 7/28/07, Bruno Rodrigues Siqueira <bruno@ravnus.com> wrote: > > Ok. > QUERY PLAN > Sort (cost=11449.37..11449.40 rows=119 width=8) (actual > time=14431.537..14431.538 rows=2 loops=1) > Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) > -> HashAggregate (cost=11448.79..11448.96 rows=119 width=8) (actual > time=14431.521..14431.523 rows=2 loops=1) > -> Index Scan using detalhamento_bas_idx3003 on detalhamento_bas > (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155 > rows=2335819 loops=1) See the row mismatch there? It expects about 11k rows, gets back 2.3 million. That's a pretty big misestimate. Have you run analyze recently on this table? Is there a reason you're doing this: to_char( data_encerramento ,'yyyy-mm') between '2006-12' and '2007-01' when you should be able to just do: data_encerramento between '2006-12-01' and '2007-01-31' ? that should be able to use good estimates from analyze. My guess is the planner is making a bad guess because of the way you're handling the dates. > SERVER > DELL PowerEdge 2950 > XEON Quad-Core 3.0Ghz > 4Gb RAM > Linux CentOS 5.0 64-bits > Postgres 8.1.4 > Postgresql.conf > # - Memory - > > shared_buffers = 50000 # min 16 or max_connections*2, 8KB 400 Meg is kind of low for a server with 4 G ram. 25% is more reasonable (i.e. 125000 buffers) > work_mem = 3145728 # min 64, size in KB > maintenance_work_mem = 4194304 # min 1024, size in KB Whoa nellie! thats ~ 3 Gig of work mem, and 4 gig of maintenance work mem. In a machine with 4 gig ram, that's a recipe for disaster. Something more reasonable would be 128000 (~125Meg) for each since you've limited your machine to 10 connections you should be ok. setting work_mem too high can run your machine out of memory and into a swap storm that will kill performance. > fsync = off # turns forced synchronization on or > off So, the data in this database isn't important? Cause that's what fsync = off says to me. Better to buy yourself a nice battery backed caching RAID controller than turn off fsync. > effective_cache_size = 41943040 # typically 8KB each And you're machine has 343,604,830,208 bytes of memory available for caching? Seems a little high to me. > random_page_cost = 1 # units are one sequential page > fetch Seldom if ever is it a good idea to bonk the planner on the head with random_page_cost=1. setting it to 1.2 ot 1.4 is low enough, but 1.4 to 2.0 is more realistic. > stats_start_collector = off > #stats_command_string = off > #stats_block_level = off > #stats_row_level = off > #stats_reset_on_server_start = off I think you need stats_row_level on for autovacuum, but I'm not 100% sure. Let us know what happens after fixing these settings and running analyze and running explain analyze, with possible changes to the query.
On Sat, Jul 28, 2007 at 10:36:16PM +0000, Ragnar wrote: > On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote: > > > where > > > > to_char( data_encerramento ,'yyyy-mm') > > between '2006-12' and '2007-01' > > assuming data_encerramento is a date column, try: > WHERE data_encerramento between '2006-12-01' and '2007-01-31' IMO, much better would be: WHERE data_encerramento >= '2006-12-01' AND data_encerramento < '2007-02-01' This means you don't have to worry about last day of the month or timestamp precision. In fact, since the field is actually a timestamp, the between posted above won't work correctly. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Attachment
Look it EXPLAIN ANALYZE select to_char(data_encerramento,'mm/yyyy') as opcoes_mes, to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas where data_encerramento = '01/12/2006' GROUP BY opcoes_mes, ordem ORDER BY ordem DESC **************************************************************************** QUERY PLAN Sort (cost=60.72..60.72 rows=1 width=8) (actual time=4.586..4.586 rows=0 loops=1) Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) -> HashAggregate (cost=60.72..60.72 rows=1 width=8) (actual time=4.579..4.579 rows=0 loops=1) -> Index Scan using detalhamento_bas_idx3005 on detalhamento_bas (cost=0.00..60.67 rows=105 width=8) (actual time=4.576..4.576 rows=0 loops=1) Index Cond: (data_encerramento = '2006-12-01 00:00:00'::timestamp without time zone) Total runtime: 4.629 ms //////////////////////////////////////////////////////////////////////////// EXPLAIN ANALYZE select to_char(data_encerramento,'mm/yyyy') as opcoes_mes, to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas where data_encerramento >= '01/12/2006' and data_encerramento < '01/02/2007' GROUP BY opcoes_mes, ordem ORDER BY ordem DESC **************************************************************************** QUERY PLAN Sort (cost=219113.10..219113.10 rows=4 width=8) (actual time=10079.212..10079.213 rows=2 loops=1) Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) -> HashAggregate (cost=219113.09..219113.09 rows=4 width=8) (actual time=10079.193..10079.195 rows=2 loops=1) -> Seq Scan on detalhamento_bas (cost=0.00..217945.41 rows=2335358 width=8) (actual time=0.041..8535.792 rows=2335819 loops=1) Filter: ((data_encerramento >= '2006-12-01 00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01 00:00:00'::timestamp without time zone)) Total runtime: 10079.256 ms Strange!!! Why does the index not works? All my querys doesn't work with range dates.... I don't know what to do... Please, help! Bruno -----Mensagem original----- De: Decibel! [mailto:decibel@decibel.org] Enviada em: domingo, 29 de julho de 2007 13:36 Para: Ragnar Cc: Bruno Rodrigues Siqueira; pgsql-performance@postgresql.org Assunto: Re: RES: [PERFORM] select on 1milion register = 6s On Sat, Jul 28, 2007 at 10:36:16PM +0000, Ragnar wrote: > On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote: > > > where > > > > to_char( data_encerramento ,'yyyy-mm') > > between '2006-12' and '2007-01' > > assuming data_encerramento is a date column, try: > WHERE data_encerramento between '2006-12-01' and '2007-01-31' IMO, much better would be: WHERE data_encerramento >= '2006-12-01' AND data_encerramento < '2007-02-01' This means you don't have to worry about last day of the month or timestamp precision. In fact, since the field is actually a timestamp, the between posted above won't work correctly. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Scott Marlowe wrote: > On 7/28/07, Bruno Rodrigues Siqueira <bruno@ravnus.com> wrote: > > stats_start_collector = off > > #stats_command_string = off > > #stats_block_level = off > > #stats_row_level = off > > #stats_reset_on_server_start = off > > I think you need stats_row_level on for autovacuum, but I'm not 100% sure. That's correct (of course you need "start_collector" on as well). Most likely, autovacuum is not even running. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Sun, Jul 29, 2007 at 01:44:23PM -0300, Bruno Rodrigues Siqueira wrote: > EXPLAIN > ANALYZE > select > to_char(data_encerramento,'mm/yyyy') as opcoes_mes, > to_char(data_encerramento,'yyyy-mm') as ordem from detalhamento_bas > where > > data_encerramento >= '01/12/2006' and > data_encerramento < '01/02/2007' > > GROUP BY opcoes_mes, ordem > ORDER BY ordem DESC > > **************************************************************************** > > QUERY PLAN > Sort (cost=219113.10..219113.10 rows=4 width=8) (actual > time=10079.212..10079.213 rows=2 loops=1) > Sort Key: to_char(data_encerramento, 'yyyy-mm'::text) > -> HashAggregate (cost=219113.09..219113.09 rows=4 width=8) (actual > time=10079.193..10079.195 rows=2 loops=1) > -> Seq Scan on detalhamento_bas (cost=0.00..217945.41 rows=2335358 > width=8) (actual time=0.041..8535.792 rows=2335819 loops=1) > Filter: ((data_encerramento >= '2006-12-01 > 00:00:00'::timestamp without time zone) AND (data_encerramento < '2007-02-01 > 00:00:00'::timestamp without time zone)) > Total runtime: 10079.256 ms > > Strange!!! Why does the index not works? It's unlikely that it's going to be faster to index scan 2.3M rows than to sequential scan them. Try setting enable_seqscan=false and see if it is or not. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Attachment
Please reply-all so others can learn and contribute. On Sun, Jul 29, 2007 at 09:38:12PM -0700, Craig James wrote: > Decibel! wrote: > >It's unlikely that it's going to be faster to index scan 2.3M rows than > >to sequential scan them. Try setting enable_seqscan=false and see if it > >is or not. > > Out of curiosity ... Doesn't that depend on the table? Are all of the data > for one row stored contiguously, or are the data stored column-wise? If > it's the former, and the table has hundreds of columns, or a few columns > with large text strings, then wouldn't the time for a sequential scan > depend not on the number of rows, but rather the total amount of data? Yes, the time for a seqscan is mostly dependent on table size and not the number of rows. But the number of rows plays a very large role in the cost of an indexscan. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
Scott Marlowe wrote: >> random_page_cost = 1 # units are one sequential page >> fetch > > Seldom if ever is it a good idea to bonk the planner on the head with > random_page_cost=1. setting it to 1.2 ot 1.4 is low enough, but 1.4 > to 2.0 is more realistic. Which is probably the reason why the planner thinks a seq scan is faster than an index scan... Jan