Thread: postgres 7.4 at 100%
Hi all, I was running Postgres 7.3 and it was running at about 15% with my application. On Postgres 7.4 on another box, it was running at 100%... My settings are default on both boxes I think. There are only about 20 inserts per second, which is really low. Anyone have any ideas as to something I have to do to Postgres 7.4 to change it from the default so that it's not eating up all my CPU? I have no clue how to debug this... Help please!!!! Should I downgrade to 7.3 to see what happens? BTW I'm running Postgres 7.3.2 on: Linux box 2.4.25-040218 #1 SMP Wed Feb 18 17:59:29 CET 2004 i686 i686 i386 GNU/Linux on a single processor P4 1.4GHz, 512 MB RAM. Does the SMP kernel do something with the single processor CPU? or should this not affect psql? Thanks in advance!!! Chris
Chris Cheston wrote: > Hi all, > > I was running Postgres 7.3 and it was running at about 15% with my > application. On Postgres 7.4 on another box, it was running at 100%... People are going to need more information. Are you talking about CPU/disk IO/memory? > My settings are default on both boxes I think. Doubtful - PG crawls with the default settings. Check your old postgresql.conf file and compare. Also, read the tuning article at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php > There are only about 20 inserts per second, which is really low. > Anyone have any ideas as to something I have to do to Postgres 7.4 to > change it from the default so that it's not eating up all my CPU? I > have no clue how to debug this... What does top/vmstat/iostat show during heavy usage? > Help please!!!! Should I downgrade to 7.3 to see what happens? BTW > I'm running Postgres 7.3.2 on: > > Linux box 2.4.25-040218 #1 SMP Wed Feb 18 17:59:29 CET 2004 i686 i686 > i386 GNU/Linux > > on a single processor P4 1.4GHz, 512 MB RAM. Does the SMP kernel do > something with the single processor CPU? or should this not affect > psql? Don't know about the SMP thing. Unlikely that one of the big distributions would mess that up much though. -- Richard Huxton Archonet Ltd
Hi Richard, Thanks so much for replying. Pls see below. Thanks in advance for any advice, Chris > People are going to need more information. Are you talking about > CPU/disk IO/memory? ## CPU is at 100%. > > > My settings are default on both boxes I think. > > Doubtful - PG crawls with the default settings. Check your old > postgresql.conf file and compare. Also, read the tuning article at: > http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php > > > There are only about 20 inserts per second, which is really low. > > Anyone have any ideas as to something I have to do to Postgres 7.4 to > > change it from the default so that it's not eating up all my CPU? I > > have no clue how to debug this... > > What does top/vmstat/iostat show during heavy usage? > TOP: 137 processes: 135 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 81.4% user 17.9% system 0.0% nice 0.0% iowait 0.5% idle Mem: 507036k av, 500244k used, 6792k free, 0k shrd, 68024k buff 133072k active, 277368k inactive Swap: 787176k av, 98924k used, 688252k free 232500k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 20734 postgres 15 0 4028 4028 3244 R 83.6 0.7 348:03 0 postmaster 21249 numnet 9 0 78060 76M 8440 S 7.5 15.3 32:51 0 myapp 18478 user 12 0 1224 1224 884 S 5.7 0.2 57:01 0 top [root@live root]# vmstat procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 98924 5980 68024 233528 4 3 6 22 28 10 13 6 24 iostat: Time: 03:18:18 AM Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn dev3-0 11.00 0.80 142.40 4 712 Time: 03:18:23 AM Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn dev3-0 10.60 0.00 143.20 0 716 or blocks: Time: 03:20:58 AM Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev3-0 25.40 3.20 756.80 16 3784 Time: 03:21:03 AM Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev3-0 30.20 3.20 841.60 16 4208 extended: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/hda3 0.00 79.20 0.60 30.60 4.80 878.40 2.40 439.20 28.31 0.36 1.15 0.45 1.40 avg-cpu: %user %nice %sys %idle 31.00 0.00 18.20 50.80 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/hda3 0.00 45.80 0.00 10.80 0.00 452.80 0.00 226.40 41.93 0.08 0.74 0.37 0.40 avg-cpu: %user %nice %sys %idle 83.20 0.00 16.60 0.20 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util /dev/hda 0.00 28.20 0.00 10.10 0.00 315.20 0.00 157.60 31.21 4294917.33 0.30 99.01 100.00 /dev/hda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 /dev/hda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 /dev/hda3 0.00 28.20 0.00 10.10 0.00 315.20 0.00 157.60 31.21 0.03 0.30 0.20 0.20 My conf file: [root@live setup]# cat /var/lib/pgsql/data/postgresql.conf # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - tcpip_socket = true max_connections = 20 # note: increasing max_connections costs about 500 bytes of shared # memory per connection slot, in addition to costs from shared_buffers # and max_locks_per_transaction. #superuser_reserved_connections = 2 #port = 5432 #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #virtual_host = '' # what interface to listen on; defaults to any #rendezvous_name = '' # defaults to the computer name # - Security & Authentication - #authentication_timeout = 60 # 1-600, in seconds #ssl = false #password_encryption = true #krb_server_keyfile = '' #db_user_namespace = false #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 40 # min 16, at least max_connections*2, 8KB each #sort_mem = 1024 # min 64, size in KB #vacuum_mem = 8192 # min 1024, size in KB # - Free Space Map - #max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each #max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - #fsync = true # turns forced synchronization on or off #wal_sync_method = fsync # the default varies across platforms: # fsync, fdatasync, open_sync, or open_datasync #wal_buffers = 8 # min 4, 8KB each # - Checkpoints - #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 300 # range 30-3600, in seconds #checkpoint_warning = 30 # 0 is off, in seconds #commit_delay = 0 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Enabling - #enable_hashagg = true #enable_hashjoin = true #enable_indexscan = true #enable_mergejoin = true #enable_nestloop = true #enable_seqscan = true #enable_sort = true #enable_tidscan = true # - Planner Cost Constants - #effective_cache_size = 1000 # typically 8KB each #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) # - Genetic Query Optimizer - #geqo = true #geqo_threshold = 11 #geqo_effort = 1 #geqo_generations = 0 #geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 #from_collapse_limit = 8 #join_collapse_limit = 8 # 1 disables collapsing of explicit JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Syslog - syslog = 0 # range 0-2; 0=stdout; 1=both; 2=syslog syslog_facility = 'LOCAL0' syslog_ident = 'postgres' # - When to Log - client_min_messages = warning # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error log_min_messages = log # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic log_error_verbosity = default # terse, default, or verbose messages log_min_error_statement = warning # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) #log_min_duration_statement = -1 # Log all statements whose # execution time exceeds the value, in # milliseconds. Zero prints all queries. # Minus-one disables. #silent_mode = false # DO NOT USE without Syslog! # - What to Log - #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false #log_connections = false #log_duration = false #log_pid = false #log_statement = false #log_timestamp = false #log_hostname = false #log_source_port = false #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - #log_parser_stats = false #log_planner_stats = false #log_executor_stats = false #log_statement_stats = false # - Query/Index Statistics Collector - #stats_start_collector = true #stats_command_string = false #stats_block_level = false #stats_row_level = false #stats_reset_on_server_start = true #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '$user,public' # schema names #check_function_bodies = true #default_transaction_isolation = 'read committed' #default_transaction_read_only = false #statement_timeout = 0 # 0 is disabled, in milliseconds # - Locale and Formatting - #datestyle = 'iso, mdy' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database encoding # These settings are initialized by initdb -- they may be changed lc_messages = 'en_US.UTF-8' # locale for system error message strings lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting # - Other Defaults - #explain_pretty_print = true #dynamic_library_path = '$libdir' #max_expr_depth = 10000 # min 10 #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- #deadlock_timeout = 1000 # in milliseconds #max_locks_per_transaction = 64 # min 10, ~260*max_connections bytes each #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = true #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = true # - Other Platforms & Clients - #transform_null_equals = false
Chris Cheston <ccheston@gmail.com> writes: > shared_buffers = 40 # min 16, at least max_connections*2, 8KB each This is ridiculously low for any kind of production server. Try something like 5000-10000 for a start. -Doug
Not to mention upping your effective_cache.
Doug McNaught wrote:
Chris Cheston <ccheston@gmail.com> writes:shared_buffers = 40 # min 16, at least max_connections*2, 8KB eachThis is ridiculously low for any kind of production server. Try something like 5000-10000 for a start. -Doug ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
-- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Hi all, I upped effective_cache to 16000 KB and I could only up the shared_buffers to 3000. Anything more and postgres would not start. Postmaster is still using lots of CPU. pg_stat_activity shows only query is happening at a time so the requests are probably queueing on this one thread. Is this the right way to go? Any other suggestions for me to figure out why Postmaster is using so much CPU? Thanks in advance, Chris numnet=# select * from pg_stat_activity; datid | datname | procpid | usesysid | usename | current_query | query_start -------+---------+---------+----------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------- 17144 | numnet | 26120 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.02042-04 17144 | numnet | 26121 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.026025-04 17144 | numnet | 26122 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.030917-04 17144 | numnet | 26123 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.036266-04 17144 | numnet | 26124 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.041551-04 17144 | numnet | 26125 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.046449-04 17144 | numnet | 26126 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.051666-04 17144 | numnet | 26127 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.057398-04 17144 | numnet | 26128 | 103 | numnet | <IDLE> | 2004-06-26 18:01:24.06225-04 17144 | numnet | 26129 | 103 | numnet | SELECT id,name,number, systemid,pin,last,to,from,lastest,start,end,continue,type,status,duration FROM logs WHERE (((from= 'me') and (to= 'you')) and (serverip= '23.6.6.33 17144 | numnet | 26147 | 103 | numnet | <IDLE> | 2004-06-26 18:03:46.175789-04 (11 rows) ----- Original Message ----- From: Joshua D. Drake <jd@commandprompt.com> Date: Sat, 26 Jun 2004 07:11:49 -0700 Subject: Re: [PERFORM] postgres 7.4 at 100% To: Doug McNaught <doug@mcnaught.org> Cc: pgsql-performance@postgresql.org Hello, Not to mention upping your effective_cache. Doug McNaught wrote: Chris Cheston <ccheston@gmail.com> writes: shared_buffers = 40 # min 16, at least max_connections*2, 8KB each This is ridiculously low for any kind of production server. Try something like 5000-10000 for a start. -Doug ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
> I upped effective_cache to 16000 KB and I could only up the > shared_buffers to 3000. Anything more and postgres would not start. You need to greatly incrase the shared memory max setting on your machine so that you can use at the very least, 10000 shared buffers. Chris
On Sun, 2004-06-27 at 00:33, Christopher Kings-Lynne wrote: > > I upped effective_cache to 16000 KB and I could only up the > > shared_buffers to 3000. Anything more and postgres would not start. > > You need to greatly incrase the shared memory max setting on your > machine so that you can use at the very least, 10000 shared buffers. Doug said the same, yet the PG Tuning article recommends not make this too large as it is just temporary used by the query queue or so. (I guess the system would benefit using more memory for file system cache) So who is correct? The tuning article or big-honking-shared-mem proponents? FWIW: I have a box with 512 MB RAM and see no different between 4096 and 32758 shared buffers... Regards, Frank
Attachment
Frank Knobbe <frank@knobbe.us> writes: > On Sun, 2004-06-27 at 00:33, Christopher Kings-Lynne wrote: >>> I upped effective_cache to 16000 KB and I could only up the >>> shared_buffers to 3000. Anything more and postgres would not start. >> You need to greatly incrase the shared memory max setting on your >> machine so that you can use at the very least, 10000 shared buffers. > Doug said the same, yet the PG Tuning article recommends not make this > too large as it is just temporary used by the query queue or so. The original report was that the guy had it set to 40 (!?), which is clearly far below the minimum reasonable value. But I'd not expect a huge difference between 3000 and 10000 --- in my experience, 1000 is enough to get you over the "knee" of the performance curve and into the domain of marginal improvements. So while he surely should not go back to 40, it seems there's another factor involved here that we've not recognized yet. regards, tom lane
Tom, > So while he surely should not go back to 40, it seems there's another > factor involved here that we've not recognized yet. I'd agree. Actually, the first thing I'd do, were it my machine, is reboot it and run memtest86 overnight. CPU thrashing like that may indicate bad RAM. If the RAM checks out, I'd like to see the EXPLAIN ANALYZE for some of the longest-running queries, and for those INSERTS. Also, is the new machine running Software RAID? -- Josh Berkus Aglio Database Solutions San Francisco
Frank, > Doug said the same, yet the PG Tuning article recommends not make this > too large as it is just temporary used by the query queue or so. (I > guess the system would benefit using more memory for file system cache) As one of the writers of that article, let me point out: " -- Medium size data set and 256-512MB available RAM: 16-32MB (2048-4096) -- Large dataset and lots of available RAM (1-4GB): 64-256MB (8192-32768) " While this is probably a little conservative, it's still way bigger than 40. I would disagree with the folks who suggest 32,000 as a setting for you. On Linux, that's a bit too large; I've never seen performance improvements with shared_buffers greater than 18% of *available* RAM. -- -Josh Berkus Aglio Database Solutions San Francisco
On Mon, 2004-06-28 at 14:40, Josh Berkus wrote: > As one of the writers of that article, let me point out: > > " -- Medium size data set and 256-512MB available RAM: 16-32MB (2048-4096) > -- Large dataset and lots of available RAM (1-4GB): 64-256MB (8192-32768) " > > While this is probably a little conservative, it's still way bigger than 40. I agree that 40 is a bit weak :) Chris' system has only 512 MB of RAM though. I thought the quick response "..for any kind of production server, try 5000-10000..." -- without considering how much memory he has -- was a bit... uhm... eager. Besides, if the shared memory is used to queue client requests, shouldn't that memory be sized according to workload (i.e. amount of clients, transactions per second, etc) instead of just taking a percentage of the total amount of memory? If there only a few connections, why waste shared memory on that when the memory could be better used as file system cache to prevent PG from going to the disk so often? I understand tuning PG is almost an art form, yet it should be based on actual usage patterns, not just by system dimensions, don't you agree? Regards, Frank
Attachment
Frank, > I understand tuning PG is almost an art form, yet it should be based on > actual usage patterns, not just by system dimensions, don't you agree? Well, it's both. It's more that available RAM determines your *upper* limit; that is, on Linux, you don't really want to have more than 20% allocated to the shared_buffers or you'll be taking memory away from the kernel. Within that limit, data size, query complexity and volume, and whether or not you have long-running procedures tell you whether you're at the low end or the high end. To futher complicate things, these calculations are all going to change with 7.5. -- -Josh Berkus Aglio Database Solutions San Francisco
Wow, this simple query is taking 676.24 ms to execute! it only takes 18 ms on our other machine. This table has 150,000 rows. Is this normal? no, the machine is not running software RAID. Anyone have any ideas next as to what I should do to debug this? I'm really wondering if the Linux OS running SMP is the cause. Thanks, Chris live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual time=0.30..574.72 rows=143485 loops=1) Filter: (from = 'you'::character varying) Total runtime: 676.24 msec (3 rows) explain analyze for inserts is fast too. On Mon, 28 Jun 2004 09:47:59 -0700, Josh Berkus <josh@agliodbs.com> wrote: > > Tom, > > > So while he surely should not go back to 40, it seems there's another > > factor involved here that we've not recognized yet. > > I'd agree. Actually, the first thing I'd do, were it my machine, is reboot it > and run memtest86 overnight. CPU thrashing like that may indicate bad RAM. > > If the RAM checks out, I'd like to see the EXPLAIN ANALYZE for some of the > longest-running queries, and for those INSERTS. > > Also, is the new machine running Software RAID? > > -- > Josh Berkus > Aglio Database Solutions > San Francisco > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
> live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------- > Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual > time=0.30..574.72 rows=143485 loops=1) > Filter: (from = 'you'::character varying) > Total runtime: 676.24 msec > (3 rows) Have you got an index on calllogs(from)? Have you vacuumed and analyzed that table recently? Chris
ok i just vacuumed it and it's taking slightly longer now to execute (only about 8 ms longer, to around 701 ms). Not using indexes for calllogs(from)... should I? The values for calllogs(from) are not unique (sorry if I'm misunderstanding your point). Thanks, Chris On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote: > > > live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; > > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------- > > Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual > > time=0.30..574.72 rows=143485 loops=1) > > Filter: (from = 'you'::character varying) > > Total runtime: 676.24 msec > > (3 rows) > > Have you got an index on calllogs(from)? > > Have you vacuumed and analyzed that table recently? > > Chris > >
On Tue, Jun 29, 2004 at 01:37:30 -0700, Chris Cheston <ccheston@gmail.com> wrote: > ok i just vacuumed it and it's taking slightly longer now to execute > (only about 8 ms longer, to around 701 ms). > > Not using indexes for calllogs(from)... should I? The values for > calllogs(from) are not unique (sorry if I'm misunderstanding your > point). If you are hoping for some other plan than a sequential scan through all of the records you are going to need an index. You can have an index on a column (or function) that isn't unique for all rows.
Chris Cheston <ccheston@gmail.com> writes: > Wow, this simple query is taking 676.24 ms to execute! it only takes > 18 ms on our other machine. > This table has 150,000 rows. Is this normal? > live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------- > Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual > time=0.30..574.72 rows=143485 loops=1) > Filter: (from = 'you'::character varying) > Total runtime: 676.24 msec > (3 rows) So the query is pulling 140K+ rows out of a table with 150K entries? No chance that an index will help for that. You're fortunate that the thing did not try to use an index though, because it thinks there are only 24 rows matching 'you', which is one of the more spectacular statistical failures I've seen lately. I take it you haven't ANALYZEd this table in a long time? It is hard to believe that your other machine can pull 140K+ rows in 18 msec, though. Are you sure the table contents are the same in both cases? If they are, the only reason I can think of for the discrepancy is a large amount of dead space in this copy of the table. What does VACUUM VERBOSE show for it, and how does that compare to what you see on the other machine? Try a CLUSTER or VACUUM FULL to see if you can shrink the table's physical size (number of pages). regards, tom lane
Is the from field nullable? If not, try "create index calllogs_from on calllogs ( from );" and then do an explain analyze of your query. Gavin Chris Cheston wrote: >ok i just vacuumed it and it's taking slightly longer now to execute >(only about 8 ms longer, to around 701 ms). > >Not using indexes for calllogs(from)... should I? The values for >calllogs(from) are not unique (sorry if I'm misunderstanding your >point). > >Thanks, > >Chris > >On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne ><chriskl@familyhealth.com.au> wrote: > > >>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; >>> QUERY PLAN >>>---------------------------------------------------------------------------------------------------------- >>> Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual >>>time=0.30..574.72 rows=143485 loops=1) >>> Filter: (from = 'you'::character varying) >>> Total runtime: 676.24 msec >>>(3 rows) >>> >>> >>Have you got an index on calllogs(from)? >> >>Have you vacuumed and analyzed that table recently? >> >>Chris >> >> >> >> > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >
Oh my, creating an index has absolutely reduced the times it takes to query from around 700 ms to less than 1 ms! Thanks so much for all your help. You've saved me! One question: Why would I or would I not create multiple indexes in a table? I created another index in the same table an it's improved performance even more. Thanks, Chris On Tue, 29 Jun 2004 09:03:24 -0700, Gavin M. Roy <gmr@ehpg.net> wrote: > > Is the from field nullable? If not, try "create index calllogs_from on > calllogs ( from );" and then do an explain analyze of your query. > > Gavin > > > > Chris Cheston wrote: > > >ok i just vacuumed it and it's taking slightly longer now to execute > >(only about 8 ms longer, to around 701 ms). > > > >Not using indexes for calllogs(from)... should I? The values for > >calllogs(from) are not unique (sorry if I'm misunderstanding your > >point). > > > >Thanks, > > > >Chris > > > >On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne > ><chriskl@familyhealth.com.au> wrote: > > > > > >>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you'; > >>> QUERY PLAN > >>>---------------------------------------------------------------------------------------------------------- > >>> Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual > >>>time=0.30..574.72 rows=143485 loops=1) > >>> Filter: (from = 'you'::character varying) > >>> Total runtime: 676.24 msec > >>>(3 rows) > >>> > >>> > >>Have you got an index on calllogs(from)? > >> > >>Have you vacuumed and analyzed that table recently? > >> > >>Chris > >> > >> > >> > >> > > > >---------------------------(end of broadcast)--------------------------- > >TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > > > >
> Why would I or would I not create multiple indexes in a table? I > created another index in the same table an it's improved performance > even more. You create indexes when you need indexes. Indexes are most helpful when they match the WHERE clause of your selects. So, if you commonly do one query that selects on one column, and another query that selects on two other columns - then create one index on the first column and another index over the second two columns. Chris
I see - thanks very much. I created an index for column 'oid' which I was using in a WHERE. So rule of thumb- create an index for column(s) which I use in WHERE queries. Thanks, Chis On Wed, 30 Jun 2004 15:30:52 +0800, Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote: > > > > Why would I or would I not create multiple indexes in a table? I > > created another index in the same table an it's improved performance > > even more. > > You create indexes when you need indexes. Indexes are most helpful when > they match the WHERE clause of your selects. > > So, if you commonly do one query that selects on one column, and another > query that selects on two other columns - then create one index on the > first column and another index over the second two columns. > > Chris >
> I see - thanks very much. I created an index for column 'oid' which I > was using in a WHERE. So rule of thumb- create an index for column(s) > which I use in WHERE queries. So to speak. They can also sometimes assist in sorting. The OID column is special. I suggest adding a unique index to that column. In postgresql it is _possible_ for the oid counter to wraparound, hence if you rely on oids (not necessarily a good idea), it's best to put a unique index on the oid column. I _strongly_ suggest that you read this: http://www.postgresql.org/docs/7.4/static/indexes.html Chris
Creating indexes on a table affects insert performance depending on the number of indexes that have to be populated. From a query standpoint, indexes are a godsend in most cases.
Duane
-----Original Message-----
From: Chris Cheston [mailto:ccheston@gmail.com]
Sent: Wednesday, June 30, 2004 12:19 AM
To: Gavin M. Roy
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] postgres 7.4 at 100%
Oh my, creating an index has absolutely reduced the times it takes to
query from around 700 ms to less than 1 ms!
Thanks so much for all your help. You've saved me!
One question:
Why would I or would I not create multiple indexes in a table? I
created another index in the same table an it's improved performance
even more.
Thanks,
Chris
On Tue, 29 Jun 2004 09:03:24 -0700, Gavin M. Roy <gmr@ehpg.net> wrote:
>
> Is the from field nullable? If not, try "create index calllogs_from on
> calllogs ( from );" and then do an explain analyze of your query.
>
> Gavin
>
>
>
> Chris Cheston wrote:
>
> >ok i just vacuumed it and it's taking slightly longer now to execute
> >(only about 8 ms longer, to around 701 ms).
> >
> >Not using indexes for calllogs(from)... should I? The values for
> >calllogs(from) are not unique (sorry if I'm misunderstanding your
> >point).
> >
> >Thanks,
> >
> >Chris
> >
> >On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne
> ><chriskl@familyhealth.com.au> wrote:
> >
> >
> >>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
> >>> QUERY PLAN
> >>>----------------------------------------------------------------------------------------------------------
> >>> Seq Scan on calllogs (cost=0.00..136.11 rows=24 width=4) (actual
> >>>time=0.30..574.72 rows=143485 loops=1)
> >>> Filter: (from = 'you'::character varying)
> >>> Total runtime: 676.24 msec
> >>>(3 rows)
> >>>
> >>>
> >>Have you got an index on calllogs(from)?
> >>
> >>Have you vacuumed and analyzed that table recently?
> >>
> >>Chris
> >>
> >>
> >>
> >>
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> >
>
>
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?