Thread: postgres 7.4 at 100%

postgres 7.4 at 100%

From
Chris Cheston
Date:
Hi all,

I was running Postgres 7.3 and it was running at about 15% with my
application. On Postgres 7.4 on another box, it was running at 100%...
My settings are default on both boxes I think.

There are only about 20 inserts per second, which is really low.
Anyone have any ideas as to something I have to do to Postgres 7.4 to
change it from the default so that it's not eating up all my CPU?  I
have no clue how to debug this...

Help please!!!!  Should I downgrade to 7.3 to see what happens?  BTW
I'm running Postgres 7.3.2 on:

Linux box 2.4.25-040218 #1 SMP Wed Feb 18 17:59:29 CET 2004 i686 i686
i386 GNU/Linux

on a single processor P4 1.4GHz, 512 MB RAM.  Does the SMP kernel do
something with the single processor CPU? or should this not affect
psql?

Thanks in advance!!!
Chris

Re: postgres 7.4 at 100%

From
Richard Huxton
Date:
Chris Cheston wrote:
> Hi all,
>
> I was running Postgres 7.3 and it was running at about 15% with my
> application. On Postgres 7.4 on another box, it was running at 100%...

People are going to need more information. Are you talking about
CPU/disk IO/memory?

> My settings are default on both boxes I think.

Doubtful - PG crawls with the default settings. Check your old
postgresql.conf file and compare. Also, read the tuning article at:
  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

> There are only about 20 inserts per second, which is really low.
> Anyone have any ideas as to something I have to do to Postgres 7.4 to
> change it from the default so that it's not eating up all my CPU?  I
> have no clue how to debug this...

What does top/vmstat/iostat show during heavy usage?

> Help please!!!!  Should I downgrade to 7.3 to see what happens?  BTW
> I'm running Postgres 7.3.2 on:
>
> Linux box 2.4.25-040218 #1 SMP Wed Feb 18 17:59:29 CET 2004 i686 i686
> i386 GNU/Linux
>
> on a single processor P4 1.4GHz, 512 MB RAM.  Does the SMP kernel do
> something with the single processor CPU? or should this not affect
> psql?

Don't know about the SMP thing. Unlikely that one of the big
distributions would mess that up much though.

--
   Richard Huxton
   Archonet Ltd

Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
Hi Richard,
Thanks so much for replying.  Pls see below.
Thanks in advance for any advice,
Chris

> People are going to need more information. Are you talking about
> CPU/disk IO/memory?

## CPU is at 100%.

>
> > My settings are default on both boxes I think.
>
> Doubtful - PG crawls with the default settings. Check your old
> postgresql.conf file and compare. Also, read the tuning article at:
>   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
>
> > There are only about 20 inserts per second, which is really low.
> > Anyone have any ideas as to something I have to do to Postgres 7.4 to
> > change it from the default so that it's not eating up all my CPU?  I
> > have no clue how to debug this...
>
> What does top/vmstat/iostat show during heavy usage?
>
TOP:

137 processes: 135 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  81.4% user  17.9% system   0.0% nice   0.0% iowait   0.5% idle
Mem:   507036k av,  500244k used,    6792k free,       0k shrd,   68024k buff
       133072k active,             277368k inactive
Swap:  787176k av,   98924k used,  688252k free                  232500k cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
20734 postgres  15   0  4028 4028  3244 R    83.6  0.7 348:03   0 postmaster
21249 numnet     9   0 78060  76M  8440 S     7.5 15.3  32:51   0 myapp
18478 user     12   0  1224 1224   884 S     5.7  0.2  57:01   0 top

[root@live root]# vmstat
   procs                      memory      swap          io     system      cpu
 r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id
 1  0  0  98924   5980  68024 233528    4    3     6    22   28    10 13  6 24

iostat:
Time: 03:18:18 AM
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
dev3-0           11.00         0.80       142.40          4        712

Time: 03:18:23 AM
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
dev3-0           10.60         0.00       143.20          0        716

or blocks:

Time: 03:20:58 AM
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0           25.40         3.20       756.80         16       3784

Time: 03:21:03 AM
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
dev3-0           30.20         3.20       841.60         16       4208

extended:

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util
/dev/hda3    0.00  79.20  0.60 30.60    4.80  878.40     2.40   439.20
   28.31     0.36    1.15   0.45   1.40

avg-cpu:  %user   %nice    %sys   %idle
          31.00    0.00   18.20   50.80

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util
/dev/hda3    0.00  45.80  0.00 10.80    0.00  452.80     0.00   226.40
   41.93     0.08    0.74   0.37   0.40

avg-cpu:  %user   %nice    %sys   %idle
          83.20    0.00   16.60    0.20

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util
/dev/hda     0.00  28.20  0.00 10.10    0.00  315.20     0.00   157.60
   31.21 4294917.33    0.30  99.01 100.00
/dev/hda1    0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00
    0.00     0.00    0.00   0.00   0.00
/dev/hda2    0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00
    0.00     0.00    0.00   0.00   0.00
/dev/hda3    0.00  28.20  0.00 10.10    0.00  315.20     0.00   157.60
   31.21     0.03    0.30   0.20   0.20

My conf file:

[root@live setup]# cat /var/lib/pgsql/data/postgresql.conf
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload".


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

tcpip_socket = true
max_connections = 20
        # note: increasing max_connections costs about 500 bytes of shared
        # memory per connection slot, in addition to costs from shared_buffers
        # and max_locks_per_transaction.
#superuser_reserved_connections = 2
#port = 5432
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#virtual_host = ''              # what interface to listen on; defaults to any
#rendezvous_name = ''           # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60    # 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 40             # min 16, at least max_connections*2, 8KB each
#sort_mem = 1024                # min 64, size in KB
#vacuum_mem = 8192              # min 1024, size in KB

# - Free Space Map -

#max_fsm_pages = 20000          # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000       # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = true                   # turns forced synchronization on or off
#wal_sync_method = fsync        # the default varies across platforms:
                                # fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8                # min 4, 8KB each

# - Checkpoints -

#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300       # range 30-3600, in seconds
#checkpoint_warning = 30        # 0 is off, in seconds
#commit_delay = 0               # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Enabling -

#enable_hashagg = true
#enable_hashjoin = true
#enable_indexscan = true
#enable_mergejoin = true
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true

# - Planner Cost Constants -

#effective_cache_size = 1000    # typically 8KB each
#random_page_cost = 4           # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)

# - Genetic Query Optimizer -

#geqo = true
#geqo_threshold = 11
#geqo_effort = 1
#geqo_generations = 0
#geqo_pool_size = 0             # default based on tables in statement,
                                # range 128-1024
#geqo_selection_bias = 2.0      # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10 # range 1-1000
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Syslog -

syslog = 0                      # range 0-2; 0=stdout; 1=both; 2=syslog
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'

# - When to Log -

client_min_messages = warning   # Values, in order of decreasing detail:
                                #   debug5, debug4, debug3, debug2, debug1,
                                #   log, info, notice, warning, error

log_min_messages = log          # Values, in order of decreasing detail:
                                #   debug5, debug4, debug3, debug2, debug1,
                                #   info, notice, warning, error, log, fatal,
                                #   panic

log_error_verbosity = default   # terse, default, or verbose messages

log_min_error_statement = warning # Values in order of increasing severity:
                                 #   debug5, debug4, debug3, debug2, debug1,
                                 #   info, notice, warning, error, panic(off)

#log_min_duration_statement = -1 # Log all statements whose
                                 # execution time exceeds the value, in
                                 # milliseconds.  Zero prints all queries.
                                 # Minus-one disables.

#silent_mode = false             # DO NOT USE without Syslog!

# - What to Log -

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false
#log_connections = false
#log_duration = false
#log_pid = false
#log_statement = false
#log_timestamp = false
#log_hostname = false
#log_source_port = false


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = false
#log_planner_stats = false
#log_executor_stats = false
#log_statement_stats = false

# - Query/Index Statistics Collector -

#stats_start_collector = true
#stats_command_string = false
#stats_block_level = false
#stats_row_level = false
#stats_reset_on_server_start = true


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'   # schema names
#check_function_bodies = true
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = false
#statement_timeout = 0          # 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown             # actually, defaults to TZ environment setting
#australian_timezones = false
#extra_float_digits = 0         # min -15, max 2
#client_encoding = sql_ascii    # actually, defaults to database encoding

# These settings are initialized by initdb -- they may be changed
lc_messages = 'en_US.UTF-8'             # locale for system error
message strings
lc_monetary = 'en_US.UTF-8'             # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'              # locale for number formatting
lc_time = 'en_US.UTF-8'                 # locale for time formatting

# - Other Defaults -

#explain_pretty_print = true
#dynamic_library_path = '$libdir'
#max_expr_depth = 10000         # min 10


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000        # in milliseconds
#max_locks_per_transaction = 64 # min 10, ~260*max_connections bytes each


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = true
#regex_flavor = advanced        # advanced, extended, or basic
#sql_inheritance = true

# - Other Platforms & Clients -

#transform_null_equals = false

Re: postgres 7.4 at 100%

From
Doug McNaught
Date:
Chris Cheston <ccheston@gmail.com> writes:

> shared_buffers = 40             # min 16, at least max_connections*2, 8KB each

This is ridiculously low for any kind of production server.  Try
something like 5000-10000 for a start.

-Doug

Re: postgres 7.4 at 100%

From
"Joshua D. Drake"
Date:
Hello,

Not to mention upping your effective_cache.

Doug McNaught wrote:
Chris Cheston <ccheston@gmail.com> writes:
 
shared_buffers = 40             # min 16, at least max_connections*2, 8KB each   
This is ridiculously low for any kind of production server.  Try
something like 5000-10000 for a start.

-Doug


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your     joining column's datatypes do not match 


-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
Hi all,

I upped effective_cache to 16000 KB and I could only up the
shared_buffers to 3000.  Anything more and postgres would not start.

Postmaster is still using lots of CPU.  pg_stat_activity shows only
query is happening at a time so the requests are probably queueing on
this one thread. Is this the right way to go?

Any other suggestions for me to figure out why Postmaster is using so much CPU?

Thanks in advance,
Chris

numnet=# select * from pg_stat_activity;
 datid | datname | procpid | usesysid | usename |

                             current_query

                      |          query_start

-------+---------+---------+----------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------
 17144 | numnet  |   26120 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.02042-04
 17144 | numnet  |   26121 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.026025-04
 17144 | numnet  |   26122 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.030917-04
 17144 | numnet  |   26123 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.036266-04
 17144 | numnet  |   26124 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.041551-04
 17144 | numnet  |   26125 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.046449-04
 17144 | numnet  |   26126 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.051666-04
 17144 | numnet  |   26127 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.057398-04
 17144 | numnet  |   26128 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:01:24.06225-04
 17144 | numnet  |   26129 |      103 | numnet  | SELECT id,name,number,
systemid,pin,last,to,from,lastest,start,end,continue,type,status,duration
FROM logs WHERE (((from= 'me') and (to= 'you')) and (serverip=
'23.6.6.33
 17144 | numnet  |   26147 |      103 | numnet  | <IDLE>



                      | 2004-06-26 18:03:46.175789-04
(11 rows)




----- Original Message -----
From: Joshua D. Drake <jd@commandprompt.com>
Date: Sat, 26 Jun 2004 07:11:49 -0700
Subject: Re: [PERFORM] postgres 7.4 at 100%
To: Doug McNaught <doug@mcnaught.org>
Cc: pgsql-performance@postgresql.org


Hello,

Not to mention upping your effective_cache.


Doug McNaught wrote:

Chris Cheston <ccheston@gmail.com> writes:


shared_buffers = 40             # min 16, at least max_connections*2, 8KB each
    This is ridiculously low for any kind of production server.  Try
something like 5000-10000 for a start.

-Doug


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Re: postgres 7.4 at 100%

From
Christopher Kings-Lynne
Date:
> I upped effective_cache to 16000 KB and I could only up the
> shared_buffers to 3000.  Anything more and postgres would not start.

You need to greatly incrase the shared memory max setting on your
machine so that you can use at the very least, 10000 shared buffers.

Chris


Re: postgres 7.4 at 100%

From
Frank Knobbe
Date:
On Sun, 2004-06-27 at 00:33, Christopher Kings-Lynne wrote:
> > I upped effective_cache to 16000 KB and I could only up the
> > shared_buffers to 3000.  Anything more and postgres would not start.
>
> You need to greatly incrase the shared memory max setting on your
> machine so that you can use at the very least, 10000 shared buffers.


Doug said the same, yet the PG Tuning article recommends not make this
too large as it is just temporary used by the query queue or so. (I
guess the system would benefit using more memory for file system cache)

So who is correct? The tuning article or big-honking-shared-mem
proponents?

FWIW: I have a box with 512 MB RAM and see no different between 4096 and
32758 shared buffers...

Regards,
Frank


Attachment

Re: postgres 7.4 at 100%

From
Tom Lane
Date:
Frank Knobbe <frank@knobbe.us> writes:
> On Sun, 2004-06-27 at 00:33, Christopher Kings-Lynne wrote:
>>> I upped effective_cache to 16000 KB and I could only up the
>>> shared_buffers to 3000.  Anything more and postgres would not start.

>> You need to greatly incrase the shared memory max setting on your
>> machine so that you can use at the very least, 10000 shared buffers.

> Doug said the same, yet the PG Tuning article recommends not make this
> too large as it is just temporary used by the query queue or so.

The original report was that the guy had it set to 40 (!?), which is
clearly far below the minimum reasonable value.  But I'd not expect a
huge difference between 3000 and 10000 --- in my experience, 1000 is
enough to get you over the "knee" of the performance curve and into
the domain of marginal improvements.

So while he surely should not go back to 40, it seems there's another
factor involved here that we've not recognized yet.

            regards, tom lane

Re: postgres 7.4 at 100%

From
Josh Berkus
Date:
Tom,

> So while he surely should not go back to 40, it seems there's another
> factor involved here that we've not recognized yet.

I'd agree.  Actually, the first thing I'd do, were it my machine, is reboot it
and run memtest86 overnight.    CPU thrashing like that may indicate bad RAM.

If the RAM checks out, I'd like to see the EXPLAIN ANALYZE for some of the
longest-running queries, and for those INSERTS.

Also, is the new machine running Software RAID?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: postgres 7.4 at 100%

From
Josh Berkus
Date:
Frank,

> Doug said the same, yet the PG Tuning article recommends not make this
> too large as it is just temporary used by the query queue or so. (I
> guess the system would benefit using more memory for file system cache)

As one of the writers of that article, let me point out:

" -- Medium size data set and 256-512MB available RAM: 16-32MB (2048-4096)
-- Large dataset and lots of available RAM (1-4GB): 64-256MB (8192-32768) "

While this is probably a little conservative, it's still way bigger than 40.

I would disagree with the folks who suggest 32,000 as a setting for you.   On
Linux, that's a bit too large; I've never seen performance improvements with
shared_buffers greater than 18% of *available* RAM.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: postgres 7.4 at 100%

From
Frank Knobbe
Date:
On Mon, 2004-06-28 at 14:40, Josh Berkus wrote:
> As one of the writers of that article, let me point out:
>
> " -- Medium size data set and 256-512MB available RAM: 16-32MB (2048-4096)
> -- Large dataset and lots of available RAM (1-4GB): 64-256MB (8192-32768) "
>
> While this is probably a little conservative, it's still way bigger than 40.

I agree that 40 is a bit weak :)  Chris' system has only 512 MB of RAM
though. I thought the quick response "..for any kind of production
server, try 5000-10000..." -- without considering how much memory he has
-- was a bit... uhm... eager.

Besides, if the shared memory is used to queue client requests,
shouldn't that memory be sized according to workload (i.e. amount of
clients, transactions per second, etc) instead of just taking a
percentage of the total amount of memory? If there only a few
connections, why waste shared memory on that when the memory could be
better used as file system cache to prevent PG from going to the disk so
often?

I understand tuning PG is almost an art form, yet it should be based on
actual usage patterns, not just by system dimensions, don't you agree?

Regards,
Frank





Attachment

Re: postgres 7.4 at 100%

From
Josh Berkus
Date:
Frank,

> I understand tuning PG is almost an art form, yet it should be based on
> actual usage patterns, not just by system dimensions, don't you agree?

Well, it's both.   It's more that available RAM determines your *upper* limit;
that is, on Linux, you don't really want to have more than 20% allocated to
the shared_buffers or you'll be taking memory away from the kernel.

Within that limit, data size, query complexity and volume, and whether or not
you have long-running procedures tell you whether you're at the low end or
the high end.

To futher complicate things, these calculations are all going to change with
7.5.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
Wow, this simple query is taking 676.24 ms to execute! it only takes
18 ms on our other machine.

This table has 150,000 rows. Is this normal?

no, the machine is not running software RAID.  Anyone have any ideas
next as to what I should do to debug this? I'm really wondering if the
Linux OS running SMP is the cause.

Thanks,
Chris

live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
time=0.30..574.72 rows=143485 loops=1)
   Filter: (from = 'you'::character varying)
 Total runtime: 676.24 msec
(3 rows)

explain analyze for inserts is fast too.


On Mon, 28 Jun 2004 09:47:59 -0700, Josh Berkus <josh@agliodbs.com> wrote:
>
> Tom,
>
> > So while he surely should not go back to 40, it seems there's another
> > factor involved here that we've not recognized yet.
>
> I'd agree.  Actually, the first thing I'd do, were it my machine, is reboot it
> and run memtest86 overnight.    CPU thrashing like that may indicate bad RAM.
>
> If the RAM checks out, I'd like to see the EXPLAIN ANALYZE for some of the
> longest-running queries, and for those INSERTS.
>
> Also, is the new machine running Software RAID?
>
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org
>

Re: postgres 7.4 at 100%

From
Christopher Kings-Lynne
Date:
> live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
>                                                 QUERY PLAN
> ----------------------------------------------------------------------------------------------------------
>  Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
> time=0.30..574.72 rows=143485 loops=1)
>    Filter: (from = 'you'::character varying)
>  Total runtime: 676.24 msec
> (3 rows)

Have you got an index on calllogs(from)?

Have you vacuumed and analyzed that table recently?

Chris


Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
ok i just vacuumed it and it's taking slightly longer now to execute
(only about 8 ms longer, to around 701 ms).

Not using indexes for calllogs(from)... should I?  The values for
calllogs(from) are not unique (sorry if I'm misunderstanding your
point).

Thanks,

Chris

On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne
<chriskl@familyhealth.com.au> wrote:
>
> > live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
> >                                                 QUERY PLAN
> > ----------------------------------------------------------------------------------------------------------
> >  Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
> > time=0.30..574.72 rows=143485 loops=1)
> >    Filter: (from = 'you'::character varying)
> >  Total runtime: 676.24 msec
> > (3 rows)
>
> Have you got an index on calllogs(from)?
>
> Have you vacuumed and analyzed that table recently?
>
> Chris
>
>

Re: postgres 7.4 at 100%

From
Bruno Wolff III
Date:
On Tue, Jun 29, 2004 at 01:37:30 -0700,
  Chris Cheston <ccheston@gmail.com> wrote:
> ok i just vacuumed it and it's taking slightly longer now to execute
> (only about 8 ms longer, to around 701 ms).
>
> Not using indexes for calllogs(from)... should I?  The values for
> calllogs(from) are not unique (sorry if I'm misunderstanding your
> point).

If you are hoping for some other plan than a sequential scan through
all of the records you are going to need an index. You can have an
index on a column (or function) that isn't unique for all rows.

Re: postgres 7.4 at 100%

From
Tom Lane
Date:
Chris Cheston <ccheston@gmail.com> writes:
> Wow, this simple query is taking 676.24 ms to execute! it only takes
> 18 ms on our other machine.

> This table has 150,000 rows. Is this normal?

> live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
>                                                 QUERY PLAN
> ----------------------------------------------------------------------------------------------------------
>  Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
> time=0.30..574.72 rows=143485 loops=1)
>    Filter: (from = 'you'::character varying)
>  Total runtime: 676.24 msec
> (3 rows)

So the query is pulling 140K+ rows out of a table with 150K entries?
No chance that an index will help for that.  You're fortunate that the
thing did not try to use an index though, because it thinks there are
only 24 rows matching 'you', which is one of the more spectacular
statistical failures I've seen lately.  I take it you haven't ANALYZEd
this table in a long time?

It is hard to believe that your other machine can pull 140K+ rows in
18 msec, though.  Are you sure the table contents are the same in both
cases?

If they are, the only reason I can think of for the discrepancy is a
large amount of dead space in this copy of the table.  What does VACUUM
VERBOSE show for it, and how does that compare to what you see on the
other machine?  Try a CLUSTER or VACUUM FULL to see if you can shrink
the table's physical size (number of pages).

            regards, tom lane

Re: postgres 7.4 at 100%

From
"Gavin M. Roy"
Date:
Is the from field nullable?  If not, try "create index calllogs_from on
calllogs ( from );" and then do an explain analyze of your query.

Gavin


Chris Cheston wrote:

>ok i just vacuumed it and it's taking slightly longer now to execute
>(only about 8 ms longer, to around 701 ms).
>
>Not using indexes for calllogs(from)... should I?  The values for
>calllogs(from) are not unique (sorry if I'm misunderstanding your
>point).
>
>Thanks,
>
>Chris
>
>On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne
><chriskl@familyhealth.com.au> wrote:
>
>
>>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
>>>                                                QUERY PLAN
>>>----------------------------------------------------------------------------------------------------------
>>> Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
>>>time=0.30..574.72 rows=143485 loops=1)
>>>   Filter: (from = 'you'::character varying)
>>> Total runtime: 676.24 msec
>>>(3 rows)
>>>
>>>
>>Have you got an index on calllogs(from)?
>>
>>Have you vacuumed and analyzed that table recently?
>>
>>Chris
>>
>>
>>
>>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>


Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
Oh my, creating an index has absolutely reduced the times it takes to
query from around 700 ms to less than 1 ms!

Thanks so much for all your help.  You've saved me!

One question:

Why would I or would I not create multiple indexes in a table? I
created another index in the same table an it's improved performance
even more.

Thanks,
Chris

On Tue, 29 Jun 2004 09:03:24 -0700, Gavin M. Roy <gmr@ehpg.net> wrote:
>
> Is the from field nullable?  If not, try "create index calllogs_from on
> calllogs ( from );" and then do an explain analyze of your query.
>
> Gavin
>
>
>
> Chris Cheston wrote:
>
> >ok i just vacuumed it and it's taking slightly longer now to execute
> >(only about 8 ms longer, to around 701 ms).
> >
> >Not using indexes for calllogs(from)... should I?  The values for
> >calllogs(from) are not unique (sorry if I'm misunderstanding your
> >point).
> >
> >Thanks,
> >
> >Chris
> >
> >On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne
> ><chriskl@familyhealth.com.au> wrote:
> >
> >
> >>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
> >>>                                                QUERY PLAN
> >>>----------------------------------------------------------------------------------------------------------
> >>> Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
> >>>time=0.30..574.72 rows=143485 loops=1)
> >>>   Filter: (from = 'you'::character varying)
> >>> Total runtime: 676.24 msec
> >>>(3 rows)
> >>>
> >>>
> >>Have you got an index on calllogs(from)?
> >>
> >>Have you vacuumed and analyzed that table recently?
> >>
> >>Chris
> >>
> >>
> >>
> >>
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> >
>
>

Re: postgres 7.4 at 100%

From
Christopher Kings-Lynne
Date:
> Why would I or would I not create multiple indexes in a table? I
> created another index in the same table an it's improved performance
> even more.

You create indexes when you need indexes.  Indexes are most helpful when
they match the WHERE clause of your selects.

So, if you commonly do one query that selects on one column, and another
query that selects on two other columns - then create one index on the
first column and another index over the second two columns.

Chris

Re: postgres 7.4 at 100%

From
Chris Cheston
Date:
I see - thanks very much. I created an index for column 'oid' which I
was using in a WHERE.  So rule of thumb- create an index for column(s)
which I use in WHERE queries.

Thanks,
Chis

On Wed, 30 Jun 2004 15:30:52 +0800, Christopher Kings-Lynne
<chriskl@familyhealth.com.au> wrote:
>
>
> > Why would I or would I not create multiple indexes in a table? I
> > created another index in the same table an it's improved performance
> > even more.
>
> You create indexes when you need indexes.  Indexes are most helpful when
> they match the WHERE clause of your selects.
>
> So, if you commonly do one query that selects on one column, and another
> query that selects on two other columns - then create one index on the
> first column and another index over the second two columns.
>
> Chris
>

Re: postgres 7.4 at 100%

From
Christopher Kings-Lynne
Date:
> I see - thanks very much. I created an index for column 'oid' which I
> was using in a WHERE.  So rule of thumb- create an index for column(s)
> which I use in WHERE queries.

So to speak.  They can also sometimes assist in sorting.  The OID column
is special.  I suggest adding a unique index to that column.  In
postgresql it is _possible_ for the oid counter to wraparound, hence if
you rely on oids (not necessarily a good idea), it's best to put a
unique index on the oid column.

I _strongly_ suggest that you read this:

http://www.postgresql.org/docs/7.4/static/indexes.html

Chris


Re: postgres 7.4 at 100%

From
Duane Lee - EGOVX
Date:

Creating indexes on a table affects insert performance depending on the number of indexes that have to be populated.  From a query standpoint, indexes are a godsend in most cases.

Duane

-----Original Message-----
From: Chris Cheston [mailto:ccheston@gmail.com]
Sent: Wednesday, June 30, 2004 12:19 AM
To: Gavin M. Roy
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] postgres 7.4 at 100%

Oh my, creating an index has absolutely reduced the times it takes to
query from around 700 ms to less than 1 ms!

Thanks so much for all your help.  You've saved me!

One question:

Why would I or would I not create multiple indexes in a table? I
created another index in the same table an it's improved performance
even more.

Thanks,
Chris

On Tue, 29 Jun 2004 09:03:24 -0700, Gavin M. Roy <gmr@ehpg.net> wrote:
>
> Is the from field nullable?  If not, try "create index calllogs_from on
> calllogs ( from );" and then do an explain analyze of your query.
>
> Gavin
>
>
>
> Chris Cheston wrote:
>
> >ok i just vacuumed it and it's taking slightly longer now to execute
> >(only about 8 ms longer, to around 701 ms).
> >
> >Not using indexes for calllogs(from)... should I?  The values for
> >calllogs(from) are not unique (sorry if I'm misunderstanding your
> >point).
> >
> >Thanks,
> >
> >Chris
> >
> >On Tue, 29 Jun 2004 16:21:01 +0800, Christopher Kings-Lynne
> ><chriskl@familyhealth.com.au> wrote:
> >
> >
> >>>live=# explain analyze SELECT id FROM calllogs WHERE from = 'you';
> >>>                                                QUERY PLAN
> >>>----------------------------------------------------------------------------------------------------------
> >>> Seq Scan on calllogs  (cost=0.00..136.11 rows=24 width=4) (actual
> >>>time=0.30..574.72 rows=143485 loops=1)
> >>>   Filter: (from = 'you'::character varying)
> >>> Total runtime: 676.24 msec
> >>>(3 rows)
> >>>
> >>>
> >>Have you got an index on calllogs(from)?
> >>
> >>Have you vacuumed and analyzed that table recently?
> >>
> >>Chris
> >>
> >>
> >>
> >>
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> >    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >
> >
>
>

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html