Re: High CPU Usage - PostgreSQL 7.3 - Mailing list pgsql-performance

From Neil Hepworth
Subject Re: High CPU Usage - PostgreSQL 7.3
Date
Msg-id 5a6f767a0607100212p55050c79lc456013f90d2c7bc@mail.gmail.com
Whole thread Raw
In response to Re: High CPU Usage - PostgreSQL 7.3  ("Neil Hepworth" <nhepworth@gmail.com>)
Responses Re: High CPU Usage - PostgreSQL 7.3
List pgsql-performance
I should also explain that I run through these queries on multiple
tables and with some slightly different parameters for the
"consolidation" so I run through those 3 queries (or similar) 9 times
and this takes a total of about 2 hours, with high CPU usage.  And I
am running the queries from a remote Java application (using JDBC),
the client is using postgresql-8.0-311.jdbc3.jar.  The explain analyse
results I have provided below are from running via pgAdmin, not the
Java app (I did a vacuum analyse of the db before running them):


*** For the create ***:

-- Executing query:

BEGIN;
EXPLAIN ANALYZE CREATE TABLE fttemp1643 AS SELECT * FROM ftone LIMIT 0;
;
ROLLBACK;

ERROR:  parser: parse error at or near "CREATE" at character 25

Now that surprised me!  I hadn't done an explain on that query before
as it was so simple.  Perhaps not permitted for creates?  If I just
run the create:

-- Executing query:
CREATE TABLE fttemp1643 AS SELECT * FROM ftone LIMIT 0;


Query returned successfully with no result in 48 ms.



*** For the insert ***:

Subquery Scan "*SELECT*"  (cost=59690.11..62038.38 rows=23483
width=16) (actual time=16861.73..36473.12 rows=560094 loops=1)
  ->  Aggregate  (cost=59690.11..62038.38 rows=23483 width=16) (actual
time=16861.72..34243.63 rows=560094 loops=1)
        ->  Group  (cost=59690.11..61451.32 rows=234827 width=16)
(actual time=16861.62..20920.12 rows=709461 loops=1)
              ->  Sort  (cost=59690.11..60277.18 rows=234827 width=16)
(actual time=16861.62..18081.07 rows=709461 loops=1)
                    Sort Key: eppairdefnid, "start"
                    ->  Seq Scan on ftone  (cost=0.00..36446.66
rows=234827 width=16) (actual time=0.45..10320.91 rows=709461 loops=1)
                          Filter: ((consolidation = 60) AND ("start" <
(to_timestamp('2006-07-10 18:43:27.391103+1000'::text,
'YYYY-MM-DDHH24:00:00.0'::text))::timestamp without time zone))
Total runtime: 55378.68 msec


*** For the delete ***:

Hash Join  (cost=0.00..30020.31 rows=425 width=14) (actual
time=3767.47..3767.47 rows=0 loops=1)
  Hash Cond: ("outer".eppairdefnid = "inner".eppairdefnid)
  ->  Seq Scan on ftone  (cost=0.00..23583.33 rows=1286333 width=10)
(actual time=0.04..2299.94 rows=1286333 loops=1)
  ->  Hash  (cost=0.00..0.00 rows=1 width=4) (actual
time=206.01..206.01 rows=0 loops=1)
        ->  Seq Scan on fttemp1600384653  (cost=0.00..0.00 rows=1
width=4) (actual time=206.00..206.00 rows=0 loops=1)
Total runtime: 3767.52 msec


Thanks,
Neil

On 10/07/06, Neil Hepworth <nhepworth@gmail.com> wrote:
> Thanks for the reply.
>
> The database is vacuum analysed regularly and during my testing I
> tried running the vacuum analyse full immediately before the running
> through the set of queries (which does help a bit - reduces the time
> to about 80% but is is still over an hour, with basically 100% CPU).
>
> I'll get back to you with the full explain analyse output (I need to
> re-create my test database back to its original state and that takes a
> while) but I assume the part you're after is that all queries are
> sequential scans, which I initially thought was the problem.  But it
> is my understanding that I cannot make them index scans because a
> large percentage of the table is being returned by the query
> (typically 30%) so the planner will favour a sequential scan over an
> index scan for such a query, correct?  If the queries had been disk
> bound (due to retrieving large amounts of data) I would have
> understood but I am confused as to why a sequential scan would cause
> such high CPU and not high disk activity.
>
> Yes, I wish I could upgrade to the latest version of PostgreSQL but at
> the moment my hands are tied due to dependencies on other applications
> running on our server (obviously we need to update certain queries,
> e.g. delete .. using.. and test with 8.1 first) - I will be pushing
> for an upgrade as soon as possible.  And the fsync=false is a
> "compromise" to try to improve performance (moving to 8.1 would be
> better compromise).
>
> Neil
>
>
> On 10/07/06, Jeff Frost <jeff@frostconsultingllc.com> wrote:
> > On Mon, 10 Jul 2006, Neil Hepworth wrote:
> >
> > > I am running PostgreSQL 7.3 on a Linux box (RHEL 2.1 - Xeon 2.8GHz
> > > with 1GB of RAM) and seeing very high CPU usage (normally over 90%)
> > > when I am running the following queries, and the queries take a long
> > > time to return; over an hour!
> >
> > First off, when is the last time you vacuum analyzed this DB and how often
> > does the vacuum analyze happen.  Please post the EXPLAIN ANALYZE output for
> > each of the queries below.
> >
> > Also, I would strongly urge you to upgrade to a more recent version of
> > postgresql.  We're currently up to 8.1.4 and it has tons of excellent
> > performance enhancements as well as helpful features such as integrated
> > autovacuum, point in time recovery backups, etc.
> >
> > Also, I see that you're running with fsync = false.  That's quite dangerous
> > especially on a production system.
> >
> >
> > >
> > > CREATE TEMPORARY TABLE fttemp1600384653 AS SELECT * FROM ftoneway LIMIT 0;
> > >
> > > INSERT INTO fttemp1600384653 SELECT epId, TO_TIMESTAMP(start,
> > > 'YYYY-MM-DD HH24:00:00.0')::timestamp AS start, 60 AS consolidation,
> > > SUM(cnt) AS cnt FROM ftone WHERE consolidation = 0 AND start <
> > > TO_TIMESTAMP('2006-06-27 18:43:27.391103+1000', 'YYYY-MM-DD
> > > HH24:00:00.0')::timestamp;
> > >
> > > DELETE FROM ONLY ftone WHERE ftoneway.epId= fttemp1600384653.epId;
> > >
> > > The only changes I've made to the default postgresql.comf file are listed
> > > below:
> > >
> > > LC_MESSAGES = 'en_US'
> > > LC_MONETARY = 'en_US'
> > > LC_NUMERIC = 'en_US'
> > > LC_TIME = 'en_US'
> > > tcpip_socket = true
> > > max_connections = 20
> > > effective_cache_size = 32768
> > > wal_buffers = 128
> > > fsync = false
> > > shared_buffers = 3000
> > > max_fsm_relations = 10000
> > > max_fsm_pages = 100000
> > >
> > > The tables are around a million rows but when when I run against
> > > tables of a few hundred thousand rows it still takes tens of minutes
> > > with high CPU.  My database does have a lot of tables (can be several
> > > thousand), can that cause performance issues?
> > >
> > > Thanks,
> > >   Neil
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 3: Have you checked our extensive FAQ?
> > >
> > >              http://www.postgresql.org/docs/faq
> > >
> > >
> >
> > --
> > Jeff Frost, Owner       <jeff@frostconsultingllc.com>
> > Frost Consulting, LLC   http://www.frostconsultingllc.com/
> > Phone: 650-780-7908     FAX: 650-649-1954
> >
>

pgsql-performance by date:

Previous
From: "Neil Hepworth"
Date:
Subject: Re: High CPU Usage - PostgreSQL 7.3
Next
From: Vivek Khera
Date:
Subject: Re: suggested RAID controller for FreeBSD 6.1 + PostgreSQL 8.1