Re: Performance question - Mailing list pgsql-admin

From Benjamin Krajmalnik
Subject Re: Performance question
Date
Msg-id F4E6A2751A2823418A21D4A160B68988029D7A@fletch.stackdump.local
Whole thread Raw
In response to Performance question  ("Benjamin Krajmalnik" <kraj@illumen.com>)
List pgsql-admin
Unfortunately, the application doing the logging is not written by me,
so I have to play with the deck of cards which I am dealt.
The developer stated that he uses connection pooling, but apparently the
pg odbc driver does not support it, since I can see a new connection
being made for each statement execution.
I have asked him to see if he can create one connection and maintain it
- we'll see.

I have another scenario to try out, whereby instead of making a stored
procedure call via ODBC I will insert the parameters of the SP call into
a table via simple insert statements.  If I see significant improvements
(I am not holding my breath), I will use that route and run a background
peocess on the server to issue the SP calls off of that recordset.

> -----Original Message-----
> From: Jim C. Nasby [mailto:decibel@decibel.org]
> Sent: Monday, December 04, 2006 8:25 PM
> To: Benjamin Krajmalnik
> Cc: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Performance question
>
> Please cc the list so others can learn and help.
>
> Yes, if ODBC is tearing the connection down after every call,
> performance *WILL* suck. Setup some kind of persistent connection.
> Better yet, don't make lots of tiny calls to the database if
> you can avoid it.
>
> On Mon, Dec 04, 2006 at 06:28:03PM -0700, Benjamin Krajmalnik wrote:
> > Hi Jim,
> >
> > Apples to apples (as best as I can tell).
> > The test procedure was as follows:
> >
> > I captured the SQL statements which call the stored procedures from
> > our monitoring box to a file.
> > I copied the file to the FreeBSD data server.
> > Executed psql, and pumped the file to it through \i filename.
> > Execution speed for the code in question was sub 10 seconds.
> > Running the query directly via ODBC is taking about 60 ms.
> >
> > The numbers I just stated above are after I removed the
> code which was
> > accessing the one partitioned table where the constraint
> exclusion was
> > not taking place (more on that later).  This code happened to be
> > updating a daily aggregate - so each SP call was selecting the test
> > record for the day and aggregating data.  I ended up adding
> a few more
> > fields to a records which keeps some data in arrays for
> quick access
> > to generate our graphs, and now I run a scheduled task at midnight
> > which takes the previous days aggregates and creates our snapshot
> > records (which we use for historical trending of up to 24 months).
> > This process now takes about 5 seconds, whereas running it
> through the
> > stored procedure it was adding almost 200 miliseconds to
> the execution time).
> > I assume this was caused by the fact that there was a
> select - and the
> > execution planner was not isolating to the single
> partitioned table,
> > followed by either an insert or another update.
> > The new code is much more efficient and loads the db much
> less, so it
> > will be the preferred method.
> > Right now the datra path between the monitoring appliacne
> and the data
> > server is 100Mbit.  We are moving it to Gig tonight - so at least I
> > hope I will realize some speed advantages.  I have been
> monitoring the
> > switch, and the actual throughput is only 150kbit/sec, so I do not
> > have a data throughput bpttlenecg, but rather I suspect
> latencies in
> > the ODBC connection.
> >
> > The item I found interesting is that the statement duration
> was much
> > longer.
> > What I am thinking is that when running through psql, it is using a
> > local pipe.
> > When running through ODBC, each stored procedure call is creating a
> > connection and then disconnecting, so it has added a huge
> overhead.  I
> > think the duration is including the time it is taking to
> > connect/authenticate disconnect/teardown.
> >
> > Thanks for the tip on the constraint exclusion.
> > I need to go back and see how we are querying it for the
> reports, and
> > will adjust it accordingly.
> > Does it make a difference if it is timestamp or timestamp without
> > timezone?
> >
> >
> > > -----Original Message-----
> > > From: Jim Nasby [mailto:decibel@decibel.org]
> > > Sent: Monday, December 04, 2006 5:53 PM
> > > To: Benjamin Krajmalnik
> > > Cc: pgsql-admin@postgresql.org
> > > Subject: Re: [ADMIN] Performance question
> > >
> > > Are you sure you're doing an apples-apples comparison? Is
> the load
> > > on both machines the same, or does production have extra stuff
> > > running?
> > > Have you tried your original test again in the same setup without
> > > ODBC to eliminate that possibility? (or redone your original test
> > > with ODBC).
> > >
> > > As for the constraint exclusion, try changing the query so that
> > > '2006-12-01' is cast to a timestamp (or change the constraints so
> > > that you're constraining dates... the check constraints should
> > > really match the type of the underlying table).
> > >
> > > On Dec 1, 2006, at 9:18 PM, Benjamin Krajmalnik wrote:
> > >
> > > > I am battling a performance issue and was wondering if
> > > someone could
> > > > help.  PostgreSQL 8.1.5, FreeBSD.
> > > >
> > > > I have a very intense stored procedure which performs real time
> > > > aggregation of data.
> > > >
> > > > I captured the stored procedure calls from a production
> system and
> > > > pumped them through psql, logging duration.
> > > > The stored procedure ran at about 30ms per stored
> procedure call -
> > > > which
> > > > was significantly faster than the previous production system.
> > > > Based on
> > > > these results, I was extremely optimistic and migrated
> to the new
> > > > server.
> > > >
> > > > Now that we turned up the new server, the stored procedure
> > > is taking
> > > > over 250ms.
> > > > These are the execution times as recorded by enable
> "all" logging.
> > > >
> > > > The only difference is that when I pumped the data in the test
> > > > environment, I pumped it by reading the script via psql,
> > > whereas the
> > > > current system is calling the stored procedure via ODBC
> through a
> > > > 100Mbit ethernet connection.
> > > >
> > > > I cannot explain the difference between the 2
> scenarios.  Does the
> > > > duration include the overhead of the ODBC layer?
> > > >
> > > > Now, the second part that was troubling relates to the
> execution
> > > > planner.
> > > >
> > > >
> > > > Two of the tables in which data is being entered are
> partitioned
> > > > tables.
> > > > Data is being routed to the partition by using rules.  Each
> > > partition
> > > > has a check constraint on a date range.  Constraint
> exclusion is on.
> > > >
> > > > The first partitioned table is very simple, and the stored
> > > procedure
> > > > simply inserts a record for each test which is executed.  These
> > > > are later used by a procedure running in the background which
> > > calculates
> > > > statistical data.
> > > >
> > > > My problem is with the second partitioned table.  As part of my
> > > > troubleshooting, I omitted the code from the stored procedure
> > > > which accesses this data.  In this table, I attempt to retrieve
> > > > the daily record for a specific test.  If it is not
> found, after
> > > > aggregating data I insert it.  If the record already
> exists, data
> > > > is aggregated and the record is updated.
> > > >
> > > > The key is composed of a date field (monthdate) and an
> int4 field
> > > > (kstestsysid).
> > > > The check constraints are of the form of:
> > > >
> > > > ALTER TABLE tblkssnapshot12
> > > >   ADD CONSTRAINT tblkssnapshot12_chk CHECK (monthdate >=
> > > '2006-12-01
> > > > 00:00:00'::timestamp without time zone AND monthdate <
> '2007-01-01
> > > > 00:00:00'::timestamp without time zone);
> > > >
> > > > I ran a sample query which would be issued by the
> stored procedure
> > > > with the explain option.
> > > > The query was:
> > > >
> > > > select monthdate, kstestssysid from tblkssnapshotdaily
> > > where monthdate
> > > > = '2006-12-01' and kstestssysid = 3143
> > > >
> > > > The explain is as follows:
> > > >
> > > > select monthdate, kstestssysid from tblkssnapshotdaily
> > > where monthdate
> > > > = '2006-12-01' and kstestssysid = 3143 Result
> (cost=0.00..122.49
> > > > rows=25 width=8)
> > > >   ->  Append  (cost=0.00..122.49 rows=25 width=8)
> > > >         ->  Index Scan using
> tblkssnapshotdaily_idx_monthtest on
> > > > tblkssnapshotdaily  (cost=0.00..3.52 rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot01_idx_monthtest on
> > > > tblkssnapshot01 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot02_idx_monthtest on
> > > > tblkssnapshot02 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot03_idx_monthtest on
> > > > tblkssnapshot03 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot04_idx_monthtest on
> > > > tblkssnapshot04 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot05_idx_monthtest on
> > > > tblkssnapshot05 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot06_idx_monthtest on
> > > > tblkssnapshot06 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot07_idx_monthtest on
> > > > tblkssnapshot07 tblkssnapshotdaily  (cost=0.00..5.94
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot08_idx_monthtest on
> > > > tblkssnapshot08 tblkssnapshotdaily  (cost=0.00..4.73
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot09_idx_monthtest on
> > > > tblkssnapshot09 tblkssnapshotdaily  (cost=0.00..5.56
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot10_idx_monthtest on
> > > > tblkssnapshot10 tblkssnapshotdaily  (cost=0.00..5.75
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot11_idx_monthtest on
> > > > tblkssnapshot11 tblkssnapshotdaily  (cost=0.00..5.81
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot12_idx_monthtest on
> > > > tblkssnapshot12 tblkssnapshotdaily  (cost=0.00..4.33
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot13_idx_monthtest on
> > > > tblkssnapshot13 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot14_idx_monthtest on
> > > > tblkssnapshot14 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot15_idx_monthtest on
> > > > tblkssnapshot15 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot16_idx_monthtest on
> > > > tblkssnapshot16 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot17_idx_monthtest on
> > > > tblkssnapshot17 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot18_idx_monthtest on
> > > > tblkssnapshot18 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot19_idx_monthtest on
> > > > tblkssnapshot19 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot20_idx_monthtest on
> > > > tblkssnapshot20 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot21_idx_monthtest on
> > > > tblkssnapshot21 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot22_idx_monthtest on
> > > > tblkssnapshot22 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot23_idx_monthtest on
> > > > tblkssnapshot23 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >         ->  Index Scan using tblkssnapshot24_idx_monthtest on
> > > > tblkssnapshot24 tblkssnapshotdaily  (cost=0.00..4.83
> rows=1 width=8)
> > > >               Index Cond: ((monthdate = '2006-12-01'::date) AND
> > > > (kstestssysid = 3143))
> > > >
> > > > If constraint exclusion is on, why are index scans taking
> > > place on all
> > > > of the tables and not only on the only partitioned table
> > > for which the
> > > > constraint exists?
> > > >
> > > > I went to a partitioned table thinking this would improve
> > > performance
> > > > from the previous schema in which all of the data was in the
> > > > parent table, but the end result was significantly slower
> > > > performance (by orders of magnitude).
> > > >
> > > >
> > > >
> > > > ---------------------------(end of
> > > > broadcast)---------------------------
> > > > TIP 5: don't forget to increase your free space map settings
> > > >
> > >
> > > --
> > > Jim Nasby                                            jim@nasby.net
> > > EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
> > >
> > >
> > >
> >
>
> --
> Jim C. Nasby, Database Architect                decibel@decibel.org
> Give your computer some brain candy! www.distributed.net Team #1828
>
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
>

pgsql-admin by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Performance question
Next
From: Jim Nasby
Date:
Subject: Re: dump from 8.x restore to 7.3.x