Re: Performance advice - Mailing list pgsql-performance

From Manfred Koizar
Subject Re: Performance advice
Date
Msg-id 28plfvck3q28qgipvgfjfdc9rk77b170tq@4ax.com
Whole thread Raw
In response to Re: Performance advice  ("Michael Mattox" <michael.mattox@verideon.com>)
List pgsql-performance
On Wed, 25 Jun 2003 11:47:48 +0200, "Michael Mattox"
<michael.mattox@verideon.com> wrote:
>> |INFO:  --Relation public.jdo_sequencex--
>> |INFO:  Pages 28: Changed 1, Empty 0; Tup 1: Vac 5124, Keep 0, UnUsed 0.
>>                                           ^      ^^^^
>> This table could stand more frequent VACUUMs, every 15 minutes or so.
>
>Can you explain what the "Vac" is

That's a long story, where shall I start?  Search for MVCC in the docs
and in the list archives.  So you know that every DELETE and every
UPDATE leaves behind old versions of tuples.  The space occupied by
these cannot be used immediately.  VACUUM is responsible for finding
dead tuples, which are so old that there is no active transaction that
could be interested in their contents, and reclaiming the space.  The
number of such tuples is reported as "Vac".

> and how you knew that it should be vacuumed more often?

jdo_sequencex stores (5000 old versions and 1 active version of) a
single row in 28 pages.  Depending on when you did ANALYSE it and
depending on the SQL statement, the planner might think that a
sequential scan is the most efficient way to access this single row.
A seq scan has to read 28 pages instead of a single page.  Well,
probably all 28 pages are in the OS cache or even in PG's shared
buffers, but 27 pages are just wasted and push out pages you could
make better use of.  And processing those 28 pages does not come at no
CPU cost.  If you VACUUM frequently enough, this relation never grows
beyond one page.

>I'm using Java Data Objects (JDO) which is an O/R mapper.  It generated the
>schema from my object model by default it used a table for a sequence.  I
>just got finished configuring it to use a real postgres sequence.  With the
>way they have it designed, it opens and closes a connection each time it
>retrieves a sequence.  Would I get a performance increase if I modify their
>code to retrieve multiple sequence numbers in one connection?  For example I
>could have it grab 50 at a time, which would replace 50 connections with 1.

Better yet you modify the code to use the normal access functions for
sequences.

Servus
 Manfred

pgsql-performance by date:

Previous
From: SZUCS Gábor
Date:
Subject: Re: Similar querys, better execution time on worst execution plan
Next
From: Andre Schubert
Date:
Subject: problem with pg_statistics