Thread: Seeking advice regarding a design problem

Seeking advice regarding a design problem

From
Wei Weng
Date:
I am running postgresql as database backend, and I have some scripts
dealing with constant incoming data and then insert these data into the
database, in a quite complex way, involving a couple of procedures.

But the performance of the database is worse than I had thought. After
about 100 times of the script being run, the speed of the insertion
slowed down dramatically. But it went back to the regular fast speed
after I did a vacuum analyze.

how can I redesign the system to avoid the bottleneck? And why is it
that postgresql can slow down so much after doing some complex
operations?


Thanks


-- 
Wei Weng
Network Software Engineer
KenCast Inc.




Re: Seeking advice regarding a design problem

From
Wei Weng
Date:
Forgot to mention, the version of postgresql I am running is 7.1.3.

On Fri, 2002-08-02 at 12:16, Wei Weng wrote:
> I am running postgresql as database backend, and I have some scripts
> dealing with constant incoming data and then insert these data into the
> database, in a quite complex way, involving a couple of procedures.
> 
> But the performance of the database is worse than I had thought. After
> about 100 times of the script being run, the speed of the insertion
> slowed down dramatically. But it went back to the regular fast speed
> after I did a vacuum analyze.
> 
> how can I redesign the system to avoid the bottleneck? And why is it
> that postgresql can slow down so much after doing some complex
> operations?
> 
> 
> Thanks
> 
> 
> -- 
> Wei Weng
> Network Software Engineer
> KenCast Inc.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 
-- 
Wei Weng
Network Software Engineer
KenCast Inc.




Re: Seeking advice regarding a design problem

From
Stephan Szabo
Date:
On 2 Aug 2002, Wei Weng wrote:

> I am running postgresql as database backend, and I have some scripts
> dealing with constant incoming data and then insert these data into the
> database, in a quite complex way, involving a couple of procedures.
>
> But the performance of the database is worse than I had thought. After
> about 100 times of the script being run, the speed of the insertion
> slowed down dramatically. But it went back to the regular fast speed
> after I did a vacuum analyze.
>
> how can I redesign the system to avoid the bottleneck? And why is it

Upgrade to 7.2 so that you can vacuum while other things are going on
and vacuum analyze after modifying a large portion of the database (note
that if the database is particularly large you'll probably need to change
the free space map configuration as well).

It's hard to tell what particularly you're running into, is it just a
case that you're accessing the dead tuples and that's slowing it down,
is it that you're changing the data in a fashion that changes how the
optimizer should be choosing to do queries, etc...




Re: Seeking advice regarding a design problem

From
Wei Weng
Date:
On Fri, 2002-08-02 at 14:24, Stephan Szabo wrote:
> On 2 Aug 2002, Wei Weng wrote:
> 
> > I am running postgresql as database backend, and I have some scripts
> > dealing with constant incoming data and then insert these data into the
> > database, in a quite complex way, involving a couple of procedures.
> >
> > But the performance of the database is worse than I had thought. After
> > about 100 times of the script being run, the speed of the insertion
> > slowed down dramatically. But it went back to the regular fast speed
> > after I did a vacuum analyze.
> >
> > how can I redesign the system to avoid the bottleneck? And why is it
> 
> Upgrade to 7.2 so that you can vacuum while other things are going on
> and vacuum analyze after modifying a large portion of the database (note
> that if the database is particularly large you'll probably need to change
> the free space map configuration as well).
I found this in my postgresql.conf

#shared_buffers = 64        # 2*max_connections, min 16
#max_fsm_relations = 100    # min 10, fsm is free space map
#max_fsm_pages = 10000      # min 1000, fsm is free space map
#max_locks_per_transaction = 64 # min 10
#wal_buffers = 8            # min 4

Which ones are critical to the insertion performance? I looked for them
in the interactive dev doc, but the descriptions were not clear enough.

> 
> It's hard to tell what particularly you're running into, is it just a
> case that you're accessing the dead tuples and that's slowing it down,
What do you mean by "dead tuples"?


> is it that you're changing the data in a fashion that changes how the
> optimizer should be choosing to do queries, etc...
> 

Thanks!
-- 
Wei Weng
Network Software Engineer
KenCast Inc.




Re: Seeking advice regarding a design problem

From
Stephan Szabo
Date:
On 2 Aug 2002, Wei Weng wrote:

> On Fri, 2002-08-02 at 14:24, Stephan Szabo wrote:
> > On 2 Aug 2002, Wei Weng wrote:
> >
> > > I am running postgresql as database backend, and I have some scripts
> > > dealing with constant incoming data and then insert these data into the
> > > database, in a quite complex way, involving a couple of procedures.
> > >
> > > But the performance of the database is worse than I had thought. After
> > > about 100 times of the script being run, the speed of the insertion
> > > slowed down dramatically. But it went back to the regular fast speed
> > > after I did a vacuum analyze.
> > >
> > > how can I redesign the system to avoid the bottleneck? And why is it
> >
> > Upgrade to 7.2 so that you can vacuum while other things are going on
> > and vacuum analyze after modifying a large portion of the database (note
> > that if the database is particularly large you'll probably need to change
> > the free space map configuration as well).
> I found this in my postgresql.conf
>
> #shared_buffers = 64        # 2*max_connections, min 16
> #max_fsm_relations = 100    # min 10, fsm is free space map
> #max_fsm_pages = 10000      # min 1000, fsm is free space map
> #max_locks_per_transaction = 64 # min 10
> #wal_buffers = 8            # min 4
>
> Which ones are critical to the insertion performance? I looked for them
> in the interactive dev doc, but the descriptions were not clear enough.

In general shared_buffers should be higher than the default.  I'd suggest
incrementing it while testing to get an idea of what works for your
system.

In 7.2, you may want to raise max_fsm_pages if you're noticing that
non-full vacuums are not reclaiming all of your space.

> > It's hard to tell what particularly you're running into, is it just a
> > case that you're accessing the dead tuples and that's slowing it down,
> What do you mean by "dead tuples"?

Tuples that are not visible to the transaction.  Postgres uses a non
overwriting storage manager, so any updates or deletes leave the old
row in place.  Vacuum removes rows that no transaction can see.
If you vacuum analyze you'll get some stats about how many rows were
removed and such.

Another important question is whether you've got any foreign keys or
triggers on the tables since those may be making a difference as well.