Thread: Seeking advice regarding a design problem
I am running postgresql as database backend, and I have some scripts dealing with constant incoming data and then insert these data into the database, in a quite complex way, involving a couple of procedures. But the performance of the database is worse than I had thought. After about 100 times of the script being run, the speed of the insertion slowed down dramatically. But it went back to the regular fast speed after I did a vacuum analyze. how can I redesign the system to avoid the bottleneck? And why is it that postgresql can slow down so much after doing some complex operations? Thanks -- Wei Weng Network Software Engineer KenCast Inc.
Forgot to mention, the version of postgresql I am running is 7.1.3. On Fri, 2002-08-02 at 12:16, Wei Weng wrote: > I am running postgresql as database backend, and I have some scripts > dealing with constant incoming data and then insert these data into the > database, in a quite complex way, involving a couple of procedures. > > But the performance of the database is worse than I had thought. After > about 100 times of the script being run, the speed of the insertion > slowed down dramatically. But it went back to the regular fast speed > after I did a vacuum analyze. > > how can I redesign the system to avoid the bottleneck? And why is it > that postgresql can slow down so much after doing some complex > operations? > > > Thanks > > > -- > Wei Weng > Network Software Engineer > KenCast Inc. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Wei Weng Network Software Engineer KenCast Inc.
On 2 Aug 2002, Wei Weng wrote: > I am running postgresql as database backend, and I have some scripts > dealing with constant incoming data and then insert these data into the > database, in a quite complex way, involving a couple of procedures. > > But the performance of the database is worse than I had thought. After > about 100 times of the script being run, the speed of the insertion > slowed down dramatically. But it went back to the regular fast speed > after I did a vacuum analyze. > > how can I redesign the system to avoid the bottleneck? And why is it Upgrade to 7.2 so that you can vacuum while other things are going on and vacuum analyze after modifying a large portion of the database (note that if the database is particularly large you'll probably need to change the free space map configuration as well). It's hard to tell what particularly you're running into, is it just a case that you're accessing the dead tuples and that's slowing it down, is it that you're changing the data in a fashion that changes how the optimizer should be choosing to do queries, etc...
On Fri, 2002-08-02 at 14:24, Stephan Szabo wrote: > On 2 Aug 2002, Wei Weng wrote: > > > I am running postgresql as database backend, and I have some scripts > > dealing with constant incoming data and then insert these data into the > > database, in a quite complex way, involving a couple of procedures. > > > > But the performance of the database is worse than I had thought. After > > about 100 times of the script being run, the speed of the insertion > > slowed down dramatically. But it went back to the regular fast speed > > after I did a vacuum analyze. > > > > how can I redesign the system to avoid the bottleneck? And why is it > > Upgrade to 7.2 so that you can vacuum while other things are going on > and vacuum analyze after modifying a large portion of the database (note > that if the database is particularly large you'll probably need to change > the free space map configuration as well). I found this in my postgresql.conf #shared_buffers = 64 # 2*max_connections, min 16 #max_fsm_relations = 100 # min 10, fsm is free space map #max_fsm_pages = 10000 # min 1000, fsm is free space map #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8 # min 4 Which ones are critical to the insertion performance? I looked for them in the interactive dev doc, but the descriptions were not clear enough. > > It's hard to tell what particularly you're running into, is it just a > case that you're accessing the dead tuples and that's slowing it down, What do you mean by "dead tuples"? > is it that you're changing the data in a fashion that changes how the > optimizer should be choosing to do queries, etc... > Thanks! -- Wei Weng Network Software Engineer KenCast Inc.
On 2 Aug 2002, Wei Weng wrote: > On Fri, 2002-08-02 at 14:24, Stephan Szabo wrote: > > On 2 Aug 2002, Wei Weng wrote: > > > > > I am running postgresql as database backend, and I have some scripts > > > dealing with constant incoming data and then insert these data into the > > > database, in a quite complex way, involving a couple of procedures. > > > > > > But the performance of the database is worse than I had thought. After > > > about 100 times of the script being run, the speed of the insertion > > > slowed down dramatically. But it went back to the regular fast speed > > > after I did a vacuum analyze. > > > > > > how can I redesign the system to avoid the bottleneck? And why is it > > > > Upgrade to 7.2 so that you can vacuum while other things are going on > > and vacuum analyze after modifying a large portion of the database (note > > that if the database is particularly large you'll probably need to change > > the free space map configuration as well). > I found this in my postgresql.conf > > #shared_buffers = 64 # 2*max_connections, min 16 > #max_fsm_relations = 100 # min 10, fsm is free space map > #max_fsm_pages = 10000 # min 1000, fsm is free space map > #max_locks_per_transaction = 64 # min 10 > #wal_buffers = 8 # min 4 > > Which ones are critical to the insertion performance? I looked for them > in the interactive dev doc, but the descriptions were not clear enough. In general shared_buffers should be higher than the default. I'd suggest incrementing it while testing to get an idea of what works for your system. In 7.2, you may want to raise max_fsm_pages if you're noticing that non-full vacuums are not reclaiming all of your space. > > It's hard to tell what particularly you're running into, is it just a > > case that you're accessing the dead tuples and that's slowing it down, > What do you mean by "dead tuples"? Tuples that are not visible to the transaction. Postgres uses a non overwriting storage manager, so any updates or deletes leave the old row in place. Vacuum removes rows that no transaction can see. If you vacuum analyze you'll get some stats about how many rows were removed and such. Another important question is whether you've got any foreign keys or triggers on the tables since those may be making a difference as well.