Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit - Mailing list pgsql-performance

From Heikki Linnakangas
Subject Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date
Msg-id 47D54B6C.7080404@enterprisedb.com
Whole thread Raw
In response to Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> For 8.4, it would be nice to improve that. I tested that on my laptop
>> with a similarly-sized table, inserting each row in a pl/pgsql function
>> with an exception handler, and I got very similar run times. According
>> to oprofile, all the time is spent in TransactionIdIsInProgress. I think
>> it would be pretty straightforward to store the committed subtransaction
>> ids in a sorted array, instead of a linked list, and binary search.
>
> I think the OP is not complaining about the time to run the transaction
> that has all the subtransactions; he's complaining about the time to
> scan the table that it emitted.

If you read the original post carefully, he complained that the seq scan
was slow when executed within the same transaction as populating the
table, and fast if he committed in between.

>  Presumably, each row in the table has a
> different (sub)transaction ID and so we are thrashing the clog lookup
> mechanism.  It only happens once because after that the XMIN_COMMITTED
> hint bits are set.
>
> This probably ties into the recent discussions about eliminating the
> fixed-size allocations for SLRU buffers --- I suspect it would've run
> better if it could have scaled up the number of pg_clog pages held in
> memory.

I doubt that makes any noticeable difference in this case. 300000
transaction ids fit on < ~100 clog pages, and the xmins on heap pages
are nicely in order.

Getting rid of the fixed-size allocations would be nice for other
reasons, of course.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Next
From: Tom Lane
Date:
Subject: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit