Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Date
Msg-id 47D80F5C.2060500@enterprisedb.com
Whole thread Raw
In response to Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Responses Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
List pgsql-patches
Pavan Deolasee wrote:
> On Wed, Mar 12, 2008 at 9:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>>  I didn't like it; it seemed overly complicated (consider dealing with
>>  XID wraparound),
>
> We are talking about subtransactions here. I don't think we support
> subtransaction wrap-around, do we ?

Imagine that you start a transaction just before transaction
wrap-around, so that the top level XID is 2^31-10. Then you start 20
subtransactions. What XIDs will they get? Now how would you map those to
a bitmap?

It's certainly possible, you could index the bitmap by the index from
top transaction XID for example. But it does get a bit complicated.

>> and it would have problems with a slow transaction
>>  generating a sparse set of subtransaction XIDs.
>
> I agree thats the worst case. But is that common ? Thats what I
> was thinking when I proposed the alternate solution. I thought that can
> happen only if most of the subtransactions abort, which again I thought
> is not a normal case. But frankly I am not sure if my assumption is correct.

It's not that common to have hundreds of thousands of subtransactions to
begin with..

>> I think getting rid of
>>  the linear search will be enough to fix the performance problem.
>
> I wonder if a skewed binary search would help more ? For example,
> if we know that the range of xids stored in the array is 1 to 1000 and
> if we are searching for a number closer to 1000, we can break the
> array into <large,small> parts instead of equal parts and then
> search.

Possibly, but I doubt it's worth the trouble. The simple binary search
solved the performance problem well enough. In the test case of the OP,
with 300000 subtransactions, with the patch, there was no longer any
measurable difference whether you ran the "SELECT COUNT(*)" in the same
transaction as the INSERTs or after a COMMIT.

> Well, may be I making simple things complicated ;-)

I think so :-).

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: "Pavan Deolasee"
Date:
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Next
From: Tom Lane
Date:
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit