Thread: GIN fast insert

GIN fast insert

From
Teodor Sigaev
Date:
Hi there,

we present two variants of GIN fast insert patch, since we're not sure
what is a the best solution:

v0.28.1
- remove disable cost in gincostestimate
- per http://archives.postgresql.org/message-id/499466D2.4010808@sigaev.ru
   gingettuple could force cleanup of pending list if it got a lossy tidbitmap.
   If such cleanup occurs the gingettuple will rescan pending list again.

v0.28.2
- remove disable cost in gincostestimate
- per http://archives.postgresql.org/message-id/12795.1234379754@sss.pgh.pa.us
   AM can now have only one search method: amgettuple or amgetbitmap.
- GIN now has only amgetbitmap interface
--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

Re: GIN fast insert

From
Robert Haas
Date:
On Tue, Feb 17, 2009 at 2:28 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Hi there,
>
> we present two variants of GIN fast insert patch, since we're not sure
> what is a the best solution:
>
> v0.28.1
> - remove disable cost in gincostestimate
> - per http://archives.postgresql.org/message-id/499466D2.4010808@sigaev.ru
>  gingettuple could force cleanup of pending list if it got a lossy
> tidbitmap.
>  If such cleanup occurs the gingettuple will rescan pending list again.
>
> v0.28.2
> - remove disable cost in gincostestimate
> - per
> http://archives.postgresql.org/message-id/12795.1234379754@sss.pgh.pa.us
>  AM can now have only one search method: amgettuple or amgetbitmap.
> - GIN now has only amgetbitmap interface

I reviewed v0.28.1.  I see that disable_cost is gone from
gincostestimate, but you're still using the pending list to set costs,
and I still think that's bogus.  It seems clear that this is going to
change much faster than plans are going to be invalidated, and if
autovacuum is doing its job, the pending list won't get long enough to
matter much anyway, right?  I don't think this patch should be
touching gincostestimate at all.

I am thinking that it is may not be necessary to remove the
gingettuple interface (as you did in v0.28.2).  Forcing a cleanup of
the pending list seems like a reasonable workaround.  We don't expect
this situation to come up frequently, so if the method we use to
handle it is not terribly efficient, oh well.  The one thing that
concerns me is - what will happen in a hot standby environment, when
that patch is committed?  In that situation, I believe that we can't
call modify any heap or index pages, so...

Some other assorted minor comments on v0.28.1...

1. The description of the "fastupdate" reloption should be reworded
for consistency with other options:

Enables "fast update" feature for this GIN index

2. Why is this implemented as a string reloption rather than a boolean
reloption?  It seems like you want to model this on
autovacuum_enabled.

3. Documentation wordsmithing.  You have the following paragraph:

As of <productname>PostgreSQL</productname> 8.4, this problem is
alleviated by postponing most of the work until the next
<command>VACUUM</>.  Newly inserted index entries are temporarily
stored in an unsorted list of pending entries <command>VACUUM</>
inserts all pending entries into the main <acronym>GIN</acronym> index
data structure, using the same bulk insert techniques used during
initial index creation.  This greatly improves <acronym>GIN</acronym>
index update speed, even counting the additional vacuum overhead.

Here is my proposed rewrite:

As of <productname>PostgreSQL</productname> 8.4, <acronym>GIN</> is
capable of postponing much of this work by inserting new tuples into a
temporary, unsorted list of pending entries.  When the table is
vacuumed, or in some cases when the pending list becomes too large,
the entries are moved to the main <acronym>GIN</acronym> data
structure using the same bulk insert techniques used during initial
index creation.  This greatly improves <acronym>GIN</acronym> index
update speed, even counting the additional vacuum overhead.

4. More wordsmithing.  In the following paragraph, you have:

It's recommended to use properly-configured autovacuum with tables
having <acronym>GIN</acronym> indexes, to keep this overhead to
reasonable levels.

I think it is clearer and more succinct to write simply:

Proper use of autovacuum can minimize this problem.

5. In textsearch.sgml, you say that GIN indexes are moderately slower
to update, but about 10x slower without fastupdate.  Can we provide a
real number in place of "moderately"?  I don't know whether to think
this means 20% or 2x.

6. One of the comment changes in startScanEntry is simply a correction
of a typographical error ("deletion" for "deletition").  You might as
well commit this change separately and remove it from this patch.

7. pg_stat_get_fresh_inserted_tuples.  I am not crazy about the fact
that we call this the pending list in some places, fast update in some
places, and now, here, fresh tuples.  Let's just call it fast insert
tuples.

8. tbm_check_tuple.  The opening curly brace should be uncuddled.  The
curly braces around wordnum = bitnum = 0 are superfluous.

9. gincostestimate.  There are a lot of superfluous whitespace changes
here, and some debugging code that obviously wasn't fully removed.

10. GinPageOpaqueData.  Surely we can come up with a better name than
GIN_LIST. This is yet another name for the same concept.  Why not call
this GIN_FAST_INSERT_LIST?

11. ginInsertCleanup.  "Inserion" is a typo.

Unfortunately, I don't understand all of this patch well enough to
give it as thorough a going-over as it deserves, so my apologies for
whatever I've missed.

...Robert


Re: GIN fast insert

From
Teodor Sigaev
Date:
> and I still think that's bogus.  It seems clear that this is going to
> change much faster than plans are going to be invalidated, and if
> autovacuum is doing its job, the pending list won't get long enough to
> matter much anyway, right?  I don't think this patch should be
> touching gincostestimate at all.
Why not? For any estimation function it's possible to invent workload which will
change cost (or optimal plan) much faster than plan's invalidation.
Gincostestimate depends on statistic on table, not on real index's state, and
plan's invalidation will occur after analyze run.


> I am thinking that it is may not be necessary to remove the
> gingettuple interface (as you did in v0.28.2).  Forcing a cleanup of
> the pending list seems like a reasonable workaround.  We don't expect
> this situation to come up frequently, so if the method we use to
Agree

> handle it is not terribly efficient, oh well.  The one thing that
> concerns me is - what will happen in a hot standby environment, when
> that patch is committed?  In that situation, I believe that we can't
> call modify any heap or index pages, so...

I don't see a problems here, because indexes in postgres don't depend on any
transaction's ids or modes as heap depends. WAL-logger works without that
knowledge too. May be I missed something here or don't understand.

Although heap's pages could be changed by setting commit-status bits on tuple
even in read-only transaction, but that changes are not WAL-logged. That is
correct for index's page too.

> 1. The description of the "fastupdate" reloption should be reworded
> for consistency with other options:
> Enables "fast update" feature for this GIN index
Fixed

>
> 2. Why is this implemented as a string reloption rather than a boolean
> reloption?  It seems like you want to model this on
> autovacuum_enabled.
Fixed

> 3. Documentation wordsmithing.  You have the following paragraph:
Done

> 4. More wordsmithing.  In the following paragraph, you have:
Done

> 5. In textsearch.sgml, you say that GIN indexes are moderately slower
> to update, but about 10x slower without fastupdate.  Can we provide a
> real number in place of "moderately"?  I don't know whether to think
> this means 20% or 2x.
Will make exact measurement some later :)


> 6. One of the comment changes in startScanEntry is simply a correction
> of a typographical error ("deletion" for "deletition").  You might as
> well commit this change separately and remove it from this patch.
Sorry, I don't find what you mean. I found only "begining" (and fixed it)


> 7. pg_stat_get_fresh_inserted_tuples.  I am not crazy about the fact
> that we call this the pending list in some places, fast update in some
> places, and now, here, fresh tuples.  Let's just call it fast insert
> tuples.
It's really inserted ("fresh") tuples in table since last vacuum. This number
doesn't depend on existence of GIN indexes or its modes.


> 8. tbm_check_tuple.  The opening curly brace should be uncuddled.  The
> curly braces around wordnum = bitnum = 0 are superfluous.
It's a copy/paste code from tbm_add_tuples, and I'd like
if
{
...
}
else
{
...
}

much more than
if
..
else
{

}

>
> 9. gincostestimate.  There are a lot of superfluous whitespace changes
> here, and some debugging code that obviously wasn't fully removed.
Cleaned.


> 10. GinPageOpaqueData.  Surely we can come up with a better name than
> GIN_LIST. This is yet another name for the same concept.  Why not call
> this GIN_FAST_INSERT_LIST?
Like other defines here - GIN_DATA, not a GIN_DATA_PAGE, GIN_LEAF - not a
GIN_LEAF_TREE_PAGE.

>
> 11. ginInsertCleanup.  "Inserion" is a typo.
fixed

> Unfortunately, I don't understand all of this patch well enough to
> give it as thorough a going-over as it deserves, so my apologies for
> whatever I've missed.
Thank you a lot for your reviewing.

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

Re: GIN fast insert

From
Robert Haas
Date:
On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>> and I still think that's bogus.  It seems clear that this is going to
>> change much faster than plans are going to be invalidated, and if
>> autovacuum is doing its job, the pending list won't get long enough to
>> matter much anyway, right?  I don't think this patch should be
>> touching gincostestimate at all.
>
> Why not? For any estimation function it's possible to invent workload which
> will change cost (or optimal plan) much faster than plan's invalidation.
> Gincostestimate depends on statistic on table, not on real index's state,
> and plan's invalidation will occur after analyze run.

Hrm, hum, maybe you're right.

>> I am thinking that it is may not be necessary to remove the
>> gingettuple interface (as you did in v0.28.2).  Forcing a cleanup of
>> the pending list seems like a reasonable workaround.  We don't expect
>> this situation to come up frequently, so if the method we use to
>
> Agree
>
>> handle it is not terribly efficient, oh well.  The one thing that
>> concerns me is - what will happen in a hot standby environment, when
>> that patch is committed?  In that situation, I believe that we can't
>> call modify any heap or index pages, so...
>
> I don't see a problems here, because indexes in postgres don't depend on any
> transaction's ids or modes as heap depends. WAL-logger works without that
> knowledge too. May be I missed something here or don't understand.
>
> Although heap's pages could be changed by setting commit-status bits on
> tuple even in read-only transaction, but that changes are not WAL-logged.
> That is correct for index's page too.

It would be helpful if Heikki or Simon could jump in here, but my
understanding is that cleaning up the pending list is a read-write
operation.  I don't think we can do that on a hot standby server.

...Robert


Re: GIN fast insert

From
Heikki Linnakangas
Date:
Robert Haas wrote:
> On Thu, Feb 19, 2009 at 8:36 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>>> handle it is not terribly efficient, oh well.  The one thing that
>>> concerns me is - what will happen in a hot standby environment, when
>>> that patch is committed?  In that situation, I believe that we can't
>>> call modify any heap or index pages, so...
>> I don't see a problems here, because indexes in postgres don't depend on any
>> transaction's ids or modes as heap depends. WAL-logger works without that
>> knowledge too. May be I missed something here or don't understand.
>>
>> Although heap's pages could be changed by setting commit-status bits on
>> tuple even in read-only transaction, but that changes are not WAL-logged.
>> That is correct for index's page too.
> 
> It would be helpful if Heikki or Simon could jump in here, but my
> understanding is that cleaning up the pending list is a read-write
> operation.  I don't think we can do that on a hot standby server.

Right, can't do that on a hot standby server.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: GIN fast insert

From
Teodor Sigaev
Date:
> Right, can't do that on a hot standby server.

Is anywhere applicable hot standby patch? Last version on wiki is 9g and it 
can't be applied cleanly.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: GIN fast insert

From
Heikki Linnakangas
Date:
Teodor Sigaev wrote:
>> Right, can't do that on a hot standby server.
> 
> Is anywhere applicable hot standby patch? Last version on wiki is 9g and 
> it can't be applied cleanly.

The latest version is in Simon's git repository at:

http://git.postgresql.org/?p=~sriggs/simon.git;a=shortlog;h=refs/heads/dev_hot_standby

in the dev_hot_standby branch. I don't think he's posted an updated 
patch based on that work.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: GIN fast insert

From
Simon Riggs
Date:
On Thu, 2009-02-19 at 22:43 -0500, Robert Haas wrote:
> > I don't see a problems here, because indexes in postgres don't
> depend on any
> > transaction's ids or modes as heap depends. WAL-logger works without
> that
> > knowledge too. May be I missed something here or don't understand.
> >
> > Although heap's pages could be changed by setting commit-status bits
> on
> > tuple even in read-only transaction, but that changes are not
> WAL-logged.
> > That is correct for index's page too.
> 
> It would be helpful if Heikki or Simon could jump in here, but my
> understanding is that cleaning up the pending list is a read-write
> operation.  I don't think we can do that on a hot standby server.

>From reading the docs with the patch the pending list is merged into the
main index when a VACUUM is performed. I (think I) can see that
additions to the pending list are WAL logged, so that will work in Hot
Standby. I also see ginEntryInsert() calls during the move from pending
list to main index which means that is WAL logged also. AFAICS this
*could* work during Hot Standby mode.
Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage
rather than attempt to read the patch.
Teodor, can you confirm 
* we WAL log the insert into the pending list
* we WAL log the move from the pending list to the main index
* that we maintain the pending list correctly during redo so that it can
be accessed by index scans

The main thing with Hot Standby is that we can't do any writes. So a
pending list cannot change solely because of a gingettuple call on the
*standby*. But that's easy to disable. If all the inserts happened on
the primary node and all the reads happened on the standby, then pending
list would never be cleaned up if the cleanup is triggered only by read.
I would suggest that we trigger cleanup by read at threshold size X and
trigger cleanup by insert at threshold size 5X. That avoids the strange
case mentioned, but generally ensures only reads trigger cleanup. (But
why do we want that??)

I found many parts of the patch and docs quite confusing because of the
way things are named. For me, this is a deferred or delayed insert
technique to allow batching. I would prefer if everything used one
description, rather than "fast", "pending", "delayed" etc. 

Personally, I see ginInsertCleanup() as a scheduled task unrelated to
vacuum. Making the deferred tasks happen at vacuum time is just a
convenient way of having a background task occur regularly. That's OK
for now, but I would like to be able to request a background task
without having to hook into AV.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: GIN fast insert

From
Robert Haas
Date:
On Mon, Feb 23, 2009 at 4:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> It would be helpful if Heikki or Simon could jump in here, but my
>> understanding is that cleaning up the pending list is a read-write
>> operation.  I don't think we can do that on a hot standby server.
>
> >From reading the docs with the patch the pending list is merged into the
> main index when a VACUUM is performed. I (think I) can see that
> additions to the pending list are WAL logged, so that will work in Hot
> Standby. I also see ginEntryInsert() calls during the move from pending
> list to main index which means that is WAL logged also. AFAICS this
> *could* work during Hot Standby mode.
> Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage
> rather than attempt to read the patch.
> Teodor, can you confirm
> * we WAL log the insert into the pending list
> * we WAL log the move from the pending list to the main index
> * that we maintain the pending list correctly during redo so that it can
> be accessed by index scans
>
> The main thing with Hot Standby is that we can't do any writes. So a
> pending list cannot change solely because of a gingettuple call on the
> *standby*.

That's what I thought.  Thanks for confirming.

> But that's easy to disable. If all the inserts happened on
> the primary node and all the reads happened on the standby, then pending
> list would never be cleaned up if the cleanup is triggered only by read.

No, because the inserts would trigger VACUUM on the primary.

> I would suggest that we trigger cleanup by read at threshold size X and
> trigger cleanup by insert at threshold size 5X. That avoids the strange
> case mentioned, but generally ensures only reads trigger cleanup. (But
> why do we want that??)

I think that's actually not what we want.  What we want is for VACUUM
to deal with it.  Unfortunately that's hard to guarantee since, for
example, someone might turn autovacuum off.  So the issue is what do
we do when we're in the midst of an index scan and our TIDBitmap has
become lossy.  Right now, the answer is that we clean up the pending
list from inside the index scan and then retry the index scan.  I
don't think that's going to work.

I'm starting to think that the right thing to do here is to create a
non-lossy option for TIDBitmap.  Tom has been advocating just losing
the index scan AM altogether, but that risks losing performance in
cases where a LIMIT would have stopped the scan well prior to
completion.

> I found many parts of the patch and docs quite confusing because of the
> way things are named. For me, this is a deferred or delayed insert
> technique to allow batching. I would prefer if everything used one
> description, rather than "fast", "pending", "delayed" etc.

I mentioned this in my previous review (perhaps not quite so
articulately) and I completely agree with you.  It's clear enough
reading the patch because you know that all the changes in the patch
must be related to each other, but once it's applied it's going to be
tough to figure out.

> Personally, I see ginInsertCleanup() as a scheduled task unrelated to
> vacuum. Making the deferred tasks happen at vacuum time is just a
> convenient way of having a background task occur regularly. That's OK
> for now, but I would like to be able to request a background task
> without having to hook into AV.

This has been discussed previously and I assume you will be submitting
a patch at some point, since no one else has volunteered to implement
it.  I think autovacuum is the right way to handle this particular
case because it is a cleanup operation that is not dependent on time
but on write activity and hooks into more or less the same stats
infrastructure, but I don't deny the existence of other cases that
would benefit from a scheduler.

...Robert


Re: GIN fast insert

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I'm starting to think that the right thing to do here is to create a
> non-lossy option for TIDBitmap.  Tom has been advocating just losing
> the index scan AM altogether, but that risks losing performance in
> cases where a LIMIT would have stopped the scan well prior to
> completion.

Actually, I'm going to *insist* that we lose the index AM scan
altogether.  There might be a possibility to put it back in 8.5 or later
if anyone actually makes the case (with some evidence) that it's worth
the trouble.  But right now, what this patch needs is to be made to work
reliably, and having a great deal of complexity added by an inessential
feature is a good way to make sure it doesn't go in at all.
        regards, tom lane


Re: GIN fast insert

From
Robert Haas
Date:
On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm starting to think that the right thing to do here is to create a
>> non-lossy option for TIDBitmap.  Tom has been advocating just losing
>> the index scan AM altogether, but that risks losing performance in
>> cases where a LIMIT would have stopped the scan well prior to
>> completion.
>
> Actually, I'm going to *insist* that we lose the index AM scan
> altogether.  There might be a possibility to put it back in 8.5 or later
> if anyone actually makes the case (with some evidence) that it's worth
> the trouble.  But right now, what this patch needs is to be made to work
> reliably, and having a great deal of complexity added by an inessential
> feature is a good way to make sure it doesn't go in at all.

Except that the "inessential" feature in question is a feature that
currently WORKS, and I don't believe that the testing you've done is
anywhere near sufficient to show that no one will be upset if it goes
away.  Without some convincing evidence to support that proposition, I
think it would be better to postpone the whole patch to 8.5 and use
that time to fix the problem, rather than slam something through now
and then possibly find out that we've introduced a performance
regression in cases that used to work well.

In fact, as far as I can see, the performance testing that has been
done on this patch to date is fairly one-sided.  It focuses on the
improvement to index build speed, which is a good thing to test, but I
don't believe anyone has done any testing at all of the cases where
the patch might HURT.  Previously, the relevant case was where someone
has done a bunch of fast-inserts, but autovacuum has not kicked in
yet, or the threshold to trigger autovacuum has not been reached.  Now
we're adding to that every query that currently uses an index scan,
which seemed to be a pretty common plan in the simple cases I tested.

Maybe I'm worrying over nothing?

...Robert


Re: GIN fast insert

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Actually, I'm going to *insist* that we lose the index AM scan
>> altogether.

> Except that the "inessential" feature in question is a feature that
> currently WORKS, and I don't believe that the testing you've done is
> anywhere near sufficient to show that no one will be upset if it goes
> away.

What feature is that --- the ability to get an undefined subset of rows
quickly by using LIMIT without ORDER BY?  Not much of a feature.

> Without some convincing evidence to support that proposition, I
> think it would be better to postpone the whole patch to 8.5 and use
> that time to fix the problem,

Wouldn't bother me any.  We are way overdue for 8.4 already.
        regards, tom lane


Re: GIN fast insert

From
Robert Haas
Date:
On Mon, Feb 23, 2009 at 1:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Feb 23, 2009 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Actually, I'm going to *insist* that we lose the index AM scan
>>> altogether.
>
>> Except that the "inessential" feature in question is a feature that
>> currently WORKS, and I don't believe that the testing you've done is
>> anywhere near sufficient to show that no one will be upset if it goes
>> away.
>
> What feature is that --- the ability to get an undefined subset of rows
> quickly by using LIMIT without ORDER BY?  Not much of a feature.

That's a huge over-generalization of the effect of removing index
scans, as I am sure you are well aware.  It only took me about 5
minutes to come up with a test case against CVS HEAD where disabling
index scans resulted in a significant dropoff in performance.  Here it
is:

create table foo (id serial, x int[], primary key (id));
create index foo_gin on foo using gin (x);
insert into foo (x) select array[random()*10000::integer,
random()*10000::integer, random()*10000::integer,
random()*10000::integer] from generate_series(1,10000);
analyze foo;

OK, now here's the query:

select sum(1) from generate_series(1,10000) g left join foo on
array[g] <@ x where x is null;

On my system this takes about 45 ms to execute with default settings
and about 90 ms to execute with index scan disabled.

>> Without some convincing evidence to support that proposition, I
>> think it would be better to postpone the whole patch to 8.5 and use
>> that time to fix the problem,
>
> Wouldn't bother me any.  We are way overdue for 8.4 already.

I completely agree, but there are four patches on the CommitFest wiki
that still need some committer attention before we close up shop:

B-Tree Emulation for GIN
Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Proposal of PITR performance improvement
SE-PostgreSQL Lite

I have reviewed the second of these already and believe it's in pretty
good shape and may review one or more of the others as time permits,
especially if you or one of the other committers express an opinion
that it would be helpful for me to do that and especially if you
express an opinion on which of them it would be most helpful for.

It is well past time to move "Reducing some DDL Locks to ShareLock" to
committed and leave the unapplied portions for 8.5.  As much as I
would like to have the feature, it is probably also time to think
about punting "Hot Standby" unless it's going to get committed RSN.
At this point, we are definitely holding up both the release of 8.4
and development for 8.5.

...Robert


Re: GIN fast insert

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On my system this takes about 45 ms to execute with default settings
> and about 90 ms to execute with index scan disabled.

[ shrug... ]  That's well within my threshold of pain for this.
In any case, it might be possible to buy some/all of that back with
minor optimization effort on the bitmap-scan code paths; nobody's
ever really bothered to profile that AFAIK.  There is no real
difference in the useful work (page and tuple fetches) getting done
in the two cases, so there's no reason in principle for bitmap scan
to be much slower than indexscan here.  The LIMIT case is the only
one I'm aware of where there's a fundamental reason that bitmap scan
should be slower.
        regards, tom lane


Re: GIN fast insert

From
Robert Haas
Date:
On Tue, Feb 24, 2009 at 10:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On my system this takes about 45 ms to execute with default settings
>> and about 90 ms to execute with index scan disabled.
>
> [ shrug... ]  That's well within my threshold of pain for this.
> In any case, it might be possible to buy some/all of that back with
> minor optimization effort on the bitmap-scan code paths; nobody's
> ever really bothered to profile that AFAIK.  There is no real
> difference in the useful work (page and tuple fetches) getting done
> in the two cases, so there's no reason in principle for bitmap scan
> to be much slower than indexscan here.  The LIMIT case is the only
> one I'm aware of where there's a fundamental reason that bitmap scan
> should be slower.

Uh, a semi or anti join stops as soon as one matching row is found,
doesn't it?  ISTM that a semi or anti join is in essence an iterated
limit 1 clause.

...Robert


Re: GIN fast insert

From
Jeff Davis
Date:
On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote: 
> It only took me about 5 minutes to come up with a test case against CVS
>  HEAD where disabling index scans resulted in a significant dropoff in
>  performance.  Here it is:

On the other hand, Teodor showed a typical use case and a very
substantial performance gain:

http://archives.postgresql.org/message-id/497F4606.6070303@sigaev.ru

My impression is that GIN is used almost entirely for full text search.
Even though GIN is actually quite general (as the BTree patch shows),
the current users we have to worry about are a fairly narrow group
(correct me if I'm wrong here).

I wonder how many people really use GIN with non-bitmap scans for some
benefit? And even if the benefit exists, does the planner have a way to
identify those cases reliably, or does it have to be done manually?

Regards,Jeff Davis



Re: GIN fast insert

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Tue, 2009-02-24 at 00:18 -0500, Robert Haas wrote: 
>> It only took me about 5 minutes to come up with a test case against CVS
>> HEAD where disabling index scans resulted in a significant dropoff in
>> performance.  Here it is:

> On the other hand, Teodor showed a typical use case and a very
> substantial performance gain:

Yeah.  Whatever we do here is a tradeoff (and whether Robert likes it
or not, reliability and code maintainability weigh heavily in the
tradeoff).

> I wonder how many people really use GIN with non-bitmap scans for some
> benefit? And even if the benefit exists, does the planner have a way to
> identify those cases reliably, or does it have to be done manually?

A relevant point there is that most of the estimator functions for
GIN-amenable operators are just smoke and mirrors; so if the planner
is making a good choice between indexscan and bitmapscan at all, it's
mostly luck.  This might get better someday, but not in 8.4.
        regards, tom lane


Re: GIN fast insert

From
Robert Haas
Date:
On Tue, Feb 24, 2009 at 2:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> On the other hand, Teodor showed a typical use case and a very
>> substantial performance gain:
>
> Yeah.  Whatever we do here is a tradeoff (and whether Robert likes it
> or not, reliability and code maintainability weigh heavily in the
> tradeoff).

I have no problem with reliability or code maintainability and I'm not
sure what I said that would give that impression.  If the consensus of
the group is that the performance loss from dropping index scans is
not important, then I'm fine with that, especially if that consensus
is reached in the context of an educated knowledge of what that
performance loss is likely to be.  To me, a 2x slowdown on two-table
anti-join seems pretty bad, but I just work here.  Perhaps nobody else
thinks that a semi-join or anti-join against a GIN index is a
plausible use case (like, find all of the words from the following
list that do not appear in any document)?

If everyone agrees that we don't care about that case (or about
ORDER-BY-without-LIMIT, which is certainly less compelling), then go
ahead and remove it.  I have no horse in this race other than having
been asked to review the patch, which I did.

On the other hand, if a significant number of people think that it
might be a bad idea to make that case significantly worse, then some
redesign work is called for, and that may mean the patch needs to get
bumped.

My own opinion is that it is better to decide on the right design and
then figure out which release that design can go into than it is to
start by deciding this has to go into 8.4 and then figuring out what
can be done in that period of time.  I don't think there is any
question that making GIN continue to support both index scans and
bitmap index scans will make the code more complex, but how bad will
it be?  So far we've ruled out using the planner to prevent index
scans when the pending list is long (because it's not reliable) and
cleaning up the pending list during insert when needed (because it
won't work with Hot Standby).  We haven't decided what WILL work,
apart from ripping out index scans altogether, so to some degree we're
comparing against an unknown.

>> I wonder how many people really use GIN with non-bitmap scans for some
>> benefit? And even if the benefit exists, does the planner have a way to
>> identify those cases reliably, or does it have to be done manually?
>
> A relevant point there is that most of the estimator functions for
> GIN-amenable operators are just smoke and mirrors; so if the planner
> is making a good choice between indexscan and bitmapscan at all, it's
> mostly luck.  This might get better someday, but not in 8.4.

Based on the limited testing I've done thus far, it appears to pick an
index scan for small numbers of rows and a bitmap index scan for
larger number of rows.  Index scans will have lower startup costs
which can be valuable if you only need to scan part of the index (as
in the semi and anti join cases).  I haven't done enough testing to
see if there is any benefit when scanning the whole index and only
returning a few tuples.

...Robert


Re: GIN fast insert

From
Teodor Sigaev
Date:
> Teodor, can you confirm 
> * we WAL log the insert into the pending list
> * we WAL log the move from the pending list to the main index
Yes, I can and I confirm

> * that we maintain the pending list correctly during redo so that it can
> be accessed by index scans
Not sure about correct locking in ginxlog.c - but it should fixable rather easy.


> The main thing with Hot Standby is that we can't do any writes. So a
> pending list cannot change solely because of a gingettuple call on the
> *standby*. But that's easy to disable. If all the inserts happened on
Right. And suggest to increase work_mem.

> the primary node and all the reads happened on the standby, then pending
> list would never be cleaned up if the cleanup is triggered only by read.
> I would suggest that we trigger cleanup by read at threshold size X and
> trigger cleanup by insert at threshold size 5X. That avoids the strange
threshold is not helpful here because for gininsert and gingettuple it depends 
on work_mem setting which could be very different on slave(s) and master. Next, 
it's possible that master never executes read-only queries (i.e. gingettuple 
will not be called) by application's design.

This is a strong objection for invocation of cleanup from gingettuple. So, I'm 
inclined to Tom's idea about removing gingettuple interface although GIN with 
fastupdate=off doesn't affect discussed problem .

> I found many parts of the patch and docs quite confusing because of the
> way things are named. For me, this is a deferred or delayed insert
> technique to allow batching. I would prefer if everything used one
> description, rather than "fast", "pending", "delayed" etc. 
Terminology was taken from inverted index technique.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: GIN fast insert

From
Teodor Sigaev
Date:
> it be?  So far we've ruled out using the planner to prevent index
> scans when the pending list is long (because it's not reliable) and
> cleaning up the pending list during insert when needed (because it
> won't work with Hot Standby).  We haven't decided what WILL work,

During insert it will work with Hot Standby because there is no any limitation 
for number of pages touched or WAL records. There is a problem with cleanup 
invoked by gingettuple - slave could not start cleanup process at all and hence 
it could emit an error if tidbitmap becomes lossy.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: GIN fast insert

From
Robert Haas
Date:
On Thu, Feb 26, 2009 at 11:41 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>> it be?  So far we've ruled out using the planner to prevent index
>> scans when the pending list is long (because it's not reliable) and
>> cleaning up the pending list during insert when needed (because it
>> won't work with Hot Standby).  We haven't decided what WILL work,
>
> During insert it will work with Hot Standby because there is no any
> limitation for number of pages touched or WAL records. There is a problem
> with cleanup invoked by gingettuple - slave could not start cleanup process
> at all and hence it could emit an error if tidbitmap becomes lossy.

I didn't think about that option - might be reasonable.

...Robert


Re: GIN fast insert

From
Tom Lane
Date:
While I'm looking at this thing...

Why is tbm_add_page() not coded as a simple wrapper around
tbm_mark_page_lossy()?  As coded, it sometimes forces a whole bunch of
pages *other than* the target page to become lossy too, which I cannot
see a reason for it to be doing.
        regards, tom lane


Re: GIN fast insert

From
Teodor Sigaev
Date:
> Why is tbm_add_page() not coded as a simple wrapper around
> tbm_mark_page_lossy()?  As coded, it sometimes forces a whole bunch of
> pages *other than* the target page to become lossy too, which I cannot
> see a reason for it to be doing.

[after digging in tidbitmap]
Oops, I was wrong, I supposed that all pages in chunk should be lossy, but it's 
true only for chunk page. So, tbm_add_page() should only call 
tbm_mark_page_lossy()...

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: GIN fast insert

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> Oops, I was wrong, I supposed that all pages in chunk should be lossy, but it's 
> true only for chunk page. So, tbm_add_page() should only call 
> tbm_mark_page_lossy()...

OK, thanks, that's what I thought.  I've changed it in the copy I'm
editing here.

I have another question.  I added the following comment to
ginInsertCleanup(); is it accurate?  (If it isn't, I think
the code is buggy ...)

* This can be called concurrently by multiple backends, so it must cope.* On first glance it looks completely not
concurrent-safeand not crash-safe* either.  The reason it's okay is that multiple insertion of the same entry* is
detectedand treated as a no-op by gininsert.c.  If we crash after* posting entries to the main index and before
removingthem from the* pending list, it's okay because when we redo the posting later on, nothing* bad will happen.
Likewise,if two backends simultaneously try to post* a pending entry into the main index, one will succeed and one will
do*nothing.  We try to notice when someone else is a little bit ahead of* us in the process, but that's just to avoid
wastingcycles.  Only the* action of removing a page from the pending list really needs exclusive* lock.
 

        regards, tom lane


Re: GIN fast insert

From
Teodor Sigaev
Date:
> ginInsertCleanup(); is it accurate?  (If it isn't, I think

Exactly, that is right.

> 
> 
>  * This can be called concurrently by multiple backends, so it must cope.
>  * On first glance it looks completely not concurrent-safe and not crash-safe
>  * either.  The reason it's okay is that multiple insertion of the same entry
>  * is detected and treated as a no-op by gininsert.c.  If we crash after
>  * posting entries to the main index and before removing them from the
>  * pending list, it's okay because when we redo the posting later on, nothing
>  * bad will happen.  Likewise, if two backends simultaneously try to post
>  * a pending entry into the main index, one will succeed and one will do
>  * nothing.  We try to notice when someone else is a little bit ahead of
>  * us in the process, but that's just to avoid wasting cycles.  Only the
>  * action of removing a page from the pending list really needs exclusive
>  * lock.
> 
> 
>             regards, tom lane
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: GIN fast insert

From
Tom Lane
Date:
I've committed this patch with additional editorialization --- mostly
cosmetic changes, except for removing the stats-driven cleanup in favor
of letting cleanups occur during auto-ANALYZE, as per my suggestion
yesterday.
        regards, tom lane


Re: GIN fast insert

From
Josh Berkus
Date:
On 3/24/09 1:18 PM, Tom Lane wrote:
> I've committed this patch with additional editorialization --- mostly
> cosmetic changes, except for removing the stats-driven cleanup in favor
> of letting cleanups occur during auto-ANALYZE, as per my suggestion
> yesterday.

By my count, this was the last patch left for 8.4.  No?

--Josh


Re: GIN fast insert

From
David Fetter
Date:
On Tue, Mar 24, 2009 at 04:55:47PM -0700, Josh Berkus wrote:
> On 3/24/09 1:18 PM, Tom Lane wrote:
>> I've committed this patch with additional editorialization --- mostly
>> cosmetic changes, except for removing the stats-driven cleanup in favor
>> of letting cleanups occur during auto-ANALYZE, as per my suggestion
>> yesterday.
>
> By my count, this was the last patch left for 8.4.  No?

One more patch left according to the Commitfest wiki: B-Tree emulation
for GIN.

http://wiki.postgresql.org/wiki/CommitFestInProgress

Cheers,
David <whisper>beta</>
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate