Re: Bulkdelete and Vacuum operations on custom index - Mailing list pgsql-general

From Tom Lane
Subject Re: Bulkdelete and Vacuum operations on custom index
Date
Msg-id 28168.1268836970@sss.pgh.pa.us
Whole thread Raw
In response to Bulkdelete and Vacuum operations on custom index  (Carsten Kropf <ckropf2@fh-hof.de>)
List pgsql-general
Carsten Kropf <ckropf2@fh-hof.de> writes:
> I am currently implementing some index access methods on top of
> PostgreSQL. Until now, it is pretty fine and working
> properly. However, I am now doing the implementation of bulk deletion
> and vacuum of the structure. I don't know exactly, how to achieve this
> because it would be much easier to just collect statistics in
> bulkdelete and to implement the "real deal" of deleting the particular
> entries from my structures when vacuum is called on the index. Is it
> legitimate to do this: just collect statistics and pass the statistics
> and items to be deleted in main memory back to the caller and perform
> the real deletion of entries in vacuum?

No.  You *must* make the index entries go away during bulkdelete,
because the heap tuples they are pointing at will be deleted as soon
as it returns.  If you don't do this, and there's a crash before the
vacuum finishes, you have dangling index entries pointing at nonexistent
heap entries, which will lead to big trouble later.  I think you
probably don't even need a crash to have trouble --- consider a
concurrent indexscan query that finds one of those index entries and
tries to visit the heap tuple from it.

The other problem with your sketch is that you can't assume you have an
indefinitely large amount of working memory available.

Perhaps you could set a flag on each deleted index tuple during
bulkdelete (with scans knowing to ignore marked tuples) and then do the
physical reorganization at vacuum cleanup.  This would imply doing a
full scan of the index during cleanup (to find the dead entries) but we
do similar things in btree indexes and the performance seems to be OK.

BTW, this seems a bit off-topic for pgsql-general.  You'd be better
off asking such questions in -hackers.

            regards, tom lane

pgsql-general by date:

Previous
From: "Chokshi, Meghaben"
Date:
Subject: DBT-2 Error
Next
From: Tom Lane
Date:
Subject: Re: stopping processes, preventing connections