> There's a definitional issue here, which is what does it mean to be
> counting index tuples. I think GIN could bypass the VACUUM error check
> by always returning the heap tuple count as its index tuple count. This
One problem: ambulkdelete hasn't any access to heap or heap's statistics
(num_tuples in scan_index() and vacuum_index() in vacuum.c). So, ambulkdelete
can't set stats->num_index_tuples equal to num_tuples. With partial index
problem is increased...
After looking into vacuum.c I found following ways to skip check:
1) Simplest: just return NULL by ginvacuumcleanup. Disadvantage: drop any statistics
2) Quick hack in vacuum.c to be fixed in a future:if ( indrel->rd_rel->relam == GIN_AM_OID ) stats->num_index_tuples
=num_tuples;else if (stats->num_index_tuples != num_tuples ) { checking as now}
3) Add column to pg_am pointed to scan_index/vacuum_index's behaviour like above. I don't think that column is
frequentcase - only for inverted indexes.
If there is not objections, at Tuesday we add quick hack (2) and commit GIN.
After that our plan is:
1) add opclasses for other array
2) add indisclustered=true for all GIN indexes by changes in
UpdateIndexRelation() and mark_index_clustered(). The issue is:
can table be clustered on several indexes now? Because GIN is always 'clustered' table can be clustered on several GIN
indexand one any other. Cluster command
on GIN index should do nothing. May be, it will be cleaner to add indclustered
column to pg_am.
3) Return to WAL problem with GiST
4) work on gincostesimate and, possibly, GIN's opclasses costestimate tweak... Including num_tuples issue
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/