On Tue, Jun 12, 2018 at 11:01 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote: > FWIW, I've looked at this again. I think that the situation Siva > reported in the first mail can happen before we get commit 3b2787e. > That is, gin indexes had had a data corruption bug. I've reproduced > the situation with PostgreSQL 10.1 and observed that a gin index can > corrupt. Thank you so much for trying the steps out and reproducing the issue! It is good that the correctness of query result is not affected when reading from a gin index that has duplicates. One place where an assertion will fail is when we compress the posting list [1] but since it is a debug assert, it will be suppressed in release builds. Here, the correctness of varbyte encoding is not affected, but the encoded page will have redundant zero deltas. When we repack leaf items [2] this may cause creation of a new segment if all the items (including the duplicates) do not fit in a single page. [1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/gin/ginpostinglist.c;h=8d2d31ac7236de5e2c62c6fa745af45a2b895b2c;hb=refs/heads/REL_10_STABLE#l209 [2] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/gin/gindatapage.c;h=2e5ea479763f32d6dc637e1c27d5975d124f293f;hb=refs/heads/REL_10_STABLE#l1601 Best Siva
pgsql-hackers by date:
Соглашаюсь с условиями обработки персональных данных