Re: Multicolumn index corruption on 8.4 beta 2 - Mailing list pgsql-hackers
From | Floris Bos / Maxnet |
---|---|
Subject | Re: Multicolumn index corruption on 8.4 beta 2 |
Date | |
Msg-id | 4A2E4A2C.3090207@je-eigen-domein.nl Whole thread Raw |
In response to | Re: Multicolumn index corruption on 8.4 beta 2 (Richard Huxton <dev@archonet.com>) |
Responses |
Re: Multicolumn index corruption on 8.4 beta 2
Re: Multicolumn index corruption on 8.4 beta 2 Re: Multicolumn index corruption on 8.4 beta 2 |
List | pgsql-hackers |
Hi, Richard Huxton wrote: > Not a hacker myself, but I can tell you that the first question you'll > be asked is "can you produce a test case"? If you can generate the > problem from a test table+generated data that will let people figure out > the problem for you. Unfortunately, I have not been able to produce a test case (yet) on a small data set. While the data in the database is public information, the whole database is about 100 GB, and therefore kinda hard to share. > If not, details of the table schema will be needed, and is there any > pattern to the missed rows? Also - compile settings, character set and > locale details might be relevant too. == Compile settings == No fancy settings. - Clean Opensolaris 2009.06 installation - Installed gcc and gmake packages. - Downloaded source and did a ./configure --disable-readline ; gmake ; gmake install == Postgresql settings == The following settings differ from the defaults: -- shared_buffers=3500MB maintenance_work_mem = 128MB fsync = off synchronous_commit = off checkpoint_segments = 25 -- The locale used when creating the database is SQL_ASCII == Hardware == Tyan barebone 2x Opteron 2376 quadcore 32 GB reg ecc memory 1x Intel X25-E 32 GB SSD for OS and pg_xlog directory 2x Intel X25-E 64 GB SSD (ZFS striping) for the database == Table layout == -- Table "public.posts_index" Column | Type | Modifiers ------------+------------------------+----------------------------------------------------------- cid | integer | not null default nextval('posts_index_cid _seq'::regclass) groupid | integer | not null startdate | integer | not null poster | character varying(64) | not null basefile | character varying(64) | not null subject | character varying(255) |not null size | real | nfo | boolean | c | boolean | parts | integer | totalparts | integer | imdb | integer | ng1 | boolean | default false g2 | integer | default 0 g3 | integer | default 0 data | bytea | Indexes: "posts_index5_pkey" PRIMARY KEY, btree (cid) CLUSTER "gr_idx" btree (groupid, (- cid)) "pgb_idx" btree(poster, groupid, basefile) -- Only noticed problems with the pgb_idx index so far. The problem only occurs on a subset of the rows, at a time. After adding/updating rows and doing a reindex, the rows that were missing before sometimes suddenly do work, but then different ones do not. > And can you post an explain plan for the incorrect scan? In particular is it using a bitmap index scan or a regular indexscan? Or does it happen with either? Happens with both. Index scan: === => explain SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=11.25..11.26 rows=1 width=0) -> Index Scan using pgb_idx on posts_index (cost=0.00..11.25 rows=1 width=0) Index Cond: (((poster)::text = 'Yenc@power-post.org (Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text = 'NIB8124849'::text)) => SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; count ------- 0 === When I disable index scan, it uses bitmap without luck: == => set enable_indexscan=false; SET => explain SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=11.26..11.27 rows=1 width=0) -> Bitmap Heap Scan on posts_index (cost=7.24..11.26 rows=1 width=0) Recheck Cond: (((poster)::text = 'Yenc@power-post.org (Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text = 'NIB8124849'::text)) -> Bitmap Index Scan on pgb_idx (cost=0.00..7.24 rows=1 width=0) Index Cond:(((poster)::text = 'Yenc@power-post.org (Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text = 'NIB8124849'::text)) => SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; count ------- 0 == Sequential scan does find the row: == => set enable_indexscan=false; SET => set enable_bitmapscan=false; SET => explain SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=288153.28..288153.29 rows=1 width=0) -> Seq Scan on posts_index (cost=0.00..288153.28 rows=1 width=0) Filter: (((poster)::text = 'Yenc@power-post.org (Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text = 'NIB8124849'::text)) (3 rows) => SELECT count(*) FROM posts_index WHERE poster='Yenc@power-post.org (Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849'; count ------- 1 == Yours sincerely, Floris Bos
pgsql-hackers by date: