Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | 57050BFE.7020805@postgrespro.ru Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: WIP: Covering + unique indexes.
Re: WIP: Covering + unique indexes. |
List | pgsql-hackers |
06.04.2016 03:05, Peter Geoghegan: > On Tue, Apr 5, 2016 at 1:31 PM, Peter Geoghegan<pg@heroku.com> wrote: >> My new understanding: The extra "included" columns are stored in the >> index, but do not affect its sort order at all. They are no more part >> of the key than, say, the heap TID that the key points to. They are >> just "payload". It was really long and complicated discussion. I'm glad that finally we are in agreement about the patch. Anyway, I think all mentioned questions will be very helpful for the future work on b-tree. > Noticed a few issues following another pass: > > * tuplesort.c should handle the CLUSTER case in the same way as the > btree case. No? Yes, I just missed that cluster uses index sort. Fixed. > * Why have a RelationGetNumberOfAttributes(indexRel) call in > tuplesort_begin_index_btree() at all now? Fixed. > * This critical section is unnecessary, because this happens during > index builds: > > + if (indnkeyatts != indnatts && P_ISLEAF(opageop)) > + { > + /* > + * It's essential to truncate High key here. > + * The purpose is not just to save more space > on this particular page, > + * but to keep whole b-tree structure > consistent. Subsequent insertions > + * assume that hikey is already truncated, and > so they should not > + * worry about it, when copying the high key > into the parent page > + * as a downlink. > + * NOTE It is not crutial for reliability in present, > + * but maybe it will be that in the future. > + * NOTE this code will be changed by the > "btree compression" patch, > + * which is in progress now. > + */ > + keytup = index_reform_tuple(wstate->index, oitup, > + > indnatts, indnkeyatts); > + > + /* delete "wrong" high key, insert keytup as > P_HIKEY. */ > + START_CRIT_SECTION(); > + PageIndexTupleDelete(opage, P_HIKEY); > + > + if (!_bt_pgaddtup(opage, > IndexTupleSize(keytup), keytup, P_HIKEY)) > + elog(ERROR, "failed to rewrite > compressed item in index \"%s\"", > + RelationGetRelationName(wstate->index)); > + END_CRIT_SECTION(); > + } > > Note that START_CRIT_SECTION() promotes any ERROR to PANIC, which > isn't useful here, because we have no buffer lock held, and nothing > must be WAL-logged. > > * Think you forgot to update spghandler(). (You did not add a test for > just that one AM, either) Fixed. > * I wonder why this restriction needs to exist: > > + else > + elog(ERROR, "Expressions are not supported in > included columns."); > > What does not supporting it buy us? Was it just that the pg_index > representation is more complicated, and you wanted to put it off? > > An error like this should use ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED ..., btw. Yes, you get it right. It was a bit complicated to implement and I decided to delay it to the next patch. errmsg is fixed. > * I would like to see index_reform_tuple() assert that the new, > truncated index tuple is definitely <= the original (I worry about the > 1/3 page restriction issue). Maybe you should also change the name of > index_reform_tuple(), per David. Is it possible that the new tuple, containing less attributes than the old one, will have a greater size? Maybe you can give an example? I think that Assert(indnkeyatts <= indnatts); covers this kind of errors. I do not mind to rename this function, but what name would be better? index_truncate_tuple()? > * There is some stray whitespace within RelationGetIndexAttrBitmap(). > I think you should have updated it with code, though. I don't think > it's necessary for HOT updates to work, but I think it could be > necessary so that we don't need to get a row lock that blocks > non-conflict foreign key locking (see heap_update() callers). I think > you need to be careful for non-key columns within the loop in > RelationGetIndexAttrBitmap(), basically, because it seems to still go > through all columns. UPSERT also must call this code, FWIW. > > * I think that a similar omission is also made for the replica > identity stuff in RelationGetIndexAttrBitmap(). Some thought is needed > on how this patch interacts with logical decoding, I guess. Good point. Indexes are everywhere in the code. I missed that RelationGetIndexAttrBitmap() is used not only for REINDEX. I'll discuss it with Theodor and send an updated patch tomorrow. > * Valgrind shows an error with an aggregate statement I tried: > > 2016-04-05 17:01:31.129 PDT 12310 LOG: statement: explain analyze > select count(*) from ab where b > 5 group by a, b; > ==12310== Invalid read of size 4 > ==12310== at 0x656615: match_clause_to_indexcol (indxpath.c:2226) > ==12310== by 0x656615: match_clause_to_index (indxpath.c:2144) > ==12310== by 0x656DBC: match_clauses_to_index (indxpath.c:2115) > ==12310== by 0x658054: match_restriction_clauses_to_index (indxpath.c:2026) > ==12310== by 0x658054: create_index_paths (indxpath.c:269) > ==12310== by 0x64D1DB: set_plain_rel_pathlist (allpaths.c:649) > ==12310== by 0x64D1DB: set_rel_pathlist (allpaths.c:427) > ==12310== by 0x64D93B: set_base_rel_pathlists (allpaths.c:299) > ==12310== by 0x64D93B: make_one_rel (allpaths.c:170) > ==12310== by 0x66876C: query_planner (planmain.c:246) > ==12310== by 0x669FBA: grouping_planner (planner.c:1666) > ==12310== by 0x66D0C9: subquery_planner (planner.c:751) > ==12310== by 0x66D3DA: standard_planner (planner.c:300) > ==12310== by 0x66D714: planner (planner.c:170) > ==12310== by 0x6FD692: pg_plan_query (postgres.c:798) > ==12310== by 0x59082D: ExplainOneQuery (explain.c:350) > ==12310== Address 0xbff290c is 2,508 bytes inside a block of size 8,192 alloc'd > ==12310== at 0x4C2AB80: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==12310== by 0x81B7FA: AllocSetAlloc (aset.c:853) > ==12310== by 0x81D257: palloc (mcxt.c:907) > ==12310== by 0x4B6F65: RelationGetIndexScan (genam.c:94) > ==12310== by 0x4C135D: btbeginscan (nbtree.c:431) > ==12310== by 0x4B7A5C: index_beginscan_internal (indexam.c:279) > ==12310== by 0x4B7C5A: index_beginscan (indexam.c:222) > ==12310== by 0x4B73D1: systable_beginscan (genam.c:379) > ==12310== by 0x7E8CF9: ScanPgRelation (relcache.c:341) > ==12310== by 0x7EB3C4: RelationBuildDesc (relcache.c:951) > ==12310== by 0x7ECD35: RelationIdGetRelation (relcache.c:1800) > ==12310== by 0x4A4D37: relation_open (heapam.c:1118) > ==12310== > { > <insert_a_suppression_name_here> > Memcheck:Addr4 > fun:match_clause_to_indexcol > fun:match_clause_to_index > fun:match_clauses_to_index > fun:match_restriction_clauses_to_index > fun:create_index_paths > fun:set_plain_rel_pathlist > fun:set_rel_pathlist > fun:set_base_rel_pathlists > fun:make_one_rel > fun:query_planner > fun:grouping_planner > fun:subquery_planner > fun:standard_planner > fun:planner > fun:pg_plan_query > fun:ExplainOneQuery > } > > Separately, I tried "make installcheck-tests TESTS=index_including" > from Postgres + Valgrind, with Valgrind's --track-origins option > enabled (as it was above). I recommend installing Valgrind, and making > sure that the patch shows no errors. I didn't actually find a Valgrind > issue from just using your regression tests (nor did I find an issue > from separately running the regression tests with > CLOBBER_CACHE_ALWAYS, FWIW). > Thank you for advice. Another miss of index->ncolumns to index->nkeycolumns replacement in match_clause_to_index. Fixed. I also updated couple of typos in documentation. Thank you again for the detailed review. -- Anastasia Lubennikova Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: