Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiSTindexes from gevel - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiSTindexes from gevel |
Date | |
Msg-id | CAEepm=1Vf6edWc33qF5us5Vw4J8QS-3LYtFzFVKYscs6c9+B7g@mail.gmail.com Whole thread Raw |
In response to | [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes fromgevel (Alexey Chernyshov <a.chernyshov@postgrespro.ru>) |
Responses |
Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiSTindexes from gevel
|
List | pgsql-hackers |
On Sat, Jul 22, 2017 at 12:05 AM, Alexey Chernyshov <a.chernyshov@postgrespro.ru> wrote: > the following patch transfers functionality from gevel module > (http://www.sai.msu.su/~megera/wiki/Gevel) which provides functions for > analyzing GIN and GiST indexes to pageinspect. Gevel was originally > designed by Oleg Bartunov, and Teodor Sigaev for developers of GiST and > GIN indexes. > > Functions added: > - gist_stat(text) - shows statistics on GiST Tree > - gist_tree(text) - shows GiST tree > - gist_tree(text, int4) - shows GiST tree up to MAXLEVEL > - gist_print(text) - prints objects stored in GiST tree > - spgist_stat(text) - shows statistics on SP-GiST > - spgist_print(text) - prints objects stored in index > - gin_value_count() - originally gin_stat(text) - prints estimated counts > for index values > - gin_stats() - originally gin_statpage(text) - shows statistics > - gin_count_estimate(text, tsquery) - shows number of indexed rows matched > query > > Tests also transferred, docs for new functions are added. I run pgindent > over the code, but the result is different from those I expected, so I leave > pgindented one. > The patch is applicable to the commit > 866f4a7c210857aa342bf901558d170325094dde. Hi Alexey, This patch still applies but doesn't build after commits 2cd70845 and c6293249. ginfuncs.c:91:47: error: invalid type argument of ‘->’ (have ‘FormData_pg_attribute’) st->index->rd_att->attrs[st->attnum]->attbyval, ...several similar errors... For example, that expression needs to be changed to: TupleDescAttr(st->index->rd_att, attnum)->attbyval Here is some superficial review since I'm here: +/* + * process_tuples + * Retrieves the number of TIDs in a tuple. + */ +static void +process_tuple(FuncCallContext *funcctx, GinStatState * st, IndexTuple itup) "process_tuples" vs "process_tuple" + /* do no distiguish various null category */ "do not distinguish various null categories" + st->nulls[0] = (st->category == GIN_CAT_NORM_KEY) ? false : true; That's a long way to write (st->category != GIN_CAT_NORM_KEY)! + * We scaned the whole page, so we should take right page "scanned" +/* + * refind_position + * Refinds a previois position, at returns it has correctly + * set offset and buffer is locked. + */ "previous", s/, at returns/. On return/ + memset(st->nulls, false, 2 * sizeof(*st->nulls)); "false" looks strange here where an int is expected. The size expression is weird. I think you should just write: st->nulls[0] = false; st->nulls[1] = false; Investigating the types more closely, I see that 'nulls' is declared like this (in struct GinStatState): + char nulls[2]; Then later you do this: + htuple = heap_form_tuple(funcctx->attinmeta->tupdesc, + st->dvalues, + st->nulls); But that's not OK, heap_form_tuple takes bool *. bool and char are not interchangeable (they may have different sizes on some platforms, among other problems, even though usually they are both 1 byte). So I think you should declare it as "bool nulls[2]". Meanwhile in TypeStorage you have a member "bool *nulls", which you then initialise like so: + st->nulls = (char *) palloc((tupdesc->natts + 2) * sizeof(*st->nulls)); That cast is wrong. +/* + * gin_count_estimate + * Outputs number of indexed rows matched query. It doesn't touch heap at + * all. Maybe "... number of indexed rows matching query."? + if (attnum < 0 || attnum >= st->index->rd_att->natts) + elog(ERROR, "Wrong column's number"); + st->attnum = attnum; Maybe "invalid column number" or "invalid attribute number"? + elog(ERROR, "Column type is not a tsvector"); s/C/c/ (according to convention). + * Prints various stat about index internals. s/stat/stats/ + elog(ERROR, "Relation %s.%s is not an index", s/R/r/ + elog(ERROR, "Index %s.%s has wrong type", s/I/i/ + <function>gin_value_count</function> prints estimated counts for each + indexed values, single-argument function will return result for a first + column of index. For example: "... for each indexed value. The single argument version will return results for the first column of an index. For example:" + <function>gin_count_estimate</function> outputs number of indexed + rows matched query. It doesn't touch heap at all. For example: "... outputs the number of indexed rows matched by a query. ..." + <function>gist_print</function> prints objects stored in + <acronym>GiST</acronym> tree, works only if objects in index have + textual representation (<function>type_out</function> functions should be + implemented for the given object type). It's known to work with R-tree + <acronym>GiST</acronym> based index. Note, in example below, objects are + of type box. For example: "... prints objects stored in a GiST tree. it works only if the objects in the index have textual representation (type_out functions should be implemented for the given object type). It's known to work with R-tree GiST-based indexes. Note: in the example below, objects are of type box. For example:" + <function>gin_value_count</function> prints estimated counts for each + indexed value for a given column. For example: Maybe "for a given column (starting from zero)"? + <function>spgist_print</function> prints objects stored in + <acronym>SP-GiST</acronym> tree, works only if objects in index have + textual representation (<function>type_out</function> functions should + be implemented for the given object type). "... tree. It works only if ..." + Note 1. in example below we used quad_point_ops which uses point + for leaf and prefix value, but doesn't use node_label at all. + Use type 'int' as dummy type for prefix or/and node_label. "... in the example ..." A few random whitespace changes: - (errmsg("must be superuser to use raw page functions")))); + (errmsg("must be superuser to use raw page functions"))) + ); - opaq->flags, GIN_META))); + opaq->flags, GIN_META)) + ); - flags[nflags++] = DirectFunctionCall1(to_hex32, Int32GetDatum(flagbits)); + flags[nflags++] = DirectFunctionCall1(to_hex32, + Int32GetDatum(flagbits)); - (GIN_DATA | GIN_LEAF | GIN_COMPRESSED)))); + (GIN_DATA | GIN_LEAF | GIN_COMPRESSED))) + ); -- Thomas Munro http://www.enterprisedb.com
pgsql-hackers by date: