Re: How to implement Gin method? - Mailing list pgsql-hackers
From | Martijn van Oosterhout |
---|---|
Subject | Re: How to implement Gin method? |
Date | |
Msg-id | 20130708195638.GC15254@svana.org Whole thread Raw |
In response to | Re: How to implement Gin method? (kenji uno <ku@digitaldolphins.jp>) |
List | pgsql-hackers |
On Mon, Jul 08, 2013 at 03:21:09PM +0900, kenji uno wrote: > Hi. > > Ok, ok thanks. > > My problem is to shorten time of searching full text stored in text field. Ok, your explanation of your problem really helps, thanks. > However the following query is very slow! 9,400ms. It uses "Seq Scan" lol. > > SELECT * FROM xxx > WHERE TRUE > AND (ddstrike(title,'ABC') OR (filter(fts1body) @> filter('AAA') AND ddstrike(fts1body,'AAA'))) Well, in this case it still needs to scan the whole table to search the title obviously. > Apply filter to "title" column too. > > The best query result costs 3,700ms. 18 hits. It surely uses expected query plan: two "Bitmap index scan" -> "Bitmap Or"-> "Bitmap Heap Scan". > > SELECT * FROM xxx > WHERE TRUE > AND (filter(title) @> filter('ABC') OR filter(fts1body) @> filter('ABC')) > AND (ddstrike(title,'ABC') OR ddstrike(fts1body,'ABC')) It would be useful to see the "explain analyze" of this query. Note that looking up 11,000 entries in an index could very take as long as sequentially scanning the whole table. > However, I may think good idea which uses inverted index. I think your above idea is a good one, but you need to work out why your above implementation didn't work out and why you think implementing it directly will be better. > > So I want to know... > - the actual work of extractQuery and consistant. > - the detail interface of extractValue/extractQuery/consistant. It may help understanding. > > I looked at contrib/_int.sql of PG8.2.22 Whoa, very old version, please look at something newer. For example the RECHECK flag below is no longer used. > There are definitions of int[] GIN support. > --- > CREATE OPERATOR CLASS gin__int_ops > FOR TYPE _int4 USING gin > AS > OPERATOR 3 &&, > OPERATOR 6 = (anyarray, anyarray) RECHECK, > OPERATOR 7 @>, > OPERATOR 8 <@ RECHECK, > OPERATOR 13 @, > OPERATOR 14 ~ RECHECK, > OPERATOR 20 @@ (_int4, query_int), > FUNCTION 1 btint4cmp (int4, int4), > FUNCTION 2 ginarrayextract (anyarray, internal), > FUNCTION 3 ginint4_queryextract (internal, internal, int2), > FUNCTION 4 ginint4_consistent (internal, int2, internal), > STORAGE int4; > --- > Both ginint4_queryextract and ginint4_consistent assume that "query" argument is a PGARRAY (ArrayType *). Where is it decided?Is it array of STORAGE type? Remember the above uses operators which are what is indexed. The left hand side is the array. The right hand side is whatever is defined. intarray defines the operator &&(int[], int[]) hence the "query" argument is int[] in that case. Apparently intarray accepts many kinds of queries, it is the operators that define what actually happens. > Both extractQuery(ginint4_queryextract) and extractValue(ginarrayextract) seem to return similar value type. They returnDatum array of int4. Is it array of STORAGE type? From my reading of http://www.postgresql.org/docs/9.2/static/gin-extensibility.html, yes they must return an array of the STORAGE type. The last paragraph on that page says: The actual data types of the various Datum values mentioned above vary depending on the operator class. The item valuespassed to extractValue are always of the operator class's input type, and all key values must be of the class'sSTORAGE type. The type of the query argument passed to extractQuery and consistent is whatever is specified asthe right-hand input type of the class member operator identified by the strategy number. This need not be the sameas the item type, so long as key values of the correct type can be extracted from it. > I want to understand the overview of GIN extension. Please let us know what the documentation is missing so it can be improved. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
pgsql-hackers by date: