Thread: Partitioned tables and covering indexes
Hi, Trying covering indexes on partitioned tables i get this error """ postgres=# create index on t1_part (i) include (t); ERROR: cache lookup failed for opclass 0 """ To reproduce: create table t1_part (i int, t text) partition by hash (i); create table t1_part_0 partition of t1_part for values with (modulus 2, remainder 0); create table t1_part_1 partition of t1_part for values with (modulus 2, remainder 1); insert into t1_part values (1, repeat('abcdefquerty', 20)); create index on t1_part (i) include (t); -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/04/10 16:07, Jaime Casanova wrote: > Hi, > > Trying covering indexes on partitioned tables i get this error > """ > postgres=# create index on t1_part (i) include (t); > ERROR: cache lookup failed for opclass 0 > """ > > To reproduce: > > create table t1_part (i int, t text) partition by hash (i); > create table t1_part_0 partition of t1_part for values with (modulus > 2, remainder 0); > create table t1_part_1 partition of t1_part for values with (modulus > 2, remainder 1); > insert into t1_part values (1, repeat('abcdefquerty', 20)); > > create index on t1_part (i) include (t); It seems that the bug is caused due to the original IndexStmt that DefineIndex receives being overwritten when processing the INCLUDE columns. Also, the original copy of it should be used when recursing for defining the index in partitions. Does the attached fix look correct? Haven't checked the fix with ATTACH PARTITION though. Maybe add this to open items list? Thanks, Amit
Attachment
On Tue, Apr 10, 2018 at 12:47 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp > wrote:
On 2018/04/10 16:07, Jaime Casanova wrote:
> Hi,
>
> Trying covering indexes on partitioned tables i get this error
> """
> postgres=# create index on t1_part (i) include (t);
> ERROR: cache lookup failed for opclass 0
> """
>
> To reproduce:
>
> create table t1_part (i int, t text) partition by hash (i);
> create table t1_part_0 partition of t1_part for values with (modulus
> 2, remainder 0);
> create table t1_part_1 partition of t1_part for values with (modulus
> 2, remainder 1);
> insert into t1_part values (1, repeat('abcdefquerty', 20));
>
> create index on t1_part (i) include (t);
It seems that the bug is caused due to the original IndexStmt that
DefineIndex receives being overwritten when processing the INCLUDE
columns. Also, the original copy of it should be used when recursing for
defining the index in partitions.
Does the attached fix look correct? Haven't checked the fix with ATTACH
PARTITION though.
Attached patch seems to fix the problem. However, I would rather get
rid of modifying stmt->indexParams. That seems to be more logical
for me. Also, it would be good to check some covering indexes on
partitioned tables. See the attached patch.
------
Alexander Korotkov
Postgres Professional: http://www. postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.
The Russian Postgres Company
Attachment
> Does the attached fix look correct? Haven't checked the fix with ATTACH > PARTITION though. > > > Attached patch seems to fix the problem. However, I would rather get > rid of modifying stmt->indexParams. That seems to be more logical > for me. Also, it would be good to check some covering indexes on > partitioned tables. See the attached patch. Seems right way, do not modify incoming object and do not copy rather large and deep nested structure as suggested by Amit. But it will be better to have a ATTACH PARTITION test too. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 10 April 2018 at 10:36, Teodor Sigaev <teodor@sigaev.ru> wrote: >> Does the attached fix look correct? Haven't checked the fix with >> ATTACH >> PARTITION though. >> >> >> Attached patch seems to fix the problem. However, I would rather get >> rid of modifying stmt->indexParams. That seems to be more logical >> for me. Also, it would be good to check some covering indexes on >> partitioned tables. See the attached patch. > > Seems right way, do not modify incoming object and do not copy rather large > and deep nested structure as suggested by Amit. > > But it will be better to have a ATTACH PARTITION test too. > the patch worked for me, i also tried some combinations using ATTACH PARTITION and found no problems -- Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi. On 2018/04/11 0:36, Teodor Sigaev wrote: >> Does the attached fix look correct? Haven't checked the fix with >> ATTACH >> PARTITION though. >> >> >> Attached patch seems to fix the problem. However, I would rather get >> rid of modifying stmt->indexParams. That seems to be more logical >> for me. Also, it would be good to check some covering indexes on >> partitioned tables. See the attached patch. > > Seems right way, do not modify incoming object and do not copy rather > large and deep nested structure as suggested by Amit. Yeah, Alexander's suggested way of using a separate variable for indexParams is better. > But it will be better to have a ATTACH PARTITION test too. I have added tests. Actually, instead of modifying existing tests, I think it might be better to have a separate section at the end of indexing.sql to test covering indexes feature for partitioned tables. Attached find updated patch. Thanks, Amit
Attachment
Thank you, pushed Amit Langote wrote: > Hi. > > On 2018/04/11 0:36, Teodor Sigaev wrote: >>> Does the attached fix look correct? Haven't checked the fix with >>> ATTACH >>> PARTITION though. >>> >>> >>> Attached patch seems to fix the problem. However, I would rather get >>> rid of modifying stmt->indexParams. That seems to be more logical >>> for me. Also, it would be good to check some covering indexes on >>> partitioned tables. See the attached patch. >> >> Seems right way, do not modify incoming object and do not copy rather >> large and deep nested structure as suggested by Amit. > > Yeah, Alexander's suggested way of using a separate variable for > indexParams is better. > >> But it will be better to have a ATTACH PARTITION test too. > > I have added tests. Actually, instead of modifying existing tests, I > think it might be better to have a separate section at the end of > indexing.sql to test covering indexes feature for partitioned tables. > > Attached find updated patch. > > Thanks, > Amit > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Patch makes buildfarm almost red, patch is temporary reverted. Actually, discovered bug is not related to patch except new test faces with it, problem is: CompareIndexInfo() checks rd_opfamily for equality for all attributes, not only for key attribute. Obviously, CompareIndexInfo() needs more work: for (i = 0; i < info1->ii_NumIndexAttrs; i++) { if (maplen < info2->ii_KeyAttrNumbers[i]) Seems, we can go out from ii_KeyAttrNumbers array. Amit Langote wrote: > Hi. > > On 2018/04/11 0:36, Teodor Sigaev wrote: >>> Does the attached fix look correct? Haven't checked the fix with >>> ATTACH >>> PARTITION though. >>> >>> >>> Attached patch seems to fix the problem. However, I would rather get >>> rid of modifying stmt->indexParams. That seems to be more logical >>> for me. Also, it would be good to check some covering indexes on >>> partitioned tables. See the attached patch. >> >> Seems right way, do not modify incoming object and do not copy rather >> large and deep nested structure as suggested by Amit. > > Yeah, Alexander's suggested way of using a separate variable for > indexParams is better. > >> But it will be better to have a ATTACH PARTITION test too. > > I have added tests. Actually, instead of modifying existing tests, I > think it might be better to have a separate section at the end of > indexing.sql to test covering indexes feature for partitioned tables. > > Attached find updated patch. > > Thanks, > Amit > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> Actually, discovered bug is not related to patch except new test faces > with it, > problem is: CompareIndexInfo() checks rd_opfamily for equality for all > attributes, not only for key attribute. Patch attached. But it seems to me, field's names of IndexInfo structure are a bit confused now: int ii_NumIndexAttrs; /* total number of columns in index */ int ii_NumIndexKeyAttrs; /* number of key columns in index */ AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS]; ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs number of columns, not a ii_NumIndexKeyAttrs number as easy to think. I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers. Opinions? > for (i = 0; i < info1->ii_NumIndexAttrs; i++) > { > if (maplen < info2->ii_KeyAttrNumbers[i]) > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
Teodor Sigaev wrote: > Patch attached. I wonder why this is a problem in opfamilies but not collations. If we don't compare collations, wouldn't it make more sense to break out of the loop once the number of keys is reached? When this code was written, there was no question as to what length the opfamilies/collations the arrays were; it was obvious that they must be of the length of the index attributes. It's not obvious now. Maybe add a comment about that? > But it seems to me, field's names of > IndexInfo structure are a bit confused now: > int ii_NumIndexAttrs; /* total number of columns in index */ > int ii_NumIndexKeyAttrs; /* number of key columns in index */ > AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS]; > > ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs > number of columns, not a ii_NumIndexKeyAttrs number as easy to think. > > I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers. > Opinions? Yeah, the current situation seems very odd. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Apr 11, 2018 at 10:47 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Teodor Sigaev wrote:
> Patch attached.
I wonder why this is a problem in opfamilies but not collations.
If we don't compare collations, wouldn't it make more sense to break out
of the loop once the number of keys is reached?
It appears that INCLUDE columns might have collation defined.
For instance, following query is working:
create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8");
However, I don't see any point in defining collations here, because
INCLUDE attributes exist solely for index-only scans. So, index just
can return value of INCLUDE attribute "as is", no point to do something
with collation.
So, I propose to disable collations for INCLUDE attributes.
When this code was written, there was no question as to what length the
opfamilies/collations the arrays were; it was obvious that they must be
of the length of the index attributes. It's not obvious now. Maybe add
a comment about that?
In b3b7f789 Tom have resized one array size from total number of
attributes to number of key attributes. And I like idea of applying the
same to all other array: make them sized to total number of attributes
with filling of absent attributes with 0.
> But it seems to me, field's names of
> IndexInfo structure are a bit confused now:
> int ii_NumIndexAttrs; /* total number of columns in index */
> int ii_NumIndexKeyAttrs; /* number of key columns in index */
> AttrNumber ii_KeyAttrNumbers[INDEX_MAX_KEYS];
>
> ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs
> number of columns, not a ii_NumIndexKeyAttrs number as easy to think.
>
> I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers.
> Opinions?
Yeah, the current situation seems very odd.
+1 for ii_IndexAttrNumbers.
------
Alexander Korotkov
Postgres Professional: http://www. postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.
The Russian Postgres Company
On Wed, Apr 11, 2018 at 1:58 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > It appears that INCLUDE columns might have collation defined. > For instance, following query is working: > > create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8"); > > However, I don't see any point in defining collations here, because > INCLUDE attributes exist solely for index-only scans. So, index just > can return value of INCLUDE attribute "as is", no point to do something > with collation. > > So, I propose to disable collations for INCLUDE attributes. Hmm. I'm not sure that that's exactly the right thing to do. We seem to want to have case-insensitive collations in the future. The fact that you can spell out collation name in ON CONFLICT's unique index inference specification suggests this, for example. I think that a collation is theoretically allowed to affect the behavior of equality, even though so far we've never tried to make that work for any collatable datatype. Maybe the right thing to do is to say that any collation will work equally well (as will any opclass). Maybe that's the same thing as what you said, though. > +1 for ii_IndexAttrNumbers. +1 -- Peter Geoghegan
On 4/11/18 17:08, Peter Geoghegan wrote: >> However, I don't see any point in defining collations here, because >> INCLUDE attributes exist solely for index-only scans. So, index just >> can return value of INCLUDE attribute "as is", no point to do something >> with collation. >> >> So, I propose to disable collations for INCLUDE attributes. > Hmm. I'm not sure that that's exactly the right thing to do. We seem > to want to have case-insensitive collations in the future. The fact > that you can spell out collation name in ON CONFLICT's unique index > inference specification suggests this, for example. I think that a > collation is theoretically allowed to affect the behavior of equality, > even though so far we've never tried to make that work for any > collatable datatype. But in this case it doesn't even do equality comparison, it just returns the value. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > But in this case it doesn't even do equality comparison, it just returns > the value. That's the idea that I tried to express. The point is that we need to tell the user that there is no need to worry about it, rather than that they're wrong to ask about it. Though we should probably actually just throw an error. -- Peter Geoghegan
On 4/11/18 17:38, Peter Geoghegan wrote: > On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> But in this case it doesn't even do equality comparison, it just returns >> the value. > > That's the idea that I tried to express. The point is that we need to > tell the user that there is no need to worry about it, rather than > that they're wrong to ask about it. Though we should probably actually > just throw an error. Or maybe it should be the collation of the underlying table columns. Otherwise the collation returned by an index-only scan would be different from a table scan, no? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Geoghegan wrote: > On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> But in this case it doesn't even do equality comparison, it just returns >> the value. > > That's the idea that I tried to express. The point is that we need to > tell the user that there is no need to worry about it, rather than > that they're wrong to ask about it. Though we should probably actually > just throw an error. Any operation over including columns is a deal for filter, not an index, so collation will not be used somewhere inside index. 2Alexander: ComputeIndexAttrs() may use whole vectors (for including columns too) just to simplify coding, in other places, seems, better to have exact size of vectors. Using full-sized vectors could mask a error, for exmaple, with setting opfamily to InvalidOid for key column. But if you insist on that, I'll modify attached patch to use full-sized vectors anywhere. In attached path I did: 1) fix a bug in CompareIndexInfo() which checks opfamily for including column 2) correct size for all vectors 3) Add assertion in various places to be sure that we don't try to use including column as key column 4) per Peter Geoghegan idea add a error message if somebody adds options to include column instead silently ignore it. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Attachment
>> That's the idea that I tried to express. The point is that we need to >> tell the user that there is no need to worry about it, rather than >> that they're wrong to ask about it. Though we should probably actually >> just throw an error. > > Or maybe it should be the collation of the underlying table columns. > Otherwise the collation returned by an index-only scan would be > different from a table scan, no? +1, dangerous -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
+1, dangerousThat's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.
Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?
I'm OK with collation of included columns to be the same as collation
of underlying table columns. But I still think we should throw an error
when user is trying to specify his own collation of included columns.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Peter Geoghegan wrote:On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote: But in this case it doesn't even do equality comparison, it just returns
the value.
That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.
Any operation over including columns is a deal for filter, not an index, so collation will not be used somewhere inside index.
2Alexander: ComputeIndexAttrs() may use whole vectors (for including columns too) just to simplify coding, in other places, seems, better to have exact size of vectors. Using full-sized vectors could mask a error, for exmaple, with setting opfamily to InvalidOid for key column. But if you insist on that, I'll modify attached patch to use full-sized vectors anywhere.
In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use including column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options to include column instead silently ignore it.
I don't insist on using full-sized vectors everywhere. Attached patch looks OK for me. I haven't test it though.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Thanks to everyone, this part is pushed. I will waiting a bit before pushing topic-stater patch to have a time to look on buildfarm. Alvaro, could you add a comment to CompareIndexInfo() to clarify why it doesn't compare indoptions (like DESC/ASC etc)? It doesn't matter for uniqueness of index but changes an order and it's not clear at least for me why we don't pay attention for that. > In attached path I did: > 1) fix a bug in CompareIndexInfo() which checks opfamily for including column > 2) correct size for all vectors > 3) Add assertion in various places to be sure that we don't try to use including > column as key column > 4) per Peter Geoghegan idea add a error message if somebody adds options to > include column instead silently ignore it. > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
pushed. Hope, second try will be successful. Teodor Sigaev wrote: > Thank you, pushed > > Amit Langote wrote: >> Hi. >> >> On 2018/04/11 0:36, Teodor Sigaev wrote: >>>> Does the attached fix look correct? Haven't checked the fix with >>>> ATTACH >>>> PARTITION though. >>>> >>>> >>>> Attached patch seems to fix the problem. However, I would rather get >>>> rid of modifying stmt->indexParams. That seems to be more logical >>>> for me. Also, it would be good to check some covering indexes on >>>> partitioned tables. See the attached patch. >>> >>> Seems right way, do not modify incoming object and do not copy rather >>> large and deep nested structure as suggested by Amit. >> >> Yeah, Alexander's suggested way of using a separate variable for >> indexParams is better. >> >>> But it will be better to have a ATTACH PARTITION test too. >> >> I have added tests. Actually, instead of modifying existing tests, I >> think it might be better to have a separate section at the end of >> indexing.sql to test covering indexes feature for partitioned tables. >> >> Attached find updated patch. >> >> Thanks, >> Amit >> > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Thu, Apr 12, 2018 at 6:14 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > I'm OK with collation of included columns to be the same as collation > of underlying table columns. But I still think we should throw an error > when user is trying to specify his own collation of included columns. I agree. The collation of a table column is just setting a default for how it gets interpreted in queries, but the collation of an index column affects the ordering of the index. For INCLUDE columns, the latter isn't relevant, so the value has no meaning. Letting people set a meaningless value sometimes gets us into trouble (see also the nearby thread on TABLESPACE settings on partitioned tables). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company