Re: [HACKERS] Proposal: Local indexes for partitioned table - Mailing list pgsql-hackers

From David Rowley
Subject Re: [HACKERS] Proposal: Local indexes for partitioned table
Date
Msg-id CAKJS1f8Dn7swSEhOWwzZzssW7747YB=2Hi+T7uGud40dur69-g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Proposal: Local indexes for partitioned table  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [HACKERS] Proposal: Local indexes for partitioned table
List pgsql-hackers
On 9 January 2018 at 09:54, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I removed the pg_index.indparentidx column that previous versions add,
> and replaced it with pg_inherits rows.  This makes the code a little bit
> bulkier in a couple of places, but nothing terrible.  As a benefit,
> there's no extra index in pg_index now.

Hi Álvaro,

I've made a pass over this patch. It's mostly in pretty good shape,
but I did find a few things to note down.

1. "either partition" -> "either the partition"

+       so the partition index is dropped together with either
partition it indexes,
+       or with the parent index it is attached to.

2. In findDependentObjects(), should the following if test be below
the ReleaseDeletionLock call?

+ if (foundDep->deptype == DEPENDENCY_INTERNAL_AUTO)
+ break;
  ReleaseDeletionLock(object);

3. The following existing comment indicates that we'll be creating a
disk file for the index. Should we update this comment to say that
this is the case only for RELKIND_INDEX?

/*
* create the index relation's relcache entry and physical disk file. (If
* we fail further down, it's the smgr's responsibility to remove the disk
* file again.)
*/

Maybe:

/*
* create the index relation's relcache entry and for non-partitioned
* indexes, a physical disk file. (If we fail further down, it's the
* smgr's responsibility to remove the disk file again.)
*/

4. Extra newline

+ recordDependencyOn(&myself, &referenced, DEPENDENCY_INTERNAL_AUTO);
+ }
+
+

5. Do you think it's out of scope to support expression indexes?

/*
* Expression indexes are currently not considered equal.  Not needed for
* current callers.
*/
if (info1->ii_Expressions != NIL || info2->ii_Expressions != NIL)
return false;

6. pg_inherits.inhseqno is int, not smallint. I understand you copied
this from StoreCatalogInheritance1(), so maybe a fix should be put in
before this patch is?

void
StoreSingleInheritance(Oid relationId, Oid parentOid, int16 seqNumber)

and

values[Anum_pg_inherits_inhseqno - 1] = Int16GetDatum(seqNumber);

7. Is the following a bug fix? If so, should it not be fixed and
backpatched, rather than sneaked in here?

@@ -10119,6 +10195,7 @@ ATExecChangeOwner(Oid relationOid, Oid
newOwnerId, bool recursing, LOCKMODE lock
  * relation, as well as its toast table (if it has one).
  */
  if (tuple_class->relkind == RELKIND_RELATION ||
+ tuple_class->relkind == RELKIND_PARTITIONED_TABLE ||

8. Missing if (attachInfos) pfree(attachInfos);

+ if (idxes)
+ pfree(idxes);
+ if (attachRelIdxs)
+ pfree(attachRelIdxs);

9. The first OR branch of the Assert will always be false, so might as
well get rid of it.

+ if (!has_superclass(idxid))
+ continue;
+
+ Assert(!has_superclass(idxid) ||
+    (IndexGetRelation(get_partition_parent(idxid), false) ==
+    RelationGetRelid(rel)));

10. lock -> locks:

/* we already hold lock on both tables, so this is safe: */

11. "already the right" -> "already in the right"

/* Silently do nothing if already the right state */

12. It seems to be RangeVarGetRelidExtended() that locks the partition
index, not the callback.

/* no deadlock risk: our callback above already acquired the lock */

13. We seem to only be holding an AccessShareLock on the partitioned
table. What happens if someone concurrently does ALTER TABLE
partitioned_table DETACH our_partition;?

parentTbl = relation_open(parentIdx->rd_index->indrelid, AccessShareLock);

and;
/* Make sure it indexes a partition of the other index's table */
partDesc = RelationGetPartitionDesc(parentTbl);
found = false;
for (i = 0; i < partDesc->nparts; i++)
{
if (partDesc->oids[i] == state.partitionOid)
{
found = true;
break;
}
}

Wouldn't you also need an AccessExclusiveLock to prevent another
session trying to ALTER INDEX part_index ATTACH PARTITION leaf_index;
at the same time? The concurrent session could be trying to ATTACH it
to a subpartition and this comment could be attaching to the parent.

14. validatePartitionedIndex does not verify that the index it is
checking is valid. Consider the following:

create table part (a int, b int) partition by list(a);
create table part1 partition of part for values in(1) partition by list(b)
create table part1 partition of part for values in(1) partition by list(b);
create table part2 partition of part1 for values in (1);
create index on only part1 (a);
create index on only part (a);
alter index part_a_idx attach partition part1_a_idx;
\d part
                Table "public.part"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |
Partition key: LIST (a)
Indexes:
    "part_a_idx" btree (a) <--- should be INVALID
Number of partitions: 1 (Use \d+ to list them.)

\d part1
               Table "public.part1"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |
Partition of: part FOR VALUES IN (1)
Partition key: LIST (b)
Indexes:
    "part1_a_idx" btree (a) INVALID
Number of partitions: 1 (Use \d+ to list them.)

But that probably means you also need more code to search up to parent
partitions and try and validate any invalid indexes once a
sub-partition index is made valid.

As, if the part_a_idx had correctly been marked as INVALID, and then we do:

CREATE INDEX part2_a_idx ON part2 (a);
ALTER INDEX part1_a_idx ATTACH PARTITION part2_a_idx;

This would validate part1_a_idx, but we'd also then need to attempt
validation on part_a_idx. (I'm still reading, so perhaps you've
already thought of that.)

15. Can you explain why you do the following for CREATE INDEX, but not
for CREATE INDEX IF NOT EXISTS?

@@ -7338,6 +7363,7 @@ IndexStmt: CREATE opt_unique INDEX
opt_concurrently opt_index_name
  n->concurrent = $4;
  n->idxname = $5;
  n->relation = $7;
+ n->relationId = InvalidOid;

It's maybe not required anyway since makeNode() will memset zero the
memory, but I think it's best to do it unless I've misunderstood what
it's for.

16. It's not exactly a problem, but I had a quick look and I don't see
any other uses of list_free() checking for a non-NIL list first:

+ if (inheritors)
+ list_free(inheritors);

Probably might as well just remove the if test.

17. Just a stylistic thing; I think it would be neater to set isindex
using the existing switch(). It would just need some case shuffling.

@@ -439,6 +440,7 @@ RelationParseRelOptions(Relation relation, HeapTuple tuple)
  case RELKIND_RELATION:
  case RELKIND_TOASTVALUE:
  case RELKIND_INDEX:
+ case RELKIND_PARTITIONED_INDEX:
  case RELKIND_VIEW:
  case RELKIND_MATVIEW:
  case RELKIND_PARTITIONED_TABLE:
@@ -452,10 +454,12 @@ RelationParseRelOptions(Relation relation,
HeapTuple tuple)
  * we might not have any other for pg_class yet (consider executing this
  * code for pg_class itself)
  */
+ isindex = relation->rd_rel->relkind == RELKIND_INDEX ||
+ relation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX;

Or maybe it's even better to set relation->rd_amroutine->amoptions
into a local variable and use it unconditionally in the call to
extractRelOptions().

18. At most it can't do any harm, but is the following still needed? I
originally thought this would have been for pg_index changes. What's
changed?

-#define CATALOG_VERSION_NO 201712251
+#define CATALOG_VERSION_NO 201712291

I admit to not quite being as thorough in my review with the pg_dump code.

The tests seem fine, but I'd like to see a test for #14 once that's fixed.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #14941: Vacuum crashes
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Replication status in logical replication