Thread: Bug about drop index concurrently

Bug about drop index concurrently

From
"李杰(慎追)"
Date:

Hi hackers,

In recently, I discovered a postgres bug, and I hope I can ask you for the best solution.
The problem is as follows:

postgres=# explain analyze select * from xxx where a=500;

ERROR:  could not open relation with OID 25989

The structure of my table is as follows:

postgres=# \d xxx

                Table "public.xxx"

 Column |  Type   | Collation | Nullable | Default 

--------+---------+-----------+----------+---------

 a      | integer |           |          | 

 b      | text    |           |          | 


postgres=# select count(*) from xxx;

 count  

--------

 800000

(1 row)


postgres=# select * from xxx limit 3;


 a |                b                 

---+----------------------------------

 1 | 203c51477570aa517cfa317155dcc52c

 2 | b58da31baa5c78fee4642fd398bd5909

 3 | c7c475bf0a3ca2fc2afc4812a4d44c58


I opened the log file and saw that the index of table xxx was deleted,

postgres=# drop index CONCURRENTLY idx_xxx ;

DROP INDEX


In order to reproduce this bug, I created and deleted the index again and again on the master.
What is hard to understand is that this bug cannot be repeated 100%.
I wrote a script that loops over the master and runs the following two sentences.

postgres=# create index idx_xxx on xxx (a);

postgres=# drop index CONCURRENTLY idx_xxx ;

postgres=# create index idx_xxx on xxx (a);

postgres=# drop index CONCURRENTLY idx_xxx ;

...

...

...

At the same time, I started two clients in the standby, 

respectively execute the following sql on the table xxx:


postgres=# explain analyze select * from xxx where a=500;

postgres=# \watch 0.1


After a few minutes, the bug will appear.


I finally confirmed my guess, I used an index scan in the standby query,
but deleted the index on the master at the same time.
Curious, I went to read the source code of Postgres. I found that
 regular DROP INDEX commands imposes a AccessExclusiveLock on the table,
 while drop index concurrently commands only used ShareUpdateExclusiveLock.

As we all know, only AccessExclusiveLock and  AccessShareLock ,a select's  lock ,
are mutually exclusive, and AccessShareLock can't block ShareUpdateExclusiveLock.
This is very weird and not desirable.

This is of course, developers must have thought of this, so we can see in the source 
code, before the drop index concurrently, will wait for all transactions using this
 index to wait for detection.

 But this only exists on the master, my query is executed on the standby.
 I use the pg_waldump tool to parse the wal file, and analyze the stantup process,
 I found that there is no similar operation on the standby, so it will appear that 
 when I execute the query on the standby, the index will be deleted by others.

I think this is a bug that will affect the user's experience. we need to fix it.

 I have imagined that the logic that detects the  query transaction and

 waits for it to end is implemented on the standby,but this may increase the

log application delay and the delay is exacerbated that cause the master and backup. 

This is not desirable if the query concurrency is large.


All in all, I expect that you can provide a solution that can use drop index concurrently 

without affecting the master-slave delay.


Sincerely look forward to your reply and thanks.


adger




Re: Bug about drop index concurrently

From
Tomas Vondra
Date:
Hi,

I can trivially reproduce this - it's enough to create a master-standby
setup, and then do this on the master

  CREATE TABLE t (a int, b int);
  INSERT INTO t SELECT i, i FROM generate_series(1,10000) s(i);

and run pgbench with this script

  DROP INDEX CONCURRENTLY IF EXISTS t_a_idx;
  CREATE INDEX t_a_idx ON t(a);

while on the standby there's another pgbench running this script

  EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10000;

and it fails pretty fast for me. With an extra assert(false) added to
src/backend/access/common/relation.c I get a backtrace like this:

    Program terminated with signal SIGABRT, Aborted.
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-22.fc30.x86_64
    (gdb) bt
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    #1  0x00007c32e457a895 in abort () from /lib64/libc.so.6
    #2  0x0000000000a58579 in ExceptionalCondition (conditionName=0xacd9bc "!(0)", errorType=0xacd95b
"FailedAssertion",fileName=0xacd950 "relation.c", lineNumber=64) at assert.c:54
 
    #3  0x000000000048d1bd in relation_open (relationId=38216, lockmode=1) at relation.c:64
    #4  0x00000000005082e4 in index_open (relationId=38216, lockmode=1) at indexam.c:130
    #5  0x000000000080ac3f in get_relation_info (root=0x21698b8, relationObjectId=16385, inhparent=false,
rel=0x220ce60)at plancat.c:196
 
    #6  0x00000000008118c6 in build_simple_rel (root=0x21698b8, relid=1, parent=0x0) at relnode.c:292
    #7  0x00000000007d485d in add_base_rels_to_query (root=0x21698b8, jtnode=0x2169478) at initsplan.c:114
    #8  0x00000000007d48a3 in add_base_rels_to_query (root=0x21698b8, jtnode=0x21693e0) at initsplan.c:122
    #9  0x00000000007d8fad in query_planner (root=0x21698b8, qp_callback=0x7ded97 <standard_qp_callback>,
qp_extra=0x7fffa4834f10)at planmain.c:168
 
    #10 0x00000000007dc316 in grouping_planner (root=0x21698b8, inheritance_update=false, tuple_fraction=0) at
planner.c:2048
    #11 0x00000000007da7ca in subquery_planner (glob=0x220d078, parse=0x2168f78, parent_root=0x0, hasRecursion=false,
tuple_fraction=0)at planner.c:1012
 
    #12 0x00000000007d942c in standard_planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:406
    #13 0x00000000007d91e8 in planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:275
    #14 0x00000000008e1b0d in pg_plan_query (querytree=0x2168f78, cursorOptions=256, boundParams=0x0) at
postgres.c:878
    #15 0x0000000000658683 in ExplainOneQuery (query=0x2168f78, cursorOptions=256, into=0x0, es=0x220cd90,
queryString=0x21407b8"explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0) at explain.c:367
 
    #16 0x0000000000658386 in ExplainQuery (pstate=0x220cc28, stmt=0x2141728, queryString=0x21407b8 "explain analyze
select* from t where a = 10000;", params=0x0, queryEnv=0x0, dest=0x220cb90) at explain.c:255
 
    #17 0x00000000008ea218 in standard_ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select *
fromt where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
 
        completionTag=0x7fffa48355c0 "") at utility.c:675
    #18 0x00000000008e9a45 in ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t
wherea = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
 
        completionTag=0x7fffa48355c0 "") at utility.c:360
    #19 0x00000000008e8a0c in PortalRunUtility (portal=0x219c278, pstmt=0x21425c0, isTopLevel=true,
setHoldSnapshot=true,dest=0x220cb90, completionTag=0x7fffa48355c0 "") at pquery.c:1175
 
    #20 0x00000000008e871a in FillPortalStore (portal=0x219c278, isTopLevel=true) at pquery.c:1035
    #21 0x00000000008e8075 in PortalRun (portal=0x219c278, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x21efb90,altdest=0x21efb90, completionTag=0x7fffa48357b0 "") at pquery.c:765
 
    #22 0x00000000008e207c in exec_simple_query (query_string=0x21407b8 "explain analyze select * from t where a =
10000;")at postgres.c:1215
 
    #23 0x00000000008e636e in PostgresMain (argc=1, argv=0x216c600, dbname=0x216c4e0 "test", username=0x213c3f8 "user")
atpostgres.c:4236
 
    #24 0x000000000083c71e in BackendRun (port=0x2165850) at postmaster.c:4437
    #25 0x000000000083beef in BackendStartup (port=0x2165850) at postmaster.c:4128
    #26 0x0000000000838313 in ServerLoop () at postmaster.c:1704
    #27 0x0000000000837bbf in PostmasterMain (argc=3, argv=0x213a360) at postmaster.c:1377
    #28 0x0000000000759643 in main (argc=3, argv=0x213a360) at main.c:228

So my guess is the root cause is pretty simple - we close/unlock the
indexes after completing the query, but then EXPLAIN tries to open it
again when producing the explain plan.

I don't have a very good idea how to fix this, as explain has no idea
which indexes will be used by the query, etc.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



回复:Bug about drop index concurrently

From
"李杰(慎追)"
Date:
Thanks for the quick reply.
And sorry I haven’t got back to you sooner .

I have seen this backtrace in the core file, and I also looked at the bug in the standby because there is no lock in the drop index concurrently.

However, when our business will perform a large number of queries in the standby, this problem will occur more frequently. So we are trying to solve this problem, and the solution we are currently dealing with is to ban it.

Of course, we considered applying the method of waiting to detect the query lock on the master to the standby, but worried about affecting the standby application log delay, so we gave up that.

If you have a better solution in the future, please push it to the new version, or email it, thank you very much.

regards.

adger.

------------------------------------------------------------------
发件人:Tomas Vondra <tomas.vondra@2ndquadrant.com>
发送时间:2019年10月19日(星期六) 02:00
收件人:李杰(慎追) <adger.lj@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>
主 题:Re: Bug about drop index concurrently

Hi,

I can trivially reproduce this - it's enough to create a master-standby
setup, and then do this on the master

  CREATE TABLE t (a int, b int);
  INSERT INTO t SELECT i, i FROM generate_series(1,10000) s(i);

and run pgbench with this script

  DROP INDEX CONCURRENTLY IF EXISTS t_a_idx;
  CREATE INDEX t_a_idx ON t(a);

while on the standby there's another pgbench running this script

  EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10000;

and it fails pretty fast for me. With an extra assert(false) added to
src/backend/access/common/relation.c I get a backtrace like this:

    Program terminated with signal SIGABRT, Aborted.
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-22.fc30.x86_64
    (gdb) bt
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    #1  0x00007c32e457a895 in abort () from /lib64/libc.so.6
    #2  0x0000000000a58579 in ExceptionalCondition (conditionName=0xacd9bc "!(0)", errorType=0xacd95b "FailedAssertion", fileName=0xacd950 "relation.c", lineNumber=64) at assert.c:54
    #3  0x000000000048d1bd in relation_open (relationId=38216, lockmode=1) at relation.c:64
    #4  0x00000000005082e4 in index_open (relationId=38216, lockmode=1) at indexam.c:130
    #5  0x000000000080ac3f in get_relation_info (root=0x21698b8, relationObjectId=16385, inhparent=false, rel=0x220ce60) at plancat.c:196
    #6  0x00000000008118c6 in build_simple_rel (root=0x21698b8, relid=1, parent=0x0) at relnode.c:292
    #7  0x00000000007d485d in add_base_rels_to_query (root=0x21698b8, jtnode=0x2169478) at initsplan.c:114
    #8  0x00000000007d48a3 in add_base_rels_to_query (root=0x21698b8, jtnode=0x21693e0) at initsplan.c:122
    #9  0x00000000007d8fad in query_planner (root=0x21698b8, qp_callback=0x7ded97 <standard_qp_callback>, qp_extra=0x7fffa4834f10) at planmain.c:168
    #10 0x00000000007dc316 in grouping_planner (root=0x21698b8, inheritance_update=false, tuple_fraction=0) at planner.c:2048
    #11 0x00000000007da7ca in subquery_planner (glob=0x220d078, parse=0x2168f78, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1012
    #12 0x00000000007d942c in standard_planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:406
    #13 0x00000000007d91e8 in planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:275
    #14 0x00000000008e1b0d in pg_plan_query (querytree=0x2168f78, cursorOptions=256, boundParams=0x0) at postgres.c:878
    #15 0x0000000000658683 in ExplainOneQuery (query=0x2168f78, cursorOptions=256, into=0x0, es=0x220cd90, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0) at explain.c:367
    #16 0x0000000000658386 in ExplainQuery (pstate=0x220cc28, stmt=0x2141728, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0, dest=0x220cb90) at explain.c:255
    #17 0x00000000008ea218 in standard_ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
        completionTag=0x7fffa48355c0 "") at utility.c:675
    #18 0x00000000008e9a45 in ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
        completionTag=0x7fffa48355c0 "") at utility.c:360
    #19 0x00000000008e8a0c in PortalRunUtility (portal=0x219c278, pstmt=0x21425c0, isTopLevel=true, setHoldSnapshot=true, dest=0x220cb90, completionTag=0x7fffa48355c0 "") at pquery.c:1175
    #20 0x00000000008e871a in FillPortalStore (portal=0x219c278, isTopLevel=true) at pquery.c:1035
    #21 0x00000000008e8075 in PortalRun (portal=0x219c278, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x21efb90, altdest=0x21efb90, completionTag=0x7fffa48357b0 "") at pquery.c:765
    #22 0x00000000008e207c in exec_simple_query (query_string=0x21407b8 "explain analyze select * from t where a = 10000;") at postgres.c:1215
    #23 0x00000000008e636e in PostgresMain (argc=1, argv=0x216c600, dbname=0x216c4e0 "test", username=0x213c3f8 "user") at postgres.c:4236
    #24 0x000000000083c71e in BackendRun (port=0x2165850) at postmaster.c:4437
    #25 0x000000000083beef in BackendStartup (port=0x2165850) at postmaster.c:4128
    #26 0x0000000000838313 in ServerLoop () at postmaster.c:1704
    #27 0x0000000000837bbf in PostmasterMain (argc=3, argv=0x213a360) at postmaster.c:1377
    #28 0x0000000000759643 in main (argc=3, argv=0x213a360) at main.c:228

So my guess is the root cause is pretty simple - we close/unlock the
indexes after completing the query, but then EXPLAIN tries to open it
again when producing the explain plan.

I don't have a very good idea how to fix this, as explain has no idea
which indexes will be used by the query, etc.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

回复:Bug about drop index concurrently

From
"李杰(慎追)"
Date:
Hi all,

I am sorry to bother you again.

I want to consult again, about the last time I raised a bug about drop index, are you going to deal with it in the future? Is it to ban it or to propose a repair plan, what is your next plan?

Sincerely look forward to your reply and thanks.

adger.

------------------------------------------------------------------
发件人:Tomas Vondra <tomas.vondra@2ndquadrant.com>
发送时间:2019年10月19日(星期六) 02:00
收件人:李杰(慎追) <adger.lj@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>
主 题:Re: Bug about drop index concurrently

Hi,

I can trivially reproduce this - it's enough to create a master-standby
setup, and then do this on the master

  CREATE TABLE t (a int, b int);
  INSERT INTO t SELECT i, i FROM generate_series(1,10000) s(i);

and run pgbench with this script

  DROP INDEX CONCURRENTLY IF EXISTS t_a_idx;
  CREATE INDEX t_a_idx ON t(a);

while on the standby there's another pgbench running this script

  EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10000;

and it fails pretty fast for me. With an extra assert(false) added to
src/backend/access/common/relation.c I get a backtrace like this:

    Program terminated with signal SIGABRT, Aborted.
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-22.fc30.x86_64
    (gdb) bt
    #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
    #1  0x00007c32e457a895 in abort () from /lib64/libc.so.6
    #2  0x0000000000a58579 in ExceptionalCondition (conditionName=0xacd9bc "!(0)", errorType=0xacd95b "FailedAssertion", fileName=0xacd950 "relation.c", lineNumber=64) at assert.c:54
    #3  0x000000000048d1bd in relation_open (relationId=38216, lockmode=1) at relation.c:64
    #4  0x00000000005082e4 in index_open (relationId=38216, lockmode=1) at indexam.c:130
    #5  0x000000000080ac3f in get_relation_info (root=0x21698b8, relationObjectId=16385, inhparent=false, rel=0x220ce60) at plancat.c:196
    #6  0x00000000008118c6 in build_simple_rel (root=0x21698b8, relid=1, parent=0x0) at relnode.c:292
    #7  0x00000000007d485d in add_base_rels_to_query (root=0x21698b8, jtnode=0x2169478) at initsplan.c:114
    #8  0x00000000007d48a3 in add_base_rels_to_query (root=0x21698b8, jtnode=0x21693e0) at initsplan.c:122
    #9  0x00000000007d8fad in query_planner (root=0x21698b8, qp_callback=0x7ded97 <standard_qp_callback>, qp_extra=0x7fffa4834f10) at planmain.c:168
    #10 0x00000000007dc316 in grouping_planner (root=0x21698b8, inheritance_update=false, tuple_fraction=0) at planner.c:2048
    #11 0x00000000007da7ca in subquery_planner (glob=0x220d078, parse=0x2168f78, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1012
    #12 0x00000000007d942c in standard_planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:406
    #13 0x00000000007d91e8 in planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:275
    #14 0x00000000008e1b0d in pg_plan_query (querytree=0x2168f78, cursorOptions=256, boundParams=0x0) at postgres.c:878
    #15 0x0000000000658683 in ExplainOneQuery (query=0x2168f78, cursorOptions=256, into=0x0, es=0x220cd90, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0) at explain.c:367
    #16 0x0000000000658386 in ExplainQuery (pstate=0x220cc28, stmt=0x2141728, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0, dest=0x220cb90) at explain.c:255
    #17 0x00000000008ea218 in standard_ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
        completionTag=0x7fffa48355c0 "") at utility.c:675
    #18 0x00000000008e9a45 in ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
        completionTag=0x7fffa48355c0 "") at utility.c:360
    #19 0x00000000008e8a0c in PortalRunUtility (portal=0x219c278, pstmt=0x21425c0, isTopLevel=true, setHoldSnapshot=true, dest=0x220cb90, completionTag=0x7fffa48355c0 "") at pquery.c:1175
    #20 0x00000000008e871a in FillPortalStore (portal=0x219c278, isTopLevel=true) at pquery.c:1035
    #21 0x00000000008e8075 in PortalRun (portal=0x219c278, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x21efb90, altdest=0x21efb90, completionTag=0x7fffa48357b0 "") at pquery.c:765
    #22 0x00000000008e207c in exec_simple_query (query_string=0x21407b8 "explain analyze select * from t where a = 10000;") at postgres.c:1215
    #23 0x00000000008e636e in PostgresMain (argc=1, argv=0x216c600, dbname=0x216c4e0 "test", username=0x213c3f8 "user") at postgres.c:4236
    #24 0x000000000083c71e in BackendRun (port=0x2165850) at postmaster.c:4437
    #25 0x000000000083beef in BackendStartup (port=0x2165850) at postmaster.c:4128
    #26 0x0000000000838313 in ServerLoop () at postmaster.c:1704
    #27 0x0000000000837bbf in PostmasterMain (argc=3, argv=0x213a360) at postmaster.c:1377
    #28 0x0000000000759643 in main (argc=3, argv=0x213a360) at main.c:228

So my guess is the root cause is pretty simple - we close/unlock the
indexes after completing the query, but then EXPLAIN tries to open it
again when producing the explain plan.

I don't have a very good idea how to fix this, as explain has no idea
which indexes will be used by the query, etc.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: 回复:Bug about drop index concurrently

From
Tomas Vondra
Date:
On Mon, Oct 21, 2019 at 10:36:04AM +0800, 李杰(慎追) wrote:
>Thanks for the quick reply.  And sorry I haven’t got back to you sooner
>.
>
>I have seen this backtrace in the core file, and I also looked at the
>bug in the standby because there is no lock in the drop index
>concurrently.
>

I'm a bit confused. You shouldn't see any crashes and/or core files in
this scenario, for two reasons. Firstly, I assume you're running a
regular build without asserts. Secondly, I had to add an extra assert to
trigger the failure. So what core are you talking about?

Also, it's not clear to me what do you mean by "bug in the standby" or
no lock in the drop index concurrently. Can you explain?

>However, when our business will perform a large number of queries in
>the standby, this problem will occur more frequently. So we are trying
>to solve this problem, and the solution we are currently dealing with
>is to ban it.
>

Hmm, so you observe the issue with regular queries, not just EXPLAIN
ANALYZE?

>Of course, we considered applying the method of waiting to detect the
>query lock on the master to the standby, but worried about affecting
>the standby application log delay, so we gave up that.
>

I don't understand? What method?

>If you have a better solution in the future, please push it to the new
>version, or email it, thank you very much.
>


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: 回复:Bug about drop index concurrently

From
Tomas Vondra
Date:
On Tue, Oct 22, 2019 at 03:24:41PM +0800, 李杰(慎追) wrote:
>Hi all,
>
>I am sorry to bother you again.
>
>I want to consult again, about the last time I raised a bug about drop
>index, are you going to deal with it in the future? Is it to ban it or
>to propose a repair plan, what is your next plan?
>

It does seem like a bug, i.e. something we need to fix.

Not sure what/how we could ban?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bug about drop index concurrently

From
Tomas Vondra
Date:
On Fri, Oct 18, 2019 at 05:00:54PM +0200, Tomas Vondra wrote:
>Hi,
>
>I can trivially reproduce this - it's enough to create a master-standby
>setup, and then do this on the master
>
> CREATE TABLE t (a int, b int);
> INSERT INTO t SELECT i, i FROM generate_series(1,10000) s(i);
>
>and run pgbench with this script
>
> DROP INDEX CONCURRENTLY IF EXISTS t_a_idx;
> CREATE INDEX t_a_idx ON t(a);
>
>while on the standby there's another pgbench running this script
>
> EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10000;
>
>and it fails pretty fast for me. With an extra assert(false) added to
>src/backend/access/common/relation.c I get a backtrace like this:
>
>   Program terminated with signal SIGABRT, Aborted.
>   #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
>   Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-22.fc30.x86_64
>   (gdb) bt
>   #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
>   #1  0x00007c32e457a895 in abort () from /lib64/libc.so.6
>   #2  0x0000000000a58579 in ExceptionalCondition (conditionName=0xacd9bc "!(0)", errorType=0xacd95b
"FailedAssertion",fileName=0xacd950 "relation.c", lineNumber=64) at assert.c:54
 
>   #3  0x000000000048d1bd in relation_open (relationId=38216, lockmode=1) at relation.c:64
>   #4  0x00000000005082e4 in index_open (relationId=38216, lockmode=1) at indexam.c:130
>   #5  0x000000000080ac3f in get_relation_info (root=0x21698b8, relationObjectId=16385, inhparent=false,
rel=0x220ce60)at plancat.c:196
 
>   #6  0x00000000008118c6 in build_simple_rel (root=0x21698b8, relid=1, parent=0x0) at relnode.c:292
>   #7  0x00000000007d485d in add_base_rels_to_query (root=0x21698b8, jtnode=0x2169478) at initsplan.c:114
>   #8  0x00000000007d48a3 in add_base_rels_to_query (root=0x21698b8, jtnode=0x21693e0) at initsplan.c:122
>   #9  0x00000000007d8fad in query_planner (root=0x21698b8, qp_callback=0x7ded97 <standard_qp_callback>,
qp_extra=0x7fffa4834f10)at planmain.c:168
 
>   #10 0x00000000007dc316 in grouping_planner (root=0x21698b8, inheritance_update=false, tuple_fraction=0) at
planner.c:2048
>   #11 0x00000000007da7ca in subquery_planner (glob=0x220d078, parse=0x2168f78, parent_root=0x0, hasRecursion=false,
tuple_fraction=0)at planner.c:1012
 
>   #12 0x00000000007d942c in standard_planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:406
>   #13 0x00000000007d91e8 in planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:275
>   #14 0x00000000008e1b0d in pg_plan_query (querytree=0x2168f78, cursorOptions=256, boundParams=0x0) at
postgres.c:878
>   #15 0x0000000000658683 in ExplainOneQuery (query=0x2168f78, cursorOptions=256, into=0x0, es=0x220cd90,
queryString=0x21407b8"explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0) at explain.c:367
 
>   #16 0x0000000000658386 in ExplainQuery (pstate=0x220cc28, stmt=0x2141728, queryString=0x21407b8 "explain analyze
select* from t where a = 10000;", params=0x0, queryEnv=0x0, dest=0x220cb90) at explain.c:255
 
>   #17 0x00000000008ea218 in standard_ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select *
fromt where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
 
>       completionTag=0x7fffa48355c0 "") at utility.c:675
>   #18 0x00000000008e9a45 in ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t
wherea = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
 
>       completionTag=0x7fffa48355c0 "") at utility.c:360
>   #19 0x00000000008e8a0c in PortalRunUtility (portal=0x219c278, pstmt=0x21425c0, isTopLevel=true,
setHoldSnapshot=true,dest=0x220cb90, completionTag=0x7fffa48355c0 "") at pquery.c:1175
 
>   #20 0x00000000008e871a in FillPortalStore (portal=0x219c278, isTopLevel=true) at pquery.c:1035
>   #21 0x00000000008e8075 in PortalRun (portal=0x219c278, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x21efb90,altdest=0x21efb90, completionTag=0x7fffa48357b0 "") at pquery.c:765
 
>   #22 0x00000000008e207c in exec_simple_query (query_string=0x21407b8 "explain analyze select * from t where a =
10000;")at postgres.c:1215
 
>   #23 0x00000000008e636e in PostgresMain (argc=1, argv=0x216c600, dbname=0x216c4e0 "test", username=0x213c3f8 "user")
atpostgres.c:4236
 
>   #24 0x000000000083c71e in BackendRun (port=0x2165850) at postmaster.c:4437
>   #25 0x000000000083beef in BackendStartup (port=0x2165850) at postmaster.c:4128
>   #26 0x0000000000838313 in ServerLoop () at postmaster.c:1704
>   #27 0x0000000000837bbf in PostmasterMain (argc=3, argv=0x213a360) at postmaster.c:1377
>   #28 0x0000000000759643 in main (argc=3, argv=0x213a360) at main.c:228
>
>So my guess is the root cause is pretty simple - we close/unlock the
>indexes after completing the query, but then EXPLAIN tries to open it
>again when producing the explain plan.
>

D'oh! I've just looked at this issue more carefully, and I realize this
suggestion (that it's due to releasing a lock too early) is just bogus.
Sorry about the confusion :-(

In fact, I think you've been 100% correct in your analysis regarding the
root cause, which is that we don't realize there's a query on a standby,
using the index that we're dropping (and REINDEX CONCURRENTLY seems to
have exactly the same issue).

I've reproduced this on all releases back to 10, I suppose it affects
all releases with DROP INDEX CONCURRENTLY (I haven't tried, but I don't
see why it wouldn't).

I still think it's a bug, and we'll need to fix it somehow. Not sure
how, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



回复:回复:Bug about drop index concurrently

From
"李杰(慎追)"
Date:
>
>I'm a bit confused. You shouldn't see any crashes and/or core files in
>this scenario, for two reasons. Firstly, I assume you're running a
>regular build without asserts. Secondly, I had to add an extra assert to
>trigger the failure. So what core are you talking about?
>
Sorry, I should explain it more clearly.
I saw the core file because I modified the postgres source code and added Assert to it.
>
>Also, it's not clear to me what do you mean by "bug in the standby" or
>no lock in the drop index concurrently. Can you explain?
>
"bug in the standby" means that we built a master-slave instance, when we executed a large number of queries on the standby, we executed 'drop index concurrently' on the master so that get ‘error’ in the standby. Although it is not 100%, it will appear.
no lock in the drop index concurrently :::  I think this is because there are not enough advanced locks when executing ‘ drop index concurrently’.

>Hmm, so you observe the issue with regular queries, not just EXPLAIN
>ANALYZE?

yeah, we have seen this error frequently.

>>Of course, we considered applying the method of waiting to detect the
>>query lock on the master to the standby, but worried about affecting
>>the standby application log delay, so we gave up that.
>>
>I don't understand? What method?
>

I analyzed this problem, I used to find out the cause of this problem, I also executed 'drop index concurrently' and ‘explain select * from xxx’  on the master, but the bug did not appear as expected.
So I went to analyze the source code. I found that there is such a mechanism on the master that when the 'drop index concurrently' is execute, it wait will every transaction that saw the old index state has finished. source code is as follows follow as:

WaitForLockers(heaplocktag, AccessExclusiveLock);

Therefore, I think that if this method is also available in standby, then the error will not appear. but I worried about affecting the standby application log delay, so we gave up that.





------------------------------------------------------------------
发件人:Tomas Vondra <tomas.vondra@2ndquadrant.com>
发送时间:2019年10月23日(星期三) 01:47
收件人:李杰(慎追) <adger.lj@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>
主 题:Re: 回复:Bug about drop index concurrently

On Mon, Oct 21, 2019 at 10:36:04AM +0800, 李杰(慎追) wrote:
>Thanks for the quick reply.  And sorry I haven’t got back to you sooner
>.
>
>I have seen this backtrace in the core file, and I also looked at the
>bug in the standby because there is no lock in the drop index
>concurrently.
>

I'm a bit confused. You shouldn't see any crashes and/or core files in
this scenario, for two reasons. Firstly, I assume you're running a
regular build without asserts. Secondly, I had to add an extra assert to
trigger the failure. So what core are you talking about?

Also, it's not clear to me what do you mean by "bug in the standby" or
no lock in the drop index concurrently. Can you explain?

>However, when our business will perform a large number of queries in
>the standby, this problem will occur more frequently. So we are trying
>to solve this problem, and the solution we are currently dealing with
>is to ban it.
>

Hmm, so you observe the issue with regular queries, not just EXPLAIN
ANALYZE?

>Of course, we considered applying the method of waiting to detect the
>query lock on the master to the standby, but worried about affecting
>the standby application log delay, so we gave up that.
>

I don't understand? What method?

>If you have a better solution in the future, please push it to the new
>version, or email it, thank you very much.
>


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

回复:Bug about drop index concurrently

From
"李杰(慎追)"
Date:
Ah ha ha , this is great, I am very ashamed of my English expression, did not let you clearly understand my mail.

now, I am very glad that you can understand this. I sincerely hope that I can help you. I am also a postgres fan, a freshly graduated student.

We have all confirmed that this bug will only appear on the standby and will not appear on the master.But it does affect the use of standby.

For this bug, I proposed two options, one is to disable this feature (drop index concurrently), the other is to wait for the standby select like on the master, but it may affect the application delay of the log. Because this bug appears on the standby, I think both methods have advantages and disadvantages. So I hope that you can discuss it so much that it will help you.

Sincerely 

adger


------------------------------------------------------------------
发件人:Tomas Vondra <tomas.vondra@2ndquadrant.com>
发送时间:2019年10月23日(星期三) 06:49
收件人:李杰(慎追) <adger.lj@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>
主 题:Re: Bug about drop index concurrently

On Fri, Oct 18, 2019 at 05:00:54PM +0200, Tomas Vondra wrote:
>Hi,
>
>I can trivially reproduce this - it's enough to create a master-standby
>setup, and then do this on the master
>
> CREATE TABLE t (a int, b int);
> INSERT INTO t SELECT i, i FROM generate_series(1,10000) s(i);
>
>and run pgbench with this script
>
> DROP INDEX CONCURRENTLY IF EXISTS t_a_idx;
> CREATE INDEX t_a_idx ON t(a);
>
>while on the standby there's another pgbench running this script
>
> EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10000;
>
>and it fails pretty fast for me. With an extra assert(false) added to
>src/backend/access/common/relation.c I get a backtrace like this:
>
>   Program terminated with signal SIGABRT, Aborted.
>   #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
>   Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-22.fc30.x86_64
>   (gdb) bt
>   #0  0x00007c32e458fe35 in raise () from /lib64/libc.so.6
>   #1  0x00007c32e457a895 in abort () from /lib64/libc.so.6
>   #2  0x0000000000a58579 in ExceptionalCondition (conditionName=0xacd9bc "!(0)", errorType=0xacd95b "FailedAssertion", fileName=0xacd950 "relation.c", lineNumber=64) at assert.c:54
>   #3  0x000000000048d1bd in relation_open (relationId=38216, lockmode=1) at relation.c:64
>   #4  0x00000000005082e4 in index_open (relationId=38216, lockmode=1) at indexam.c:130
>   #5  0x000000000080ac3f in get_relation_info (root=0x21698b8, relationObjectId=16385, inhparent=false, rel=0x220ce60) at plancat.c:196
>   #6  0x00000000008118c6 in build_simple_rel (root=0x21698b8, relid=1, parent=0x0) at relnode.c:292
>   #7  0x00000000007d485d in add_base_rels_to_query (root=0x21698b8, jtnode=0x2169478) at initsplan.c:114
>   #8  0x00000000007d48a3 in add_base_rels_to_query (root=0x21698b8, jtnode=0x21693e0) at initsplan.c:122
>   #9  0x00000000007d8fad in query_planner (root=0x21698b8, qp_callback=0x7ded97 <standard_qp_callback>, qp_extra=0x7fffa4834f10) at planmain.c:168
>   #10 0x00000000007dc316 in grouping_planner (root=0x21698b8, inheritance_update=false, tuple_fraction=0) at planner.c:2048
>   #11 0x00000000007da7ca in subquery_planner (glob=0x220d078, parse=0x2168f78, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1012
>   #12 0x00000000007d942c in standard_planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:406
>   #13 0x00000000007d91e8 in planner (parse=0x2168f78, cursorOptions=256, boundParams=0x0) at planner.c:275
>   #14 0x00000000008e1b0d in pg_plan_query (querytree=0x2168f78, cursorOptions=256, boundParams=0x0) at postgres.c:878
>   #15 0x0000000000658683 in ExplainOneQuery (query=0x2168f78, cursorOptions=256, into=0x0, es=0x220cd90, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0) at explain.c:367
>   #16 0x0000000000658386 in ExplainQuery (pstate=0x220cc28, stmt=0x2141728, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", params=0x0, queryEnv=0x0, dest=0x220cb90) at explain.c:255
>   #17 0x00000000008ea218 in standard_ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
>       completionTag=0x7fffa48355c0 "") at utility.c:675
>   #18 0x00000000008e9a45 in ProcessUtility (pstmt=0x21425c0, queryString=0x21407b8 "explain analyze select * from t where a = 10000;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x220cb90,
>       completionTag=0x7fffa48355c0 "") at utility.c:360
>   #19 0x00000000008e8a0c in PortalRunUtility (portal=0x219c278, pstmt=0x21425c0, isTopLevel=true, setHoldSnapshot=true, dest=0x220cb90, completionTag=0x7fffa48355c0 "") at pquery.c:1175
>   #20 0x00000000008e871a in FillPortalStore (portal=0x219c278, isTopLevel=true) at pquery.c:1035
>   #21 0x00000000008e8075 in PortalRun (portal=0x219c278, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x21efb90, altdest=0x21efb90, completionTag=0x7fffa48357b0 "") at pquery.c:765
>   #22 0x00000000008e207c in exec_simple_query (query_string=0x21407b8 "explain analyze select * from t where a = 10000;") at postgres.c:1215
>   #23 0x00000000008e636e in PostgresMain (argc=1, argv=0x216c600, dbname=0x216c4e0 "test", username=0x213c3f8 "user") at postgres.c:4236
>   #24 0x000000000083c71e in BackendRun (port=0x2165850) at postmaster.c:4437
>   #25 0x000000000083beef in BackendStartup (port=0x2165850) at postmaster.c:4128
>   #26 0x0000000000838313 in ServerLoop () at postmaster.c:1704
>   #27 0x0000000000837bbf in PostmasterMain (argc=3, argv=0x213a360) at postmaster.c:1377
>   #28 0x0000000000759643 in main (argc=3, argv=0x213a360) at main.c:228
>
>So my guess is the root cause is pretty simple - we close/unlock the
>indexes after completing the query, but then EXPLAIN tries to open it
>again when producing the explain plan.
>

D'oh! I've just looked at this issue more carefully, and I realize this
suggestion (that it's due to releasing a lock too early) is just bogus.
Sorry about the confusion :-(

In fact, I think you've been 100% correct in your analysis regarding the
root cause, which is that we don't realize there's a query on a standby,
using the index that we're dropping (and REINDEX CONCURRENTLY seems to
have exactly the same issue).

I've reproduced this on all releases back to 10, I suppose it affects
all releases with DROP INDEX CONCURRENTLY (I haven't tried, but I don't
see why it wouldn't).

I still think it's a bug, and we'll need to fix it somehow. Not sure
how, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 

Re: 回复:回复:Bug aboutdrop index concurrently

From
Tomas Vondra
Date:
On Wed, Oct 23, 2019 at 02:38:45PM +0800, 李杰(慎追) wrote:
>>
>>I'm a bit confused. You shouldn't see any crashes and/or core files in
>>this scenario, for two reasons. Firstly, I assume you're running a
>>regular build without asserts. Secondly, I had to add an extra assert
>>to trigger the failure. So what core are you talking about?
>>
>Sorry, I should explain it more clearly.  I saw the core file because I
>modified the postgres source code and added Assert to it.
>>

OK

>>Also, it's not clear to me what do you mean by "bug in the standby" or
>>no lock in the drop index concurrently. Can you explain?
>>
>"bug in the standby" means that we built a master-slave instance, when
>we executed a large number of queries on the standby, we executed 'drop
>index concurrently' on the master so that get ‘error’ in the standby.
>Although it is not 100%, it will appear.  no lock in the drop index
>concurrently :::  I think this is because there are not enough advanced
>locks when executing ‘ drop index concurrently’.
>

OK, thanks for the clarification. Yes, it won't appear every time, it's
likely timing-sensitive (I'll explain in a minute).

>>Hmm, so you observe the issue with regular queries, not just EXPLAIN
>>ANALYZE?
>
>yeah, we have seen this error frequently.
>

That suggests you're doing a lot of 'drop index concurrently', right?

>>>Of course, we considered applying the method of waiting to detect the
>>>query lock on the master to the standby, but worried about affecting
>>>the standby application log delay, so we gave up that.
>>>
>>I don't understand? What method?
>>
>
>I analyzed this problem, I used to find out the cause of this problem,
>I also executed 'drop index concurrently' and ‘explain select * from
>xxx’  on the master, but the bug did not appear as expected.  So I went
>to analyze the source code. I found that there is such a mechanism on
>the master that when the 'drop index concurrently' is execute, it wait
>will every transaction that saw the old index state has finished.
>source code is as follows follow as:
>
>WaitForLockers(heaplocktag, AccessExclusiveLock);
>
>Therefore, I think that if this method is also available in standby,
>then the error will not appear. but I worried about affecting the
>standby application log delay, so we gave up that.
>

Yes, but we can't really do that, I'm afraid.

We certainly can't do that on the master because we simply don't have
the necessary information about locks from the standby, and we really
don't want to have it, because with a busy standby that might be quite a
bit of data (plust the standby would have to wait for the master to
confirm each lock acquisition, I think which seems pretty terrible).

On the standby, we don't really have an idea that the there's a drop
index running - we only get information about AE locks, and a bunch of
catalog updates. I don't think we have a way to determine this is a drop
index in concurrent mode :-(

More preciresly, the master sends information about AccessExclusiveLock
in XLOG_STANDBY_LOCK wal record (in xl_standby_locks struct). And when
the standby replays that, it should acquire those locks.

For regular DROP INDEX we send this:

rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 16385 
rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 20573 
... catalog changes ...
rmgr: Transaction ... desc: COMMIT 2019-10-23 22:42:27.950995 CEST;
    rels: base/16384/20573; inval msgs: catcache 32 catcache 7 catcache
    6 catcache 50 catcache 49 relcache 20573 relcache 16385 snapshot 2608

while for DROP IDNEX CONCURRENTLY we send this

rmgr: Heap        ... desc: INPLACE ... catalog update
rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
  relcache 21288 relcache 16385
rmgr: Heap        ... desc: INPLACE ... catalog update
rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
  relcache 21288 relcache 16385
rmgr: Standby     ... desc: LOCK xid 10326 db 16384 rel 21288 
... catalog updates ...
rmgr: Transaction ... desc: COMMIT 2019-10-23 23:47:10.042568 CEST;
   rels: base/16384/21288; inval msgs: catcache 32 catcache 7 catcache 6
   catcache 50 catcache 49 relcache 21288 relcache 16385 snapshot 2608

So just a single lock on the index, but not the lock on the relation
itself (which makes sense, because for DROP INDEX CONCURRENTLY we don't
get an exclusive lock on the table).

I'm not quite familiar with this part of the code, but the SELECT
backends are clearly getting a stale list of indexes from relcache, and
try to open an index which was already removed by the redo. We do
acquire the lock on the index itself, but that's not sufficient :-(

Not sure how to fix this. I wonder if we could invalidate the relcache
for the relation at some point. Or maybe we could add additional
information to the WAL to make the redo wait for all lock waiters, just
like on the master. But that might be tricky because of deadlocks, and
because  the redo could easily get "stuck" waiting for a long-running
queries.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



回复:回复:回复:Bug about drop index concurrently

From
"李杰(慎追)"
Date:
>That suggests you're doing a lot of 'drop index concurrently', right?

Not completely, In the actual scene,  in fact, I didn't perform too much 'drop index concurrently' on the master. I just just execute a lot of queries on the standby. You know, it will have a certain probability every time you execute ‘drop index concurrently’ on the master. Although it is small, it may appear.

>Yes, but we can't really do that, I'm afraid.
>
>We certainly can't do that on the master because we simply don't have
>the necessary information about locks from the standby, and we really
>don't want to have it, because with a busy standby that might be quite a
>bit of data (plust the standby would have to wait for the master to
>confirm each lock acquisition, I think which seems pretty terrible).
>
yeah,  we can't do this and will lose too much in order to achieve this.

>On the standby, we don't really have an idea that the there's a drop
>index running - we only get information about AE locks, and a bunch of
>catalog updates. I don't think we have a way to determine this is a drop
>index in concurrent mode :-(
>
>More preciresly, the master sends information about AccessExclusiveLock
>in XLOG_STANDBY_LOCK wal record (in xl_standby_locks struct). And when
>the standby replays that, it should acquire those locks.
>
>For regular DROP INDEX we send this:
>
>rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 16385 
>rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 20573 
>... catalog changes ...
>rmgr: Transaction ... desc: COMMIT 2019-10-23 22:42:27.950995 CEST;
>   rels: base/16384/20573; inval msgs: catcache 32 catcache 7 catcache
> 6 catcache 50 catcache 49 relcache 20573 relcache 16385 snapshot 2608
>
>while for DROP IDNEX CONCURRENTLY we send this
>
>rmgr: Heap        ... desc: INPLACE ... catalog update
>rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
>relcache 21288 relcache 16385
>rmgr: Heap        ... desc: INPLACE ... catalog update
>rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
>  relcache 21288 relcache 16385
>rmgr: Standby     ... desc: LOCK xid 10326 db 16384 rel 21288 
>... catalog updates ...
>rmgr: Transaction ... desc: COMMIT 2019-10-23 23:47:10.042568 CEST;
>   rels: base/16384/21288; inval msgs: catcache 32 catcache 7 catcache 6
>   catcache 50 catcache 49 relcache 21288 relcache 16385 snapshot 2608
>
yeah, you are right, I got this .

>So just a single lock on the index, but not the lock on the relation
>itself (which makes sense, because for DROP INDEX CONCURRENTLY we don'>t get an exclusive lock on the table).
>
>I'm not quite familiar with this part of the code, but the SELECT
>backends are clearly getting a stale list of indexes from relcache, and
>try to open an index which was already removed by the redo. We do
>acquire the lock on the index itself, but that's not sufficient :-(
>
>Not sure how to fix this. I wonder if we could invalidate the relcache
>for the relation at some point.
--method one --
 >Or maybe we could add additional
>information to the WAL to make the redo wait for all lock waiters, just
>like on the master. But that might be tricky because of deadlocks, and
>because  the redo could easily get "stuck" waiting for a long-running
>queries.
--method two --

for the method one , I don't think this can solve the problem at all. 

Because we can't predict on the standby when the master will 'drop index concurrently'. I also have a way to manually reproduce this bug in order to make it easier for you to understand the problem. You can try to follow the steps below:
1. We build a connection on the standby, then use gdb to attach to the backend process
2. We set a breakpoint at  plancat.c:196 in get_relation_info
3. We execute 'explain select ' on this backend, in gdb you will see that the breakpoint has hit, and the query will hang.
4.We execute 'drop index concurrently' on the master. (If 'drop index' will be blocked, because there is a query on our standby, the master can not get EXlock). 
5. on the standby ,let gdb continue.
We will see an error(ERROR:  could not open relation with OIDXXX) in the standby client.

Therefore, we can see that the query is executed first by the standby, and then the 'drop index concurrently' executed by the master. So no matter how fast we can make relcache invalid, there is no way to prevent this error from happening, because there is always a moment when the index has been dropped by the master. In other words, there is no way to predict the master's 'drop index concurrently' on standby.

for the method two, you are right, deadlock is possible, and the most unbearable is that it will bring delay to the log apply on standb, resulting in inconsistent data between the master and the standby.

All in all, I think this bug is a flaw in postgres design. We need to think carefully about how to handle it better. Even we can learn from other database products. I hope I can help you.

Thank you very much for your attention.

Regards.

adger.



------------------------------------------------------------------
发件人:Tomas Vondra <tomas.vondra@2ndquadrant.com>
发送时间:2019年10月24日(星期四) 06:04
收件人:李杰(慎追) <adger.lj@alibaba-inc.com>
抄 送:pgsql-hackers <pgsql-hackers@lists.postgresql.org>
主 题:Re: 回复:回复:Bug about drop index concurrently

On Wed, Oct 23, 2019 at 02:38:45PM +0800, 李杰(慎追) wrote:
>>
>>I'm a bit confused. You shouldn't see any crashes and/or core files in
>>this scenario, for two reasons. Firstly, I assume you're running a
>>regular build without asserts. Secondly, I had to add an extra assert
>>to trigger the failure. So what core are you talking about?
>>
>Sorry, I should explain it more clearly.  I saw the core file because I
>modified the postgres source code and added Assert to it.
>>

OK

>>Also, it's not clear to me what do you mean by "bug in the standby" or
>>no lock in the drop index concurrently. Can you explain?
>>
>"bug in the standby" means that we built a master-slave instance, when
>we executed a large number of queries on the standby, we executed 'drop
>index concurrently' on the master so that get ‘error’ in the standby.
>Although it is not 100%, it will appear.  no lock in the drop index
>concurrently :::  I think this is because there are not enough advanced
>locks when executing ‘ drop index concurrently’.
>

OK, thanks for the clarification. Yes, it won't appear every time, it's
likely timing-sensitive (I'll explain in a minute).

>>Hmm, so you observe the issue with regular queries, not just EXPLAIN
>>ANALYZE?
>
>yeah, we have seen this error frequently.
>

That suggests you're doing a lot of 'drop index concurrently', right?

>>>Of course, we considered applying the method of waiting to detect the
>>>query lock on the master to the standby, but worried about affecting
>>>the standby application log delay, so we gave up that.
>>>
>>I don't understand? What method?
>>
>
>I analyzed this problem, I used to find out the cause of this problem,
>I also executed 'drop index concurrently' and ‘explain select * from
>xxx’  on the master, but the bug did not appear as expected.  So I went
>to analyze the source code. I found that there is such a mechanism on
>the master that when the 'drop index concurrently' is execute, it wait
>will every transaction that saw the old index state has finished.
>source code is as follows follow as:
>
>WaitForLockers(heaplocktag, AccessExclusiveLock);
>
>Therefore, I think that if this method is also available in standby,
>then the error will not appear. but I worried about affecting the
>standby application log delay, so we gave up that.
>

Yes, but we can't really do that, I'm afraid.

We certainly can't do that on the master because we simply don't have
the necessary information about locks from the standby, and we really
don't want to have it, because with a busy standby that might be quite a
bit of data (plust the standby would have to wait for the master to
confirm each lock acquisition, I think which seems pretty terrible).

On the standby, we don't really have an idea that the there's a drop
index running - we only get information about AE locks, and a bunch of
catalog updates. I don't think we have a way to determine this is a drop
index in concurrent mode :-(

More preciresly, the master sends information about AccessExclusiveLock
in XLOG_STANDBY_LOCK wal record (in xl_standby_locks struct). And when
the standby replays that, it should acquire those locks.

For regular DROP INDEX we send this:

rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 16385 
rmgr: Standby     ... desc: LOCK xid 8888 db 16384 rel 20573 
... catalog changes ...
rmgr: Transaction ... desc: COMMIT 2019-10-23 22:42:27.950995 CEST;
    rels: base/16384/20573; inval msgs: catcache 32 catcache 7 catcache
    6 catcache 50 catcache 49 relcache 20573 relcache 16385 snapshot 2608

while for DROP IDNEX CONCURRENTLY we send this

rmgr: Heap        ... desc: INPLACE ... catalog update
rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
  relcache 21288 relcache 16385
rmgr: Heap        ... desc: INPLACE ... catalog update
rmgr: Standby     ... desc: INVALIDATIONS ; inval msgs: catcache 32
  relcache 21288 relcache 16385
rmgr: Standby     ... desc: LOCK xid 10326 db 16384 rel 21288 
... catalog updates ...
rmgr: Transaction ... desc: COMMIT 2019-10-23 23:47:10.042568 CEST;
   rels: base/16384/21288; inval msgs: catcache 32 catcache 7 catcache 6
   catcache 50 catcache 49 relcache 21288 relcache 16385 snapshot 2608

So just a single lock on the index, but not the lock on the relation
itself (which makes sense, because for DROP INDEX CONCURRENTLY we don't
get an exclusive lock on the table).

I'm not quite familiar with this part of the code, but the SELECT
backends are clearly getting a stale list of indexes from relcache, and
try to open an index which was already removed by the redo. We do
acquire the lock on the index itself, but that's not sufficient :-(

Not sure how to fix this. I wonder if we could invalidate the relcache
for the relation at some point. Or maybe we could add additional
information to the WAL to make the redo wait for all lock waiters, just
like on the master. But that might be tricky because of deadlocks, and
because  the redo could easily get "stuck" waiting for a long-running
queries.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services