Thread: Tow kinds of different result while using create index concurrently

Hello:

 

I  have question about PG's "create index concurrently". I think it is a bug perhaps.

 

I  make two tables tab01 and tab02, they have no relationships.

I think "create index concurrently " on tab02 will not be influenced by  transaction on tab01.

But the result differs:

 

My first program:        transaction via ecpg(with host variable as where condition),  psql's "create index concurrently" succeeded.

 

My second program:  transaction via ecpg,  psql's "create index concurrently" is blocked until  ecpg program disconnect.

 

My third Test:              transaction via psql,  another psql's "create index concurrently" succeeded.

 

My fourth Test:           transaction via psql(with pg_sleep),  another psql's "create index concurrently"  is blocked until psql transaction done(commit).

 

I am using PostgreSQL9.1.2. And on PostgreSQL9.2.4, the result is same.

 

My data:

--------------------------------------------------------------------

[postgres@server bin]$ ./psql -U tester -d tester

psql (9.1.2)

Type "help" for help.

 

tester=> \d tab01;

           Table "public.tab01"

 Column |         Type         | Modifiers

--------+----------------------+-----------

 id     | integer              |

 cd     | character varying(4) |

 

tester=> \d tab02;

     Table "public.tab02"

 Column |  Type   | Modifiers

--------+---------+-----------

 id     | integer |

 value  | integer |

 

tester=> select * from tab01;

 id | cd

----+----

  1 | 14

  2 | 15

  3 | 14

(3 rows)

 

tester=> select * from tab02;

 id | value

----+-------

  1 |   100

  2 |   200

  3 |   300

(3 rows)

 

tester=>

---------------------------------------------------------------------

 

My testing method  for  First program and Second program:

 

While my ecpg program is sleeping,

I open a terminal connect PG with psql,

then  send "create index concurrently idx_tab02_id_new on tab02(id)"

 

For my first  program, I can build index successfully.

For my second program, I can not build index, the sql statement is blocked until  ecpg program disconnect from PG.

 

My table tab01 and tab02 has no relationships.

And I don't think that  my ecpg program will potentially use the index of tab02.

 

In fact , If I look into  the c program created by ecpg-- test02.c

I can find this:

------------------

   { ECPGdo(__LINE__, 0, 1, "db_conn", 0, ECPGst_normal, "select count ( * ) from tab01 where cd = $1 ",

        ECPGt_char,(vcd),(long)4 + 1,(long)1,(4 + 1)*sizeof(char),

        ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EOIT,

        ECPGt_int,&(vCount),(long)1,(long)1,sizeof(int),

        ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);}

------------------

If I quoted the $1 manually and then compile it, then I can "create index concurrently" while my ecpg program running:

 

------------------

   { ECPGdo(__LINE__, 0, 1, "db_conn", 0, ECPGst_normal, "select count ( * ) from tab01 where cd = '$1' ",

        ECPGt_char,(vcd),(long)4 + 1,(long)1,(4 + 1)*sizeof(char),

        ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EOIT,

        ECPGt_int,&(vCount),(long)1,(long)1,sizeof(int),

        ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);}

------------------

 

 

Here is my program 's source:

***My first program:

-------------------------------------

[root@server soft]# cat ./test01/test01.pc

int main()

{

   EXEC SQL BEGIN DECLARE SECTION;

         int vCount;

         char vcd[4+1];                        

   EXEC SQL END DECLARE SECTION;

 

   EXEC SQL CONNECT TO 'tester@127.0.0.1:5432' AS db_conn

     USER tester IDENTIFIED BY tester;

                                                                     

   EXEC SQL AT db_conn SELECT COUNT(*)            

        INTO :vCount FROM tab01;

                                                                       

   fprintf(stderr,"count is:%d\n",vCount);

 

   fprintf(stderr,"Before disconnect,sleep for 500 seconds\n");

   sleep(500);

 

   EXEC SQL DISCONNECT db_conn;

 

   fprintf(stderr,"After disconnect,sleep for 600 seconds\n");

   sleep(600);

 

   return 0;                                                                                                                    

}

[root@server soft]#

--------------------------------------------------------

 

***My Second Program:

--------------------------------------------------------

 

[root@server soft]# cat ./test02/test02.pc

int main()

{

   EXEC SQL BEGIN DECLARE SECTION;

         int vCount;

         char vcd[4+1];                        

   EXEC SQL END DECLARE SECTION;

 

   EXEC SQL CONNECT TO 'tester@127.0.0.1:5432' AS db_conn

     USER tester IDENTIFIED BY tester;

  

   char *pCd="14";

 

   memset(vcd,'\0',5);

   strncpy(vcd, pCd,4);             

                                                                     

   EXEC SQL AT db_conn SELECT COUNT(*)            

        INTO :vCount FROM tab01 WHERE cd = :vcd;

                                                                        

   fprintf(stderr,"count is:%d\n",vCount);

 

   fprintf(stderr,"Before disconnect,sleep for 500 seconds\n");

   sleep(500);

 

   EXEC SQL DISCONNECT db_conn;

 

   fprintf(stderr,"After disconnect,sleep for 600 seconds\n");

   sleep(600);

 

   return 0;                                                                                                                    

}

 [root@server soft]#

--------------------------------------------------------

 

And also, I can find another strange phenomenon via psql about "create index concurrently":

This time I use two psql client:

 

***My Third Test:

----------------------------------------------------------------------

Client 1:

 

[postgres@server pgsql]$ ./bin/psql -d tester -U tester

psql (9.1.2)

Type "help" for help.

 

tester=> begin;

BEGIN

tester=> select * from tab01 where cd = '14';

 id | cd

----+----

  1 | 14

  3 | 14

(2 rows)

 

tester=>

 

Client 2:

After Client 1 make  a select,it does:

 

[postgres@server pgsql]$ ./bin/psql -d tester -U tester

psql (9.1.2)

Type "help" for help.

 

tester=> create index concurrently idx_tab02_id_new on tab02(id);

And then quickly succeeded.

----------------------------------------------------------------------

 

***My Fourth Test:

----------------------------------------------------------------------

Client 1:

[postgres@server pgsql]$ ./bin/psql -d tester -U tester

psql (9.1.2)

Type "help" for help.

 

tester=> begin;

BEGIN

tester=> select * from tab01 where cd = '14';

 id | cd

----+----

  1 | 14

  3 | 14

(2 rows)

 

tester=> select pg_sleep(500);

 pg_sleep

----------

 

(1 row)

 

tester=>

 

Client 2:

During client1's pg_sleep, or even after pg_sleep,

As far as client1 don’t  finish transaction. The "create index concurrently will not succeed":

 

[postgres@server pgsql]$ ./bin/psql -d tester -U tester

psql (9.1.2)

Type "help" for help.

 

tester=> create index concurrently idx_tab02_id_new on tab02(id);

 

----------------------------------------------------------------------

 

Thanks in advance.

 

Re: Tow kinds of different result while using create index concurrently

From
Jeff Janes
Date:
On Thu, Jun 20, 2013 at 1:27 AM, 高健 <luckyjackgao@gmail.com> wrote:

Hello:

 

I  have question about PG's "create index concurrently". I think it is a bug perhaps.

 

I  make two tables tab01 and tab02, they have no relationships.

I think "create index concurrently " on tab02 will not be influenced by  transaction on tab01.

But the result differs:


This is expected.  In order to not interfere with "normal" activity, a concurrent index build has to volunteer to be blocked by such activity instead.  From the doc: "When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially use the index to terminate."

Now in your case, perhaps the argument could be made that the transaction hosting the 1st concurrent build could not potentially use the 2nd-building index, but there is no convenient way for PostgreSQL to detect that fact.

 Cheers,

Jeff
Thanks Jeff

But What I can't understand is:
In  My first test, the "create index concurrently" works well. 
In My second test,  the "create index concurrently" can not work.

The difference is only on ecpg's select statement : 
One use host variable of char (its value is of integer 14) in select statement,  
While the other is just a simple select.

If the transaction will potentially the index, it should be same on my first test and second test.

My customer want to use PG on their 7x24 environment, while rebuilding index periodically.
If I can't do it on PG, it really confused me...

sincerely yours
Jian 

2013/6/21 Jeff Janes <jeff.janes@gmail.com>
On Thu, Jun 20, 2013 at 1:27 AM, 高健 <luckyjackgao@gmail.com> wrote:

Hello:

 

I  have question about PG's "create index concurrently". I think it is a bug perhaps.

 

I  make two tables tab01 and tab02, they have no relationships.

I think "create index concurrently " on tab02 will not be influenced by  transaction on tab01.

But the result differs:


This is expected.  In order to not interfere with "normal" activity, a concurrent index build has to volunteer to be blocked by such activity instead.  From the doc: "When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially use the index to terminate."

Now in your case, perhaps the argument could be made that the transaction hosting the 1st concurrent build could not potentially use the 2nd-building index, but there is no convenient way for PostgreSQL to detect that fact.

 Cheers,

Jeff

Hello: 

Sorry for disturbing again. 

I traced source code of PG, and found that:

When the create index concurrently statement is called,The following calling relationship is there:

PortalRunMulti--> PortalRunUtility-->Standard_ProcessUtility-->DefineIndex

Here I omit some code of DefineIndex function in order to say my point clearly:

{     

      …

      old_snapshots = GetCurrentVirtualXIDs(snapshot->xmin, true, false,

                       PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, &n_old_snapshots);

       for (i = 0; i < n_old_snapshots; i++)

       {     

            …

            if (VirtualTransactionIdIsValid(old_snapshots[i]))

                  VirtualXactLockTableWait(old_snapshots[i]);

       }

      …

}  

For my first test program (mainly select * from tab02), After GetCurrentVirtualXIDs function run,n_old_snapshots is 0 ,The for (i = 0; i < n_old_snapshots; i++) loop will not be executed, 

So index creation is not blocked and succeeded.

For my second test program(mainly select * from tab02 where cd=14), After GetCurrentVirtualXIDs function run,n_old_snapshots is 1, The for (i = 0; i < n_old_snapshots; i++) loop will be executed, 

Then Because of VirtualXactLockTableWait(old_snapshots[i]) running, index creation is blocked.

 For the similar sql statement, the source code running logic differs, I think that there might be something wrong in the source code.

2013/6/21 高健 <luckyjackgao@gmail.com>
Thanks Jeff

But What I can't understand is:
In  My first test, the "create index concurrently" works well. 
In My second test,  the "create index concurrently" can not work.

The difference is only on ecpg's select statement : 
One use host variable of char (its value is of integer 14) in select statement,  
While the other is just a simple select.

If the transaction will potentially the index, it should be same on my first test and second test.

My customer want to use PG on their 7x24 environment, while rebuilding index periodically.
If I can't do it on PG, it really confused me...

sincerely yours
Jian 


2013/6/21 Jeff Janes <jeff.janes@gmail.com>
On Thu, Jun 20, 2013 at 1:27 AM, 高健 <luckyjackgao@gmail.com> wrote:

Hello:

 

I  have question about PG's "create index concurrently". I think it is a bug perhaps.

 

I  make two tables tab01 and tab02, they have no relationships.

I think "create index concurrently " on tab02 will not be influenced by  transaction on tab01.

But the result differs:


This is expected.  In order to not interfere with "normal" activity, a concurrent index build has to volunteer to be blocked by such activity instead.  From the doc: "When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially use the index to terminate."

Now in your case, perhaps the argument could be made that the transaction hosting the 1st concurrent build could not potentially use the 2nd-building index, but there is no convenient way for PostgreSQL to detect that fact.

 Cheers,

Jeff


Hello:

 

When I traced the source code of PG more deeply.

I can find the following:

 

VirtualTransactionId *

GetCurrentVirtualXIDs(TransactionId limitXmin, bool excludeXmin0,

                      bool allDbs, int excludeVacuum,

                      int *nvxids)

{

    ……

    for (index = 0; index < arrayP->numProcs; index++)

    {

        volatile PGPROC *proc = arrayP->procs[index];

 

        ……

 

        if (allDbs || proc->databaseId == MyDatabaseId)

        {

            /* Fetch xmin just once - might change on us */

            TransactionId pxmin = proc->xmin;


            if (excludeXmin0 && !TransactionIdIsValid(pxmin))

                continue;

            /*

             * InvalidTransactionId precedes all other XIDs, so a proc that

             * hasn't set xmin yet will not be rejected by this test.

             */

            if (!TransactionIdIsValid(limitXmin) ||

                TransactionIdPrecedesOrEquals(pxmin, limitXmin))

            {

                VirtualTransactionId vxid;

 

                GET_VXID_FROM_PGPROC(vxid, *proc);


                if (VirtualTransactionIdIsValid(vxid))

                    vxids[count++] = vxid;

            }

        }

    }

 

    LWLockRelease(ProcArrayLock);

 

    *nvxids = count;

    return vxids;

}

 

For my first test program,  when  <if (allDbs || proc->databaseId == MyDatabaseId) > condition is met,

The pxmin is 0, so <(excludeXmin0 && !TransactionIdIsValid(pxmin))> is met, there Is no chance to make count++.

 

For my second test program,  when  <if (allDbs || proc->databaseId == MyDatabaseId) > condition is met,

The pxmin is not 0, so <(excludeXmin0 && !TransactionIdIsValid(pxmin))> is not met, The vxids[count++] = vxid is executed.

 

The pxmin is from proc->xmin.

And I heard that when a process is working, it get transaction id from system, then use it as xmin when inserting a record.

Why the proc->xmin can be 0 ? Is it a bug?



2013/6/25 高健 <luckyjackgao@gmail.com>

Hello: 

Sorry for disturbing again. 

I traced source code of PG, and found that:

When the create index concurrently statement is called,The following calling relationship is there:

PortalRunMulti--> PortalRunUtility-->Standard_ProcessUtility-->DefineIndex

Here I omit some code of DefineIndex function in order to say my point clearly:

{     

      …

      old_snapshots = GetCurrentVirtualXIDs(snapshot->xmin, true, false,

                       PROC_IS_AUTOVACUUM | PROC_IN_VACUUM, &n_old_snapshots);

       for (i = 0; i < n_old_snapshots; i++)

       {     

            …

            if (VirtualTransactionIdIsValid(old_snapshots[i]))

                  VirtualXactLockTableWait(old_snapshots[i]);

       }

      …

}  

For my first test program (mainly select * from tab02), After GetCurrentVirtualXIDs function run,n_old_snapshots is 0 ,The for (i = 0; i < n_old_snapshots; i++) loop will not be executed, 

So index creation is not blocked and succeeded.

For my second test program(mainly select * from tab02 where cd=14), After GetCurrentVirtualXIDs function run,n_old_snapshots is 1, The for (i = 0; i < n_old_snapshots; i++) loop will be executed, 

Then Because of VirtualXactLockTableWait(old_snapshots[i]) running, index creation is blocked.

 For the similar sql statement, the source code running logic differs, I think that there might be something wrong in the source code.


2013/6/21 高健 <luckyjackgao@gmail.com>
Thanks Jeff

But What I can't understand is:
In  My first test, the "create index concurrently" works well. 
In My second test,  the "create index concurrently" can not work.

The difference is only on ecpg's select statement : 
One use host variable of char (its value is of integer 14) in select statement,  
While the other is just a simple select.

If the transaction will potentially the index, it should be same on my first test and second test.

My customer want to use PG on their 7x24 environment, while rebuilding index periodically.
If I can't do it on PG, it really confused me...

sincerely yours
Jian 


2013/6/21 Jeff Janes <jeff.janes@gmail.com>
On Thu, Jun 20, 2013 at 1:27 AM, 高健 <luckyjackgao@gmail.com> wrote:

Hello:

 

I  have question about PG's "create index concurrently". I think it is a bug perhaps.

 

I  make two tables tab01 and tab02, they have no relationships.

I think "create index concurrently " on tab02 will not be influenced by  transaction on tab01.

But the result differs:


This is expected.  In order to not interfere with "normal" activity, a concurrent index build has to volunteer to be blocked by such activity instead.  From the doc: "When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially use the index to terminate."

Now in your case, perhaps the argument could be made that the transaction hosting the 1st concurrent build could not potentially use the 2nd-building index, but there is no convenient way for PostgreSQL to detect that fact.

 Cheers,

Jeff