Thread: Create index concurrently hanging with big table on pgsql 9.3.12

Create index concurrently hanging with big table on pgsql 9.3.12

From
Robert Anderson
Date:
Hi,

We are trying to create a index concurrently but, at least apparently, it hangs in a infinite loop and never ends.

Our version:
 
flip=# select version();
                                                    version
--------------------------------
 PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 2012
0313 (Red Hat 4.4.7-16), 64-bit
(1 row)


Index creation:

CREATE INDEX CONCURRENTLY index_texto
  ON flip_pagina_edicao
  USING hash
  (texto COLLATE pg_catalog."default");  -- texto is a text data type.

Size of the table:

flip=# select pg_size_pretty(pg_total_relation_size('flip_pagina_edicao'));
 pg_size_pretty
----------------
 956 GB
(1 row)


Process strace:

....
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(622611, {{6, 1, 0}}, 1)           = 0
semop(557073, {{2, 1, 0}}, 1)           = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
semop(655380, {{5, -1, 0}}, 1)          = 0
...


Thanks in advance.

Robert

Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
"Joshua D. Drake"
Date:
On 05/09/2016 05:04 AM, Robert Anderson wrote:
> Hi,
>
> We are trying to create a index concurrently but, at least apparently,
> it hangs in a infinite loop and never ends.

Apparently how?

How long did you wait?

JD

--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
Melvin Davidson
Date:
Try the following query. See if it shows you if another transaction is blocking the needed locks to create the index.

SELECT
   w.query as waiting_query,
   w.pid as waiting_pid,
   w.usename as w_user,
   l.pid as blocking_pid,
   l.usename as blocking_user,
   t.schemaname || '.' || t.relname as tablename
   FROM pg_stat_activity w
   JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
   JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
   JOIN pg_stat_activity l ON (l2.pid = l.pid)
   JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
   WHERE w.waiting;


On Mon, May 9, 2016 at 11:50 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 05/09/2016 05:04 AM, Robert Anderson wrote:
Hi,

We are trying to create a index concurrently but, at least apparently,
it hangs in a infinite loop and never ends.

Apparently how?

How long did you wait?

JD

--
Command Prompt, Inc.                  http://the.postgres.company/
                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
Robert Anderson
Date:
Hi,

There aren't transactions blocking:

postgres=# SELECT
postgres-#    w.query as waiting_query,
postgres-#    w.pid as waiting_pid,
postgres-#    w.usename as w_user,
postgres-#    l.pid as blocking_pid,
postgres-#    l.usename as blocking_user,
postgres-#    t.schemaname || '.' || t.relname as tablename
postgres-#    FROM pg_stat_activity w
postgres-#    JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
postgres-#    JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
postgres-#    JOIN pg_stat_activity l ON (l2.pid = l.pid)
postgres-#    JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
postgres-#    WHERE w.waiting;
 waiting_query | waiting_pid | w_user | blocking_pid | blocking_user | tablename
---------------+-------------+--------+--------------+---------------+-----------
(0 rows)

How long I'm waiting:

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_activity where query like 'CREATE%';
-[ RECORD 1 ]----+----------------------------------------
datid            | 16434
datname          | flip
pid              | 3990
usesysid         | 10
usename          | postgres
application_name | psql
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 2016-05-07 11:48:39.218398-03
xact_start       | 2016-05-07 11:48:43.417734-03
query_start      | 2016-05-07 11:48:43.417734-03
state_change     | 2016-05-07 11:48:43.417742-03
waiting          | f
state            | active
query            | CREATE INDEX CONCURRENTLY index_texto
                 |   ON flip_pagina_edicao
                 |   USING hash
                 |   (texto COLLATE pg_catalog."default");

postgres=#


In attachment follows a strace sample of the running process.

2016-05-09 13:25 GMT-03:00 Melvin Davidson <melvin6925@gmail.com>:
Try the following query. See if it shows you if another transaction is blocking the needed locks to create the index.

SELECT
   w.query as waiting_query,
   w.pid as waiting_pid,
   w.usename as w_user,
   l.pid as blocking_pid,
   l.usename as blocking_user,
   t.schemaname || '.' || t.relname as tablename
   FROM pg_stat_activity w
   JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
   JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
   JOIN pg_stat_activity l ON (l2.pid = l.pid)
   JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
   WHERE w.waiting;


On Mon, May 9, 2016 at 11:50 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 05/09/2016 05:04 AM, Robert Anderson wrote:
Hi,

We are trying to create a index concurrently but, at least apparently,
it hangs in a infinite loop and never ends.

Apparently how?

How long did you wait?

JD


--
Command Prompt, Inc.                  http://the.postgres.company/
                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Attachment
Robert Anderson <ranomail@gmail.com> writes:
> There aren't transactions blocking:

> postgres=# SELECT
> postgres-#    w.query as waiting_query,
> postgres-#    w.pid as waiting_pid,
> postgres-#    w.usename as w_user,
> postgres-#    l.pid as blocking_pid,
> postgres-#    l.usename as blocking_user,
> postgres-#    t.schemaname || '.' || t.relname as tablename
> postgres-#    FROM pg_stat_activity w
> postgres-#    JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
> postgres-#    JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
> postgres-#    JOIN pg_stat_activity l ON (l2.pid = l.pid)
> postgres-#    JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
> postgres-#    WHERE w.waiting;
>  waiting_query | waiting_pid | w_user | blocking_pid | blocking_user |
> tablename
> ---------------+-------------+--------+--------------+---------------+-----------
> (0 rows)

This test proves little, because that last JOIN will discard locks on
non-table objects, and what CREATE INDEX CONCURRENTLY would be most
likely to be blocked on is transaction VXIDs.  However, since
pg_stat_activity claims that "waiting" is false, probably there isn't
anything in pg_locks.  Still, it'd be better to do
"select * from pg_stat_activity where pid = 3990" and be sure.

            regards, tom lane


Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
Robert Anderson
Date:
Only one line returned:

postgres=# select * from pg_stat_activity where pid=3990;
-[ RECORD 1 ]----+----------------------------------------
datid            | 16434
datname          | flip
pid              | 3990
usesysid         | 10
usename          | postgres
application_name | psql
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 2016-05-07 11:48:39.218398-03
xact_start       | 2016-05-07 11:48:43.417734-03
query_start      | 2016-05-07 11:48:43.417734-03
state_change     | 2016-05-07 11:48:43.417742-03
waiting          | f
state            | active
query            | CREATE INDEX CONCURRENTLY index_texto
                 |   ON flip_pagina_edicao
                 |   USING hash
                 |   (texto COLLATE pg_catalog."default");

postgres=#

2016-05-09 14:20 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:
Robert Anderson <ranomail@gmail.com> writes:
> There aren't transactions blocking:

> postgres=# SELECT
> postgres-#    w.query as waiting_query,
> postgres-#    w.pid as waiting_pid,
> postgres-#    w.usename as w_user,
> postgres-#    l.pid as blocking_pid,
> postgres-#    l.usename as blocking_user,
> postgres-#    t.schemaname || '.' || t.relname as tablename
> postgres-#    FROM pg_stat_activity w
> postgres-#    JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
> postgres-#    JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
> postgres-#    JOIN pg_stat_activity l ON (l2.pid = l.pid)
> postgres-#    JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
> postgres-#    WHERE w.waiting;
>  waiting_query | waiting_pid | w_user | blocking_pid | blocking_user |
> tablename
> ---------------+-------------+--------+--------------+---------------+-----------
> (0 rows)

This test proves little, because that last JOIN will discard locks on
non-table objects, and what CREATE INDEX CONCURRENTLY would be most
likely to be blocked on is transaction VXIDs.  However, since
pg_stat_activity claims that "waiting" is false, probably there isn't
anything in pg_locks.  Still, it'd be better to do
"select * from pg_stat_activity where pid = 3990" and be sure.

                        regards, tom lane

Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
"Joshua D. Drake"
Date:
On 05/09/2016 10:32 AM, Robert Anderson wrote:
> Only one line returned:
>
> postgres=# select * from pg_stat_activity where pid=3990;
> -[ RECORD 1 ]----+----------------------------------------
> datid            | 16434
> datname          | flip
> pid              | 3990
> usesysid         | 10
> usename          | postgres
> application_name | psql
> client_addr      |
> client_hostname  |
> client_port      | -1
> backend_start    | 2016-05-07 11:48:39.218398-03
> xact_start       | 2016-05-07 11:48:43.417734-03
> query_start      | 2016-05-07 11:48:43.417734-03
> state_change     | 2016-05-07 11:48:43.417742-03
> waiting          | f
> state            | active
> query            | CREATE INDEX CONCURRENTLY index_texto
>                   |   ON flip_pagina_edicao
>                   |   USING hash
>                   |   (texto COLLATE pg_catalog."default");

So it isn't blocked by a lock. You still haven't stated why you think it
is blocked? How long has it been taking? What is your
maintenance_work_mem set to?

JD



--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Re: Create index concurrently hanging with big table on pgsql 9.3.12

From
Robert Anderson
Date:
 You still haven't stated why you think it is blocked? 

Ouput from iotop:

3990 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.00 % postgres: postgres flip [local] CREATE INDEX

The process isn't reading or writing anything for many hours, but it's using almost 90% of CPU.

How long has it been taking? 

backend_start    | 2016-05-07 11:48:39.218398-03

More than 50 hours.

What is your maintenance_work_mem set to?

maintenance_work_mem = 352MB



2016-05-09 14:34 GMT-03:00 Joshua D. Drake <jd@commandprompt.com>:
On 05/09/2016 10:32 AM, Robert Anderson wrote:
Only one line returned:

postgres=# select * from pg_stat_activity where pid=3990;
-[ RECORD 1 ]----+----------------------------------------
datid            | 16434
datname          | flip
pid              | 3990
usesysid         | 10
usename          | postgres
application_name | psql
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 2016-05-07 11:48:39.218398-03
xact_start       | 2016-05-07 11:48:43.417734-03
query_start      | 2016-05-07 11:48:43.417734-03
state_change     | 2016-05-07 11:48:43.417742-03
waiting          | f
state            | active
query            | CREATE INDEX CONCURRENTLY index_texto
                  |   ON flip_pagina_edicao
                  |   USING hash
                  |   (texto COLLATE pg_catalog."default");

So it isn't blocked by a lock. You still haven't stated why you think it is blocked? How long has it been taking? What is your maintenance_work_mem set to?


JD



--
Command Prompt, Inc.                  http://the.postgres.company/
                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

Robert Anderson <ranomail@gmail.com> writes:
> Only one line returned:
> postgres=# select * from pg_stat_activity where pid=3990;

Aaah, sorry, that was a brain fade.  I meant to ask about rows in
pg_locks with that pid.

            regards, tom lane