Home > mailing lists

Re: Reindex taking forever, and 99% CPU - Mailing list pgsql-general

From	Phoenix Kiula
Subject	Re: Reindex taking forever, and 99% CPU
Date	August 7, 2014 04:52:31
Msg-id	CAFWfU=t7t8rxNnvWN4ByhHq44O1LtCBnU_upYEW-jco1Fw_m5Q@mail.gmail.com Whole thread Raw
In response to	Re: Reindex taking forever, and 99% CPU (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses	Re: Reindex taking forever, and 99% CPU (Kevin Grittner <kgrittn@ymail.com>)
List	pgsql-general

Tree view

Thank you for the very specific idea of pg_stat_user.

This is what I see (the output is also included in email below, but
this is easier to read) --
https://gist.github.com/anonymous/53f748a8c6c454b804b3

The output here  (might become a jumbled mess)--

=# SELECT * from pg_stat_user_tables where relname='bigtb';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd |
n_live_tup | n_dead_tup |          last_vacuum          |
last_autovacuum |         last_analyze          | last_autoanalyze

--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------
 105954 | public     | bigtb   |        0 |            0 |   220396 |
       89781 |     63516 |         6 |       910 |             1 |
634879579 |         39 | 2014-08-06 20:12:47.163055-04 |
  | 2014-08-06 20:19:40.317771-04 | (1 row)

Time: 50.844 ms



We spent some time to do some massive cleaning of the data from this
table. Brought it down to around 630 million rows. Overall size of the
table including indexes is about 120GB anyway.

More stats that we could manage are pretty-pasted here:
https://gist.github.com/anonymous/21aaeae10584013c3820

The biggest table ("bigtb" -- codename for pasting on public forum)
stores some URLs. The most important index is for this table is the
"alias" column, which is varchar(35) as you can see.

Table definition also pasted below:



                              Table "public.bigtb"
     Column      |            Type             |            Modifiers
-----------------+-----------------------------+---------------------------------
 alias           | character varying(35)       | not null
 url             | text                        | not null
 user_registered | boolean                     |
 private_key     | character varying(6)        | default NULL::character varying
 modify_date     | timestamp without time zone | default now()
 ip              | bigint                      |
 url_md5         | text                        |

Indexes:
    "idx_bigtb_pkey" PRIMARY KEY, btree (alias)
    "idx_bigtb_ip_url" UNIQUE, btree (ip, url_md5)
    "idx_bigtb_modify_date" btree (modify_date)
    "idx_bigtb_urlmd5" btree (url_md5)
Check constraints:
    "bigtb_alias_check" CHECK (alias::text ~ '[-.~a-z0-9_]'::text)
Referenced by:
    TABLE "bigtb_registered" CONSTRAINT "fk_bigtb_registered" FOREIGN
KEY (alias) REFERENCES bigtb(alias) MATCH FULL ON UPDATE CASCADE ON
DELETE CASCADE
    TABLE "interesting" CONSTRAINT "interesting_alias_fkey" FOREIGN
KEY (alias) REFERENCES bigtb(alias) MATCH FULL ON UPDATE CASCADE
Rules:
    __track_bigtb_deleted AS
    ON DELETE TO bigtb
   WHERE NOT (EXISTS ( SELECT bigtb_deleted.alias
           FROM bigtb_deleted
          WHERE bigtb_deleted.alias::text = old.alias::text)) DO
INSERT INTO bigtb_deleted (alias, url, user_registered, modify_date)
  VALUES (old.alias, old.url, old.user_registered, old.modify_date)




What else could I do here?

As you will see in the code shared above (GIST Github link) the stats
for this table are:

bigtb -

   row count: 634,879,168
   inserted:  65613
   updated: 6
   deleted:  1013



There are recent numbers. The DB has been going down often. But
deletions would be around 20,000 per week. Updates are lowest. INSERT
and SELECT are huge, with of course SELECT being the biggest activity
(high traffic website).

We did put PGBouncer for some pooling benefits, and "memcached" for
taking some load off the postgresql server. As of this writing, the
memcached thing is caching around 200,000 URLs which would otherwise
have been a query based on the index on the "alias" column --
"idx_bigtb_pkey".

What other info can I share?

Suppose we might have to explore partitioning, which would probably be
via first letter of the alias? This would lead to around 26 + 9 = 35
sub-tables. Is this too many?

My CONFIG settings:


max_connections                 = 180       # Was 250!  -
http://www.php.net/manual/en/function.pg-pconnect.php#20309
superuser_reserved_connections  = 5
shared_buffers                  = 512MB
effective_cache_size            = 1200MB   # Nov 11 2011, was 1500MB
temp_buffers                    = 32MB     # min 800kB
maintenance_work_mem            = 320MB    # min 1MB, was 128MB
work_mem                        = 64MB
wal_buffers                     = 20MB     # min 32kB
fsync                           = on       # turns forced
synchronization on or off
checkpoint_segments             = 128      # was 128
checkpoint_timeout              = 1000     # was 1000
enable_indexscan                = on
log_min_duration_statement      = 1000



Much appreciate any further ideas!





On Sun, Aug 3, 2014 at 9:29 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
> On 08/02/2014 07:37 PM, Phoenix Kiula wrote:
>>>
>>> In your original post you said it was stopping on pg_class so now I am
>>> confused.
>>
>>
>>
>>
>> No need to be confused. The vacuum thing is a bit tricky for laymen
>> like myself. The "pg_class" seemed to be associated to this table.
>> Anyway, even before the upgrade, the vacuum was stopping at this table
>> and taking forever.
>>
>> The question is: what now. Where can I give you information from?
>> IOSTAT I've already shared.
>>
>> Will the work_mem settings affect the manual REINDEX that's still
>> running? What can I do to speed up the REINDEX? Should I change my
>> autovacuum settings for this table specifcally (it's the only mammoth
>> table in the DB, and our main one)?
>
>
> Adding to my previous post, some information from the statistic collector
> would be useful. See here for more information:
>
> http://www.postgresql.org/docs/9.0/static/monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE
>
> For now the output of:
>
> SELECT * from pg_stat_user_tables where relname='your_table_name';
>
> might prove helpful.
>
>
>>
>> Thanks.
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com

pgsql-general by date:

From: Peter Geoghegan
Date: 07 August 2014, 04:47:01
Subject: Re: The dangers of streaming across versions of glibc: A cautionary tale

From: David G Johnston
Date: 07 August 2014, 04:52:53
Subject: Re: Need help in tuning

Re: Reindex taking forever, and 99% CPU - Mailing list pgsql-general

Previous

Next