Re: [GENERAL] bloat indexes - opinion - Mailing list pgsql-general

From Patrick B
Subject Re: [GENERAL] bloat indexes - opinion
Date
Msg-id CAJNY3isDuHMuLJQuhf933mMp5NR-xct-kFFeJahRi5Tcq1O+yg@mail.gmail.com
Whole thread Raw
In response to Re: [GENERAL] bloat indexes - opinion  (Patrick B <patrickbakerbr@gmail.com>)
Responses Re: [GENERAL] bloat indexes - opinion  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
> Hi guys,
>
> I've got a lot of bloat indexes on my 4TB database.
>
> Let's take this example:
>
>     Table: seg
>     Index: ix_filter_by_tree
>     Times_used: 1018082183
>     Table_size: 18 GB -- wrong. The table is mostly on pg_toast table.
>     Its real size is 2TB

How do you know one number is right and the other is wrong?


1. on that table (seg) i store binary data. It is impossible to have only 18GB of it.
2. 
SELECT schema_name, 
       pg_size_pretty(sum(table_size)::bigint),
       (sum(table_size) / pg_database_size(current_database())) * 100
FROM (
  SELECT pg_catalog.pg_namespace.nspname as schema_name,
         pg_relation_size(pg_catalog.pg_class.oid) as table_size
  FROM   pg_catalog.pg_class
     JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid
) t
GROUP BY schema_name
ORDER BY schema_name

pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.

 

Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT

>     Index_size: 17 GB
>     Num_writes 16245023
>     Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
>     (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
>
>
>
> What is the real impact of a bloat index? If I reindex it, queries will
> be faster?
>
> Thanks
> Patrick



I ran the query before and after the reindex, and it seems it did not help on performance.

The query I used:
explain analyze select * from seg where full_path = '/userfile/123';

Before reindex:
Index Scan using ix_filter_by_tree on seg  (cost=0.00..144.87 rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
  Index Cond: (full_path = '/userfile/123')
Total runtime: 0.059 ms
(3 rows)

After reindex:
Index Scan using ix_filter_by_tree on seg  (cost=0.00..141.83 rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
  Index Cond: (full_path = '/userfile/123')
Total runtime: 0.036 ms
(3 rows)

Note that the 'cost' is pretty much the same.

My question is:
If I have a bloat index. Why do I need to reindex it if I got none performance improvements? 

Cheers
Patrick

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: R: R: [GENERAL] Slow queries on very big (and partitioned) table
Next
From: François Beaulieu
Date:
Subject: Re: [GENERAL] NOTIFY command impact