Thread: How to fast the REINDEX

How to fast the REINDEX

From
raghavendra t
Date:
Hi All,
 
I have a table with 40GB size, it has few indexes on it. When i try to REINDEX on the table, its take a long time. I tried increasing the maintenance_work_mem, but still i havnt find a satisfying result.
 
Questions
=======
1. What are the parameters will effect, when issuing the REINDEX command
2. Best possible way to increase the spead of the REINDEX
 
Thanks in Advance
 
Regards
Raghavendra
 

Re: How to fast the REINDEX

From
"Kevin Grittner"
Date:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> I have a table with 40GB size, it has few indexes on it.

What does the table look like?  What indexes are there?

> When i try to REINDEX on the table,

Why are you doing that?

> its take a long time.

How long?

> I tried increasing the maintenance_work_mem, but still i havnt
> find a satisfying result.

What run time are you expecting?

> Questions
> =======
> 1. What are the parameters will effect, when issuing the REINDEX
>    command
> 2. Best possible way to increase the spead of the REINDEX

It's hard to answer that without more information, like PostgreSQL
version and configuration, for starters.  See:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

My best guess is that you can make them instantaneous by not running
them.  A good VACUUM policy should make such runs unnecessary in
most cases -- at least on recent PostgreSQL versions.

-Kevin

Re: How to fast the REINDEX

From
raghavendra t
Date:
Hi Kevin,
 
Thank you for the update,
 
>>What does the table look like?  What indexes are there?
Table has a combination of byteas. Indexes are b-tree and Partial
 
>>Why are you doing that?
Our table face lot of updates and deletes in a day, so we prefer reindex to update the indexes as well overcome with a corrupted index.
 
>> How long?
More than 4 hrs..
 
>>What run time are you expecting?
Less than what it is taking at present.
 
>>It's hard to answer that without more information, like PostgreSQL
>>version and configuration, for starters.  See:
                                                  version
------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44), 32-bit
(1 row)
Expected the performance question..
 
Regards
Raghavendra

On Thu, Apr 1, 2010 at 2:32 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> I have a table with 40GB size, it has few indexes on it.

What does the table look like?  What indexes are there?

> When i try to REINDEX on the table,

Why are you doing that?

> its take a long time.

How long?

> I tried increasing the maintenance_work_mem, but still i havnt
> find a satisfying result.

What run time are you expecting?

> Questions
> =======
> 1. What are the parameters will effect, when issuing the REINDEX
>    command
> 2. Best possible way to increase the spead of the REINDEX

It's hard to answer that without more information, like PostgreSQL
version and configuration, for starters.  See:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

My best guess is that you can make them instantaneous by not running
them.  A good VACUUM policy should make such runs unnecessary in
most cases -- at least on recent PostgreSQL versions.

-Kevin

Re: How to fast the REINDEX

From
"Kevin Grittner"
Date:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> overcome with a corrupted index.

If this is a one-time fix for a corrupted index, did you look at
CREATE INDEX CONCURRENTLY?  You could avoid any down time while you
fix things up.

http://www.postgresql.org/docs/8.4/interactive/sql-createindex.html

-Kevin

Re: How to fast the REINDEX

From
raghavendra t
Date:
If this is a one-time fix for a corrupted index, did you look at
CREATE INDEX CONCURRENTLY?  You could avoid any down time while you
fix things up.
Using CREATE INDEX CONCURRENTLY will avoid the exclusive locks on the table, but my question is, how to get a performance on the existing indexes. You mean to say , drop the existing indexes and create the index with CONCURRENTLY. Does this give the performance back.
 
Regards
Raghavendra
 
 
On Thu, Apr 1, 2010 at 3:10 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> overcome with a corrupted index.

If this is a one-time fix for a corrupted index, did you look at
CREATE INDEX CONCURRENTLY?  You could avoid any down time while you
fix things up.

http://www.postgresql.org/docs/8.4/interactive/sql-createindex.html

-Kevin

Re: How to fast the REINDEX

From
"Kevin Grittner"
Date:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> my question is, how to get a performance on the existing indexes.
> You mean to say , drop the existing indexes and create the index
> with CONCURRENTLY. Does this give the performance back.

You would normally want to create first and then drop the old ones,
unless the old ones are hopelessly corrupted.  Since you still
haven't given me any information to suggest you need to reindex
except for the mention of corruption, or any information to help
identify where the performance bottleneck is, I can't see any other
improvements to suggest at this point.

-Kevin

Re: How to fast the REINDEX

From
raghavendra t
Date:
Thank you for the suggestion.

On Thu, Apr 1, 2010 at 3:21 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> my question is, how to get a performance on the existing indexes.
> You mean to say , drop the existing indexes and create the index
> with CONCURRENTLY. Does this give the performance back.

You would normally want to create first and then drop the old ones,
unless the old ones are hopelessly corrupted.  Since you still
haven't given me any information to suggest you need to reindex
except for the mention of corruption, or any information to help
identify where the performance bottleneck is, I can't see any other
improvements to suggest at this point.

-Kevin

Re: How to fast the REINDEX

From
"Kevin Grittner"
Date:
raghavendra t <raagavendra.rao@gmail.com> wrote:
> Thank you for the suggestion.

I'm sorry I couldn't come up with more, but what you've provided so
far is roughly equivalent to me telling you that it takes over four
hours to travel to see my Uncle Jim, and then asking you how I can
find out how he's doing in less time than that.  There's just not
much to go on.  :-(

If you proceed with the course suggested in the URL I referenced,
people on the list have a chance to be more helpful to you.

-Kevin

Re: How to fast the REINDEX

From
Jaime Casanova
Date:
On Wed, Mar 31, 2010 at 5:33 PM, raghavendra t
<raagavendra.rao@gmail.com> wrote:
>
>>>Why are you doing that?
> Our table face lot of updates and deletes in a day, so we prefer reindex to
> update the indexes as well overcome with a corrupted index.
>

do you have a corrupted index? if not, there is nothing to do...
REINDEX is not a mantenance task on postgres

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: How to fast the REINDEX

From
raghavendra t
Date:
I'm sorry I couldn't come up with more, but what you've provided so
far is roughly equivalent to me telling you that it takes over four
hours to travel to see my Uncle Jim, and then asking you how I can
find out how he's doing in less time than that.  There's just not
much to go on.  :-(

If you proceed with the course suggested in the URL I referenced,
people on the list have a chance to be more helpful to you.
Instead of looking into the priority of the question or where it has to be posted, it would be appreciated to keep a discussion to the point mentioned.  Truely this question belong to some other place as you have mentioned in the URL. But answer for Q1 might be expected alteast. Hope i could get the information from the other Thread in other catagory.
 
Thank you
 
Regards
Raghavendra
 
 
On Thu, Apr 1, 2010 at 3:40 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
raghavendra t <raagavendra.rao@gmail.com> wrote:
> Thank you for the suggestion.

I'm sorry I couldn't come up with more, but what you've provided so
far is roughly equivalent to me telling you that it takes over four
hours to travel to see my Uncle Jim, and then asking you how I can
find out how he's doing in less time than that.  There's just not
much to go on.  :-(

If you proceed with the course suggested in the URL I referenced,
people on the list have a chance to be more helpful to you.

-Kevin

Re: How to fast the REINDEX

From
Craig Ringer
Date:
Jaime Casanova wrote:
> On Wed, Mar 31, 2010 at 5:33 PM, raghavendra t
> <raagavendra.rao@gmail.com> wrote:
>>>> Why are you doing that?
>> Our table face lot of updates and deletes in a day, so we prefer reindex to
>> update the indexes as well overcome with a corrupted index.
>>
>
> do you have a corrupted index? if not, there is nothing to do...
> REINDEX is not a mantenance task on postgres

Actually, if your free_space_map (pre 8.4) isn't up to keeping track of
bloat, or autovac isn't running enough, you're likely to get bloat of
indexes as well as tables that may need VACUUM FULL + REINDEX to
properly clean up.

It's probably better to fix your fsm/autovac settings then CLUSTER the
table so it doesn't happen again, though.

--
Craig Ringer

Re: How to fast the REINDEX

From
Steve Clark
Date:
On 03/31/2010 11:11 PM, Craig Ringer wrote:
> Jaime Casanova wrote:
>> On Wed, Mar 31, 2010 at 5:33 PM, raghavendra t
>> <raagavendra.rao@gmail.com>  wrote:
>>>>> Why are you doing that?
>>> Our table face lot of updates and deletes in a day, so we prefer reindex to
>>> update the indexes as well overcome with a corrupted index.
>>>
>>
>> do you have a corrupted index? if not, there is nothing to do...
>> REINDEX is not a mantenance task on postgres
>
> Actually, if your free_space_map (pre 8.4) isn't up to keeping track of
> bloat, or autovac isn't running enough, you're likely to get bloat of
> indexes as well as tables that may need VACUUM FULL + REINDEX to
> properly clean up.
>
> It's probably better to fix your fsm/autovac settings then CLUSTER the
> table so it doesn't happen again, though.
>
> --
> Craig Ringer
>
So am I to understand I don't need to do daily reindexing as a maintenance
measure with 8.3.7 on FreeBSD.

--
Stephen Clark
NetWolves
Sr. Software Engineer III
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
www.netwolves.com

Re: How to fast the REINDEX

From
Greg Smith
Date:
raghavendra t wrote:
> 1. What are the parameters will effect, when issuing the REINDEX command
> 2. Best possible way to increase the spead of the REINDEX

If you haven't done the usual general tuning on your server, that might
help.  http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server is
an introduction.  If increasing maintainance_work_mem alone doesn't
help, I'd try increases to checkpoint_segments and then shared_buffers
next.  Those are the three parameters mostly likely to speed that up.

The things already suggested in this thread are still valid though.
Needing to REINDEX suggests there may be a problem with your database
better addressed by running autovacuum more regularly.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: How to fast the REINDEX

From
"Pierre C"
Date:
> So am I to understand I don't need to do daily reindexing as a
> maintenance measure with 8.3.7 on FreeBSD.

Sometimes it's better to have indexes with some space in them so every
insert doesn't hit a full index page and triggers a page split to make
some space.
Of course if the index is 90% holes you got a problem ;)



Re: How to fast the REINDEX

From
raghavendra t
Date:

Hi All,
 
System Config
---------------------
CPU - Intel® Xenon® CPU
CPU Speed - 3.16 GHz
Server Model - Sun Fire X4150
RAM-Size - 16GB
Steve:
So am I to understand I don't need to do daily reindexing as a maintenance measure with 8.3.7 on FreeBSD.
 
My question is something like Steve's, why we should not do reindexing as our maintenance task. I was doing reindex only to get
a best fit and not fall short of 90% hole, bcoz my table has lot of updates and deletes. We also has the weekly maintance of VACUUM, but still reindex takes lot of time.
 
Present Paramters settings
----------------------------------------
maintainence_work_mem - 1GB
Checkpoint_segment and Wal_buffers are default values.
 
 
Kevin, Pierre, Greg, Steve, Hannu, Jorge -----   Thank you for your wonderfull support and giving me the correct picture on REINDEX on this thread. I appoligies if i couldnt have shared the proper information in resolving my issue. Is the above information provided by me will help out in tuning better.
 
Regards
Raghavendra
 
 
 
 
On Thu, Apr 1, 2010 at 5:55 PM, Pierre C <lists@peufeu.com> wrote:

So am I to understand I don't need to do daily reindexing as a maintenance measure with 8.3.7 on FreeBSD.

Sometimes it's better to have indexes with some space in them so every insert doesn't hit a full index page and triggers a page split to make some space.
Of course if the index is 90% holes you got a problem ;)



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: How to fast the REINDEX

From
Brad Nicholson
Date:
On Thu, 2010-04-01 at 19:17 +0530, raghavendra t wrote:
>
> Hi All,
>
> System Config
> ---------------------
> CPU - Intel® Xenon® CPU
> CPU Speed - 3.16 GHz
> Server Model - Sun Fire X4150
> RAM-Size - 16GB
>
>         Steve:
>         So am I to understand I don't need to do daily reindexing as a
>         maintenance measure with 8.3.7 on FreeBSD.
>
> My question is something like Steve's, why we should not do reindexing
> as our maintenance task. I was doing reindex only to get
> a best fit and not fall short of 90% hole, bcoz my table has lot of
> updates and deletes. We also has the weekly maintance of VACUUM, but
> still reindex takes lot of time.

This is your problem.  You should enable autovaccuum, let the vacuums
happen more frequently, and this problem will go away.  You will still
have to fix the underlying bloat a last time though.


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: How to fast the REINDEX

From
"Kevin Grittner"
Date:
raghavendra t <raagavendra.rao@gmail.com> wrote:

> System Config
> ---------------------
> CPU - Intel* Xenon* CPU
> CPU Speed - 3.16 GHz
> Server Model - Sun Fire X4150
> RAM-Size - 16GB

The disk system matters a lot, too.  How many drives do you have in
what RAID configuration(s)?

> My question is something like Steve's, why we should not do
> reindexing as our maintenance task.

If your VACUUM policy is good, the REINDEX should not be necessary.
A good VACUUM policy, in my experience, usually involves setting it
to VACUUM any table in which 20% or more of the rows have changed
(with autovacuum_vacuum_threshold set pretty low).  Cut those about
in half for the autovacuum ANALYZE trigger point.  You may need to
use cost limits to avoid a hit on the production workload when
autovacuum kicks in.  If you do need that, I've found a 10ms naptime
is adequate for us.  Then, try running VACUUM ANALYZE VERBOSE
*nightly* (again, with cost limits if needed to avoid impact on
other processes).  Capture the output, as it can be used to find
where you have bloat problems. Monitor the last few lines to make
sure your fsm (free space manager) settings are high enough -- it'll
tell you if they're not.

If you do this, you should be able to stop running REINDEX without
any performance hit.  There will be some dead space in the indexes,
but this will likely help with performance of UPDATE and INSERT, as
page splits will happen less frequently, and PostgreSQL won't have
to constantly be asking the OS for more disk space.

> I was doing reindex only to get a best fit and not fall short of
> 90% hole, bcoz my table has lot of updates and deletes. We also
> has the weekly maintance of VACUUM, but still reindex takes lot of
> time.

VACUUM won't help a lot with REINDEX time, since REINDEX has to read
the entire table once per index and build everything up from scratch
every time.  If you VACUUM often enough, it is kept in good shape as
you go.

> Present Paramters settings
> ----------------------------------------
> maintainence_work_mem - 1GB
> Checkpoint_segment and Wal_buffers are default values.

You will probably benefit from increasing those last two.  Is
everything else at the default?  There are a few others which almost
always need to be tuned to your run time environment.  The defaults
are designed to allow the server to start and run even on a very
small desktop machine, so that someone's first "test drive" isn't
marred by problems.  When you gear up for production use, you
normally need to tune it.

> Kevin, Pierre, Greg, Steve, Hannu, Jorge -----   Thank you for
> your wonderfull support and giving me the correct picture on
> REINDEX on this thread. I appoligies if i couldnt have shared the
> proper information in resolving my issue. Is the above information
> provided by me will help out in tuning better.

I'm starting to get a better picture of the environment.  I really
think that if you modify your VACUUM policy you can drop the REINDEX
and be in much better shape overall.  If you provide information on
your disk subsystem, show us what your postgresql.conf file looks
like (with all comments stripped out), and give us a general idea of
the workload, we might be able to suggest some tuning that will help
you overall.  And you might think about when and how to upgrade --
autovacuum is something which has been getting better with major
releases, and performance in general has been improving markedly.

I hope this helps.

-Kevin

Re: How to fast the REINDEX

From
Robert Haas
Date:
On Thu, Apr 1, 2010 at 9:47 AM, raghavendra t <raagavendra.rao@gmail.com> wrote:
> and deletes. We also has the weekly maintance of VACUUM, but still reindex
> takes lot of time.

If you only VACUUM once a week, *everything* is going to take a lot of time.

...Robert

Re: How to fast the REINDEX

From
Hannu Krosing
Date:
On Thu, 2010-04-01 at 04:27 +0530, raghavendra t wrote:
>         I'm sorry I couldn't come up with more, but what you've
>         provided so
>         far is roughly equivalent to me telling you that it takes over
>         four
>         hours to travel to see my Uncle Jim, and then asking you how I
>         can
>         find out how he's doing in less time than that.  There's just
>         not
>         much to go on.  :-(
>
>         If you proceed with the course suggested in the URL I
>         referenced,
>         people on the list have a chance to be more helpful to you.
> Instead of looking into the priority of the question or where it has
> to be posted, it would be appreciated to keep a discussion to the
> point mentioned.  Truely this question belong to some other place as
> you have mentioned in the URL. But answer for Q1 might be expected
> alteast.

Ok, here is my answer to your Q1:

Q1. What are the parameters will effect, when issuing the REINDEX
command

A: Assuming you meant what parameters affect performance of REINDEX
command.

Most parameters that affect general performance affect also REINDEX
command.

Some that affect more are:

* amount of RAM in your server - the most important thing

* speed of disk subsystem - next most important in case not all of
active data fits in memory

Tunables

*  maintenance_work_mem - affects how much of sorting can be done in
memory, if you can afford to have maintenance_work_mem > largest index
size then sorting for index creation can be done in RAM only and is
significantly faster than when doing tape sort with intermediate files
on disks.

* wal_buffers - the bigger the better here, but competes with how big
you can make maintenance_work_mem . If more of heap and created indexes
can be kept in shared memory, everything runs faster.

* checkpoint_segments - affects how often whole wal_buffers is synced to
disk, if done too often then wastes lot of disk bandwidth for no good
reason.

* other chekpoint_* - tune to avoid excessive checkpointing.

> Hope i could get the information from the other Thread in other
> catagory.

Nah, actually [PERFORM] is the right place to ask.

Just most people got the impression that you may be doing unnecessary
REINDEXing, and the best way to speed up unneeded things is not to do
them ;)

> Thank you
>
> Regards
> Raghavendra



--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training