Re: FILLFACTOR and increasing index - Mailing list pgsql-general

From Boszormenyi Zoltan
Subject Re: FILLFACTOR and increasing index
Date
Msg-id 4DF51021.2020607@cybertec.at
Whole thread Raw
In response to Re: FILLFACTOR and increasing index  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: FILLFACTOR and increasing index  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-general
Hi,

2011-05-12 00:28 keltezéssel, Tomas Vondra írta:
> Hi,
>
> I've studied the implementation of the btree indexes and how exactly the
> fillfactor is used, and in general
>
> - when a page split happens, the process needs to obtain more locks
>   than with simple insert, which may result in contention with other
>   processes that modify the index (the same pages)
>
> - the fillfactor is used only for the leaf pages, the rest of the index
>   does not use it (directly)
>
> So lowering the number of page splits might remove some contention when
> there's a lot of processes accessing the same pages.
>
> But that's the theory - I really was not able to come up with a test
> that benefits from lower fillfactor. Actually the lower the fillfactor,
> the larger the index, which may be a significant issue with huge indexes.

We recently had a testcase for exercising FILLFACTOR on indexes.
Several (15+) GB raw data arrives daily and must be imported into
the database for analytic purposes, the table is heavily partitioned
and each partition has 5 or 6 indexes. The importer is heavily threaded
and uses COPY to insert the data. This is strictly an INSERT-only scenario,
the analysis comes later. This is where FILLFACTOR=70 helped to
reduce the index contention.

With the default 90% (and 100% on the table itself) when a lot of data
arrived in burst that were in the same time interval so 4-8 threads tried
to push data into the same partition, individual data chunks (about 10-15MB
each, about 63500 pieces) were imported in about 1.5-2.5 minutes, with
a few spikes over 4 minutes. This was the result of a few days long
network outage, the data collectors sent their backlogs.

When all indexes were re-created with FILLFACTOR=70, the import times
went down considerably. Out of about 63000 raw data chunks, there were
only 6 or 8 where the import time fell in the 10-25 seconds range, the rest
of them were imported under 10 seconds and the majority of them (65%)
were imported under 3 seconds. The system has 24 cores, so we can use
8 of them dedicated to importing. This almost 1TB data was imported
in about 3.5-4 hours after the FILLFACTOR change. Without it, it didn't
finish in 2 days.

Best regards,
Zoltán Böszörményi

> So in your case, I'd probably go with the default fillfactor (90), and
> maybe I'd consider even using fillfactor=100, to make the index as dense
> as possible.
>
> Anyway I guess the impact of this will be negligible, compared to other
> parameters (shared buffers, work_mem, etc.).
>
> regards
> Tomas
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/


pgsql-general by date:

Previous
From: Hanno Schlichting
Date:
Subject: Tweaking bytea / large object block sizes?
Next
From: Vincent Veyron
Date:
Subject: Re: Tweaking bytea / large object block sizes?