Thread: Index creation time and distribution
Hi -performance, I experienced this morning a performance problem when we imported a dump in a 8.1 database. The table is 5 millions rows large and when the dump creates an index on a specific text column called clazz it takes 27 minutes while on the other columns, it only takes a couple of seconds: LOG: duration: 1636301.317 ms statement: CREATE INDEX index_journal_clazz ON journal USING btree (clazz); LOG: duration: 20613.009 ms statement: CREATE INDEX index_journal_date ON journal USING btree (date); LOG: duration: 10653.290 ms statement: CREATE INDEX index_journal_modifieur ON journal USING btree (modifieur); LOG: duration: 15031.579 ms statement: CREATE INDEX index_journal_objectid ON journal USING btree (objectid); The only weird thing about this column is that 4.7 millions of rows have the exact same value. A partial index excluding this value is really fast to create but, as the database is used via JDBC and prepared statements, this index is totally useless (the plan is created before the BIND so it can't use the partial index). FWIW we can't use ?protocolVersion=2 with this application so it's not an option. As part of the deployment process of this application, we often need to drop/create/restore the database and 25 minutes is really longer than we can afford. So my questions are: - is the index creation time so correlated with the distribution? I was quite surprised by this behaviour. The time is essentially CPU time. - if not, what can I check to diagnose this problem? - IIRC, 8.3 could allow me to use the partial index as the query should be planned after the BIND (plans are unnamed). Am I right? Thanks for any input. -- Guillaume
"Guillaume Smet" <guillaume.smet@gmail.com> writes: > I experienced this morning a performance problem when we imported a > dump in a 8.1 database. > The table is 5 millions rows large and when the dump creates an index > on a specific text column called clazz it takes 27 minutes while on > the other columns, it only takes a couple of seconds: > The only weird thing about this column is that 4.7 millions of rows > have the exact same value. Do you have maintenance_work_mem set large enough that the index creation sort is done in-memory? 8.1 depends on the platform's qsort and a lot of them are kinda pessimal for input like this. 8.2 (which uses our own qsort) seems to perform better in a quick test. regards, tom lane
On Thu, May 22, 2008 at 3:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Do you have maintenance_work_mem set large enough that the index > creation sort is done in-memory? 8.1 depends on the platform's qsort > and a lot of them are kinda pessimal for input like this. FWIW, it's a 32 bits CentOS 4.6 box. maintenance_work_mem is set to 256 MB and the size of the index is 400 MB. Should I try to raise it up to 512 MB? The server only has 2GB of RAM so it seems a bit high. > 8.2 (which uses our own qsort) seems to perform better in a quick > test. Mmmmh OK. I was considering an upgrade to 8.3 in the next months anyway. Do we agree that in the case of unnamed prepared statement, 8.3 plans the query after the BIND? The partial index seems to be a better solution anyway, considering that it's 12 MB vs 400 MB. Thanks. -- Guillaume
On Thu, 22 May 2008, Tom Lane wrote: > Do you have maintenance_work_mem set large enough that the index > creation sort is done in-memory? 8.1 depends on the platform's qsort > and a lot of them are kinda pessimal for input like this. Looking at the fact that other indexes on the same table are created quickly, it seems that the maintenance_work_mem isn't the issue - the sort algorithm is. Having lots of elements the same value is a worst-case-scenario for a naive quicksort. I am in the middle of testing sorting algorithms for a performance lecture I'm going to give, and one of the best algorithms I have seen yet is that used in Java's java.util.Arrays.sort(). I haven't been able to beat it with any other comparison sort yet (although I have beaten it with a bucket sort, but I wouldn't recommend such an algorithm for a database). From the JavaDoc: > The sorting algorithm is a tuned quicksort, adapted from Jon L. Bentley > and M. Douglas McIlroy's "Engineering a Sort Function", > Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November > 1993). This algorithm offers n*log(n) performance on many data sets that > cause other quicksorts to degrade to quadratic performance. Matthew -- First law of computing: Anything can go wro sig: Segmentation fault. core dumped.
On Thu, May 22, 2008 at 6:32 AM, Guillaume Smet <guillaume.smet@gmail.com> wrote: > Hi -performance, > > > LOG: duration: 1636301.317 ms statement: CREATE INDEX > index_journal_clazz ON journal USING btree (clazz); > LOG: duration: 20613.009 ms statement: CREATE INDEX > index_journal_date ON journal USING btree (date); Just curious, what happens if you create the date index first, then the clazz one?
On Thu, May 22, 2008 at 6:50 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > Just curious, what happens if you create the date index first, then > the clazz one? It's not due to any cache effect if it's your question. It's mostly CPU time and changing the order doesn't change the behaviour. I'll make some tests with 8.3 in a few weeks (I'll be out of town next week) to see if using PostgreSQL qsort reduces the problem. -- Guillaume
"Guillaume Smet" <guillaume.smet@gmail.com> writes: > On Thu, May 22, 2008 at 3:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Do you have maintenance_work_mem set large enough that the index >> creation sort is done in-memory? 8.1 depends on the platform's qsort >> and a lot of them are kinda pessimal for input like this. > maintenance_work_mem is set to 256 MB and the size of the index is 400 MB. > Should I try to raise it up to 512 MB? The server only has 2GB of RAM > so it seems a bit high. Hmm, that's most likely not going to be enough to get it to do an in-memory sort ... try turning on trace_sort to see. But anyway, if you are in the on-disk sort regime, 8.3 is only going to be marginally faster for such a case --- it's going to have to write all the index entries out and read 'em back in anyway. >> 8.2 (which uses our own qsort) seems to perform better in a quick >> test. > Mmmmh OK. I was considering an upgrade to 8.3 in the next months anyway. > Do we agree that in the case of unnamed prepared statement, 8.3 plans > the query after the BIND? The partial index seems to be a better > solution anyway, considering that it's 12 MB vs 400 MB. Ermm .. this is in fact mostly broken in 8.3.0 and 8.3.1. If you don't want to wait for 8.3.2, you need this patch: http://archives.postgresql.org/pgsql-committers/2008-03/msg00566.php regards, tom lane
On Thu, May 22, 2008 at 9:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ermm .. this is in fact mostly broken in 8.3.0 and 8.3.1. If you don't > want to wait for 8.3.2, you need this patch: > http://archives.postgresql.org/pgsql-committers/2008-03/msg00566.php That's what I had in mind. We have to test a lot of things before even considering an upgrade so that's not really a problem for us to wait for 8.3.2. Thanks. -- Guillaume