Re: [HACKERS] Index creation takes for ever - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] Index creation takes for ever |
Date | |
Msg-id | 200312010450.hB14ovH16330@candle.pha.pa.us Whole thread Raw |
In response to | Re: [HACKERS] Index creation takes for ever (Manfred Koizar <mkoi-pg@aon.at>) |
Responses |
Re: [HACKERS] Index creation takes for ever
|
List | pgsql-patches |
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches I will try to apply it within the next 48 hours. --------------------------------------------------------------------------- Manfred Koizar wrote: > On Mon, 01 Sep 2003 08:46:09 -0400, Tom Lane <tgl@sss.pgh.pa.us> > wrote: > >ohp@pyrenet.fr writes: > >> it took 69 minutes to finish, 75% of this time was devoted to create 2 > >> indexes on varchar(2) with value being 'O', 'N' or null; > > > >I still say it's either strcoll or qsort's fault. > > If qsort is to blame, then maybe this patch could help. It sorts > equal key values on item pointer. And if it doesn't help index > creation speed, at least the resulting index has better correlation. > > Test script: > CREATE TABLE t (i int NOT NULL, t text NOT NULL); > INSERT INTO t VALUES (1, 'lajshdflasjhdflajhsdfljhasdlfjhasdf'); > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t VALUES (100, 's,dmfa.,smdn.famsndfamdnsbfmansdbf'); > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > INSERT INTO t SELECT * FROM t; > ANALYZE t; > CREATE INDEX t_i ON t(i); > SET enable_seqscan = 0; > SELECT ctid FROM t WHERE i=100 LIMIT 10; > > Result without patch: > ctid > ---------- > (153,14) > (306,23) > (305,80) > (152,91) > (76,68) > (38,34) > (153,34) > (305,50) > (9,62) > (305,40) > (10 rows) > > Result with patch: > ctid > -------- > (0,5) > (0,10) > (0,15) > (0,20) > (0,25) > (0,30) > (0,35) > (0,40) > (0,45) > (0,50) > (10 rows) > > For testing purposes I have made a second patch that provides a > boolean GUC variable sort_index. It is available here: > http://www.pivot.at/pg/23.test-IdxTupleSort.diff > > Servus > Manfred > diff -ruN ../base/src/backend/utils/sort/tuplesort.c src/backend/utils/sort/tuplesort.c > --- ../base/src/backend/utils/sort/tuplesort.c 2003-08-17 21:58:06.000000000 +0200 > +++ src/backend/utils/sort/tuplesort.c 2003-09-05 10:04:22.000000000 +0200 > @@ -2071,6 +2071,33 @@ > (errcode(ERRCODE_UNIQUE_VIOLATION), > errmsg("could not create unique index"), > errdetail("Table contains duplicated values."))); > + else > + { > + /* > + * If key values are equal, we sort on ItemPointer. This might help > + * for some bad qsort implementation having performance problems > + * with many equal items. OTOH I wouldn't trust such a weak qsort > + * to handle pre-sorted sequences very well ... > + * > + * Anyway, this code doesn't hurt much, and it helps produce indices > + * with better index correlation which is a good thing per se. > + */ > + ItemPointer tid1 = &tuple1->t_tid; > + ItemPointer tid2 = &tuple2->t_tid; > + BlockNumber blk1 = ItemPointerGetBlockNumber(tid1); > + BlockNumber blk2 = ItemPointerGetBlockNumber(tid2); > + > + if (blk1 != blk2) > + return (blk1 < blk2) ? -1 : 1; > + else > + { > + OffsetNumber pos1 = ItemPointerGetOffsetNumber(tid1); > + OffsetNumber pos2 = ItemPointerGetOffsetNumber(tid2); > + > + if (pos1 != pos2) > + return (pos1 < pos2) ? -1 : 1; > + } > + } > > return 0; > } > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-patches by date: