RE: pgsql is 75 times faster with my new index scan - Mailing list pgsql-hackers
From | Mikheev, Vadim |
---|---|
Subject | RE: pgsql is 75 times faster with my new index scan |
Date | |
Msg-id | 8F4C99C66D04D4118F580090272A7A23018D0E@SECTORBASE1 Whole thread Raw |
In response to | pgsql is 75 times faster with my new index scan (devik@cdi.cz) |
List | pgsql-hackers |
Why not implement *true* CLUSTER? With cluster, all heap tuples will be in cluster index. Vadim > -----Original Message----- > From: devik@cdi.cz [mailto:devik@cdi.cz] > Sent: Tuesday, September 26, 2000 2:15 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] pgsql is 75 times faster with my new index scan > > > Hello, > I recently spoke about extending index scan to be able > to take data directly from index pages. I wanted to know > whether should I spend my time and implement it. > So that I hacked last pgsql a bit to use proposed scan > mode and did some measurements (see bellow). Measurements > was done on (id int,txt varchar(20)) table with 1 000 000 rows > with btree index on both attrs. Query involved was: > select id,count(txt) from big group by id; > Duplicates distribution on id column was 1:1000. I was run > query twice after linux restart to ensure proper cache > utilization (on disk heap & index was 90MB in total). > So I think that by implementing this scan mode we can expect > to gain huge speedup in all queries which uses indices and > can found all data in their pages. > > Problems: > my changes implemented only indexscan and new cost function. > it doesn't work when index pages contains tuples which doesn't > belong to our transaction. test was done after vacuum and > only one tx running. > > TODO: > - add HeapTupleHeaderData into each IndexTupleData > - change code to reflect above > - when deleting-updating heap then also update tuples' > HeapTupleHeaderData in indices > > The last step could be done in two ways. First by limiting > number of indices for one table we can store coresponding > indices' TIDs in each heap tuple. The update is then simple > taking one disk write. > Or do it in standart way - lookup appropriate index tuple > by traversing index. It will cost us more disk accesses. > > Is someone interested in this ?? > regards devik > > With current indexscan: > ! system usage stats: > ! 1812.534505 elapsed 93.060547 user 149.447266 system sec > ! [93.118164 user 149.474609 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 130978/32 [131603/297] page faults/reclaims, 132 [132] swaps > ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent > ! 0/0 [0/0] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 555587 read, 551155 written, buffer hit > rate = 44.68% > ! Local blocks: 0 read, 0 written, buffer hit > rate = 0.00% > ! Direct blocks: 0 read, 0 written > > With improved indexscan: > ! system usage stats: > ! 23.686788 elapsed 22.157227 user 0.372071 system sec > ! [22.193359 user 0.385742 sys total] > ! 0/0 [0/0] filesystem blocks in/out > ! 1186/42 [1467/266] page faults/reclaims, 0 [0] swaps > ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent > ! 0/0 [0/0] voluntary/involuntary context switches > ! postgres usage stats: > ! Shared blocks: 4385 read, 0 written, buffer hit > rate = 4.32% > ! Local blocks: 0 read, 0 written, buffer hit > rate = 0.00% > ! Direct blocks: 0 read, 0 written > >
pgsql-hackers by date: