Re: Minmax indexes - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Minmax indexes
Date
Msg-id 20140812225244.GJ5728@eldon.alvh.no-ip.org
Whole thread Raw
In response to Re: Minmax indexes  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Minmax indexes
List pgsql-hackers
Heikki Linnakangas wrote:
> I couldn't resist starting to hack on this, and implemented the
> scheme I've been having in mind:
>
> 1. MMTuple contains the block number of the heap page (range) that
> the tuple represents. Vacuum is no longer needed to clean up old
> tuples; when an index tuples is updated, the old tuple is deleted
> atomically with the insertion of a new tuple and updating the
> revmap, so no garbage is left behind.
>
> 2. LockTuple is gone. When following the pointer from revmap to
> MMTuple, the block number is used to check that you land on the
> right tuple. If not, the search is started over, looking at the
> revmap again.

Thanks, looks good, yeah.  Did you just forget to attach the
access/rmgrdesc/minmaxdesc.c file, or did you ignore it altogether?
Anyway I hacked one up, and cleaned up some other things.

> I'm sure this still needs some cleanup, but here's the patch, based
> on your v14. Now that I know what this approach looks like, I still
> like it much better. The insert and update code is somewhat more
> complicated, because you have to be careful to lock the old page,
> new page, and revmap page in the right order. But it's not too bad,
> and it gets rid of all the complexity in vacuum.

It seems there is some issue here, because pageinspect tells me the
index is not growing properly for some reason.  minmax_revmap_data gives
me this array of TIDs after a bunch of insert/vacuum/delete/ etc:


"(2,1)","(2,2)","(2,3)","(2,4)","(2,5)","(4,1)","(5,1)","(6,1)","(7,1)","(8,1)","(9,1)","(10,1)","(11,1)","(12,1)","(13,1)","(14,1)","(15,1)","(16,1)","(17,1)","(18,1)","(19,1)","(20,1)","(21,1)","(22,1)","(23,1)","(24,1)","(25,1)","(26,1)","(27,1)","(28,1)","(29,1)","(30,1)","(31,1)","(32,1)","(33,1)","(34,1)","(35,1)","(36,1)","(37,1)","(38,1)","(39,1)","(40,1)","(41,1)","(42,1)","(43,1)","(44,1)","(45,1)","(46,1)","(47,1)","(48,1)","(49,1)","(50,1)","(51,1)","(52,1)","(53,1)","(54,1)","(55,1)","(56,1)","(57,1)","(58,1)","(59,1)","(60,1)","(61,1)","(62,1)","(63,1)","(64,1)","(65,1)","(66,1)","(67,1)","(68,1)","(69,1)","(70,1)","(71,1)","(72,1)","(73,1)","(74,1)","(75,1)","(76,1)","(77,1)","(78,1)","(79,1)","(80,1)","(81,1)","(82,1)","(83,1)","(84,1)","(85,1)","(86,1)","(87,1)","(88,1)","(89,1)","(90,1)","(91,1)","(92,1)","(93,1)","(94,1)","(95,1)","(96,1)","(97,1)","(98,1)","(99,1)","(100,1)","(101,1)","(102,1)","(103,1)","(104,1)","(105,1)","(106,1)","(107,1)","(108,1)","(109,1)","(110,1)","(111,1)","(112,1)","(113,1)","(114,1)","(115,1)","(116,1)","(117,1)","(118,1)","(119,1)","(120,1)","(121,1)","(122,1)","(123,1)","(124,1)","(125,1)","(126,1)","(127,1)","(128,1)","(129,1)","(130,1)","(131,1)","(132,1)","(133,1)","(134,1)"

There are some who would think that getting one item per page is
suboptimal.  (Maybe it's just a missing FSM update somewhere.)


I've been hacking away a bit more at this; will post updated patch
probably tomorrow (was about to post but just found a memory stomp in
pageinspect.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WAL format and API changes (9.5)
Next
From: Stephen Frost
Date:
Subject: Re: replication commands and log_statements