Thread: Re: [HACKERS] No heap lookups on index

Re: [HACKERS] No heap lookups on index

From
Glen Parker
Date:
Tom Lane wrote:
>>What ever happened to grouped heap reads, i.e. building a list of tuples
>>from the index, sorting in heap order, then reading the heap in a batch?
>
>
> Done in 8.1.  I'm uncertain whether Scott knows about that ...

That's GREAT news!  Is that the "Bitmap Scan" item in the what's new
list (http://www.postgresql.org/docs/whatsnew)?  I didn't even notice it
when I read it the first time.  I'm really looking forward to our
upcoming 8.1 upgrade.

-Glen

Re: [HACKERS] No heap lookups on index

From
Bruce Momjian
Date:
Glen Parker wrote:
> Tom Lane wrote:
> >>What ever happened to grouped heap reads, i.e. building a list of tuples
> >>from the index, sorting in heap order, then reading the heap in a batch?
> >
> >
> > Done in 8.1.  I'm uncertain whether Scott knows about that ...
>
> That's GREAT news!  Is that the "Bitmap Scan" item in the what's new
> list (http://www.postgresql.org/docs/whatsnew)?  I didn't even notice it

Yes.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] No heap lookups on index

From
"Jim C. Nasby"
Date:
On Wed, Jan 18, 2006 at 10:11:26PM -0500, Bruce Momjian wrote:
> Glen Parker wrote:
> > Tom Lane wrote:
> > >>What ever happened to grouped heap reads, i.e. building a list of tuples
> > >>from the index, sorting in heap order, then reading the heap in a batch?
> > >
> > >
> > > Done in 8.1.  I'm uncertain whether Scott knows about that ...
> >
> > That's GREAT news!  Is that the "Bitmap Scan" item in the what's new
> > list (http://www.postgresql.org/docs/whatsnew)?  I didn't even notice it
>
> Yes.

But note that some recent testing indicated that even if you read a file
in sequential order, just skipping over random sections, as soon as you
hit the point where you're reading ~5% of the file you might as well
just read the entire thing, so the amount this helps may be
questionable. The thread was about using block sampling instead of row
sampling for analyze.

I suspect the issue is that rotational delay is becomming just as
'damaging' as track-to-track seek delay. If that's true, the only way to
improve things would be to order reads taking both track seek time and
rotational position into account. Theoretically the drive could do this,
though I don't know if any actually do.

If my guess is correct then random reads may not be that much more
expensive than a sequential read that skips large chunks of the file.
This is because most files will cover a fairly small number of tracks,
so head positioning time will be minimal compared to rotational delay.
It would be interesting to modify the test code that was posted (see
attached) so that it read randomly instead of just skipping random
amounts.

Just for grins, I just ran seqtest.c a number of times, using various
percents and file sizes. Results also attached...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Attachment