Re: MaxOffsetNumber for Table AMs - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: MaxOffsetNumber for Table AMs
Date
Msg-id CAH2-Wz=9EGgNe5bzAY9i3AkCWYKMTBj5KaVf2zsVyQcnhCJa3g@mail.gmail.com
Whole thread Raw
In response to Re: MaxOffsetNumber for Table AMs  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: MaxOffsetNumber for Table AMs
List pgsql-hackers
On Mon, May 3, 2021 at 5:15 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I don't see why in-core changes are a strict requirement. It doesn't
> make too much difference if a lossy TID doesn't correspond exactly to
> the columnar layout -- it should be fine as long as there's locality,
> right?

But look at the details: tidbitmap.c uses MaxHeapTuplesPerPage as its
MAX_TUPLES_PER_PAGE, which seems like a problem -- that's 291 with
default BLCKSZ. I doubt that that restriction is something that you
can afford to live with, even just for the time being.

> > It seems senseless to *require* table AMs to support something like a
> > bitmap scan.
>
> I am not yet convinced that it's "senseless", but it is optional and
> there's probably a reason that it's not required.

I mean it's senseless to require it in the general case.

> We still need to address the fact that two features have had a minor
> collision: indexes on a partitioned table and table AMs that don't
> necessarily support all index types. It's not good to just throw an
> error, because we could be forcing the user to manually manage the
> indexes on hundreds of partitions just because some tables have a
> different AM and it doesn't support the index type.

I don't see why that's necessarily a problem. Why, in general, should
every table AM be able to support every index AM?

I find it puzzling that nobody can find one single thing that the
table AM interface *can't* do. What are the chances that the
abstraction really is perfect?

> > I don't think it's a coincidence that GIN is the index AM
> > that looks like it presents at least 2 problems for the columnar
> > table
> > AM. To me this suggests that this will need a much higher level
> > discussion.
>
> One problem is that ginpostinglist.c restricts the use of offset
> numbers higher than MaxOffsetNumber - 1. At best, that's a confusing
> and unnecessary off-by-one error that we happen to be stuck with
> because it affects the on-disk format. Now that I'm past that
> particular confusion, I can live with a workaround until we do
> something better.
>
> What is the other problem with GIN?

I just meant the tidbitmap.c stuff, and so on. There is really one big
problem: GIN leverages the fact that bitmap scans are all that it
supports in many different ways. The reality is that it was designed
to work with heapam -- that's how it evolved. It seems rather unlikely
that problems are confined to this ginpostinglist.c representational
issue -- which is very surface-level. The only way to figure it out is
to try to make it work and see what happens, though, so perhaps it
isn't worth discussing any further until that happens.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: MaxOffsetNumber for Table AMs
Next
From: Jeff Davis
Date:
Subject: Re: MaxOffsetNumber for Table AMs