Re: tableam vs. TOAST - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: tableam vs. TOAST |
Date | |
Msg-id | CA+TgmoYWOQMo086PNZ0H50xya-YPZ+3BLq=2-39rHpvzWpNZDw@mail.gmail.com Whole thread Raw |
In response to | Re: tableam vs. TOAST (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
List | pgsql-hackers |
On Mon, Nov 11, 2019 at 8:51 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > Compared to the previous patch (v7) where the API just had a "use this > AM for TOAST" field and the other extreme of pushing TOAST entirely > inside the heap AM, this seems like the worst of both worlds, with the > maximum additional complexity. There might be a misunderstanding here. These patches would still have a "use this AM for TOAST" callback, just as the previous set did, but I didn't include that here, because this is talking about a different part of the problem. The purpose of that callback is to determine which AM will be used to create the toast table. The purpose of these patches is to be able to detoast a value given nothing but the TOAST pointer extracted from the heap tuple, while removing the present assumption that the TOAST table is a heap table. (The current coding is actually seriously inconsistent, because right now, the code that creates TOAST tables always uses the same AM as the main heap; but the detoasting code only works with heap tables, which means that no non-heap AM can use the TOAST system at all. If necessary, we could commit the patch to allow the TOAST table AM to be changed first, and then handle allowing the detoasting logic to cope with a non-heap AM as a separate matter.) > I don't think we need to nail down this API for eternity, so I'd be > happy to err on the side of practicality here. However, it seems it's > not quite clear what for example the requirements and wishes from zheap > would be. What's the simplest way to move this forward? The only thing zheap needs - in the current design, anyway - is the ability to change the chunk size. However, I think that's mostly because we haven't spent a lot of time thinking about how to do TOAST better than the heap does TOAST today. I think it is desirable to allow for more options than that. That's why I like this approach more than the previous one. The previous approach allowed the chunk size to be variable, but permitted no other AM-specific variation; this one allows the AM to detoast in any way that it likes. The downside of that is that if you really do only want to vary the chunk size, you'll have to repeat somewhat more code. That's sad, but we're never likely to have enough AMs for that to be a really serious problem, and if we do, the AM-specific callbacks for those AMs that just want a different chunk size could call a common helper function. > The refactorings you proposed seem reasonable on their own, and I have > some additional comments on that if we decide to go forward in this > direction. One thing that's confusing is that the TOAST tables have > fields chunk_id and chunk_seq, but when an error message talks about > "chunk %d" or "chunk number %d", they usually mean the "seq" and not the > "id". Well, we've got errors like this right now: unexpected chunk number %d (expected %d) for toast value %u in %s So at least in this case, and I think in many cases, we're referring to the chunk_id as "toast value %u" and the chunk_seq as "chunk number %d". I think that's pretty good terminology. It's unfortunate that the TOAST table columns are called chunk_id and chunk_seq rather than, say, value_id and chunk_number, and I guess we could possibly change that without breaking too many things, but I'm not sure that changing the error messages would help anybody. We could try to rephrase the error message to mention the two value in the opposite order, which to me would be more clear, but I'm not exactly sure how to do that without writing rather awkward English. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: