Re: Column lookup in a row performance - Mailing list pgsql-hackers

From Павлухин Иван
Subject Re: Column lookup in a row performance
Date
Msg-id CAOykqKf6GuyZV+pq6kjM6R9ToK7whA7aMKi4RFFYCPhb_7jFwA@mail.gmail.com
Whole thread Raw
In response to Re: Column lookup in a row performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom, thanks for your answer. It definitely makes a picture in my mind
more clear.

вт, 2 апр. 2019 г. в 18:41, Tom Lane <tgl@sss.pgh.pa.us>:
>
> =?UTF-8?B?0J/QsNCy0LvRg9GF0LjQvSDQmNCy0LDQvQ==?= <vololo100@gmail.com> writes:
> >> (1) Backwards compatibility, and (2) it's not clear that a different
> >> layout would be a win for all cases.
>
> > I am curious regarding (2), for my understanding it is good to find
> > out at least one case when layout with lengths/offsets in a header
> > will be crucially worse. I will be happy if someone can elaborate.
>
> It seems like you think the only figure of merit here is how fast
> deform_heap_tuple runs.  That's not the case.  There are at least
> two issues:
>
> 1.  You're not going to be able to do this without making tuples
> larger overall in many cases; but more data means more I/O which
> means less performance.  I base this objection on the observation
> that our existing design allows single-byte length "words" in many
> common cases, but it's really hard to see how you could avoid
> storing a full-size offset for each column if you want to be able
> to access each column in O(1) time without any examination of other
> columns.
>
> 2.  Our existing system design has an across-the-board assumption
> that each variable-length datum has its length embedded in it,
> so that a single pointer carries enough information for any called
> function to work with the value.  If you remove the length word
> and expect the length to be computed by subtracting two offsets that
> are not even physically adjacent to the datum, that stops working.
> There is no fix for that that doesn't add performance costs and
> complexity.
>
> Practically speaking, even if we were willing to lose on-disk database
> compatibility, point 2 breaks so many internal and extension APIs that
> there's no chance whatever that we could remove the length-word datum
> headers.  That means that the added fields in tuple headers would be
> pure added space with no offsetting savings in the data size, making
> point 1 quite a lot worse.
>
>                         regards, tom lane



--
Best regards,
Ivan Pavlukhin



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: [HACKERS] Cached plans and statement generalization
Next
From: Alvaro Herrera
Date:
Subject: Re: ToDo: show size of partitioned table