Thread: [HACKERS] MAIN, Uncompressed?
It looks like we need a new Column Storage option for MAIN, Uncompressed. We have these Column Storage options Plain - inline only, uncompressed Main - inline until external as last resort, compressible External - external, uncompressed Extended - external, compressible So there is no option for Main, but not compressible... With reference to code... there seems to be no way to skip step 3 /* ----------* Compress and/or save external until data fits into target length** 1: Inline compress attributes with attstorage'x', and store very* large attributes with attstorage 'x' or 'e' external immediately* 2: Store attributeswith attstorage 'x' or 'e' external* 3: Inline compress attributes with attstorage 'm'* 4: Store attributes withattstorage 'm' external* ----------*/ Not sure what to call this new option? MAINU? Objections? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > It looks like we need a new Column Storage option for MAIN, Uncompressed. > We have these Column Storage options > Plain - inline only, uncompressed > Main - inline until external as last resort, compressible > External - external, uncompressed > Extended - external, compressible > So there is no option for Main, but not compressible... Doesn't Plain serve the purpose? In point of fact, though, "never inline and never compress" is not really a useful option, as it can be more easily read as "fail on wide values". regards, tom lane
On 25 August 2017 at 12:24, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> It looks like we need a new Column Storage option for MAIN, Uncompressed. >> We have these Column Storage options >> Plain - inline only, uncompressed >> Main - inline until external as last resort, compressible >> External - external, uncompressed >> Extended - external, compressible > >> So there is no option for Main, but not compressible... > > Doesn't Plain serve the purpose? No, because that just gives an error if you try to insert a large column value. > In point of fact, though, "never inline and never compress" is not really > a useful option, as it can be more easily read as "fail on wide values". Agreed, but that is not what I am proposing. Main is roughly what is wanted, yet it always tries to compress. If you already know that won't be useful it should be possible to turn compression off. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > Main is roughly what is wanted, yet it always tries to compress. If > you already know that won't be useful it should be possible to turn > compression off. If you know compression isn't useful, but you don't want to fail on wide values, then "external" should serve the purpose. regards, tom lane
On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> Main is roughly what is wanted, yet it always tries to compress. If >> you already know that won't be useful it should be possible to turn >> compression off. > > If you know compression isn't useful, but you don't want to fail on > wide values, then "external" should serve the purpose. Well, almost. External toasts at 2048-ish bytes whereas Main toasts at 8160 bytes. The rows are typically near 4kB long, so if marked External they would always be toasted. It's desirable to have the full row in the heap block, rather than have to access heap-toastindex-toastblocks in all cases. The data is also incompressible, so Main just wastes time on insert. Hence, we have a missing option. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If you know compression isn't useful, but you don't want to fail on >> wide values, then "external" should serve the purpose. > Well, almost. External toasts at 2048-ish bytes whereas Main toasts at > 8160 bytes. > The rows are typically near 4kB long, so if marked External they would > always be toasted. > It's desirable to have the full row in the heap block, rather than > have to access heap-toastindex-toastblocks in all cases. > The data is also incompressible, so Main just wastes time on insert. > Hence, we have a missing option. Maybe, but the use case seems mighty narrow. regards, tom lane
On 25 August 2017 at 14:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndquadrant.com> writes: >> On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> If you know compression isn't useful, but you don't want to fail on >>> wide values, then "external" should serve the purpose. > >> Well, almost. External toasts at 2048-ish bytes whereas Main toasts at >> 8160 bytes. >> The rows are typically near 4kB long, so if marked External they would >> always be toasted. >> It's desirable to have the full row in the heap block, rather than >> have to access heap-toastindex-toastblocks in all cases. >> The data is also incompressible, so Main just wastes time on insert. >> Hence, we have a missing option. > > Maybe, but the use case seems mighty narrow. JSON blobs between 2kB and 8160 bytes are very common. String length is maybe a poisson distribution, definitely not uniform. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 25 August 2017 at 19:59, Simon Riggs <simon@2ndquadrant.com> wrote: > > On 25 August 2017 at 14:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Maybe, but the use case seems mighty narrow. > > JSON blobs between 2kB and 8160 bytes are very common. > > String length is maybe a poisson distribution, definitely not uniform. But JSON blobs should be highly compressible. Even jsonb will be quite compressible. That said I always found remembering the mapping from these names to various behaviours to be quite hard to use. I would have found it far more useful to have two separate properties I could set "compress" and "external" or perhaps even more useful would be to set some kind of guideline size threshold for each (and perhaps a second size compress threshold and external threshold for the whole tuple). I think this is a particularly old piece of code and we're lucky the default heuristics have served well for all this time because I doubt many people fiddle with these storage attributes. The time may have come to come up with a better UI for the storage attributes because people are doing new things (like json) and wanting more control over this heuristic. For what it's worth I think a good start would be to give people more visibility into what the tuptoaster heuristic is actually doing to their data and that will encourage people to give feedback about when they're surprised and are frustrated by the existing UI. -- greg
Greg Stark <stark@mit.edu> writes: > I think this is a particularly old piece of code and we're lucky the > default heuristics have served well for all this time because I doubt > many people fiddle with these storage attributes. The time may have > come to come up with a better UI for the storage attributes because > people are doing new things (like json) and wanting more control over > this heuristic. Yeah, I could get behind a basic rethinking of the tuptoaster control knobs. I'm just not in love with Simon's specific proposal, especially not if we can't think of a better name for it than "MAINU". > For what it's worth I think a good start would be to give people more > visibility into what the tuptoaster heuristic is actually doing to > their data and that will encourage people to give feedback about when > they're surprised and are frustrated by the existing UI. Hm, what might that look like exactly? More pgstattuple functionality perhaps? regards, tom lane
On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <stark@mit.edu> writes: >> I think this is a particularly old piece of code and we're lucky the >> default heuristics have served well for all this time because I doubt >> many people fiddle with these storage attributes. The time may have >> come to come up with a better UI for the storage attributes because >> people are doing new things (like json) and wanting more control over >> this heuristic. > > Yeah, I could get behind a basic rethinking of the tuptoaster control > knobs. I'm just not in love with Simon's specific proposal, especially > not if we can't think of a better name for it than "MAINU". Extended/External would be just fine if you could set the toast target, so I think a better suggestion would be to make "toast_target" a per-attribute option . -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 26/08/17 12:18, Simon Riggs wrote: > On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Greg Stark <stark@mit.edu> writes: >>> I think this is a particularly old piece of code and we're lucky the >>> default heuristics have served well for all this time because I doubt >>> many people fiddle with these storage attributes. The time may have >>> come to come up with a better UI for the storage attributes because >>> people are doing new things (like json) and wanting more control over >>> this heuristic. >> Yeah, I could get behind a basic rethinking of the tuptoaster control >> knobs. I'm just not in love with Simon's specific proposal, especially >> not if we can't think of a better name for it than "MAINU". > Extended/External would be just fine if you could set the toast > target, so I think a better suggestion would be to make "toast_target" > a per-attribute option . > +1, have thought about this myself previously....thank you for bringing it up! regards Mark
On 26 August 2017 at 05:40, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote: > On 26/08/17 12:18, Simon Riggs wrote: > >> On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> >>> Greg Stark <stark@mit.edu> writes: >>>> >>>> I think this is a particularly old piece of code and we're lucky the >>>> default heuristics have served well for all this time because I doubt >>>> many people fiddle with these storage attributes. The time may have >>>> come to come up with a better UI for the storage attributes because >>>> people are doing new things (like json) and wanting more control over >>>> this heuristic. >>> >>> Yeah, I could get behind a basic rethinking of the tuptoaster control >>> knobs. I'm just not in love with Simon's specific proposal, especially >>> not if we can't think of a better name for it than "MAINU". >> >> Extended/External would be just fine if you could set the toast >> target, so I think a better suggestion would be to make "toast_target" >> a per-attribute option . >> > > +1, have thought about this myself previously....thank you for bringing it > up! OK, so table-level option for "toast_tuple_target", not attribute-level option The attached patch and test shows this concept is useful and doesn't affect existing data. For 4x 4000 byte rows: * by default we use 1 heap block and 3 toast blocks * toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 29 August 2017 at 07:58, Simon Riggs <simon@2ndquadrant.com> wrote: > On 26 August 2017 at 05:40, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote: >> On 26/08/17 12:18, Simon Riggs wrote: >> >>> On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> >>>> Greg Stark <stark@mit.edu> writes: >>>>> >>>>> I think this is a particularly old piece of code and we're lucky the >>>>> default heuristics have served well for all this time because I doubt >>>>> many people fiddle with these storage attributes. The time may have >>>>> come to come up with a better UI for the storage attributes because >>>>> people are doing new things (like json) and wanting more control over >>>>> this heuristic. >>>> >>>> Yeah, I could get behind a basic rethinking of the tuptoaster control >>>> knobs. I'm just not in love with Simon's specific proposal, especially >>>> not if we can't think of a better name for it than "MAINU". >>> >>> Extended/External would be just fine if you could set the toast >>> target, so I think a better suggestion would be to make "toast_target" >>> a per-attribute option . >>> >> >> +1, have thought about this myself previously....thank you for bringing it >> up! > > OK, so table-level option for "toast_tuple_target", not attribute-level option > > The attached patch and test shows this concept is useful and doesn't > affect existing data. > > For 4x 4000 byte rows: > * by default we use 1 heap block and 3 toast blocks > * toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks New patch, v2, since one line in the docs failed to apply because of recent changes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 09/12/2017 12:11 PM, Simon Riggs wrote: > >> OK, so table-level option for "toast_tuple_target", not attribute-level option >> >> The attached patch and test shows this concept is useful and doesn't >> affect existing data. >> >> For 4x 4000 byte rows: >> * by default we use 1 heap block and 3 toast blocks >> * toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks > > New patch, v2, since one line in the docs failed to apply because of > recent changes. > This has bitrotted ever so slightly. I fixed that and took the opportunity to edit the docs text slightly to improve the clarity a bit. Revised patch attached - adjust to your taste. The code is extremely simple and small and straightforward, and the tests are good. Marking as Ready for Committer, cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services