Re: RFC/WIP: adding new configuration options to TOAST - Mailing list pgsql-hackers
From | Bill Moran |
---|---|
Subject | Re: RFC/WIP: adding new configuration options to TOAST |
Date | |
Msg-id | 20151103215835.6d0161c3916bb70d5b21c5db@potentialtech.com Whole thread Raw |
In response to | Re: RFC/WIP: adding new configuration options to TOAST (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: RFC/WIP: adding new configuration options to TOAST
|
List | pgsql-hackers |
On Tue, 3 Nov 2015 18:34:39 -0800 Jeff Janes <jeff.janes@gmail.com> wrote: > On Tue, Nov 3, 2015 at 5:21 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > > On 3 November 2015 at 23:04, Bill Moran <wmoran@potentialtech.com> wrote: > >> > >> Looking for feedback to see if anyone sees any issues or has any > >> suggestions on what I'm doing. The attached patch alters 3 things > >> with regard to TOAST behavior: > > > > COMPRESSION_TEST_SIZE (2) seems useful. > > > > The other two mostly seem like options nobody's going to know are > > there, or know how to sensibly set if they do notice them. What's the > > driving reason behind those, the problem you're trying to solve? Why > > make them configurable per-table (or at all)? > > I currently have a table with one column which has a median width of > 500 bytes, a 90th percentile of 650 bytes, and makes up 75% of the > table's size, and the column is rarely used, while the table itself is > frequently seq scanned. I'd very much like to drive that column out > of main and into toast. I think target_tuple_size would let me do > that. That's exactly the use case. As it currently stands, any tuple smaller than about 2K will never be toasted. So if you have 1900 bytes of highly compressible text that is infrequently queried from the table whilst other columns are frequently accessed, there's no way to force it to be out of line from the main table, or be compressed. The two new configurables allow the DBA to make tradeoff decisions on CPU usage vs. storage efficiency. Since the TOAST code attempts to process the column that will provide the largest gain first, in your described use case you could calculate the size of the other columns, and set the target_tuple_size to just a bit larger than that, and that large column should get moved into the toast table in most or all cases (depending on how predictable the other sizes are) Compression is a similarly hard-coded value in current versions. I feel that allowing the DBA to control how much savings is required before incurring the overhead of compression is worthwhile, especially when considered on a per-table basis. For example, the compression on an archive table could be very aggressive, whereas compression on a frequently accessed table might only be justified if it saves a lot of space. How much space compression saves is highly dependent on the data being stored. I don't have anything remotely like statistical advice on how much improvement can actually be gained yet. Once I have per-table values implemented, it will be much, much easier to test the impact. It does sound like I need to spend a little more time improving the documentation to ensure that it's clear what these values achieve. > (Per-column control would be even nicer, but I'd take what I can get) Oddly, I hadn't considered getting as granualar as per-column, but now that you've got me thinking about it, it seems like a logical step to take. -- Bill Moran
pgsql-hackers by date: