Thread: Higher TOAST compression.
Friendly greetings ! I'd like to have a higher compression ratio on our base. From pg 8.3 documentation : The TOAST code is triggered only when a row value to be stored in a table is wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB). The TOAST code will compress and/or move field values out-of-line until the row value is shorter than TOAST_TUPLE_TARGET bytes (also normally 2 kB) or no more gains can be had. According to the source code : TOAST_TUPLE_THRESHOLD = a page (8KB) divided by TOAST_TUPLES_PER_PAGE (4 by default) = 2KB. TOAST_TUPLE_TARGET = TOAST_TUPLE_THRESHOLD = 2KB If i understood correctly, the compression stop when the data to toast is equal to TOAST_TUPLE_TARGET What about trying to change the TOAST_TUPLE_TARGET to get a higher compression (by having more toasted record) ? I'd like to change the TOAST_TUPLES_PER_PAGE. Maybe from 4 to 8 ? Is that correct ? Did i missed something ? I did some statistics and i will have much more TOASTed record as most of them are between 1KB and 2KB. The servers have a lot of free cpu (2x4 core) and are running out of IO, i hope to save some IO. PS : The tables are clustered and all required index are present. Any tought ? idea ? Thank you. -- F4FQM Kerunix Flan Laurent Laborde
Laurent Laborde <kerdezixe@gmail.com> wrote: > What about trying to change the TOAST_TUPLE_TARGET to get a higher > compression (by having more toasted record) ? > > I'd like to change the TOAST_TUPLES_PER_PAGE. Maybe from 4 to 8 ? > Is that correct ? Did i missed something ? > > I did some statistics and i will have much more TOASTed record as > most of them are between 1KB and 2KB. It seems like it might be reasonable to have a separate threshold for compression from that for out-of-line storage. Since I've been in that code recently, I would be pretty comfortable doing something about that; but, as is so often the case, the problem will probably be getting agreement on what would be a good change. Ignoring for a moment the fact that "low hanging fruit" in the form of *very* large values can be handled first, the options would seem to be: (1) Just hard-code a lower default threshold for compression than for out-of-line storage. (2) Add a GUC or two to control thresholds. (3) Allow override of the thresholds for individual columns. Are any of these non-controversial? What do people like there? What did I miss? -Kevin
On Fri, 2009-07-17 at 12:50 -0500, Kevin Grittner wrote: > Laurent Laborde <kerdezixe@gmail.com> wrote: > (3) Allow override of the thresholds for individual columns. > > Are any of these non-controversial? What do people like there? What > did I miss? I would skip 1 and 2 and have (3). Joshua D. Drake -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
"Joshua D. Drake" <jd@commandprompt.com> wrote: > On Fri, 2009-07-17 at 12:50 -0500, Kevin Grittner wrote: >> (3) Allow override of the thresholds for individual columns. > I would skip 1 and 2 and have (3). Sure, pick the one which requires new syntax! ;-) How about two new ALTER TABLE actions: ALTER [ COLUMN ] column SET COMPRESSION_THRESHOLD integer ALTER [ COLUMN ] columnSET EXTERNAL_THRESHOLD integer Or can you suggest something better? Laurent, would something like this address your needs? I was assuming that if the point is to reduce I/O, you were interested in doing more compression, not in storing more values out-of-line (in the separate TOAST table)? Would it be manageable to tune this on a column-by- column basis? -Kevin
On Fri, 2009-07-17 at 14:21 -0500, Kevin Grittner wrote: > "Joshua D. Drake" <jd@commandprompt.com> wrote: > > On Fri, 2009-07-17 at 12:50 -0500, Kevin Grittner wrote: > > >> (3) Allow override of the thresholds for individual columns. > > > I would skip 1 and 2 and have (3). > > Sure, pick the one which requires new syntax! ;-) Better than the argument for a new GUC :). > > How about two new ALTER TABLE actions: > > ALTER [ COLUMN ] column SET COMPRESSION_THRESHOLD integer > ALTER [ COLUMN ] column SET EXTERNAL_THRESHOLD integer > > Or can you suggest something better? Seems reasonable to me. Joshua D. Drake -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Fri, Jul 17, 2009 at 9:21 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > "Joshua D. Drake" <jd@commandprompt.com> wrote: >> On Fri, 2009-07-17 at 12:50 -0500, Kevin Grittner wrote: > >>> (3) Allow override of the thresholds for individual columns. > >> I would skip 1 and 2 and have (3). > > Sure, pick the one which requires new syntax! ;-) > > How about two new ALTER TABLE actions: > > ALTER [ COLUMN ] column SET COMPRESSION_THRESHOLD integer > ALTER [ COLUMN ] column SET EXTERNAL_THRESHOLD integer > > Or can you suggest something better? > > Laurent, would something like this address your needs? I was assuming > that if the point is to reduce I/O, you were interested in doing more > compression, not in storing more values out-of-line (in the separate > TOAST table)? Would it be manageable to tune this on a column-by- > column basis? Certainly ! We already alter storage type on some column, adding some more won't be a problem. :) But... on which version are you planning to do that ? We're still using Postgresql 8.3, because we use Slony-1 1.2.15 and upgrading to 8.4 is a *major* pain (discussed on slony mailling list). Slony-1 1.2.15 won't compile on 8.4, and upgrading to Slony-1 2.0.x require to rebuild the whole cluster (and upgrading to Pg 8.4 require a rebuild too). So we'd need to upgrade both slony and postgresql with an impossible downtime :) We stay on Pg 8.3 until the slony developpers find a better upgrade solution. The proposed solution sound really good to me. But, for now, if i could have a simple patch for 8.3 (eg: changing a #define in the source code), i'd be very happy :) Is it ok to just change TOAST_TUPLES_PER_PAGE ? Thank you for all your replies and proposed solutions :) PS : i'm not a C coder, but if you know some perl to be patched/cleaned, i'm here :) -- Laurent Laborde Sysadmin at http://www.over-blog.com/
Laurent Laborde <kerdezixe@gmail.com> wrote: > Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: >> How about two new ALTER TABLE actions: >> >> ALTER [ COLUMN ] column SET COMPRESSION_THRESHOLD integer >> ALTER [ COLUMN ] column SET EXTERNAL_THRESHOLD integer >> Laurent, would something like this address your needs? > Certainly ! > We already alter storage type on some column, adding some more won't > be a problem. :) > > But... on which version are you planning to do that ? The patch, if there's consensus that it's a good idea, would be for 8.5. Since it is new functionality, there wouldn't be a back-port to prior releases. Of course, I wouldn't be starting to work on such a patch until after our current code commit phase, which ends August 15th. > We stay on Pg 8.3 until the slony developpers find a better upgrade > solution. > > The proposed solution sound really good to me. > But, for now, if i could have a simple patch for 8.3 (eg: changing a > #define in the source code), i'd be very happy :) > > Is it ok to just change TOAST_TUPLES_PER_PAGE ? The thing that worries me about that is that it would tend to force more data to be stored out-of-line, which might *increase* your I/O; since the whole point of this exercise is to try to *decrease* it, that seems pretty iffy. However, once we get to the end of code commit, I might be able to give you a little one-off patch that would be more aggressive about compression without affecting out-of-line storage. Hard-coded, like what you're talking about, but with a little more finesse. -Kevin
On Fri, Jul 17, 2009 at 10:40 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > Laurent Laborde <kerdezixe@gmail.com> wrote: > >> But... on which version are you planning to do that ? > > The patch, if there's consensus that it's a good idea, would be for > 8.5. Since it is new functionality, there wouldn't be a back-port to > prior releases. Of course, I wouldn't be starting to work on such a > patch until after our current code commit phase, which ends August > 15th. Sure, no problem. >> We stay on Pg 8.3 until the slony developpers find a better upgrade >> solution. >> >> The proposed solution sound really good to me. >> But, for now, if i could have a simple patch for 8.3 (eg: changing a >> #define in the source code), i'd be very happy :) >> >> Is it ok to just change TOAST_TUPLES_PER_PAGE ? > > The thing that worries me about that is that it would tend to force > more data to be stored out-of-line, which might *increase* your I/O; > since the whole point of this exercise is to try to *decrease* it, > that seems pretty iffy. However, once we get to the end of code > commit, I might be able to give you a little one-off patch that would > be more aggressive about compression without affecting out-of-line > storage. Hard-coded, like what you're talking about, but with a > little more finesse. Awesome ! Yes, i understand the problem. What about SET STORAGE MAIN then ? To prevent out-of-line storage ? We use PLAIN on some specific column (i don't know why, it was here before i join overblog) And the default extended storage for all other columns. Thank you :) -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
Laurent Laborde <kerdezixe@gmail.com> wrote: > What about SET STORAGE MAIN then ? To prevent out-of-line storage ? Well, that doesn't try as hard as you might think to keep from storing data out-of-line. It uses the same threshold as the default EXTENDED storage, but saves the out-of-line option for such columns as the last thing to try to get it within the threshold. It is because I wrote a very small patch to address that issue that I jumped in on your issue. If you wanted to try my patch here: http://archives.postgresql.org/message-id/4A3638530200002500027A95@gw.wicourts.gov you could adjust both TOAST_TUPLES_PER_PAGE_MAIN and TOAST_TUPLES_PER_PAGE to suit your needs and set storage for columns to MAIN as needed. Be very cautious if you try this, because this patch has not yet been reviewed or accepted. -Kevin
On Fri, Jul 17, 2009 at 11:10 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > Laurent Laborde <kerdezixe@gmail.com> wrote: > >> What about SET STORAGE MAIN then ? To prevent out-of-line storage ? > > Well, that doesn't try as hard as you might think to keep from storing > data out-of-line. It uses the same threshold as the default EXTENDED > storage, but saves the out-of-line option for such columns as the last > thing to try to get it within the threshold. It is because I wrote a > very small patch to address that issue that I jumped in on your issue. > > If you wanted to try my patch here: > > http://archives.postgresql.org/message-id/4A3638530200002500027A95@gw.wicourts.gov > > you could adjust both TOAST_TUPLES_PER_PAGE_MAIN and > TOAST_TUPLES_PER_PAGE to suit your needs and set storage for columns > to MAIN as needed. Be very cautious if you try this, because this > patch has not yet been reviewed or accepted. I'll take a look at it, compile, and try that. (on a test server). Thank you :) -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
Hi again ! I also take a look at another possibility to improve compression. There is two compression strategy : static const PGLZ_Strategy strategy_default_data = { 256, /* Data chunks less than 256 bytes arenot * compressed */ 6144, /* Data chunks >= 6K force compression,unless * compressed output is larger than input */ 20, /* Below 6K, compression rates below 20% mean * fallback to uncompressed */ 128, /* Stop history lookup if a match of 128 bytes * is found */ 10 /* Lower good match size by 10% at every * lookup loop iteration */ }; const PGLZ_Strategy *const PGLZ_strategy_default = &strategy_default_data; static const PGLZ_Strategy strategy_always_data = { 0, /* Chunks of any size are compressed */ 0, 0, /* It's enough to save one single byte */ 128, /* Stop historylookup if a match of 128 bytes * is found */ 6 /* Lookharder for a good match */ }; const PGLZ_Strategy *const PGLZ_strategy_always = &strategy_always_data; 1) "strategy_always_data" seems to never be used. 2) the default strategy could be more aggressive (with a higher cpu cost) Additionally, we use a patched version that modify the default strategy. If i understand correctly, instead of being more aggresive on compression, it is *LESS* aggresive : static const PGLZ_Strategy strategy_default_data = { 32, /* Data chunks less than 32 bytes are not compressed */ 1024 * 1024, /* Data chunks over 1MB are not compressed either */ 25, /* Require 25% compression rate, or not worth it */ 1024, /* Give up if no compression in the first 1KB */ 128, /* Stop history lookup if a match of 128 bytes is found */ 10 /* Lower good match size by 10% at every loop iteration */ }; const PGLZ_Strategy *const PGLZ_strategy_default = &strategy_default_data; Isn't it ? What about setting "PGLZ_strategy_always" as the default strategy (insane cpu cost ?) ? Or something in-between ? Thank you. -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
Laurent Laborde <kerdezixe@gmail.com> wrote: > There is two compression strategy : > static const PGLZ_Strategy strategy_default_data = { > static const PGLZ_Strategy strategy_always_data = { > 1) "strategy_always_data" seems to never be used. A quick grep sure makes it look that way. I will look closer later. > 2) the default strategy could be more aggressive (with a higher cpu > cost) > What about setting "PGLZ_strategy_always" as the default strategy > (insane cpu cost ?) ? > Or something in-between ? That goes beyond what I was trying to do with my recent patch. What you propose may be useful, but there would need to be much discussion and benchmarking and it would be a new patch. If you have any benchmark information on relative speed and space savings, please post them. -Kevin
On Mon, Jul 20, 2009 at 6:04 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > >> What about setting "PGLZ_strategy_always" as the default strategy >> (insane cpu cost ?) ? >> Or something in-between ? > > That goes beyond what I was trying to do with my recent patch. What > you propose may be useful, but there would need to be much discussion > and benchmarking and it would be a new patch. > > If you have any benchmark information on relative speed and space > savings, please post them. I will try that as soon as my spare production server (2x4core Xeon, 32GB RAM, 8 SAS Disk) is back to life. I wasn't thinking about the very aggressive (strategy_always) compression strategy for a default postgresql release. Not everyone is IObound :) But just as a default setting here at over-blog. (definitively IOBound with large articles and comment). Anyway, i will try and report different strategy here. Thank you again for your feedback. -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > It seems like it might be reasonable to have a separate threshold for > compression from that for out-of-line storage. Since I've been in > that code recently, I would be pretty comfortable doing something > about that; but, as is so often the case, the problem will probably be > getting agreement on what would be a good change. > Ignoring for a moment the fact that "low hanging fruit" in the form of > *very* large values can be handled first, the options would seem to > be: > (1) Just hard-code a lower default threshold for compression than for > out-of-line storage. > (2) Add a GUC or two to control thresholds. > (3) Allow override of the thresholds for individual columns. I'm not clear how this would work. The toast code is designed around hitting a target for the overall tuple size; how would it make sense to treat compression and out-of-lining differently? And especially, how could you have per-column targets? I could see having a reloption that allowed per-table adjustment of the target tuple width... regards, tom lane
My 1st applied patch is the safest and simpliest : in pg_lzcompress.c : static const PGLZ_Strategy strategy_default_data = { 256, /* Data chunks less than 256 are not compressed */ 256, /* force compression on data chunks on record >= 256bytes */ 1, /* compression rate below 1% fall backto uncompressed */ 256, /* Stop history lookup if a match of 256 bytes is found */ 6 /* lowergood match size b 6% at every lookup iteration */ }; const PGLZ_Strategy *const PGLZ_strategy_default = &strategy_default_data; I need to test for a few days. But the firsts tests show that we're still IObound :) The most obvious effect is reduction by a factor 2~10 of the size of some TOAST table. It seems that a lot of record are now kept in-line instead of being stored in TOAST. I will come back later with some numbers :) Next patch will be a modified kevin's patch. (it doesn't directly apply to our source code as i'm using postgresql 8.3 and his patch is for 8.4) and a change in TOAST thresold and target. What do you think about the parameters i used in the compression strategy ? PS : biggest records are french text and html. (blog data : articles, comments, ...) Thank you. -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> It seems like it might be reasonable to have a separate threshold >> for compression from that for out-of-line storage. Since I've been >> in that code recently, I would be pretty comfortable doing >> something about that; but, as is so often the case, the problem >> will probably be getting agreement on what would be a good change. > I'm not clear how this would work. The toast code is designed > around hitting a target for the overall tuple size; how would it > make sense to treat compression and out-of-lining differently? The current steps are: 1: Inline compress attributes with attstorage 'x', and store very large attributes with attstorage 'x' or 'e' external immediately 2: Store attributes with attstorage 'x' or 'e' external 3: Inline compress attributes with attstorage 'm' 4: Store attributes with attstorage 'm' external If we had separate compression and external storage tuple targets: Is there any reason not to include 'm' in the first inline compression phase (step 1)? It does seem reasonable to store "very large attributes" externally in the first pass, but it would be pretty easy to include 'm' in the compression but skip it for the external storage during step 1. In this phase we would use the compression target. Step 2 would use the target tuple size for external storage, which would probably usually be >= the compression target. If we want to allow a compression target > external storage target, I guess we would have to allow the smaller target to go first; however, I'm not really sure if there is a sane use case for a larger compression target than external storage target. Step 3 would go away, since its work could be moved to step 1. Step 4 would maintain the behavior created by the recent patch. > And especially, how could you have per-column targets? > > I could see having a reloption that allowed per-table adjustment of > the target tuple width... Yeah, this would have to be done by table, not by column. The compression configuration mentioned by Laurent, if we want to make that tunable, which could make sense by column; but the toast tuple sizes targets would clearly need to be by table. -Kevin
I wrote: > If we want to allow a compression target > external storage target, > I guess we would have to allow the smaller target to go first Scratch that part -- even with a compression target > the external storage target, it would make sense use the same sequence of steps. -Kevin
Sorry I responded that quickly this early. I keep having additional thoughts.... "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> And especially, how could you have per-column targets? > Yeah, this would have to be done by table, not by column. If we had an optional two targets by column, we could pass any columns with such targets as a "step 0", before starting the tuple size checks. I think that makes sense, so I'm flip-flopping on that as a possibility. Now, whether that's overkill is another question. -Kevin
On Wed, Jul 22, 2009 at 10:54 AM, Laurent Laborde<kerdezixe@gmail.com> wrote: > My 1st applied patch is the safest and simpliest : > in pg_lzcompress.c : > > static const PGLZ_Strategy strategy_default_data = { > 256, /* Data chunks less than 256 are not compressed */ > 256, /* force compression on data chunks on record >= 256bytes */ > 1, /* compression rate below 1% fall back to uncompressed */ > 256, /* Stop history lookup if a match of 256 bytes is found */ > 6 /* lower good match size b 6% at every lookup iteration */ > }; > const PGLZ_Strategy *const PGLZ_strategy_default = &strategy_default_data; I'm testing in production since yesterday. It greatly improved %IOwait. My 1st guess is that postgresql keep more data inline instead of moving it in extern to toast table, reducing massively the IOseek and resulting in a higher IO througput. (iostat show a 5~25MB/s bandwidth at 100%util instead of 2~5MB/s at 100%util). So... now i'm not sure anymore about lowering the tuple per page from 4 to 8. Doing that would mean more data in TOAST table ... -- Laurent "ker2x" Laborde Sysadmin @ http://www.over-blog.com/
Laurent Laborde <kerdezixe@gmail.com> wrote: > (iostat show a 5~25MB/s bandwidth at 100%util instead of 2~5MB/s at > 100%util). Any numbers for overall benefit at the application level? > So... now i'm not sure anymore about lowering the tuple per page > om 4 to 8. > Doing that would mean more data in TOAST table ... Yeah, I've been skeptical that it would be a good thing for your situation unless the lower target only applied to more aggressive compression, not out-of-line storage. If you can wait for a week or two, I can give you a "proof of concept" patch to use separate targets for compression and out-of-line storage. It would be interesting to see how much that helps when combined with your more aggressive compression configuration. -Kevin
On Thu, Jul 23, 2009 at 4:45 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > Laurent Laborde <kerdezixe@gmail.com> wrote: > >> (iostat show a 5~25MB/s bandwidth at 100%util instead of 2~5MB/s at >> 100%util). > > Any numbers for overall benefit at the application level? > >> So... now i'm not sure anymore about lowering the tuple per page >> om 4 to 8. >> Doing that would mean more data in TOAST table ... > > Yeah, I've been skeptical that it would be a good thing for your > situation unless the lower target only applied to more aggressive > compression, not out-of-line storage. I tested to change the TUPLES PER PAGE (EXTERNAL) to 8. As expected, it very badly impact the IO performance as many tuple (always read) are now stored external. With some extremly high IOwait peak because of the additional disk seek required to get the toasted data (the average IO bandwidth dropped) : Cpu0 : 5.3%us, 3.0%sy, 0.0%ni, 7.0%id, 83.4%wa, 0.7%hi, 0.7%si, 0.0%st Cpu1 : 4.3%us, 1.3%sy, 0.0%ni, 5.7%id, 88.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 3.3%us, 0.7%sy, 0.0%ni, 8.0%id, 88.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 3.7%us, 0.7%sy, 0.0%ni, 4.7%id, 91.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 4.0%us, 1.3%sy, 0.0%ni, 8.0%id, 86.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 3.7%us, 0.3%sy, 0.0%ni, 5.7%id, 90.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 3.0%us, 0.7%sy, 0.0%ni, 6.7%id, 89.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 2.7%us, 0.7%sy, 0.0%ni, 7.7%id, 89.0%wa, 0.0%hi, 0.0%si, 0.0%st > If you can wait for a week or two, I can give you a "proof of concept" > patch to use separate targets for compression and out-of-line storage. > It would be interesting to see how much that helps when combined with > your more aggressive compression configuration. Of course, of course, i can wait. All my patchs and testing are released on a public github.com : http://github.com/ker2x/AkaneSQL/tree/master I'll continue to patch postgresql/AkaneSQL, for fun and learning purpose :) Hoping to be good enough, someday, to submit patch here. -- Laurent Laborde Sysadmin @ http://www.over-blog.com/
I'm currently rewriting the whole toaster stuff to simply define : - a compression threshold (size limit to compress, in Nth of page) - an external threshold (size limit to externalize compressed data, in Nth of page) i keep the TOAST_INDEX_TARGET and EXTERN_TUPLES_PER_PAGE. I expect a lot of trial and error as it will my first real patch. http://github.com/ker2x/AkaneSQL/tree/master as usual. -- F4FQM Kerunix Flan Laurent Laborde
On Tue, Jul 28, 2009 at 2:36 PM, Laurent Laborde<kerdezixe@gmail.com> wrote: > I'm currently rewriting the whole toaster stuff to simply define : > - a compression threshold (size limit to compress, in Nth of page) > - an external threshold (size limit to externalize compressed data, in > Nth of page) > > i keep the TOAST_INDEX_TARGET and EXTERN_TUPLES_PER_PAGE. > > I expect a lot of trial and error as it will my first real patch. > http://github.com/ker2x/AkaneSQL/tree/master as usual. Rewritten. All 114 tests passed. I'm testing it on our plateform (currently replicating ~80GB of data via slony, heavy read test will come later). If it works, and if you're interested, i may try to write a patch for 8.5. *hugs* -- Laurent "ker2x" Laborde Sysadmin @ http://www.over-blog.com/
Laurent Laborde <kerdezixe@gmail.com> wrote: > If it works, and if you're interested, i may try to write a patch > for 8.5. Cool! I can help with it if you wish. If you haven't already done so, be sure to read this carefully: http://wiki.postgresql.org/wiki/Developer_FAQ Also, be sure you are taking into account the comments of others on this thread. In particular, there was feedback regarding the level at which to apply this (hard-coded, global but configurable, or column-specific); what feedback we had was in favor of making it configurable by column. If you're not comfortable with doing that part of it, and you can get proof-of-concept benchmarks with a hard-coded global change, that would help convince people that it is worth adding code to support such capabilities at whatever level(s) people agree is best. I've worked in the parser portion before, so that part would be familiar to me if you want help there. Finally, you should probably consider volunteering to review a patch or two for the next commitfest. :-) To ensure timely review of submitted patches, while still allowing the reviewers some guarantee of unencumbered time to write their own patches, there is a development cycle which alternates between coding and reviewing. -Kevin
On Tue, Jul 28, 2009 at 12:44 PM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > Finally, you should probably consider volunteering to review a patch > or two for the next commitfest. :-) To ensure timely review of > submitted patches, while still allowing the reviewers some guarantee > of unencumbered time to write their own patches, there is a > development cycle which alternates between coding and reviewing. Yep, I don't want to throw stones at anyone in particular, and particularly not at first-time patch submitters, but there are always more people writing patches than volunteering to review them. This is partially offset by the fact that it usually takes less time to review a patch than it does to write one - however, I know that I have had a difficult time managing this CommitFest, reviewing 2 patches, and keeping on top of the changes requested for the 10 patches I submitted. As a result, one or more of my patches may fall out of this CommitFest, whereas if I had not volunteered to manage to CommitFest, or had not reviewed any patches myself, it would have been much easier. I'm not bent out of shape about it, but it would definitely be nice if everyone could volunteer to take a turn reviewing, especially regular contributors. ...Robert
On Jul 23, 2009, at 6:22 AM, Laurent Laborde wrote: > On Wed, Jul 22, 2009 at 10:54 AM, Laurent > Laborde<kerdezixe@gmail.com> wrote: >> My 1st applied patch is the safest and simpliest : >> in pg_lzcompress.c : >> >> static const PGLZ_Strategy strategy_default_data = { >> 256, /* Data chunks less than 256 are not compressed */ >> 256, /* force compression on data chunks on record >= >> 256bytes */ >> 1, /* compression rate below 1% fall back to >> uncompressed */ >> 256, /* Stop history lookup if a match of 256 bytes is >> found */ >> 6 /* lower good match size b 6% at every lookup >> iteration */ >> }; >> const PGLZ_Strategy *const PGLZ_strategy_default = >> &strategy_default_data; > > I'm testing in production since yesterday. > It greatly improved %IOwait. > > My 1st guess is that postgresql keep more data inline instead of > moving it in extern to toast table, reducing massively the IOseek and > resulting in a higher IO througput. > (iostat show a 5~25MB/s bandwidth at 100%util instead of 2~5MB/s at > 100%util). > > So... now i'm not sure anymore about lowering the tuple per page > from 4 to 8. > Doing that would mean more data in TOAST table ... What's the typical size of your data that's being toasted? I actually have a number of cases where I'd like to push data into external storage, because it seriously hurts tuple density (and I doubt it'd compress well). -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828