Re: more about pg_toast growth - Mailing list pgsql-general
From | Jan Wieck |
---|---|
Subject | Re: more about pg_toast growth |
Date | |
Msg-id | 200203132016.g2DKGDr32370@saturn.janwieck.net Whole thread Raw |
In response to | Re: more about pg_toast growth ("Jeffrey W. Baker" <jwbaker@acm.org>) |
Responses |
Re: more about pg_toast growth
("Jeffrey W. Baker" <jwbaker@acm.org>)
|
List | pgsql-general |
Jeffrey W. Baker wrote: > On Wed, 2002-03-13 at 07:22, Jan Wieck wrote: > > [...] > > > > Remember, TOAST doesn't only come in slices, don't you > > usually brown it? Meaning, the data gets compressed (with a > > lousy but really fast algorithm). What kind of data is > > resp_body? 50% compression ratio ... I guess it's html, > > right? > > It is gzipped and base64-encoded text. It's somewhat strange that a > fast LZ would deflate it very much, but I guess it must be an artifact > of the base64. The initial gzip tends to deflate the data by about 90%. Now THAT is very surprising to me! The SLZ algorithm used in TOAST will for sure not be able to squeeze anything out of a gzip compressed stream. The result would be bigger again. B64 changes the file size basically to 4/3rd, but since the input stream is gzipped, the resulting B64 stream shouldn't contain patterns that SLZ can use to reduce the size again. Are you sure you're B64-encoding the gzipped text? I mean, you have an average body size of 23K "gzipped", so you're telling that the average uncompressed body size is about 230K? You are storing 230 Megabytes of raw body data per hour? Man, who is writing all that text? > > > Anyway, I would suggest you increase the max_fsm_pages > > parameter. Commented out parameters in the postgresql.conf > > file means "default". You said you're doing about 1,000 > > inserts an hour and a daily bulk delete of approx. 24,000. > > Assuming most of the toast tuples are contigous, that'd mean > > you are freeing something like 35,000 toast pages. I would > > suggest a freespace map size of 50,000 pages, to start with. > > That should at least lower the growth rate. If you still see > > growth, go ahead and increase it further. > > I will happily do so. What is the argument against increasing the free > space map? Does it consume more memory? The machine has 4GB main > memory, and I wouln't notice increased consumption of a few megabytes. It allocates some more shared memory. It's surely in the range of a few megabytes, so no need to worry in this case. > What triggers storage in the toast tables? Is it because of a tuple > longer than some threshold? It is triggered by the attempt to store a row bigger than 1/4 of the blocksize. Blocksize defaults to 8K. The toaster tries to shrink down the row size by first compressing attribute by attribute, then by moving them out to the toast table. As soon as the row fit's into the 2K it'll stop and the storage happens. The 1/4 blocksize is something I choose because it had the best performance in my tests. The theory behind it is that your key fields are likely to be the small ones that remain uncompressed in the main-row. If more such main rows fit into fewer blocks, you have better caching of key attributes, thus faster scans and joins. And access to the huge values is only done after the final result set is collected, what usually becomes a smaller percentage of the entire data, the larger the data is. So it all reduces I/O. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
pgsql-general by date: