Thread: Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

From
Joel Stevenson
Date:
Hi all,

I'm trying to do some comparisons between the EXTERNAL and the EXTENDED storage methods on a bytea column and from the
outsidethe setting doesn't appear to affect the value stored on initial insert, but perhaps I'm looking at the wrong
numbers. If I create two new tables with a single bytea column and set one of them to external storage, then insert an
existingbytea value from another table into each one, they appear to be of exactly the same size.  This is using PG
9.0.3on Debian Lenny, using the backports-sloppy deb package of PG 9. 

(I've verified that the first table has "extended" storage via pg_attribute and that the second table has external.)

create table obj1 ( object bytea );
create table obj2 ( object bytea );
alter table obj2 alter column object set storage external;
insert into obj1 ( object ) select object from serialized_content where id = 12345;
insert into obj2 ( object ) select object from obj1;
select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where relname
='obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid from
pg_classwhere relname = 'obj2' ) ) as otoast2; 
  o1   | otoast1 |  o2   | otoast2
-------+---------+-------+---------
 65536 |   57344 | 65536 |   57344
(1 row)

Now at this point if I perform a vacuum full on one or both, they'll both shrink down to a bit over half that size:

vacuum full obj1;
vacuum full obj2;
select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where relname
='obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid from
pg_classwhere relname = 'obj2' ) ) as otoast2; 
  o1   | otoast1 |  o2   | otoast2
-------+---------+-------+---------
 40960 |   32768 | 40960 |   32768

This shrinkage doesn't occur with a normal vacuum, in fact the reported sizes increase if I do a normal vacuum at that
point,but that's not germane to my question AFAICT. 

Can I use the relation size like this to determine whether or not compression is happening for these toast columns?  If
not,is there a way that I can confirm that it is or isn't active?  The results appear to be similar for text columns. 

Thanks,
Joel


Re: Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

From
Bill Moran
Date:
In response to Joel Stevenson <jstevenson@bepress.com>:

> Hi all,
>
> I'm trying to do some comparisons between the EXTERNAL and the EXTENDED storage methods on a bytea column and from
theoutside the setting doesn't appear to affect the value stored on initial insert, but perhaps I'm looking at the
wrongnumbers.  If I create two new tables with a single bytea column and set one of them to external storage, then
insertan existing bytea value from another table into each one, they appear to be of exactly the same size.  This is
usingPG 9.0.3 on Debian Lenny, using the backports-sloppy deb package of PG 9. 
>
> (I've verified that the first table has "extended" storage via pg_attribute and that the second table has external.)
>
> create table obj1 ( object bytea );
> create table obj2 ( object bytea );
> alter table obj2 alter column object set storage external;
> insert into obj1 ( object ) select object from serialized_content where id = 12345;
> insert into obj2 ( object ) select object from obj1;
> select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where
relname= 'obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid
frompg_class where relname = 'obj2' ) ) as otoast2; 
>   o1   | otoast1 |  o2   | otoast2
> -------+---------+-------+---------
>  65536 |   57344 | 65536 |   57344
> (1 row)
>
> Now at this point if I perform a vacuum full on one or both, they'll both shrink down to a bit over half that size:
>
> vacuum full obj1;
> vacuum full obj2;
> select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where
relname= 'obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid
frompg_class where relname = 'obj2' ) ) as otoast2; 
>   o1   | otoast1 |  o2   | otoast2
> -------+---------+-------+---------
>  40960 |   32768 | 40960 |   32768

I'm not an expert, but it looks like you're not storing enough data to
actually see the difference, since the actual sizes of the tables will
always be rounded to an even page size.  With only 1 row, it's always
going to take a minimum amount.

Also, are you sure you're storing compressible data?  For example, if
you're putting PNG or JPEG images in there, they're not going to compress
any.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

From
Joel Stevenson
Date:
On Apr 12, 2011, at 10:33 AM, Bill Moran wrote:

> In response to Joel Stevenson <jstevenson@bepress.com>:
>
>> select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where
relname= 'obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid
frompg_class where relname = 'obj2' ) ) as otoast2; 
>>  o1   | otoast1 |  o2   | otoast2
>> -------+---------+-------+---------
>> 40960 |   32768 | 40960 |   32768
>
> I'm not an expert, but it looks like you're not storing enough data to
> actually see the difference, since the actual sizes of the tables will
> always be rounded to an even page size.  With only 1 row, it's always
> going to take a minimum amount.
>
> Also, are you sure you're storing compressible data?  For example, if
> you're putting PNG or JPEG images in there, they're not going to compress
> any.
>

Thanks for the reply, Bill.

The data is very compressible, the raw data is 33392 bytes long and gzips down to 6965 bytes.  As far as not storing
enough,the description of the 'SET STORAGE' clause and the TOAST strategy it sounds like the TOASTer will try to
compressanything that doesn't fit into the PG page ( default 8Kb ) so I would've thought that compression would be used
forthe EXTENDED column and not used for the EXTERNAL column since my single-row data is larger than that. 

To be certain I stored 10 rows of that data and rechecked the reported size after a vacuum full:

select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where relname
='obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid from
pg_classwhere relname = 'obj2' ) ) as otoast2; 
   o1   | otoast1 |   o2   | otoast2
--------+---------+--------+---------
 147456 |  139264 | 147456 |  139264

So, again from the outside, the column storage settings don't appear to be behaving as I'd expect them too.

Stumped.

- Joel

Re: Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

From
Noah Misch
Date:
On Mon, Apr 11, 2011 at 03:19:23PM -0700, Joel Stevenson wrote:
> I'm trying to do some comparisons between the EXTERNAL and the EXTENDED storage methods on a bytea column and from
theoutside the setting doesn't appear to affect the value stored on initial insert, but perhaps I'm looking at the
wrongnumbers.  If I create two new tables with a single bytea column and set one of them to external storage, then
insertan existing bytea value from another table into each one, they appear to be of exactly the same size.  This is
usingPG 9.0.3 on Debian Lenny, using the backports-sloppy deb package of PG 9. 
>
> (I've verified that the first table has "extended" storage via pg_attribute and that the second table has external.)
>
> create table obj1 ( object bytea );
> create table obj2 ( object bytea );
> alter table obj2 alter column object set storage external;
> insert into obj1 ( object ) select object from serialized_content where id = 12345;
> insert into obj2 ( object ) select object from obj1;

If the value that shows up for insertion is already compressed, EXTERNAL storage
will not decompress it.  Change this line to

  insert into obj2 ( object ) select object || '' from obj1;

to observe the effect you seek.

Given the purpose of EXTERNAL storage, this might qualify as a bug.

> select pg_total_relation_size('obj1') as o1, pg_total_relation_size( (select reltoastrelid from pg_class where
relname= 'obj1' ) ) as otoast1, pg_total_relation_size('obj2') as o2, pg_total_relation_size( (select reltoastrelid
frompg_class where relname = 'obj2' ) ) as otoast2; 
>   o1   | otoast1 |  o2   | otoast2
> -------+---------+-------+---------
>  65536 |   57344 | 65536 |   57344

> Can I use the relation size like this to determine whether or not compression is happening for these toast columns?
Ifnot, is there a way that I can confirm that it is or isn't active?  The results appear to be similar for text
columns.

Yes; the sizes you're seeing through that method should be accurate.

nm

Re: Column storage (EXTERNAL/EXTENDED) settings for bytea/text column

From
Joel Stevenson
Date:
Thanks for the help with that, Noah.  Indeed the sizes do look like I'd expect them to if I force deflation of the
byteavalue before inserting it into the EXTENDED column.   

On Apr 21, 2011, at 2:02 PM, Noah Misch wrote:

> On Mon, Apr 11, 2011 at 03:19:23PM -0700, Joel Stevenson wrote:
>> create table obj1 ( object bytea );
>> create table obj2 ( object bytea );
>> alter table obj2 alter column object set storage external;
>> insert into obj1 ( object ) select object from serialized_content where id = 12345;
>> insert into obj2 ( object ) select object from obj1;
>
> If the value that shows up for insertion is already compressed, EXTERNAL storage
> will not decompress it.  Change this line to
>
>  insert into obj2 ( object ) select object || '' from obj1;
>
> to observe the effect you seek.
>
> Given the purpose of EXTERNAL storage, this might qualify as a bug.