Re: Database storage bloat - Mailing list pgsql-admin

From Tom Lane
Subject Re: Database storage bloat
Date
Msg-id 15732.1081446563@sss.pgh.pa.us
Whole thread Raw
In response to Re: Database storage bloat  ("Tony and Bryn Reina" <reina_ga@hotmail.com>)
List pgsql-admin
"Tony and Bryn Reina" <reina_ga@hotmail.com> writes:
> CREATE TABLE SegmentValues (
>      dbIndex                 integer REFERENCES EntityFile (dbIndex),
>      dwunitid  smallint,
>      dwsampleindex  smallint,
>      dtimestamp  float4,
>      dvalue   float4,
>      PRIMARY KEY (dbIndex, dtimestamp, dwsampleindex, dwunitid)
> );

> I suppose one thing the binary flat file may be doing is not including the
> time stamp in table SegmentValues. Since I know the sampling rate, I can
> just calculate the timestamp on the fly by the rate times the index
> (assuming no time offset). That would lose a float4 field, but would add
> back a smallint field to the table.

That won't buy you anything at all --- the two bytes saved would be lost
again to alignment padding.  (I'm assuming you're on PC hardware with
MAXALIGN = 4 bytes.)

I don't see orders-of-magnitude bloat here though.  You've got 16 bytes
of useful data per row (which I suppose was 12 bytes in the flat file?).
There will be 28 bytes of overhead per table row.  In addition the index
will require 12 data bytes + 12 overhead bytes per entry; allowing for
the fact that b-tree only likes to pack pages about 2/3ds full, we could
estimate index size as about 36 bytes per original row, giving an
aggregate bloat factor of 6.67X compared to a binary flat file if the
flat file needed 12 bytes per row.

The only way I could see to get to a 65X bloat factor would be if you'd
repeatedly updated the table rows without vacuuming.

            regards, tom lane

pgsql-admin by date:

Previous
From: Steve Crawford
Date:
Subject: Re: Database storage bloat
Next
From: "Chris White (cjwhite)"
Date:
Subject: Error during startup of 7.4.2 database