Thread: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
 We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x

 The origin database "data/base" directory is 197 GB in size.

 The slave database "data/base" directory is 562 GB in size and is
 over 75% filesystem utilization which has set off the "disk free" siren.

 My biggest table* measures 154 GB on the origin, and 533 GB on
 the slave.  (*As reported by

 SELECT relname as "Table", pg_size_pretty(pg_total_relation_size(relid))
   As "Size" from pg_catalog.pg_statio_user_tables
   ORDER BY pg_total_relation_size(relid) DESC;
 )

 I took a peek at this table on the slave using pgadmin3. The table
 has auto-vacuum enabled, and TOAST autovacuum enabled.

 There are 8.6 million live tuples, and 1.5 million dead tuples.

 Last autovacuum was over a month ago.

 Last autoanalyze was 3 hours ago.

 Table size is 4 Gigs, and TOAST table size is 527 Gigs.
 Indexes size is 3 Gigs.

 Autovacuum threshold is 20%, and the table is just under that threshold.

 I ran vacuum analyze verbose.  But the filesystem is still at 76%
utilization.
 In fact, now, the "data/base" directory has grown to 565 GB.

 Why is my slave bigger than my master?  How can I compact it, please?

Best,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Tue, Mar 6, 2012 at 7:05 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>
>  The origin database "data/base" directory is 197 GB in size.
>
>  The slave database "data/base" directory is 562 GB in size and is
>  over 75% filesystem utilization which has set off the "disk free" siren.
>
>  My biggest table* measures 154 GB on the origin, and 533 GB on
>  the slave.  (*As reported by
>
>  SELECT relname as "Table", pg_size_pretty(pg_total_relation_size(relid))
>   As "Size" from pg_catalog.pg_statio_user_tables
>   ORDER BY pg_total_relation_size(relid) DESC;
>  )

I ran VACUUM FULL on this table, but it is still over 500 GB in size.
And growing...
I'm up to 77% utilization on the filesystem.

"check_postgres --action=bloat" now returns OK.  So it's not bloat.
What else could it be?

Best,
Aleksey

Re: Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
"Joshua D. Drake"
Date:
On 03/07/2012 06:27 PM, Aleksey Tsalolikhin wrote:

>>   SELECT relname as "Table", pg_size_pretty(pg_total_relation_size(relid))
>>    As "Size" from pg_catalog.pg_statio_user_tables
>>    ORDER BY pg_total_relation_size(relid) DESC;
>>   )
>
> I ran VACUUM FULL on this table, but it is still over 500 GB in size.
> And growing...
> I'm up to 77% utilization on the filesystem.
>
> "check_postgres --action=bloat" now returns OK.  So it's not bloat.
> What else could it be?

Try disabling replication on that table and clustering the table and
then re-enabling replication. I would have to double check but I think
check_postgres --action=bloat only checks for dead space, not usable
space, so you could actually still have bloat, just bloat that is usable.

Alternatively you could disable replication on that table, truncate the
table, and then re-enable replication for that table. A concern would be
is that it is a large table regardless, which means you are going to
hold open a transaction to refill it.


JD


>
> Best,
> Aleksey
>


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

Dear Joshua,

  You wrote:

> Try disabling replication on that table and clustering the table and then re-enabling
> replication.
...
> Alternatively you could disable replication on that table, truncate the table, and then
> re-enable replication for that table. A concern would be is that it is a large table
> regardless, which means you are going to hold open a transaction to refill it.

I don't see any way to disable replication on a table in Slony.   I do
see I can remove
a table from the replication set, and then add it back in.  Is that
what you meant, or
am I missing something?

I ask because I know when a table is added to a replication set, it is
copied over in
full from origin to slave, and since this table is huge, I'll need to
schedule a maintenance
window to minimize impact on production.

Yours truly,
Aleksey

Re: Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Wed, Mar 7, 2012 at 8:12 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
>> "check_postgres --action=bloat" returns OK [after VACUUM FULL].  So it's not bloat.
>> What else could it be?
>
> I would have to double check but I think
> check_postgres --action=bloat only checks for dead space, not usable space,
> so you could actually still have bloat, just bloat that is usable.

This is how check_postgres checks.  How would I check for usable
bloat, to confirm
that that's what I am running into?  What is usable bloat, anyway?
(Is there some write-up
on it?)


SELECT
          current_database() AS db, schemaname, tablename,
reltuples::bigint AS tups, relpages::bigint AS pages, otta,
          ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR
sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS
tbloat,
          CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint -
otta END AS wastedpages,
          CASE WHEN relpages < otta THEN 0 ELSE
bs*(sml.relpages-otta)::bigint END AS wastedbytes,
          CASE WHEN relpages < otta THEN '0 bytes'::text ELSE
(bs*(relpages-otta))::bigint || ' bytes' END AS wastedsize,
          iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,
          ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0
ELSE ipages/iotta::numeric END,1) AS ibloat,
          CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta
END AS wastedipages,
          CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END
AS wastedibytes,
          CASE WHEN ipages < iotta THEN '0 bytes' ELSE
(bs*(ipages-iotta))::bigint || ' bytes' END AS wastedisize,
          CASE WHEN relpages < otta THEN
            CASE WHEN ipages < iotta THEN 0 ELSE ipages-iotta::bigint END
            ELSE CASE WHEN ipages < iotta THEN relpages-otta::bigint
              ELSE relpages-otta::bigint + ipages-iotta::bigint END
          END AS totalwastedbytes
        FROM (
          SELECT
            schemaname, tablename, cc.reltuples, cc.relpages, bs,
            CEIL((cc.reltuples*((datahdr+ma-
              (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma
END))+nullhdr2+4))/(bs-20::float)) AS otta,
            COALESCE(c2.relname,'?') AS iname,
COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS
ipages,
            COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0)
AS iotta -- very rough approximation, assumes all cols
          FROM (
            SELECT
              ma,bs,schemaname,tablename,
              (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE
hdr%ma END)))::numeric AS datahdr,
              (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma
ELSE nullhdr%ma END))) AS nullhdr2
            FROM (
              SELECT
                schemaname, tablename, hdr, ma, bs,
                SUM((1-null_frac)*avg_width) AS datawidth,
                MAX(null_frac) AS maxfracsum,
                hdr+(
                  SELECT 1+count(*)/8
                  FROM pg_stats s2
                  WHERE null_frac<>0 AND s2.schemaname = s.schemaname
AND s2.tablename = s.tablename
                ) AS nullhdr
              FROM pg_stats s, (
                SELECT
                  (SELECT current_setting('block_size')::numeric) AS bs,
                    CASE WHEN SUBSTRING(SPLIT_PART(v, ' ', 2) FROM
'#"[0-9]+.[0-9]+#"%' for '#')
                      IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
                  CASE WHEN v ~ 'mingw32' OR v ~ '64-bit' THEN 8 ELSE
4 END AS ma
                FROM (SELECT version() AS v) AS foo
              ) AS constants
              GROUP BY 1,2,3,4,5
            ) AS foo
          ) AS rs
          JOIN pg_class cc ON cc.relname = rs.tablename
          JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND
nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
          LEFT JOIN pg_index i ON indrelid = cc.oid
          LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
        ) AS sml
         WHERE sml.relpages - otta > 0 OR ipages - iotta > 10 ORDER BY
totalwastedbytes DESC LIMIT 10

Yours,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Stuart Bishop
Date:
On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>
>  The origin database "data/base" directory is 197 GB in size.
>
>  The slave database "data/base" directory is 562 GB in size and is
>  over 75% filesystem utilization which has set off the "disk free" siren.
>
>  My biggest table* measures 154 GB on the origin, and 533 GB on
>  the slave.  (*As reported by
>
>  SELECT relname as "Table", pg_size_pretty(pg_total_relation_size(relid))
>   As "Size" from pg_catalog.pg_statio_user_tables
>   ORDER BY pg_total_relation_size(relid) DESC;
>  )
>
>  I took a peek at this table on the slave using pgadmin3. The table
>  has auto-vacuum enabled, and TOAST autovacuum enabled.
>
>  There are 8.6 million live tuples, and 1.5 million dead tuples.
>
>  Last autovacuum was over a month ago.
>
>  Last autoanalyze was 3 hours ago.
>
>  Table size is 4 Gigs, and TOAST table size is 527 Gigs.
>  Indexes size is 3 Gigs.
>
>  Autovacuum threshold is 20%, and the table is just under that threshold.
>
>  I ran vacuum analyze verbose.  But the filesystem is still at 76%
> utilization.
>  In fact, now, the "data/base" directory has grown to 565 GB.
>
>  Why is my slave bigger than my master?  How can I compact it, please?

Do you have a long running transaction on the slave? vacuum will not
reuse space that was freed after the longest running transaction.

You need to use the CLUSTER command to compact it, or VACUUM FULL
followed by a REINDEX if you don't have enough disk space to run
CLUSTER. And neither of these will do anything if the space is still
live because some old transaction might still need to access the old
tuples.


--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>
>
>  My biggest table measures 154 GB on the origin, and 533 GB on
>  the slave.
>
>  Why is my slave bigger than my master?  How can I compact it, please?


On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
<stuart@stuartbishop.net> wrote back:
>
> Do you have a long running transaction on the slave? vacuum will not
> reuse space that was freed after the longest running transaction.
>
> You need to use the CLUSTER command to compact it, or VACUUM FULL
> followed by a REINDEX if you don't have enough disk space to run
> CLUSTER. And neither of these will do anything if the space is still
> live because some old transaction might still need to access the old
> tuples.

Dear Stuart,

  We do not run any transactions on the slave besides we pg_dump the
entire database every 3 hours.  I don't have enough disk space to CLUSTER
the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
TABLE.

  I'd love to get some insight into how much logical data I have versus how
much physical space it is taking up.  Is there some admin tool or command
or query that will report that?  For each table (and index), I'd like
to know how
much data is in that object (logical data size) and how much space it is taking
up on disk (physical data size).

Yours,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:
>>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>
>>
>>  My biggest table measures 154 GB on the origin, and 533 GB on
>>  the slave.
>>
>>  Why is my slave bigger than my master?  How can I compact it, please?
>
>
> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
> <stuart@stuartbishop.net> wrote back:
>>
>> Do you have a long running transaction on the slave? vacuum will not
>> reuse space that was freed after the longest running transaction.
>>
>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>> followed by a REINDEX if you don't have enough disk space to run
>> CLUSTER. And neither of these will do anything if the space is still
>> live because some old transaction might still need to access the old
>> tuples.
>
> Dear Stuart,
>
>  We do not run any transactions on the slave besides we pg_dump the
> entire database every 3 hours.  I don't have enough disk space to CLUSTER
> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
> TABLE.
>
>  I'd love to get some insight into how much logical data I have versus how
> much physical space it is taking up.  Is there some admin tool or command
> or query that will report that?  For each table (and index), I'd like
> to know how
> much data is in that object (logical data size) and how much space it is taking
> up on disk (physical data size).

Do you do things like truncate on the master?  Cause truncates don't
get replicated in slony.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
> On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:
>> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
>> <atsaloli.tech@gmail.com> wrote:
>>>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>>
>>>
>>>  My biggest table measures 154 GB on the origin, and 533 GB on
>>>  the slave.
>>>
>>>  Why is my slave bigger than my master?  How can I compact it, please?
>>
>>
>> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
>> <stuart@stuartbishop.net> wrote back:
>>>
>>> Do you have a long running transaction on the slave? vacuum will not
>>> reuse space that was freed after the longest running transaction.
>>>
>>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>>> followed by a REINDEX if you don't have enough disk space to run
>>> CLUSTER. And neither of these will do anything if the space is still
>>> live because some old transaction might still need to access the old
>>> tuples.
>>
>> Dear Stuart,
>>
>>  We do not run any transactions on the slave besides we pg_dump the
>> entire database every 3 hours.  I don't have enough disk space to CLUSTER
>> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
>> TABLE.
>>
>>  I'd love to get some insight into how much logical data I have versus how
>> much physical space it is taking up.  Is there some admin tool or command
>> or query that will report that?  For each table (and index), I'd like
>> to know how
>> much data is in that object (logical data size) and how much space it is taking
>> up on disk (physical data size).


On Thu, Mar 8, 2012 at 11:58 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> Do you do things like truncate on the master?  Cause truncates don't
> get replicated in slony.


Dear Scott,

  No, we do not truncate this table on the master.  We only add to it.

  The REINDEX FULL completed and the table is still swollen.

Yours,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Thu, Mar 8, 2012 at 1:10 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
>> On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
>> <atsaloli.tech@gmail.com> wrote:
>>> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
>>> <atsaloli.tech@gmail.com> wrote:
>>>>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>>>
>>>>
>>>>  My biggest table measures 154 GB on the origin, and 533 GB on
>>>>  the slave.
>>>>
>>>>  Why is my slave bigger than my master?  How can I compact it, please?
>>>
>>>
>>> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
>>> <stuart@stuartbishop.net> wrote back:
>>>>
>>>> Do you have a long running transaction on the slave? vacuum will not
>>>> reuse space that was freed after the longest running transaction.
>>>>
>>>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>>>> followed by a REINDEX if you don't have enough disk space to run
>>>> CLUSTER. And neither of these will do anything if the space is still
>>>> live because some old transaction might still need to access the old
>>>> tuples.
>>>
>>> Dear Stuart,
>>>
>>>  We do not run any transactions on the slave besides we pg_dump the
>>> entire database every 3 hours.  I don't have enough disk space to CLUSTER
>>> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
>>> TABLE.
>>>
>>>  I'd love to get some insight into how much logical data I have versus how
>>> much physical space it is taking up.  Is there some admin tool or command
>>> or query that will report that?  For each table (and index), I'd like
>>> to know how
>>> much data is in that object (logical data size) and how much space it is taking
>>> up on disk (physical data size).
>
>
> On Thu, Mar 8, 2012 at 11:58 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> Do you do things like truncate on the master?  Cause truncates don't
>> get replicated in slony.
>
>
> Dear Scott,
>
>  No, we do not truncate this table on the master.  We only add to it.
>
>  The REINDEX FULL completed and the table is still swollen.

If you pg_dump -t tablename from each machine, are the backups about
the same size?

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
Dear Scott,

  When I pg_dump -t bigtablename on the Slony slave, the dump file is
212G in size.

  I am unable to perform the same test on the master until I get a
maintenance window, which may not be for a few weeks, as it does
impact our production system when we dump from the master (the web app
gets noticeably slower).

  I compare this 212 GB size (logical size) with the 550 GB reported
size (physical size on disk), which corresponds to / aligns with "du
-sh" output, 550 GB.

  I remember now I had a similar issue about a year and a half or so,
when we had a jump in database size, also with replication involved,
and one site became bigger than the other.  I talked to Bruce M. about
it at some conference and he suggested looking at his site, where he
has explanation of how Postgres stores data, low-level, like structure
of a page and so on.  Unfortunately I was unable to carve out the time
to drill into it then, just continued running with the larger database
size...  so now this issue is coming back to haunt me, even bigger
now.

  The size had doubled earlier, and now it has nearly tripled.

  I'm afraid the easiest (quickest) solution will be for me to destroy
the RAID 1E array and rebuild it as a RAID 5 array, which would give
me a bigger filesystem, buying me time to study up on what Bruce
suggested, or else to hire a professional Postgres consultant (if
$WORK ever coughs up the money).

  Our resident Oracle DBA expert (since we don't have a Postgres one)
suggested I try truncating the table on the slave (with replication
down) and then restoring it from this pg_dump, just to see if the size
of the new table will be 200 GB or 500 GB.  If 200, we're home free;
if 500, we need to continue to investigate.

  In the meantime, I owe you the size of the bigtable from
production...   if anybody has any other suggestions, I am all ears.

Yours very truly,
Aleksey


On 3/8/12, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Mar 8, 2012 at 1:10 PM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:
>>> On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
>>> <atsaloli.tech@gmail.com> wrote:
>>>> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
>>>> <atsaloli.tech@gmail.com> wrote:
>>>>>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>>>>
>>>>>
>>>>>  My biggest table measures 154 GB on the origin, and 533 GB on
>>>>>  the slave.
>>>>>
>>>>>  Why is my slave bigger than my master?  How can I compact it, please?
>>>>
>>>>
>>>> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
>>>> <stuart@stuartbishop.net> wrote back:
>>>>>
>>>>> Do you have a long running transaction on the slave? vacuum will not
>>>>> reuse space that was freed after the longest running transaction.
>>>>>
>>>>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>>>>> followed by a REINDEX if you don't have enough disk space to run
>>>>> CLUSTER. And neither of these will do anything if the space is still
>>>>> live because some old transaction might still need to access the old
>>>>> tuples.
>>>>
>>>> Dear Stuart,
>>>>
>>>>  We do not run any transactions on the slave besides we pg_dump the
>>>> entire database every 3 hours.  I don't have enough disk space to
>>>> CLUSTER
>>>> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
>>>> TABLE.
>>>>
>>>>  I'd love to get some insight into how much logical data I have versus
>>>> how
>>>> much physical space it is taking up.  Is there some admin tool or
>>>> command
>>>> or query that will report that?  For each table (and index), I'd like
>>>> to know how
>>>> much data is in that object (logical data size) and how much space it is
>>>> taking
>>>> up on disk (physical data size).
>>
>>
>> On Thu, Mar 8, 2012 at 11:58 AM, Scott Marlowe <scott.marlowe@gmail.com>
>> wrote:
>>> Do you do things like truncate on the master?  Cause truncates don't
>>> get replicated in slony.
>>
>>
>> Dear Scott,
>>
>>  No, we do not truncate this table on the master.  We only add to it.
>>
>>  The REINDEX FULL completed and the table is still swollen.
>
> If you pg_dump -t tablename from each machine, are the backups about
> the same size?
>

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Nur Hidayat
Date:
Hai Aleksey,

I once have the same problem. In my case it's because most of my table using text datatype.
When I change the field type to character varying (1000) database size reduced significantly

Unfortunately, I haven't investigate more, but it looks like how postgres stores data

Regards,
Nur Hidayat





On Mon, Mar 12, 2012 at 1:32 PM, Aleksey Tsalolikhin <atsaloli.tech@gmail.com> wrote:
Dear Scott,

 When I pg_dump -t bigtablename on the Slony slave, the dump file is
212G in size.

 I am unable to perform the same test on the master until I get a
maintenance window, which may not be for a few weeks, as it does
impact our production system when we dump from the master (the web app
gets noticeably slower).

 I compare this 212 GB size (logical size) with the 550 GB reported
size (physical size on disk), which corresponds to / aligns with "du
-sh" output, 550 GB.

 I remember now I had a similar issue about a year and a half or so,
when we had a jump in database size, also with replication involved,
and one site became bigger than the other.  I talked to Bruce M. about
it at some conference and he suggested looking at his site, where he
has explanation of how Postgres stores data, low-level, like structure
of a page and so on.  Unfortunately I was unable to carve out the time
to drill into it then, just continued running with the larger database
size...  so now this issue is coming back to haunt me, even bigger
now.

 The size had doubled earlier, and now it has nearly tripled.

 I'm afraid the easiest (quickest) solution will be for me to destroy
the RAID 1E array and rebuild it as a RAID 5 array, which would give
me a bigger filesystem, buying me time to study up on what Bruce
suggested, or else to hire a professional Postgres consultant (if
$WORK ever coughs up the money).

 Our resident Oracle DBA expert (since we don't have a Postgres one)
suggested I try truncating the table on the slave (with replication
down) and then restoring it from this pg_dump, just to see if the size
of the new table will be 200 GB or 500 GB.  If 200, we're home free;
if 500, we need to continue to investigate.

 In the meantime, I owe you the size of the bigtable from
production...   if anybody has any other suggestions, I am all ears.

Yours very truly,
Aleksey


On 3/8/12, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Mar 8, 2012 at 1:10 PM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:
>>> On Thu, Mar 8, 2012 at 12:10 PM, Aleksey Tsalolikhin
>>> <atsaloli.tech@gmail.com> wrote:
>>>> On Wed, Mar 7, 2012 at 10:05 AM, Aleksey Tsalolikhin
>>>> <atsaloli.tech@gmail.com> wrote:
>>>>>  We're replicating a PostgreSQL 8.4.x database using Slony1-1.2.x
>>>>>
>>>>>
>>>>>  My biggest table measures 154 GB on the origin, and 533 GB on
>>>>>  the slave.
>>>>>
>>>>>  Why is my slave bigger than my master?  How can I compact it, please?
>>>>
>>>>
>>>> On Wed, Mar 7, 2012 at 11:54 PM, Stuart Bishop
>>>> <stuart@stuartbishop.net> wrote back:
>>>>>
>>>>> Do you have a long running transaction on the slave? vacuum will not
>>>>> reuse space that was freed after the longest running transaction.
>>>>>
>>>>> You need to use the CLUSTER command to compact it, or VACUUM FULL
>>>>> followed by a REINDEX if you don't have enough disk space to run
>>>>> CLUSTER. And neither of these will do anything if the space is still
>>>>> live because some old transaction might still need to access the old
>>>>> tuples.
>>>>
>>>> Dear Stuart,
>>>>
>>>>  We do not run any transactions on the slave besides we pg_dump the
>>>> entire database every 3 hours.  I don't have enough disk space to
>>>> CLUSTER
>>>> the table; I ran VACUUM FULL yesterday, and I just fired up a REINDEX
>>>> TABLE.
>>>>
>>>>  I'd love to get some insight into how much logical data I have versus
>>>> how
>>>> much physical space it is taking up.  Is there some admin tool or
>>>> command
>>>> or query that will report that?  For each table (and index), I'd like
>>>> to know how
>>>> much data is in that object (logical data size) and how much space it is
>>>> taking
>>>> up on disk (physical data size).
>>
>>
>> On Thu, Mar 8, 2012 at 11:58 AM, Scott Marlowe <scott.marlowe@gmail.com>
>> wrote:
>>> Do you do things like truncate on the master?  Cause truncates don't
>>> get replicated in slony.
>>
>>
>> Dear Scott,
>>
>>  No, we do not truncate this table on the master.  We only add to it.
>>
>>  The REINDEX FULL completed and the table is still swollen.
>
> If you pg_dump -t tablename from each machine, are the backups about
> the same size?
>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
John R Pierce
Date:
On 03/12/12 12:06 AM, Nur Hidayat wrote:
>
> I once have the same problem. In my case it's because most of my table
> using text datatype.
> When I change the field type to character varying (1000) database size
> reduced significantly
>
> Unfortunately, I haven't investigate more, but it looks like how
> postgres stores data

that doesn't make any sense.   text and character varying storage is
exactly hte same, the only difference is the varchar has an optional
length constraint



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
"Nur Hidayat"
Date:
Yes, I am aware of that, but that's the fact I'm facing
Right now I'am happy enough my system runs well without eating up my drive :)
I'll investigate more later when time available :)

Cheers,
Nur Hidayat
 


.::.
Sent from my BlackBerry®
powered by The ESQ Way 165

-----Original Message-----
From: John R Pierce <pierce@hogranch.com>
Sender: pgsql-general-owner@postgresql.orgDate: Mon, 12 Mar 2012 00:39:28 
To: <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave.
 How to compact it?

On 03/12/12 12:06 AM, Nur Hidayat wrote:
>
> I once have the same problem. In my case it's because most of my table 
> using text datatype.
> When I change the field type to character varying (1000) database size 
> reduced significantly
>
> Unfortunately, I haven't investigate more, but it looks like how 
> postgres stores data

that doesn't make any sense.   text and character varying storage is 
exactly hte same, the only difference is the varchar has an optional 
length constraint



-- 
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
"Nur Hidayat"
Date:
FYI, after I changed text field into character varying, I vaccuum the whole database, resulting in much smaller
databasesize
 

Cheers,
NH



.::.
Sent from my BlackBerry®
powered by The ESQ Way 165

-----Original Message-----
From: "Nur Hidayat" <hidayat365@gmail.com>
Date: Mon, 12 Mar 2012 08:18:09 
To: John R Pierce<pierce@hogranch.com>; <pgsql-general@postgresql.org>
Reply-To: hidayat365@gmail.com
Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

Yes, I am aware of that, but that's the fact I'm facing
Right now I'am happy enough my system runs well without eating up my drive :)
I'll investigate more later when time available :)

Cheers,
Nur Hidayat
 


.::.
Sent from my BlackBerry®
powered by The ESQ Way 165

-----Original Message-----
From: John R Pierce <pierce@hogranch.com>
Sender: pgsql-general-owner@postgresql.orgDate: Mon, 12 Mar 2012 00:39:28 
To: <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave.
 How to compact it?

On 03/12/12 12:06 AM, Nur Hidayat wrote:
>
> I once have the same problem. In my case it's because most of my table 
> using text datatype.
> When I change the field type to character varying (1000) database size 
> reduced significantly
>
> Unfortunately, I haven't investigate more, but it looks like how 
> postgres stores data

that doesn't make any sense.   text and character varying storage is 
exactly hte same, the only difference is the varchar has an optional 
length constraint



-- 
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Alban Hertroys
Date:
On 12 March 2012 09:20, Nur Hidayat <hidayat365@gmail.com> wrote:
> FYI, after I changed text field into character varying, I vaccuum the whole database, resulting in much smaller
databasesize 

What I think that happened in your case is that because of the
data-type change every row in the table got rewritten to a new version
where said column was of the new type. The subsequent vacuum then
removed the old (bloated) rows with the old type from the database
file.

And thus you ended up with a clean table.

> -----Original Message-----
> From: "Nur Hidayat" <hidayat365@gmail.com>
> Date: Mon, 12 Mar 2012 08:18:09
> To: John R Pierce<pierce@hogranch.com>; <pgsql-general@postgresql.org>
> Reply-To: hidayat365@gmail.com
> Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?
>
> Yes, I am aware of that, but that's the fact I'm facing
> Right now I'am happy enough my system runs well without eating up my drive :)
> I'll investigate more later when time available :)
>
> Cheers,
> Nur Hidayat
>
>
>
> .::.
> Sent from my BlackBerry®
> powered by The ESQ Way 165
>
> -----Original Message-----
> From: John R Pierce <pierce@hogranch.com>
> Sender: pgsql-general-owner@postgresql.orgDate: Mon, 12 Mar 2012 00:39:28
> To: <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave.
>  How to compact it?
>
> On 03/12/12 12:06 AM, Nur Hidayat wrote:
>>
>> I once have the same problem. In my case it's because most of my table
>> using text datatype.
>> When I change the field type to character varying (1000) database size
>> reduced significantly
>>
>> Unfortunately, I haven't investigate more, but it looks like how
>> postgres stores data
>
> that doesn't make any sense.   text and character varying storage is
> exactly hte same, the only difference is the varchar has an optional
> length constraint
>
>
>
> --
> john r pierce                            N 37, W 122
> santa cruz ca                         mid-left coast
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
"Nur Hidayat"
Date:
If I didn't change the data type to charcater varying vaccuming the database doesn't reduce the database size



.::.
Sent from my BlackBerry®
powered by The ESQ Way 165

-----Original Message-----
From: Alban Hertroys <haramrae@gmail.com>
Date: Mon, 12 Mar 2012 16:43:49 
To: <hidayat365@gmail.com>
Cc: John R Pierce<pierce@hogranch.com>; <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave.
 How to compact it?

On 12 March 2012 09:20, Nur Hidayat <hidayat365@gmail.com> wrote:
> FYI, after I changed text field into character varying, I vaccuum the whole database, resulting in much smaller
databasesize
 

What I think that happened in your case is that because of the
data-type change every row in the table got rewritten to a new version
where said column was of the new type. The subsequent vacuum then
removed the old (bloated) rows with the old type from the database
file.

And thus you ended up with a clean table.

> -----Original Message-----
> From: "Nur Hidayat" <hidayat365@gmail.com>
> Date: Mon, 12 Mar 2012 08:18:09
> To: John R Pierce<pierce@hogranch.com>; <pgsql-general@postgresql.org>
> Reply-To: hidayat365@gmail.com
> Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?
>
> Yes, I am aware of that, but that's the fact I'm facing
> Right now I'am happy enough my system runs well without eating up my drive :)
> I'll investigate more later when time available :)
>
> Cheers,
> Nur Hidayat
>
>
>
> .::.
> Sent from my BlackBerry®
> powered by The ESQ Way 165
>
> -----Original Message-----
> From: John R Pierce <pierce@hogranch.com>
> Sender: pgsql-general-owner@postgresql.orgDate: Mon, 12 Mar 2012 00:39:28
> To: <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] A 154 GB table swelled to 527 GB on the Slony slave.
>  How to compact it?
>
> On 03/12/12 12:06 AM, Nur Hidayat wrote:
>>
>> I once have the same problem. In my case it's because most of my table
>> using text datatype.
>> When I change the field type to character varying (1000) database size
>> reduced significantly
>>
>> Unfortunately, I haven't investigate more, but it looks like how
>> postgres stores data
>
> that doesn't make any sense.   text and character varying storage is
> exactly hte same, the only difference is the varchar has an optional
> length constraint
>
>
>
> --
> john r pierce                            N 37, W 122
> santa cruz ca                         mid-left coast
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
John R Pierce
Date:
On 03/12/12 2:28 PM, Nur Hidayat wrote:
> If I didn't change the data type to charcater varying vaccuming the database doesn't reduce the database size

changing the data type required every tuple to get rewritten.     a
vacuum full, or a cluster likely would have done the same or better
reduction in size..



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


On Mon, Mar 12, 2012 at 3:06 AM, Nur Hidayat <hidayat365@gmail.com> wrote:
> I once have the same problem. In my case it's because most of my table using
> text datatype.
> When I change the field type to character varying (1000) database size
> reduced significantly

I'll bet what happened was postgres re-wrote your table for you,
effectively doing a compaction.  You can get similar effect by doing
an alter table and "changing" an INTEGER field to be INTEGER.
Postgres does not optimize that do a no-op, so you get the re-writing
effect.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Tue, Mar 13, 2012 at 5:14 AM, Vick Khera <vivek@khera.org> wrote:
> I'll bet what happened was postgres re-wrote your table for you,
> effectively doing a compaction.  You can get similar effect by doing
> an alter table and "changing" an INTEGER field to be INTEGER.
> Postgres does not optimize that do a no-op, so you get the re-writing
> effect.

How does table rewriting work?  Does it happen a row at a time or all at once?

In other words, how much free disk space is needed on an 800 TB filesystem
to rewrite a 550 TB table?  (Have I got enough space?)

Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Wed, Mar 14, 2012 at 8:24 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> On Tue, Mar 13, 2012 at 5:14 AM, Vick Khera <vivek@khera.org> wrote:
>> I'll bet what happened was postgres re-wrote your table for you,
>> effectively doing a compaction.  You can get similar effect by doing
>> an alter table and "changing" an INTEGER field to be INTEGER.
>> Postgres does not optimize that do a no-op, so you get the re-writing
>> effect.
>
> How does table rewriting work?  Does it happen a row at a time or all at once?

When you do something like alter type or update without a where
clause, it will have to make a new copy of every old copy in the
table.

> In other words, how much free disk space is needed on an 800 TB filesystem
> to rewrite a 550 TB table?  (Have I got enough space?)

If you update the whole table at once (a la alter table alter column
or update with no where) then it has to have the space for all the
real data to fit.  The size of the file system isn't important as much
as how much free space is left.  IFF it's 800TB with exactly 550TB
used, then you have 250TB free.  The good news is that if the table is
bloated, it should be able to just write to the free space in the
table that's already there.  This requires proper vacuuming and on
older versions free space map settings.  With a really large table,
older versions of pgsql (<=8.3) tended to blow out their free space
map if you didn't crank them up.  Newer versions just need proper
regular vacuuming.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
Executive summary:  Why would the TOAST table on the slave have 4x the
page count of the master?  Is there a way to compact it if I don't
have enough disk space to duplicate the table?  How do I prevent this
situation from recurring?

On Wed, Mar 14, 2012 at 7:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
> The good news is that if the table is
> bloated, it should be able to just write to the free space in the
> table that's already there.

Thank you, I got it.  The table is not bloated, as per
check_postgres.pl --action=bloat

I compared number of pages on the large table between the Slony origin
node and the slave, using "SELECT relname, relpages FROM pg_class"

origin:  386,918
slave: 421,235

origin toast: 19,211,059
slave toast: 70,035,481

How about that?

reltuples on origin toast table:  6.76368e+07
reltuples on slave toast table:  7.00354e+07

Here is my situation:

1. I've got an 800 GB (not TB as I wrote earlier today) filesystem.

2. 595 GB is in use by the Postgres database, and over 527 GB by this
large table (including TOAST)

3. 155 GB is free.

4. pg_dump of this large table comes out to 212 GB in size (so
physical size is nearly 3x the logical size)

Why does the slave TOAST table have 4x the page count of the master?

Best,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Wed, Mar 14, 2012 at 8:06 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> Executive summary:  Why would the TOAST table on the slave have 4x the
> page count of the master?

  Hypothesis:  the pages of the TOAST table on the slave have more
unused space in them than the pages of the TOAST table on the master.

  How do I test this hypothesis?  I need to examine the raw pages on disk.

  I'm looking at
http://momjian.us/main/writings/pgsql/internalpics.pdf   Slide 52
shows "internal page structure"  but I don't know how to actually look
at a page.
 Is it possible to do a raw dump of the pages of a table?  Do I just
look at the hexdump of the file corresponding to relfilenode?

Best,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Wed, Mar 14, 2012 at 9:06 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
>
> On Wed, Mar 14, 2012 at 7:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>
>> The good news is that if the table is
>> bloated, it should be able to just write to the free space in the
>> table that's already there.
>
> Thank you, I got it.  The table is not bloated, as per
> check_postgres.pl --action=bloat

Are you sure you're checking the toast table that goes with whatever
parent table?

Easy way to tell. du -s /var/lib/data/base dir, then update a few
thousand rows, roll it back, and run du -s again.  Compare.  If the du
numbers stay the same then you're updating pre-allocated space and
should be ok.  If there's a delta, compute it per tuple updated,
multiply by tuples and that's how much you'll need.

If the du -s numbers don't change or only a little then feel free to
either run a single update while running

watch "df -h /var/lib/where/my/data/dir/lives"

and being ready to hit CTRL-C if you see if running your machine out of memory.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Wed, Mar 14, 2012 at 10:57 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> If the du -s numbers don't change or only a little then feel free to
> either run a single update while running
>
> watch "df -h /var/lib/where/my/data/dir/lives"
>
> and being ready to hit CTRL-C if you see if running your machine out of memory.

OR batch them in batches of x 10000 or so and df or du in between.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Wed, Mar 14, 2012 at 9:57 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> Are you sure you're checking the toast table that goes with whatever
> parent table?

Yep.   I find out the relation id of the TOAST table:

"select reltoastrelid from pg_class where relname = 'parent_table_name';"

Find out the relation name of the TOAST table:

"select X::regclass;"  (where X is what I got from step 1 above)


> Easy way to tell. du -s /var/lib/data/base dir, then update a few
> thousand rows, roll it back, and run du -s again.  Compare.  If the du
> numbers stay the same then you're updating pre-allocated space and
> should be ok.

I don't think I can run this test while the system is in production...
 we do 250-300 database transactions per second under low load... high
load takes us above 1500 tps.  my numbers for "du -sh data/base" would
be affected by the production workload.

Let me ask you this:  I've been looking at "select ctid from
big_table" on the master and slave and I notice that pages can have
"holes" in them.  Some pages have rows that go sequentially from 0 to
26 or 27 or so, and some pages have rows that go:

 (431665,2)
 (431665,5)
 (431665,8)
 (431665,11)
 (431665,14)
 (431665,17)
 (431665,20)
 (431665,23)

That's the last page on the slave.  It has  only 8 rows in it.  It's
composed mostly of holes.  That aligns with my hypothesis that pages
on the slave have less data in them than pages on the master.  (Which
would explain why the slave has 4x the number of pages compared to the
master.)

Is there any way to consolidate the pages on the slave without taking
replication offline?

Best,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Filip Rembiałkowski
Date:
On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin <atsaloli.tech@gmail.com> wrote:

Is there any way to consolidate the pages on the slave without taking
replication offline?

maybe CLUSTER?

filip@dev=> create table foobar (id serial primary key, load text);
CREATE TABLE
filip@dev=> insert into foobar(load)select md5(random()::text) from generate_series(1,100000);
INSERT 0 100000
filip@dev=> delete from foobar where id%4<>3; -- delete 75% of the table
DELETE 75000
filip@dev=> select pg_relation_size('foobar');
 pg_relation_size
------------------
          6832128
(1 row)

filip@dev=> CLUSTER foobar using foobar_pkey;
CLUSTER
filip@dev=> select pg_relation_size('foobar');
 pg_relation_size
------------------
          1712128
(1 row)

Of course events destined to this table will be queued by Slony while the table is locked.


Filip



Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Greg Williamson
Date:
Filip Rembiałkowski suggested:


>> On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin <atsaloli.tech@gmail.com> wrote:
>>
>>
>> Is there any way to consolidate the pages on the slave without taking
>> replication offline?
>>
>>
>maybe CLUSTER?
>
<...>
>
>Of course events destined to this table will be queued by Slony while the table is locked.
>

I've not much recent experience with Slony, but possibly pg_reorg, found at:

  <http://pgfoundry.org/projects/reorg/>

would be of use ...

Greg Williamson

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> Is there any way to consolidate the pages on the slave without taking
> replication offline?

Filip Rembiałkowski suggested:   maybe CLUSTER?

Greg Williamson suggested: pg_reorg


Thank you, Filip and Greg.  They would both work IF I had enough free
space on the slave,
which, sadly, I do not. :(

CLUSTER requires free space at least equal to the sum of the table
size and the index sizes.

pg_reorg rquires amount of space twice larger than target table and indexes.

Too bad I can't say "CLUSTER TABLE tablename USING_ARRAY
/dev/temp/array" or something
like that, using an external array for temporary storage just for the
cluster.  I do have an external
USB drive with more than enough free space on it.

I've got a maintenance window scheduled for tomorrow night to get rid
of the holes in the pages on the
slave.  My plan is to shut down the application, destroy the
replication set, re-create it, and start
replication, which will do a full sync.  It's a litle overkill but I
have this procedure documented
and I've done it before.

I expect that after the table is TRUNCATE'd on the Slave and COPY'ied over from
the Master, we won't have holes in the pages....   I sure hope so!!

Our database is about 200 GB - over a WAN link, last time it took 8
hours to do a full sync, I expect it'll be
more like 9 or 10 hours this time.

I still don't know where these holes came from... and what is
responsible.  Our Postgres 8.4.9 or
Slony 1.2.21 or some combination of the two?  We don't delete rows
from that table AFAIK so I'm
really confused why the page contents would be so sparse...  and since
I'm not fixing the root cause
I'm sure the problem will come back.   But at least this keeps my
slave operational (which it won't be
once the filesystem hits 100%)

Thanks for all the help and suggestions so far. I really appreciate it!

Any more pointers would be welcome...

Too bad there is no way to compact the rows/pages within the table
without using an
temporary table.  That would be the silver bullet for my particular
predicament.

Yours,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Steve Crawford
Date:
I'm coming into this conversation *way* late so forgive me if this has
been hashed out already

On 03/16/2012 12:20 PM, Aleksey Tsalolikhin wrote:
> CLUSTER requires free space at least equal to the sum of the tablesize
> and the index sizes.
>

Although it is not documented in an absolutely clear way, I do not
believe this is true in part due to the different cluster options and
the lack of specificity on whether tablesize refers to the on-disk size
of the new or the old table.

I *think* you can get away with only sufficient free space to store the
*new* table and indexes which, on a heavily bloated table, may be
significantly less than the space required for a full copy of the
bloated table/indexes.

Depending on your schema and which tables are using space, you might be
able to start clustering the smaller tables and progressively free
additional space that can be used when clustering the larger tables.

> Too bad there is no way to compact the rows/pages within the table
> without using an
> temporary table.  That would be the silver bullet for my particular
> predicament.
Isn't that "VACUUM FULL"? Much slower than CLUSTER and can cause index
bloat (solvable by subsequent reindexing) but does not create a
duplicate of the table.

Cheers,
Steve


Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Alban Hertroys
Date:
On 16 Mar 2012, at 20:20, Aleksey Tsalolikhin wrote:

> CLUSTER requires free space at least equal to the sum of the table
> size and the index sizes.
>
> pg_reorg rquires amount of space twice larger than target table and indexes.
>
> Too bad I can't say "CLUSTER TABLE tablename USING_ARRAY
> /dev/temp/array" or something
> like that, using an external array for temporary storage just for the
> cluster.  I do have an external
> USB drive with more than enough free space on it.


You could move the table to a different TABLESPACE, namely one on that USB disk. I'm not particular to the intrinsics
oftablespaces, but it's possible that moving it would trigger a rewrite of the contents such that the effect would be
similarto what CLUSTER would do for you - it probably just moves the files though, in which case you'd perform the
CLUSTERon the new TABLESPACE and then move it back. 

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Scott Marlowe
Date:
On Fri, Mar 16, 2012 at 1:20 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:
>> Is there any way to consolidate the pages on the slave without taking
>> replication offline?
>
> Filip Rembiałkowski suggested:   maybe CLUSTER?
>
> Greg Williamson suggested: pg_reorg
>
>
> Thank you, Filip and Greg.  They would both work IF I had enough free
> space on the slave,
> which, sadly, I do not. :(
>
> CLUSTER requires free space at least equal to the sum of the table
> size and the index sizes.
>
> pg_reorg rquires amount of space twice larger than target table and indexes.
>
> Too bad I can't say "CLUSTER TABLE tablename USING_ARRAY
> /dev/temp/array" or something
> like that, using an external array for temporary storage just for the
> cluster.  I do have an external
> USB drive with more than enough free space on it.
>
> I've got a maintenance window scheduled for tomorrow night to get rid
> of the holes in the pages on the
> slave.  My plan is to shut down the application, destroy the
> replication set, re-create it, and start
> replication, which will do a full sync.  It's a litle overkill but I
> have this procedure documented
> and I've done it before.

If you've got other big tables in the set, you could put that one
table into its own set, then drop that set and resubscribe a new set
with just that table in it, then merge the sets.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Aleksey Tsalolikhin
Date:
On Fri, Mar 16, 2012 at 2:03 PM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
>
>
> I *think* you can get away with only sufficient free space to store the
> *new* table and indexes

Yeah; I don't have that much free space.  Just 30 GB short.  :(


> Depending on your schema and which tables are using space, you might be able
> to start clustering the smaller tables and progressively free additional
> space that can be used when clustering the larger tables.

Clever idea, thank you.  Unfortunately the usage is 95%+ in this one big table.


>> Too bad there is no way to compact the rows/pages within the table
>> without using an
>> temporary table.  That would be the silver bullet for my particular
>> predicament.
>
> Isn't that "VACUUM FULL"? Much slower than CLUSTER and can cause index bloat
> (solvable by subsequent reindexing) but does not create a duplicate of the
> table.

I *did* run "VACUUM FULL" on the table, but I still have holes in my pages.
(i.e. examining a page using "select ctid" shows me row 2, 8, 15, 20, etc.
not 1 2 3 4 ...)

Is this a problem with "VACUUM FULL" or with my understanding?

Thank you, Scott, for the suggestion to move the big table to a separate
Slony replication set; and to Alban for moving it to a different TABLESPACE.
That is especially interesting, and I think exactly what I was looking for
(rewrite the table using an external storage device).  Unfortunately I am out
of time to resolve this "live"; we'll take things down tomorrow night
(I only get
Saturday nights for maintenance, and not too often, either) and just re-sync.

This has been extremely educational; thank you!

And I still don't know why I have holes in my pages.......  should I have holes
in my pages??  why only on the slave and not on the master?  (the slave has
4x the number of pages compared to the master, and there's less rows per
page on the slave.)

Yours very truly,
Aleksey

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
dennis jenkins
Date:
On Fri, Mar 16, 2012 at 2:20 PM, Aleksey Tsalolikhin
<atsaloli.tech@gmail.com> wrote:
> On Thu, Mar 15, 2012 at 6:43 AM, Aleksey Tsalolikhin
> <atsaloli.tech@gmail.com> wrote:

> Our database is about 200 GB - over a WAN link, last time it took 8
> hours to do a full sync, I expect it'll be
> more like 9 or 10 hours this time.
>

Aleksey, a suggestion:  The vast majority of the postgresql wire
protocol compresses well.  If your WAN link is not already compressed,
construct a compressed SSH tunnel for the postgresql TCP port in the
WAN link.  I've done this when rebuilding a 300GB database (via slony)
over a bandwidth-limited (2MB/s) VPN link and it cut the replication
resync time down significantly.

Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Alban Hertroys
Date:
On 17 Mar 2012, at 4:21, Aleksey Tsalolikhin wrote:

> And I still don't know why I have holes in my pages.......  should I have holes
> in my pages??  why only on the slave and not on the master?  (the slave has
> 4x the number of pages compared to the master, and there's less rows per
> page on the slave.)


I recall someone in here had a similar problem where the table's FILLFACTOR on the slave was different from the one on
themaster. Perhaps that would explain the gaps you're seeing? 

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


Re: A 154 GB table swelled to 527 GB on the Slony slave. How to compact it?

From
Mark Felder
Date:
On Sat, 17 Mar 2012 10:46:00 -0500, dennis jenkins
<dennis.jenkins.75@gmail.com> wrote:
>
> Aleksey, a suggestion:  The vast majority of the postgresql wire
> protocol compresses well.  If your WAN link is not already compressed,
> construct a compressed SSH tunnel for the postgresql TCP port in the
> WAN link.  I've done this when rebuilding a 300GB database (via slony)
> over a bandwidth-limited (2MB/s) VPN link and it cut the replication
> resync time down significantly.
>

SSH with the HPN patchset[1] would help as well if it's higher latency or
if you're CPU limited as it can use multiple threads then. It works
wonderfully for me on a 35mbit link. If you have a lower sized link that
wouldn't benefit from the HPN patchset anyway it may be worth forcing
Blowfish instead of AES to keep the CPU load lower.

Hope that helps!


[1] http://www.psc.edu/networking/projects/hpn-ssh/

FYI, the HPN patchset is included the base OpenSSH of FreeBSD 9 now.