Re: large table - Mailing list pgsql-general

From Bill Moran
Subject Re: large table
Date
Msg-id 20140922143852.e7bf5228d618f2f6e4ee1f3a@potentialtech.com
Whole thread Raw
In response to large table  (Luke Coldiron <lukecoldiron@hotmail.com>)
Responses Re: large table
List pgsql-general
On Mon, 22 Sep 2014 11:17:05 -0700
Luke Coldiron <lukecoldiron@hotmail.com> wrote:

> I am trying to figure out what would have caused a table in a PostgreSQL 8.4.16 to get into a state where there is
only1 live tuple and has only ever had one 1 tuple but the size of the table is huge. 
>
> CREATE TABLE public.myTable(  myColumn timestamp with time zone NOT NULL);
>
> Note: there is no primary key or index on this table.
> CREATE OR REPLACE FUNCTION public.myFunc()RETURNS VOID AS $$BEGIN   UPDATE public.myTable SET myColumn =
CLOCK_TIMESTAMP();
>    IF NOT FOUND THEN      INSERT INTO public.myTable(myColumn) VALUES (CLOCK_TIMESTAMP());   END IF;END;$$ LANGUAGE
plpgsqlVOLATILE STRICT SECURITY DEFINER; 
> CREATE OR REPLACE FUNCTION public.wasUpdated(OUT was_updated boolean)RETURNS BOOLEAN AS $$BEGIN   was_updated :=
COALESCE((SELECTmyColumn FROM public.myTable) > (CLOCK_TIMESTAMP() - INTERVAL '5 SECOND'),
                      FALSE);END;$$ LANGUAGE plpgsql VOLATILE STRICT SECURITY DEFINER; 
> SELECT *FROM pg_stat_all_tablesWHERE relname = 'myTable';

[snip]

> The actual size of the table is around 33 MB.
> The myFunc function is called every 2.5 seconds and the wasUpdated function every 2 seconds by separate processes.
> I realize that running a FULL VACUUM or CLUSTER command on the table will resolve the issue but I am more interested
ina root cause that explains why this table would end up in this state. I have tried to reproduce this issue by running
theexact setup and have not been able to get the table to grow like this example. Any plausible cause'es or
explanationswould be much appreciated. 

The UPDATE in myFunc() creates a new row every 2.5 seconds when it updates the row.
The data from those rows is only reclaimed when a vacuum is run.  So (for example)
if autovacuum only triggers a vacuum every 250 seconds, there will be 249 rows worth
of space in the table, on average.  With the other process querying the table, it's
possible that the row that it's looking at will be a something that _should_ be
reclaimable, so vacuum may not clear up all the free space.

As far as running the exact setup: if you're not getting the same results, then
your setup isn't exactly the same.  It's likely that there are things going on in the
setup you're curious about that you're not aware of, such as additional queries on
the table, additional load that causes operations to take a little longer, thus
resulting in different overlap of competing operations, etc.

Keep in mind that a short-lived incident might have resulted in table bloat that
won't be reclaimed by autovacuum.  I.e., if autovacuum wasn't running for a while,
this table would just keep bloating; then when you start autovacuum, it will
maintain the table size, but it won't get any smaller.

I can't make any sense of the data you provided, it's all on seperate rows and I've
given up on trying to figure out what number goes with which value, so I don't know
exactly what the situation is.  It's likely that you can improve on the situation
by tweaking the autovacuum settings for this table to vacuum it more aggressively.

Although, you don't seem to have a _problem_ that you've stated.  Are you seeing
performance issues?  Is 33M too much data and filling up the drive (not being
sarcastic here, as there are various mobile applications where 33M could be
important, even now).  Because, if this isn't actually causing any problems, I
wouldn't really worry about it.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com


pgsql-general by date:

Previous
From: Alan Hodgson
Date:
Subject: Re: large table
Next
From: Luke Coldiron
Date:
Subject: Re: large table