Thread: Dead rows not getting removed during vacuum

Dead rows not getting removed during vacuum

From
Granthana Biswas
Date:
Hello All,

Has anyone ever faced the issue of dead rows not getting removed during vacuum even if there are no open transactions/connections?

We have been facing this during every scheduled vacuum which is done after closing all other database connections:

119278 dead row versions cannot be removed yet.
The number of dead rows varies with each table.

Regards,
Granthana

Re: Dead rows not getting removed during vacuum

From
Venkata Balaji Nagothi
Date:
On Thu, Mar 20, 2014 at 5:27 AM, Granthana Biswas <granthana@zedo.com> wrote:

Hello All,

Has anyone ever faced the issue of dead rows not getting removed during vacuum even if there are no open transactions/connections?

We have been facing this during every scheduled vacuum which is done after closing all other database connections:

119278 dead row versions cannot be removed yet.
This means there are transactions in progress when the vacuum was running. Do you see any such situation ?

Please let us know, which version of PostgreSQL is this.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

Re: Dead rows not getting removed during vacuum

From
Sergey Konoplev
Date:
On Thu, Mar 20, 2014 at 5:27 AM, Granthana Biswas <granthana@zedo.com> wrote:
> Has anyone ever faced the issue of dead rows not getting removed during
> vacuum even if there are no open transactions/connections?

What does the pg_prepared_xacts view show?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: Dead rows not getting removed during vacuum

From
Haribabu Kommi
Date:
On Thu, Mar 20, 2014 at 11:27 PM, Granthana Biswas <granthana@zedo.com> wrote:
> Hello All,
>
> Has anyone ever faced the issue of dead rows not getting removed during
> vacuum even if there are no open transactions/connections?
>
> We have been facing this during every scheduled vacuum which is done after
> closing all other database connections:
>
> 119278 dead row versions cannot be removed yet.

These are the dead tuples which occurred after the vacuum operation is
started. These tuples may be visible to the other transactions,
because of this reason It cannot remove these tuples in this vacuum
scan. In the next vacuum these will be cleaned. You can observe the
same in the next vacuum.

Regards,
Hari Babu
Fujitsu Australia


Re: Dead rows not getting removed during vacuum

From
Granthana Biswas
Date:
Hi Venkata,

No there are no other connections to the DB during vacuum.

Regards,
Granthana


On Fri, Mar 21, 2014 at 3:12 AM, Venkata Balaji Nagothi <vbnpgc@gmail.com> wrote:
On Thu, Mar 20, 2014 at 5:27 AM, Granthana Biswas <granthana@zedo.com> wrote:

Hello All,

Has anyone ever faced the issue of dead rows not getting removed during vacuum even if there are no open transactions/connections?

We have been facing this during every scheduled vacuum which is done after closing all other database connections:

119278 dead row versions cannot be removed yet.
This means there are transactions in progress when the vacuum was running. Do you see any such situation ?

Please let us know, which version of PostgreSQL is this.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia


Re: Dead rows not getting removed during vacuum

From
Granthana Biswas
Date:
Hi Sergey,

We don't have two-phase commit prepared transactions. During vacuum, along with this problem we are also facing streaming replication issue. Hundreds of xlogs are in ready state in pg_xlog/archive_status. Those are being copied manually after vacuum otherwise rsync happens very slowly and replication lag keeps increasing.


Regards,
Granthana


On Fri, Mar 21, 2014 at 3:36 AM, Sergey Konoplev <gray.ru@gmail.com> wrote:
On Thu, Mar 20, 2014 at 5:27 AM, Granthana Biswas <granthana@zedo.com> wrote:
> Has anyone ever faced the issue of dead rows not getting removed during
> vacuum even if there are no open transactions/connections?

What does the pg_prepared_xacts view show?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: Dead rows not getting removed during vacuum

From
Sergey Konoplev
Date:
On Fri, Mar 21, 2014 at 2:46 AM, Granthana Biswas <granthana@zedo.com> wrote:
> We don't have two-phase commit prepared transactions. During vacuum, along with this problem we are also facing
streamingreplication issue. Hundreds of xlogs are in ready state in pg_xlog/archive_status. Those are being copied
manuallyafter vacuum otherwise rsync happens very slowly and replication lag keeps increasing. 

Okay, I see. Then next questions are:

What postgres version is it?
What max_standby_archive_delay and max_standby_streaming_delay are set to?
Are there any long running activity or idling in transaction backends
on your slaves?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: Dead rows not getting removed during vacuum

From
Granthana Biswas
Date:
Sergey,

Version is PostgreSQL 9.1.4.

max_standby_archive_delay = 120s
max_standby_streaming_delay = 180s

Long running activity or idle in transactions are not there on either master or slave as we stop all connections to both the DBs during vacuum.


Regards,
Granthana


On Fri, Mar 21, 2014 at 8:51 PM, Sergey Konoplev <gray.ru@gmail.com> wrote:
On Fri, Mar 21, 2014 at 2:46 AM, Granthana Biswas <granthana@zedo.com> wrote:
> We don't have two-phase commit prepared transactions. During vacuum, along with this problem we are also facing streaming replication issue. Hundreds of xlogs are in ready state in pg_xlog/archive_status. Those are being copied manually after vacuum otherwise rsync happens very slowly and replication lag keeps increasing.

Okay, I see. Then next questions are:

What postgres version is it?
What max_standby_archive_delay and max_standby_streaming_delay are set to?
Are there any long running activity or idling in transaction backends
on your slaves?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: Dead rows not getting removed during vacuum

From
Tom Lane
Date:
Granthana Biswas <granthana@zedo.com> writes:
> Version is PostgreSQL 9.1.4.

You do realize you're missing almost two years' worth of bug fixes?
The current release in that branch is 9.1.13, and a quick look through
the git history shows quite a number of replication-related fixes.

One that seems particularly notable in this connection is:

    commit 16222f32ed56d3ebc4136133662d932299188955
    Author: Simon Riggs <simon@2ndQuadrant.com>
    Date:   Thu Jun 7 19:24:47 2012 +0100

    Wake WALSender to reduce data loss at failover for async commit.
    WALSender now woken up after each background flush by WALwriter, avoiding
    multi-second replication delay for an all-async commit workload.
    Replication delay reduced from 7s with default settings to 200ms, allowing
    significantly reduced data loss at failover.

    Andres Freund and Simon Riggs

You wouldn't happen to be running with synchronous_commit off, would you?

Whether this is the explanation for your problem or not, it's really
irresponsible to still be running 9.1.4 at this point.  There are several
known data-loss-inducing bugs in it that will eat your data sooner or later.

            regards, tom lane


Re: Dead rows not getting removed during vacuum

From
Granthana Biswas
Date:
Thank you Tom. We will be upgrading soon.

Regards,
Granthana


On Mon, Mar 24, 2014 at 7:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Granthana Biswas <granthana@zedo.com> writes:
> Version is PostgreSQL 9.1.4.

You do realize you're missing almost two years' worth of bug fixes?
The current release in that branch is 9.1.13, and a quick look through
the git history shows quite a number of replication-related fixes.

One that seems particularly notable in this connection is:

    commit 16222f32ed56d3ebc4136133662d932299188955
    Author: Simon Riggs <simon@2ndQuadrant.com>
    Date:   Thu Jun 7 19:24:47 2012 +0100

    Wake WALSender to reduce data loss at failover for async commit.
    WALSender now woken up after each background flush by WALwriter, avoiding
    multi-second replication delay for an all-async commit workload.
    Replication delay reduced from 7s with default settings to 200ms, allowing
    significantly reduced data loss at failover.

    Andres Freund and Simon Riggs

You wouldn't happen to be running with synchronous_commit off, would you?

Whether this is the explanation for your problem or not, it's really
irresponsible to still be running 9.1.4 at this point.  There are several
known data-loss-inducing bugs in it that will eat your data sooner or later.

                        regards, tom lane