Re: database must be vacuumed with transactions - Mailing list pgsql-admin

From Dave Johansen
Subject Re: database must be vacuumed with transactions
Date
Msg-id CAAcYxUcK+=v_t_R_9XwP1Ju+MJONAbSTReEwFx3mSXXn=h8psQ@mail.gmail.com
Whole thread Raw
In response to Re: database must be vacuumed with transactions  (Keith <keith@keithf4.com>)
List pgsql-admin
On Thu, Jun 25, 2015 at 6:39 PM, Keith <keith@keithf4.com> wrote:
On Thu, Jun 25, 2015 at 8:57 PM, Dave Johansen <davejohansen@gmail.com> wrote:
On Thu, Jun 25, 2015 at 2:23 PM, Keith <keith@keithf4.com> wrote:
On Thu, Jun 25, 2015 at 2:42 PM, Dave Johansen <davejohansen@gmail.com> wrote:
On Thu, Jun 25, 2015 at 11:38 AM, John Scalia <jayknowsunix@gmail.com> wrote:
The pg_stat_activity table will show you which processes are doing what, and you'll be able to see their process ID from the O/S. That might help you.

Yes, but I need to see which process ID is doing a TON of small transactions which is causing the XID to increment at an unexpectedly high rate and so my question is "if there's a way for me to get the XID for every connection/query?".

Look at the log_line_prefix options. To get everything you're looking for to try and identify where this is coming from look at using at a minimum: %p, %r, %m, %x, %u, %d

http://www.postgresql.org/docs/8.4/static/runtime-config-logging.html#GUC-LOG-LINE-PREFIX

We found the source of the issue. It's a function that we're using to all a table to be populated from multiple processes. You can see the details here:
http://www.postgresql.org/message-id/CAAcYxUf9o3NpiKaHdeShO_iJjm92314eX6ko0JvJidmksutkJg@mail.gmail.com

Any ideas on how we could handle this with causing it to increment the XID?

Not sure on how to avoid the xid increment on this. I doubt this can be avoided, but hopefully someone else can respond to your other email with a more definitive answer.

I think it would be nice to make Postgres not increment the XID in this situation, if it hasn't been fixed already,
 
What I would recommend instead in the mean time to help deal with the XID issue is some close monitoring of your current txid vs the autovacuum_max_freeze_age. And get some sort of externally scheduled job to more aggressively vacuum the database based on that. And if you have any static, unwritten tables, you definitely want to VACUUM FREEZE these to avoid the txid building on them. The check_postgres script (https://bucardo.org/wiki/Check_postgres) has this monitoring available as an option. Or if you can't use it, you can at least look at its source code and get the queries it uses and incorporate it into your monitoring solution.

About a week before the problematic process started, we did a change in our schema that affected >75% of the rows in our database. So there was also a LARGE chunk of data that hadn't been frozen but that needed to and that just made the problem worse. From this we've learned, that doing appropriate VACUUM after a large import/re-write of data can help things out. We've also added monitoring as you've recommended to be sure and prevent this sort of issue.
 
And while this won't help you now, 9.5 has this "do nothing" upsert you're trying to do here built in

http://www.depesz.com/2015/05/10/waiting-for-9-5-add-support-for-insert-on-conflict-do-nothingupdate/

For now, though, the function is the only reliable way to do what you're doing.

Yes, sadly it looks like we're stuck with this solution for now.
 
You'll also have to make sure log_min_duration_statement is set to low enough to log the queries you're looking for. Beware setting disabling it or setting it too low, though, if you have a very high number of queries being run. This can quickly overload your IO and disk space.

This probably isn't feasible, because we run a lot of queries with most of them being very short duration.

Just want to clarify, I meant to say you don't want to set it to 0 since that would log everything. Disabling it by setting it to -1 would cause it to not log statements based on their runtime at all, which is the default.

We'll probably turn on logging when the query takes more than some sizeable fraction of a second just to monitor the health of the database and query usage in general.
 
 If you get the database running again, you can also look at the pg_stat_statements contrib module to log query counts and parameterized versions of all queries being run.

http://www.postgresql.org/docs/8.4/static/pgstatstatements.html

We got it up and we will be turning this module on to help diagnose issues like this in the future.
 
And I can't let this email go without say you REALLY need to look at upgrading your database. 8.4 is no longer being supported and receives no security or bug fixes.

We are looking into upgrading to RHEL 7 which comes with Postgres 9.2 but that's a SLOW process.

Recommend going straight to 9.4, especially with a "SLOW" major upgrading process. ;) Or, if the major upgrade is still really far off in reality, might want to just wait for 9.5 and to get the built in UPSERT. It's currently planned for an October release, but there's no guarantee there.

There are official repos for 9.4 on RHEL 7, and there will be ones for 9.5 as well when it hits.

http://www.postgresql.org/download/linux/redhat/
http://yum.postgresql.org/repopackages.php

The problem is just life cycle. Postgres is a lot better than most software out there and has a 5 year life cycle, but we honestly rely on RHEL's 10 year cycle a LOT. I've been pushing very hard to upgrade to RHEL 7 and we're barely getting to the point of being able to begin testing now that it's over 1 year after its release. So honestly, using the newer versions of Postgres provided on postgresql.org just isn't feasible because by the time we get everything approved and tested, there are only a few years of use left in the support life cycle and that makes it a deal breaker for us.

pgsql-admin by date:

Previous
From: Venkata Balaji N
Date:
Subject: Re: Moving pg_xlog folder
Next
From: Eoin O'Hehir
Date:
Subject: Re: Does VACUUM FULL need ANALYZE ?