Thread: query against pg_locks leads to large memory alloc

query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
Hi,

I have a tool that is trying to collect stats from postgres (v9.1.13).
postgres attempts to allocate more memory than is allowed:

SELECT mode, count(mode) AS count FROM pg_locks GROUP BY mode ORDER BY mode;
ERROR:  invalid memory alloc request size 1459291560

Memory-related configs from the server:

shared_buffers = 10000MB
work_mem = 15MB
maintenance_work_mem = 400MB
effective_cache_size = 50000MB
max_locks_per_transaction = 9000
max_pred_locks_per_transaction = 40000

The machine is running CentOS 6, a 32-core AMD 6276 processor, and is
configured with 64GB of memory.  Transparent Huge Pages are disabled
:-)

Thanks in advance for your time and expertise.

Dave Owens


Re: query against pg_locks leads to large memory alloc

From
Matheus de Oliveira
Date:

On Mon, Aug 18, 2014 at 6:01 PM, Dave Owens <dave@teamunify.com> wrote:
max_locks_per_transaction = 9000
max_pred_locks_per_transaction = 40000


Do you really need such large values? What is your max_connections value?

Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

Re: query against pg_locks leads to large memory alloc

From
Merlin Moncure
Date:
On Mon, Aug 18, 2014 at 4:21 PM, Matheus de Oliveira
<matioli.matheus@gmail.com> wrote:
>
> On Mon, Aug 18, 2014 at 6:01 PM, Dave Owens <dave@teamunify.com> wrote:
>>
>> max_locks_per_transaction = 9000
>> max_pred_locks_per_transaction = 40000

performance of any query to pg_locks is proportional to the setting of
max_locks_per_transaction.  still, something is awry here.  can you
'explain' that query?  also, what's the answer you get when:

SELECT COUNT(*) from pg_locks;

?
merlin


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
On Mon, Aug 18, 2014 at 2:21 PM, Matheus de Oliveira
<matioli.matheus@gmail.com> wrote:
> Do you really need such large values? What is your max_connections value?

max_connections = 450 ...we have found that we run out of shared
memory when max_pred_locks_per_transaction is less than 30k.

On Mon, Aug 18, 2014 at 2:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> performance of any query to pg_locks is proportional to the setting of
> max_locks_per_transaction.  still, something is awry here.  can you
> 'explain' that query?

tudb=# explain SELECT mode, count(mode) AS count FROM pg_locks GROUP
BY mode ORDER BY mode;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Sort  (cost=0.63..0.65 rows=200 width=32)
   Sort Key: l.mode
   ->  HashAggregate  (cost=0.30..0.32 rows=200 width=32)
         ->  Function Scan on pg_lock_status l  (cost=0.00..0.10
rows=1000 width=32)
(4 rows)


> SELECT COUNT(*) from pg_locks;

ERROR:  invalid memory alloc request size 1562436816


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:



> max_connections = 450 ...we have found that we run out of shared
> memory when max_pred_locks_per_transaction is less than 30k.

>> SELECT COUNT(*) from pg_locks;
>
> ERROR:  invalid memory alloc request size 1562436816

It gathers the information in memory to return for all those locks
(I think both the normal heavyweight locks and the predicate locks
do that).  450 * 30000 is 13.5 million predicate locks you could
have, so they don't need a very big structure per lock to start
adding up.  I guess we should refactor that to use a tuplestore, so
it can spill to disk when it gets to be more than work_mem.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Dave Owens <dave@teamunify.com> wrote:
>> max_connections = 450 ...we have found that we run out of shared
>> memory when max_pred_locks_per_transaction is less than 30k.

> It gathers the information in memory to return for all those locks
> (I think both the normal heavyweight locks and the predicate locks
> do that).� 450 * 30000 is 13.5 million predicate locks you could
> have, so they don't need a very big structure per lock to start
> adding up.� I guess we should refactor that to use a tuplestore, so
> it can spill to disk when it gets to be more than work_mem.

Seems to me the bigger issue is why does he need such a huge
max_pred_locks_per_transaction setting?  It's hard to believe that
performance wouldn't tank with 10 million predicate locks active.
Whether you can do "select * from pg_locks" seems pretty far down
the list of concerns about this setting.

            regards, tom lane


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Kevin Grittner <kgrittn@ymail.com> writes:
>> Dave Owens <dave@teamunify.com> wrote:
>>> max_connections = 450 ...we have found that we run out of shared
>>> memory when max_pred_locks_per_transaction is less than 30k.
>
>> It gathers the information in memory to return for all those locks
>> (I think both the normal heavyweight locks and the predicate locks
>> do that).  450 * 30000 is 13.5 million predicate locks you could
>> have, so they don't need a very big structure per lock to start
>> adding up.  I guess we should refactor that to use a tuplestore, so
>> it can spill to disk when it gets to be more than work_mem.
>
> Seems to me the bigger issue is why does he need such a huge
> max_pred_locks_per_transaction setting?  It's hard to believe that
> performance wouldn't tank with 10 million predicate locks active.
> Whether you can do "select * from pg_locks" seems pretty far down
> the list of concerns about this setting.

It would be interesting to know more about the workload which is
capable of that, but it would be a lot easier to analyze what's
going on if we could look at where those locks are being used (in
summary, of course -- nobody can make sense of 10 million detail
lines).  About all I can think to ask at this point is: how many
total tables and indexes are there in all databases in this cluster
(counting each partition of a partitioned table as a separate
table)?  With the promotion of finer-grained locks to courser ones
this should be pretty hard to hit without a very large number of
tables.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
Hi Kevin,

Looking at pg_stat_all_tables and pg_stat_all_indexes on our four
databases we have:

1358 tables
1808 indexes

The above totals do not include template1, template0, or postgres
databases.  We do not use partitioned tables. Only one database has a
meaningful level of concurrency (New Relic reports about 30k calls per
minute, from our main webapp).  That database alone consists of 575
tables and 732 indexes.

Dave Owens


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:

> 1358 tables
> 1808 indexes

Hmm, that's not outrageous.  How about long-running transactions?
Please check pg_stat_activity and pg_prepared_xacts for xact_start
or prepared (respectively) values older than a few minutes.  Since
predicate locks may need to be kept until an overlapping
transaction completes, a single long-running transaction can bloat
the lock count.

Also, could you show use the output from?:

  SELECT version();

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
On Tue, Aug 19, 2014 at 9:40 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Hmm, that's not outrageous.  How about long-running transactions?
> Please check pg_stat_activity and pg_prepared_xacts for xact_start
> or prepared (respectively) values older than a few minutes.  Since
> predicate locks may need to be kept until an overlapping
> transaction completes, a single long-running transaction can bloat
> the lock count.

I do see a handful of backends that like to stay IDLE in transaction
for minutes at a time.  We are refactoring the application responsible
for these long IDLE times, which will hopefully reduce the duration of
their connections.

# select backend_start, xact_start, query_start, waiting,
current_query from pg_stat_activity where xact_start < now() -
interval '3 minutes';
         backend_start         |          xact_start           |
   query_start          | waiting |     current_query

-------------------------------+-------------------------------+-------------------------------+---------+-----------------------
 2014-08-19 09:48:00.398498-07 | 2014-08-19 09:49:19.157478-07 |
2014-08-19 10:03:04.99303-07  | f       | <IDLE> in transaction
 2014-08-19 09:38:00.493924-07 | 2014-08-19 09:53:47.00614-07  |
2014-08-19 10:03:05.003496-07 | f       | <IDLE> in transaction
(2 rows)

... now() was 2014-08-19 10:03 in the above query.  I do not see
anything in pg_prepared_xacts, we do not use two-phase commit.


> Also, could you show use the output from?:
>
>   SELECT version();
                                                    version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit
(1 row)


Re: query against pg_locks leads to large memory alloc

From
Andres Freund
Date:
On 2014-08-18 14:36:52 -0700, Dave Owens wrote:
> On Mon, Aug 18, 2014 at 2:21 PM, Matheus de Oliveira
> <matioli.matheus@gmail.com> wrote:
> > Do you really need such large values? What is your max_connections value?
>
> max_connections = 450 ...we have found that we run out of shared
> memory when max_pred_locks_per_transaction is less than 30k.

What was the precise error message when that happened?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
Hi Andres,

On Tue, Aug 19, 2014 at 10:17 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> max_connections = 450 ...we have found that we run out of shared
>> memory when max_pred_locks_per_transaction is less than 30k.
>
> What was the precise error message when that happened?

2014-07-31 15:00:25 PDT 53dabbea.29c7ERROR:  53200: out of shared memory
2014-07-31 15:00:25 PDT 53dabbea.29c7HINT:  You might need to increase
max_pred_locks_per_transaction.
2014-07-31 15:00:25 PDT 53dabbea.29c7LOCATION:  CreatePredicateLock,
predicate.c:2247
2014-07-31 15:00:25 PDT 53dabbea.29c7STATEMENT:  SELECT member_id,
SUM(credit_quarters) FROM ondeck_tallies_x WHERE team_id = $1 AND
credit_quarters > 0 AND EXTRACT(day from current_timestamp -
dt_attendance_taken) <= $2 GROUP BY member_id


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
I wonder if it would be helpful to restart the database, then begin
gathering information pg_locks while it can still respond to queries.
I speculate that this is possible because the amount of memory needed
to query pg_locks continues to grow (around 1900MB now).

Dave Owens


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:

> I do see a handful of backends that like to stay IDLE in
> transaction for minutes at a time.  We are refactoring the
> application responsible for these long IDLE times, which will
> hopefully reduce the duration of their connections.

That may help some.  Other things to consider:

 - If you can use a connection pooler in transaction mode to reduce
the number of active connections you may be able to improve
performance all around, and dodge this problem in the process.
Very few systems can make efficient use of hundreds of concurrent
connections, but for various reasons fixing that with a connection
pooler is sometimes difficult.

 - If you have transactions (or SELECT statements that you run
outside of explicit transactions) which you know will not be
modifying any data, flagging them as READ ONLY will help contain
the number of predicate locks and will help overall performance.
(If the SELECT statements are not in explicit transactions, you may
have to put them in one to allow the READ ONLY property to be set,
or set default_transaction_read_only in the session to accomplish
this.)

 - Due to the heuristics used for thresholds for combining
fine-grained locks into coarser ones, you might be able to work
around this by boosting max_connections above the number you are
going to use.  Normally when you increase
max_pred_locks_per_transaction it increases the number of page
locks it will allow in a table or index before it combines them
into a relation lock; increasing max_connections doesn't affect the
granularity promotion threshold, but it increases the total number
of predicate locks allowed, so if you boost that and reduce
max_pred_locks_per_transaction in proportion, you may be able to
dodge the problem.  It's an ugly workaround, but it might get you
into better shape.  If that does work, it's good evidence that we
should tweak those heuristics.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:

> I wonder if it would be helpful to restart the database, then begin
> gathering information pg_locks while it can still respond to queries.
> I speculate that this is possible because the amount of memory needed
> to query pg_locks continues to grow (around 1900MB now).

If restart is an option, that sounds like a great idea.  If you
could capture the data into tables where we can summarize to
analyze it in a meaningful way, that would be ideal.  Something
like:

CREATE TABLE activity_snap_1 AS SELECT * FROM pg_stat_activity;

Of course, boost the number for each subsequent run.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
On Tue, Aug 19, 2014 at 11:01 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> If restart is an option, that sounds like a great idea.  If you
> could capture the data into tables where we can summarize to
> analyze it in a meaningful way, that would be ideal.  Something
> like:
>
> CREATE TABLE activity_snap_1 AS SELECT * FROM pg_stat_activity;
>
> Of course, boost the number for each subsequent run.

Kevin -

Would the you or the list be interested in snapshots of pg_locks as well?

I can take a restart tonight and get this going on a half-hourly basis
(unless you think more frequent snaps would be useful).


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:
> On Tue, Aug 19, 2014 at 11:01 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

>> CREATE TABLE activity_snap_1 AS SELECT * FROM pg_stat_activity;

> Would the you or the list be interested in snapshots of pg_locks as well?

Most definitely!  I'm sorry that copied/pasted the pg_stat_activity
example, I was playing with both.  pg_locks is definitely the more
important one, but it might be useful to try to match some of these
locks up against process information as we drill down from the
summary to see examples of what makes up those numbers.

> I can take a restart tonight and get this going on a half-hourly basis
> (unless you think more frequent snaps would be useful).

Each half-hour should be fine as long as that gives at least three
or four samples before you are unable to query pg_locks.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Dave Owens
Date:
I now have 8 hours worth of snapshots from pg_stat_activity and
pg_locks (16 snapshots from each table/view).  I have turned off
collection at this point, but I am still able to query pg_locks:

# SELECT mode, count(mode) AS count FROM pg_locks GROUP BY mode ORDER BY mode;
       mode       |  count
------------------+---------
 AccessShareLock  |     291
 ExclusiveLock    |      19
 RowExclusiveLock |       4
 RowShareLock     |       1
 SIReadLock       | 7287531
(5 rows)

SIReadLocks continue to grow.  It seems, in general, that our
application code over uses Serializable... we have produced a patch
that demotes some heavy-hitting queries down to Read Committed, and we
will see if this makes an impact on the number of SIReadLocks.

Is it interesting that only 101557 out of 7 million SIReadLocks have a
pid associated with them?

-Dave Owens


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Dave Owens <dave@teamunify.com> wrote:

> I now have 8 hours worth of snapshots from pg_stat_activity and
> pg_locks (16 snapshots from each table/view).  I have turned off
> collection at this point, but I am still able to query pg_locks

Could you take the earliest one after activity started, and the
latest one before you stopped collecting them, compress them, and
email them to me off-list, please?

> SIReadLocks continue to grow.  It seems, in general, that our
> application code over uses Serializable... we have produced a patch
> that demotes some heavy-hitting queries down to Read Committed, and we
> will see if this makes an impact on the number of SIReadLocks.

Do all of those modify data?  If not, you may get nearly the same
benefit from declaring them READ ONLY instead, and that would get
better protection against seeing transient invalid states.  One
example of that is here:

http://wiki.postgresql.org/wiki/SSI#Deposit_Report

> Is it interesting that only 101557 out of 7 million SIReadLocks have a
> pid associated with them?

I would need to double-check that I'm not forgetting another case,
but the two cases I can think of where the pid is NULL are if the
transaction is PREPARED (for two phase commit) or if committed
transactions are summarized (so they can be combined) to try to
limit RAM usage.  We might clear the pid if the connection is
closed, but (without having checked yet) I don't think we did that.
Since you don't use prepared transactions, they are probably from
the summarization.  But you would not normally accumulate much
there unless you have a long-running transaction which is not
flagged as READ ONLY.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: query against pg_locks leads to large memory alloc

From
Kevin Grittner
Date:
Kevin Grittner <kgrittn@ymail.com> wrote:
> Dave Owens <dave@teamunify.com> wrote:
>
>> I now have 8 hours worth of snapshots from pg_stat_activity and
>> pg_locks (16 snapshots from each table/view).  I have turned off
>> collection at this point, but I am still able to query pg_locks
>
> Could you take the earliest one after activity started, and the
> latest one before you stopped collecting them, compress them, and
> email them to me off-list, please?

Dave did this, off-list.  There is one transaction which has been
running for over 20 minutes, which seems to be the cause of the
accumulation.  I note that this query does not hold any of the
locks it would need to take before modifying data, and it has not
been assigned a transactionid -- both signs that it has (so far)
not modified any data.  If it is not going to modify any, it would
not have caused this accumulation of locks if it was flagged as
READ ONLY.  This is very important to do if you are using
serializable transactions in PostgreSQL.

To quantify that, I show the number of SIReadLocks in total:

test=# select count(*) from locks_snap_16 where mode = 'SIReadLock';
  count
---------
 3910257
(1 row)

... and the number of those which are only around because there is
an open overlapping transaction, not flagged as read only:

test=# select count(*) from locks_snap_16 l
test-#   where mode = 'SIReadLock'
test-#     and not exists (select * from locks_snap_16 a
test(#                       where a.locktype = 'virtualxid'
test(#                         and a.virtualxid = l.virtualtransaction);
  count
---------
 3565155
(1 row)

I can't stress enough how important it is that the advice near the
bottom of this section of the documentation is heeded:

http://www.postgresql.org/docs/9.2/interactive/transaction-iso.html#XACT-SERIALIZABLE

Those bullet-points are listed roughly in order of importance;
there is a reason this one is listed first:

 - Declare transactions as READ ONLY when possible.

In some shops using SERIALIZABLE transactions, I have seen them set
default_transaction_read_only = on, and explicitly set it off for
transactions which will (or might) modify data.

If you have a long-running report that might itself grab a lot of
predicate locks (a/k/a SIReadLocks), you can avoid that by
declaring the transaction as READ ONLY DEFERRABLE.  If you do that,
the transaction will wait to begin execution until it can acquire a
snapshot guaranteed not to show any anomalies (like the example
referenced in an earlier post can show).  It then runs without
acquiring any predicate locks, just like a REPEATABLE READ
transaction.  In fairly busy benchmarks, we never saw it take more
than six seconds to acquire such a snapshot, although the wait time
is not bounded.  Again, getting such a snapshot will be possible
sooner if you declare transactions as READ ONLY when possible.  :-)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company