Thread: problems with set_config, work_mem, maintenance_work_mem, and sorting

problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Jon Nelson
Date:
I happened to be looking in the PostgreSQL logs (8.4.10, x86_64,
ScientificLinux 6.1) and noticed that an app was doing some sorting
(group by, order by, index creation) that ended up on disk rather than
staying in memory.
So I enabled trace_sort and restarted the app.
What followed confused me.

I know that the app is setting the work_mem and maintenance_work_mem
to 1GB, at the start of the session, with the following calls:

select set_config(work_mem, 1GB, False);
select set_config(maintenance_work_mem, 1GB, False);

By timestamps, I know that these statements take place before the next
log items, generated by PostgreSQL (note: I also log the PID of the
backend and all of these are from the same PID):

LOG:  00000: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
^ these make sense

LOG:  00000: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 1, workMem = 131072, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 1, workMem = 131072, randomAccess = f
....
^^ these do not (but 128MB is the globally-configured work_mem value)

LOG:  00000: begin index sort: unique = t, workMem = 2097152, randomAccess = f
^ this kinda does (2GB is the globally-configured maintenance_work_mem value)

LOG:  00000: begin index sort: unique = f, workMem = 131072, randomAccess = f
LOG:  00000: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
..


The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
Why does PostgreSQL /sometimes/ use the globally-configured values and
sometimes use the values that come from the connection?
Am I wrong in misunderstanding what 'session' variables are? I thought
that session (versus transaction) config items were set for /all/
transactions in a given backend, until changed or until that backend
terminates. Is that not so?

If I reconfigure the app to call out to set_config(item, value, True)
after each 'BEGIN' statement then workMem seems to be correct (at
least more of the time -- the process takes some time to run and I
haven't done an exhaustive check as yet).

--
Jon

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Tom Lane
Date:
Jon Nelson <jnelson+pgsql@jamponi.net> writes:
> The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
> Why does PostgreSQL /sometimes/ use the globally-configured values and
> sometimes use the values that come from the connection?

You sure those log entries are all from the same process?

            regards, tom lane

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Jon Nelson
Date:
On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
>> Why does PostgreSQL /sometimes/ use the globally-configured values and
>> sometimes use the values that come from the connection?
>
> You sure those log entries are all from the same process?

If I am understanding this correctly, yes. They all share the same pid.
The logline format is:

log_line_prefix = '%t %d %u [%p]'

and I believe %p represents the pid, and also that a pid corresponds
to a backend. Therefore, same pid == same backend == same connection
== same session. Many transactions within a session.


--
Jon

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Tom Lane
Date:
Jon Nelson <jnelson+pgsql@jamponi.net> writes:
> On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>>> Why does PostgreSQL /sometimes/ use the globally-configured values and
>>> sometimes use the values that come from the connection?

>> You sure those log entries are all from the same process?

> If I am understanding this correctly, yes. They all share the same pid.

Hmph ... does seem a bit weird.  Can you turn on log_statements and
identify which operations aren't using the session values?

            regards, tom lane

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Jon Nelson
Date:
On Tue, Feb 28, 2012 at 2:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>>>> Why does PostgreSQL /sometimes/ use the globally-configured values and
>>>> sometimes use the values that come from the connection?
>
>>> You sure those log entries are all from the same process?
>
>> If I am understanding this correctly, yes. They all share the same pid.
>
> Hmph ... does seem a bit weird.  Can you turn on log_statements and
> identify which operations aren't using the session values?

I had log_min_duration_statement = 1000.

An example:

LOG:  00000: begin tuple sort: nkeys = 3, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT:  INSERT INTO (new table) SELECT (bunch of stuff here) FROM
.. ORDER BY ...

and also some CREATE TABLE ... statements:

LOG:  00000: begin index sort: unique = f, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_index_btree, tuplesort.c:642
STATEMENT:  CREATE TABLE <tablename> (LIKE some_other_tablename)

I also see this:

LOG:  00000: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT: SELECT <bunch of stuff from system catalogs>

which is the ORM library (SQLAlchemy) doing a reflection of the
table(s) involved.
The statement is from the same backend (pid) and takes place
chronologically *after* the following:

LOG:  00000: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT: <more reflection stuff>

Is that useful?

If that's not enough, I can crank the logging up.
What would you like to see for 'log_statements' (if what I've provided
you above is not enough).

--
Jon

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Kääriäinen Anssi
Date:
Quoting Jon Nelson:
"""
The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
Why does PostgreSQL /sometimes/ use the globally-configured values and
sometimes use the values that come from the connection?
Am I wrong in misunderstanding what 'session' variables are? I thought
that session (versus transaction) config items were set for /all/
transactions in a given backend, until changed or until that backend
terminates. Is that not so?
"""

Could it be that the transaction which does the set_config is rolled back? If that is
the case, the set_config is rolled back, too. However, if the transaction commits,
then the set_config should be in effect for the whole session. It seems this is not
documented at all for set_config, just for SET SQL command.

I think it would be nice to have a way to force the connection to use the provided
settings even if the transaction in which they are done is rolled back. In single statement
if possible. Otherwise you might be forced to do a transaction just to be sure the SET
is actually in effect for the connection's life-time.

Django was bitten by this for example, it is now fixed by using this:
https://github.com/django/django/blob/master/django/db/backends/postgresql_psycopg2/base.py#L189

 - Anssi

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Tom Lane
Date:
Jon Nelson <jnelson+pgsql@jamponi.net> writes:
> On Tue, Feb 28, 2012 at 2:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmph ... does seem a bit weird.  Can you turn on log_statements and
>> identify which operations aren't using the session values?

> I had log_min_duration_statement = 1000.

That's not really going to prove much, as you won't be able to see any
commands that might be setting or resetting the work_mem parameters.

> ... which is the ORM library (SQLAlchemy) doing a reflection of the
> table(s) involved.

Oh, there's an ORM involved?  I'll bet a nickel it's doing something
surprising, like not issuing your SET until much later than you thought.

            regards, tom lane

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Claudio Freire
Date:
On Tue, Feb 28, 2012 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> ... which is the ORM library (SQLAlchemy) doing a reflection of the
>> table(s) involved.
>
> Oh, there's an ORM involved?  I'll bet a nickel it's doing something
> surprising, like not issuing your SET until much later than you thought.

I'd rather go for an auto-rollback at some point within the
transaction that issued the set work_mem. SQLA tends to do that if,
for instance, an exception is risen within a transaction block (ie,
flushing).

You can issue the set work_mem in its own transaction, and commit it,
and in that way avoid that rollback.

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Jon Nelson
Date:
On Tue, Feb 28, 2012 at 4:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Tue, Feb 28, 2012 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>>> ... which is the ORM library (SQLAlchemy) doing a reflection of the
>>> table(s) involved.
>>
>> Oh, there's an ORM involved?  I'll bet a nickel it's doing something
>> surprising, like not issuing your SET until much later than you thought.
>
> I'd rather go for an auto-rollback at some point within the
> transaction that issued the set work_mem. SQLA tends to do that if,
> for instance, an exception is risen within a transaction block (ie,
> flushing).
>
> You can issue the set work_mem in its own transaction, and commit it,
> and in that way avoid that rollback.

I cranked the logging /all/ the way up and isolated the server.
I suspect that your theory is correct.
I'll spend a bit more time investigating.


--
Jon

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Claudio Freire
Date:
On Tue, Feb 28, 2012 at 8:00 PM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:
> I cranked the logging /all/ the way up and isolated the server.
> I suspect that your theory is correct.

Another option, depending on your SQLA version, when connections are
sent back to the pool, I seem to remember they were reset. That would
also reset the work_mem, you'd still see the same pid on PG logs, but
it's not the same session.

Re: problems with set_config, work_mem, maintenance_work_mem, and sorting

From
Ants Aasma
Date:


On Feb 29, 2012 1:44 AM, "Claudio Freire" <klaussfreire@gmail.com> wrote:
> Another option, depending on your SQLA version, when connections are
> sent back to the pool, I seem to remember they were reset. That would
> also reset the work_mem, you'd still see the same pid on PG logs, but
> it's not the same session.

Except that any open transactions are rolled back no other reset is done. The correct way to handle this would be to set the options and commit the transaction in Pool connect or checkout events. The event choice depends on whether application scope or request scope parameters are wanted.

--
Ants Aasma