Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers

From Ants Aasma
Subject Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date
Msg-id CA+CSw_srK=BSBJhLtsPiMM8iZcFKv_0hT8Sf6X2O8VEyhDpsMg@mail.gmail.com
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Thu, Apr 21, 2016 at 5:16 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Wed, Apr 20, 2016 at 8:08 PM, Ants Aasma <ants.aasma@eesti.ee> wrote:
>
>> However, while checking out if my proof of concept patch actually
>> works I hit another issue. I couldn't get my test for the feature to
>> actually work. The test script I used is attached.
>
> Could you provide enough to make that a self-contained reproducible
> test case (i.e., that I don't need to infer or re-write any steps
> or guess how to call it)?  In previous cases people have given me
> where they felt that the feature wasn't working there have have
> been valid reasons for it to behave as it was (e.g., a transaction
> with a transaction ID and an xmin which prevented cleanup from
> advancing).  I'll be happy to look at your case and see whether
> it's another such case or some bug, but it seems a waste to reverse
> engineer or rewrite parts of the test case to do so.

Just to be sure I didn't have anything screwy in my build environment
I redid the test on a freshly installed Fedora 23 VM. Steps to
reproduce:

1. Build postgresql from git. I used ./configure --enable-debug
--enable-cassert --prefix=/home/ants/pg-master
2. Set up database:

cat << EOF > test-settings.conf
old_snapshot_threshold = 1min

logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql.log'
log_line_prefix = '[%m] '
log_autovacuum_min_duration = 0
EOF

    pg-master/bin/initdb data/
    cat test-settings.conf >> data/postgresql.conf
    pg-master/bin/pg_ctl -D data/ start
    pg-master/bin/createdb

3. Install python-psycopg2 and get the test script from my earlier e-mail [1]
4. Run the test:

    python test_oldsnapshot.py "host=/tmp"

5. Observe that the table keeps growing even after the old snapshot
threshold is exceeded and autovacuum has run. Autovacuum log shows 0
tuples removed.

Only the write workload has a xid assigned, the other two backends
only have snapshot held:

[ants@localhost ~]$ pg-master/bin/psql -c "SELECT application_name,
backend_xid, backend_xmin, NOW()-xact_start AS tx_age, state FROM
pg_stat_activity"
   application_name   | backend_xid | backend_xmin |     tx_age      |
       state
----------------------+-------------+--------------+-----------------+---------------------
 write-workload       |       95637 |              | 00:00:00.009314 | active
 long-unrelated-query |             |         1806 | 00:04:33.914048 | active
 interfering-query    |             |         2444 | 00:04:32.910742 |
idle in transaction
 psql                 |             |        95637 | 00:00:00        | active

Output from the test tool attached. After killing the test tool and
the long running query autovacuum cleans stuff as expected.

I'm too tired right now to chase this down myself. The mental toll
that two small kids can take is pretty staggering. But I might find
the time to fire up a debugger sometime tomorrow.

Regards,
Ants Aasma

[1] http://www.postgresql.org/message-id/attachment/43859/test_oldsnapshot.py

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Wire protocol change
Next
From: Tom Lane
Date:
Subject: Dead code in win32.h