Thread: [BUGS] BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancelsautovacuum even if the index already exists.
[BUGS] BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancelsautovacuum even if the index already exists.
From
mba.ogolny@gmail.com
Date:
The following bug has been logged on the website: Bug reference: 14768 Logged by: Marcin Barczyński Email address: mba.ogolny@gmail.com PostgreSQL version: 9.6.3 Operating system: Ubuntu 14.04 but most probably it doesn't matter Description: I am perfectly aware of the fact that CREATE INDEX CONCURRENTLY on a table cancels a running autovacuum process on that table. But CREATE INDEX CONCURRENTLY IF NOT EXISTS should take ShareUpdateExclusiveLock only after checking that the index doesn't exist. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTScancels autovacuum even if the index already exists.
From
Michael Paquier
Date:
On Wed, Aug 2, 2017 at 12:39 PM, <mba.ogolny@gmail.com> wrote: > I am perfectly aware of the fact that CREATE INDEX CONCURRENTLY on a table > cancels a running autovacuum process on that table. > But CREATE INDEX CONCURRENTLY IF NOT EXISTS should take > ShareUpdateExclusiveLock only after checking that the index doesn't exist. Logically the checks in index_create could happen in DefineIndex() as there is no if_not_exists logic for toast indexes. But do we want to skip all the sanity checks done before that, particularly for exclusion constraints with concurrent creation? I am less sure, and this point deserves extra opinions as those pre-checks have value in themselves because the query defined is incorrectly shaped, so likely users want to know about that before being sure that the named relation already exists or not. -- Michael -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTS cancels autovacuum even if the index already exists.
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Aug 2, 2017 at 12:39 PM, <mba.ogolny@gmail.com> wrote: >> I am perfectly aware of the fact that CREATE INDEX CONCURRENTLY on a table >> cancels a running autovacuum process on that table. >> But CREATE INDEX CONCURRENTLY IF NOT EXISTS should take >> ShareUpdateExclusiveLock only after checking that the index doesn't exist. > Logically the checks in index_create could happen in DefineIndex() as > there is no if_not_exists logic for toast indexes. But do we want to > skip all the sanity checks done before that, particularly for > exclusion constraints with concurrent creation? I'm afraid this complaint is just wishful/sloppy thinking. It's useless to perform an "index doesn't exist" check without holding a lock that's sufficient to prevent such an index from being created by a concurrent transaction. There is no lock level less than SHARE UPDATE EXCLUSIVE that would prevent that; and even if there was, taking that level to make the check and then upgrading to SHARE UPDATE EXCLUSIVE would constitute a deadlock risk in itself. Perhaps the OP's problem --- which he failed to state exactly, but I suppose can be written as "I wish a failed CREATE INDEX CONCURRENTLY didn't kill a concurrent autovacuum before failing" --- could be resolved by subdividing SHARE UPDATE EXCLUSIVE into more than one lock level. But that's not exactly a trivial change. And it's not very clear why this is such a big problem that we need to be making a delicate redesign of the locking logic to avoid it. Autovacuum cancels are pretty routine, while I'm having a hard time understanding why index builds would happen so often that they'd lock out autovacuum for problematic amounts of time. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #14768: CREATE INDEX CONCURRENTLY IF NOT EXISTScancels autovacuum even if the index already exists.
From
Marcin Barczyński
Date:
On Thu, Aug 24, 2017 at 7:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Is it the best approach?
--
Regards,
Marcin
I'm afraid this complaint is just wishful/sloppy thinking.
Cancelling autovacuum was a problem for two reasons:
1. We ran a bunch of CREATE INDEX CONCURRENTLY IF NOT EXISTS queries on our service start up. We thought it's a no-op if an index already exists, and wanted to have some kind of background migrations: the service works normally, but some operations are slower until the index is created.
2. Autovacuum takes days/weeks in our scale (billions of rows), so effectively it never completed due to service restarts. We investigated the problem, and it turned out that most of the time was spent on vacuuming a GIST index on timestamp range. I took a look at the code, and during vacuum GIST index is traversed in a logical order which translates into random disk accesses (function gistbulkdelete in gistvacuum.c). Our index has almost 800 GB, so random accesses affect us badly. On the other hand, btree indexes are vacuumed in physical order (function btvacuumscan in nbtree.c).
We replaced CREATE INDEX CONCURRENTLY IF NOT EXISTS with proper migrations.
By the way, CREATE INDEX IF NOT EXISTS also cancels autovacuum task. I get your point with locks, but from a user's perspective, I don't understand why I have to resort to the following code to avoid cancelling autovacuum:
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1
FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'my_namespace' AND c.relname = 'my_index') THEN
CREATE INDEX my_index ...;
END IF;
END$$;
As for the long autovacuum, maybe I should report it as a separate bug. For now, I'm planning to replace all uses of 'contains' operator with the following function:
CREATE OR REPLACE FUNCTION tstzrange_contains(
range tstzrange,
ts timestamptz)
RETURNS bool AS
$$
SELECT (ts >= lower(range) AND (lower_inc(range) OR ts > lower(range)))
AND (ts <= upper(range) AND (upper_inc(range) OR ts < upper(range)))
$$ LANGUAGE SQL IMMUTABLE;
and create btree indexes on lower and upper bound:
CREATE INDEX my_table_time_range_lower_idx ON my_table (lower(time_range));
CREATE INDEX my_table_time_range_upper_idx ON my_table (upper(time_range));
--
Regards,
Marcin