Thread: Making table reloading easier
Hi all A very common operation that users perform is reloading tables. Sometimes as part of an ETL process. Sometimes as part of a dump and reload. Sometimes loading data from external DBs, etc. Right now users have to jump through a bunch of hoops to do this efficiently: BEGIN; TRUNCATE TABLE my_table; SELECT pg_get_indexdef(indexrelid::regclass) FROM pg_index WHERE indrelid = 'table_name'::regclass; -- Drop 'em all DROP INDEX ... ; COPY my_table FROM 'file'; -- Re-create indexes CREATE INDEX ...; COMMIT; This is pretty clunky. We already have support for disabling indexes, it's just not exposed to the user. So the simplest option would seem to be to expose it with something like: ALTER TABLE my_table DISABLE INDEX ALL; which would take an ACCESS EXCLUSIVE lock then set: indistready = 'f' indislive = 'f' indisvalid = 'f' on each index, or the named index if the user specifies one particular index. After loading the table, a REINDEX on the table would rebuild and re-enable the indexes. That changes the process to: BEGIN; TRUNCATE TABLE my_table; ALTER TABLE my_table DISABLE INDEX ALL; COPY ...; REINDEX TABLE my_table; COMMIT; It'd be even better to also add a REINDEX flag to COPY, where it disables indexes and re-creates them after it finishes. But that could be done separately. Thoughts? I'm not sure I can tackle this in the current dev cycle, but it looks simple enough that I can't help wondering what obvious thing I'm missing about why it hasn't been done yet. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
ALTER TABLE my_table
DISABLE INDEX ALL;
+1
This very thing came up in a conversation with PeterG early last year. I was in favor then and I was surprised that the only thing standing in the way was a lack of ALTER TABLE syntax.
Creating temporary data structures to mimic existing metadata structures is a pain.
It'd be even better to also add a REINDEX flag to COPY, where it
disables indexes and re-creates them after it finishes. But that could
be done separately.
I'm iffy on the COPY change. If we add index rebuilding, why not disabling as well? If the COPY fails, what state do we leave the indexes in?
I'm not sure I can tackle this in the current dev cycle,
I may have some spare cycles to devote to this, but it's unfamiliar territory. I'm happy to do the grunt work if I had some higher level guidance.
On Thu, Nov 3, 2016 at 11:37 PM, Corey Huinker <corey.huinker@gmail.com> wrote: >> ALTER TABLE my_table >> DISABLE INDEX ALL; > > +1 > This very thing came up in a conversation with PeterG early last year. I was > in favor then and I was surprised that the only thing standing in the way > was a lack of ALTER TABLE syntax. > Creating temporary data structures to mimic existing metadata structures is > a pain. As long as the exclusive lock is kept until the end of the transaction that would make it. I got scared when reading $subject that you were going to propose a ENABLE ALL. This command should be a no-opt outside a transaction block though. >> It'd be even better to also add a REINDEX flag to COPY, where it >> disables indexes and re-creates them after it finishes. But that could >> be done separately. > > I'm iffy on the COPY change. If we add index rebuilding, why not disabling > as well? If the COPY fails, what state do we leave the indexes in? Yeah, I am -1 on that. Leaving the indexes in an unclean state would force the users to REINDEX anyway afterwards. -- Michael