Thread: avoid lock conflict between SELECT and TRUNCATE
Once in a while, I have a report running a complex query such as this: BEGIN;declare "SQL_CUR0000000004919850" cursor with hold for SELECT "auths_with_trans"."user_id" AS "user_id (auths_with_trans)", MAX("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(2099950671)(0)", MIN("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(99676510)(0)", MIN("auths_with_trans"."trans_time") AS "usr:Calculation_6930907163324031:ok", MIN("auths_with_trans"."auth_created") AS "usr:Calculation_9410907163052141:ok" FROM "public"."users" "users" LEFT JOIN "public"."auths_with_trans" "auths_with_trans" ON ("users"."user_id" = "auths_with_trans"."user_id") GROUP BY 1;fetch 100 in "SQL_CUR0000000004919850" But it takes a long time to complete, and meanwhile a cron job tries to rebuild the users table by first doing "TRUNCATE TABLE users" and then repopulating it with data. Obviously, TRUNCATE is blocked until the long SELECT finishes. I'm looking for ways to avoid the conflict. One way would be to do incremental updates to the users table - that's not an option yet. What if I rename the users table to users_YYYYMMDD? Would that still be blocked by SELECT? If it's not blocked, then I could rename users out of the way, and then recreate it with fresh data as plain 'users'. Then I'd have a cron job dropping old users tables when they get too old. -- Florin Andrei http://florin.myip.org/
On Thu, Sep 10, 2015 at 3:54 AM, Florin Andrei <florin@andrei.myip.org> wrote:
Once in a while, I have a report running a complex query such as this:
BEGIN;declare "SQL_CUR0000000004919850" cursor with hold for SELECT "auths_with_trans"."user_id" AS "user_id (auths_with_trans)",
MAX("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(2099950671)(0)",
MIN("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(99676510)(0)",
MIN("auths_with_trans"."trans_time") AS "usr:Calculation_6930907163324031:ok",
MIN("auths_with_trans"."auth_created") AS "usr:Calculation_9410907163052141:ok"
FROM "public"."users" "users"
LEFT JOIN "public"."auths_with_trans" "auths_with_trans" ON ("users"."user_id" = "auths_with_trans"."user_id")
GROUP BY 1;fetch 100 in "SQL_CUR0000000004919850"
But it takes a long time to complete, and meanwhile a cron job tries to rebuild the users table by first doing "TRUNCATE TABLE users" and then repopulating it with data. Obviously, TRUNCATE is blocked until the long SELECT finishes.
I'm looking for ways to avoid the conflict. One way would be to do incremental updates to the users table - that's not an option yet.
What if I rename the users table to users_YYYYMMDD? Would that still be blocked by SELECT? If it's not blocked, then I could rename users out of the way, and then recreate it with fresh data as plain 'users'. Then I'd have a cron job dropping old users tables when they get too old.
Yes. Renaming the table would interrupt the ongoing SELECT. The best approach would be (if possible) to isolate the SELECT itself. You can consider renaming the "users" table before the SELECT starts (say renamed to users_orig) and then execute the SELECT on user_orig table and drop it (if the data is no more needed) after the SELECT finishes. Instead of TRUNCATE, you can consider re-creating the "users" table and populating the data. If you take this approach, you will need to be careful regarding privileges/grants and dependencies on the table.
Or the second approach would be --
Create a table called users_orig from the "users" table and execute SELECT on user_orig table and let the TRUNCATE/data-repopulation operation run on "users" table. This will be a problem if the data is huge. It might take up your hardware resources.
Third and simple approach would be to -
Execute SELECT and TRUNCATE at different times.
All of the above approaches are without considering data-size and other critical aspects of environment, which you need to worry about.
Regards,
Venkata Balaji N
Fujitsu Australia
as there is no option for incremental update/insert on user and renaming will have app query errors
I guess
1) creating temporary table (say temp_users) on table users with required data/columns-list and index on column user_id,
...this will be faster as there will be no joins with other tables
2) also need index on table auths_with_trans column user_id
3) replacing users with temp_users in BEGIN block
with this may reduce impact/dependency on regular crontab
Thanks
Sridhar BN
On Fri, Sep 11, 2015 at 10:52 AM, Venkata Balaji N <nag1010@gmail.com> wrote:
On Thu, Sep 10, 2015 at 3:54 AM, Florin Andrei <florin@andrei.myip.org> wrote:Once in a while, I have a report running a complex query such as this:
BEGIN;declare "SQL_CUR0000000004919850" cursor with hold for SELECT "auths_with_trans"."user_id" AS "user_id (auths_with_trans)",
MAX("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(2099950671)(0)",
MIN("auths_with_trans"."user_created") AS "TEMP(attr:user_created:ok)(99676510)(0)",
MIN("auths_with_trans"."trans_time") AS "usr:Calculation_6930907163324031:ok",
MIN("auths_with_trans"."auth_created") AS "usr:Calculation_9410907163052141:ok"
FROM "public"."users" "users"
LEFT JOIN "public"."auths_with_trans" "auths_with_trans" ON ("users"."user_id" = "auths_with_trans"."user_id")
GROUP BY 1;fetch 100 in "SQL_CUR0000000004919850"
But it takes a long time to complete, and meanwhile a cron job tries to rebuild the users table by first doing "TRUNCATE TABLE users" and then repopulating it with data. Obviously, TRUNCATE is blocked until the long SELECT finishes.
I'm looking for ways to avoid the conflict. One way would be to do incremental updates to the users table - that's not an option yet.
What if I rename the users table to users_YYYYMMDD? Would that still be blocked by SELECT? If it's not blocked, then I could rename users out of the way, and then recreate it with fresh data as plain 'users'. Then I'd have a cron job dropping old users tables when they get too old.Yes. Renaming the table would interrupt the ongoing SELECT. The best approach would be (if possible) to isolate the SELECT itself. You can consider renaming the "users" table before the SELECT starts (say renamed to users_orig) and then execute the SELECT on user_orig table and drop it (if the data is no more needed) after the SELECT finishes. Instead of TRUNCATE, you can consider re-creating the "users" table and populating the data. If you take this approach, you will need to be careful regarding privileges/grants and dependencies on the table.Or the second approach would be --Create a table called users_orig from the "users" table and execute SELECT on user_orig table and let the TRUNCATE/data-repopulation operation run on "users" table. This will be a problem if the data is huge. It might take up your hardware resources.Third and simple approach would be to -Execute SELECT and TRUNCATE at different times.All of the above approaches are without considering data-size and other critical aspects of environment, which you need to worry about.Regards,Venkata Balaji NFujitsu Australia
On 9/11/15 4:50 AM, Sridhar N Bamandlapally wrote: > 1) creating temporary table (say temp_users) on table users with > required data/columns-list and index on column user_id, > ...this will be faster as there will be no joins with other tables > > 2) also need index on table auths_with_trans column user_id > > 3) replacing users with temp_users in BEGIN block That's not really going to help unless the slow part is that you're using a cursor (which is why I don't like them...) FWIW, now that there's better transactional visibility support in the catalogs it might be possible to reduce the lock contention of trunkcate. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com