Re: Issue executing query from container - Mailing list pgsql-general
From | Eudald Valcàrcel Lacasa |
---|---|
Subject | Re: Issue executing query from container |
Date | |
Msg-id | CANEx+AXSzuf4wPeNHTcVg29EFhHarwMEnMgYHbrbJgdmHTENoA@mail.gmail.com Whole thread Raw |
In response to | Re: Issue executing query from container (Eudald Valcàrcel Lacasa <eudald.valcarcel@gmail.com>) |
Responses |
Re: Issue executing query from container
Re: Issue executing query from container |
List | pgsql-general |
Hello Tom, Sorry to bump this conversation, but I'm still dealing with this issue. I come back because I've found something different that throws away all the hypotheses (or at least it's what I think). I've run the automated script several times, and the duration of the query is widely variable. Using the same query plan, I've: 2020-10-27 11:08:39.027 CET [18671] cefron@kontriki LOG: duration: 5.735 ms plan: Query Text: UPDATE import_temp_256 AS tmp SET status = 11 FROM blacklist_central blc WHERE lower(tmp.email) = blc.value AND tmp.status = 1 AND blc.reason = 'hardbounce' Update on import_temp_256 tmp (cost=100.00..2878.74 rows=3 width=501) -> Nested Loop (cost=100.00..2878.74 rows=3 width=501) Join Filter: (lower((tmp.email)::text) = (blc.value)::text) -> Seq Scan on import_temp_256 tmp (cost=0.00..2750.96 rows=211 width=401) Filter: (status = 1) -> Materialize (cost=100.00..116.71 rows=3 width=130) -> Foreign Scan on blacklist_central blc (cost=100.00..116.70 rows=3 width=130) 2020-10-27 11:08:39.027 CET [18671] cefron@kontriki LOG: duration: 6.122 ms And being executed at 11:20:54 (but completed at 11:46:32), the following: 2020-10-27 11:46:32.714 CET [30340] cefron@kontriki LOG: duration: 1538286.304 ms plan: Query Text: UPDATE import_temp_257 AS tmp SET status = 11 FROM blacklist_central blc WHERE lower(tmp.email) = blc.value AND tmp.status = 1 AND blc.reason = 'hardbounce' Update on import_temp_257 tmp (cost=100.00..2878.74 rows=3 width=501) -> Nested Loop (cost=100.00..2878.74 rows=3 width=501) Join Filter: (lower((tmp.email)::text) = (blc.value)::text) -> Seq Scan on import_temp_257 tmp (cost=0.00..2750.96 rows=211 width=401) Filter: (status = 1) -> Materialize (cost=100.00..116.71 rows=3 width=130) -> Foreign Scan on blacklist_central blc (cost=100.00..116.70 rows=3 width=130) 2020-10-27 11:46:32.715 CET [30340] cefron@kontriki LOG: duration: 1538287.737 ms As you can see, they don't seem to differ between each other, but the duration is abismaly different. I have the database logging everything, what should I look for so that it can help me understand this strange behavior? Thank you, Eudald El mié., 15 jul. 2020 a las 18:37, Eudald Valcàrcel Lacasa (<eudald.valcarcel@gmail.com>) escribió: > > Hello Tom, > I've noticed something that may be a factor on this problem. > > In the automated script, 5 or 6 seconds before this query happens, the > foreign table is created. > If I create the foreign table manually and skip its creation in the > automated script, and then I run the automated script, > the query takes 5-6 seconds to be executed instead of 2 hours. > > Hope this info is helpful to dig further. > Thanks, > Eudald > > El mié., 15 jul. 2020 a las 16:42, Tom Lane (<tgl@sss.pgh.pa.us>) escribió: > > > > =?UTF-8?Q?Eudald_Valc=C3=A0rcel_Lacasa?= <eudald.valcarcel@gmail.com> writes: > > > After running the query both manually and with the script, I've the > > > following logs: > > > > > MANUALLY: > > > Update on import_temp_2 tmp (cost=116.73..17352.10 rows=5557 width=293) > > > -> Hash Join (cost=116.73..17352.10 rows=5557 width=293) > > > Hash Cond: (lower((tmp.email)::text) = lower((bl.value)::text)) > > > -> Seq Scan on import_temp_2 tmp (cost=0.00..14864.20 > > > rows=370496 width=193) > > > Filter: (status = 1) > > > -> Hash (cost=116.70..116.70 rows=3 width=130) > > > Buckets: 32768 (originally 1024) Batches: 2 > > > (originally 1) Memory Usage: 3841kB > > > -> Foreign Scan on blacklist_central bl > > > (cost=100.00..116.70 rows=3 width=130) > > > > > AUTOMATED: > > > Update on import_temp_2 tmp (cost=100.00..13295.86 rows=15 width=500) > > > -> Nested Loop (cost=100.00..13295.86 rows=15 width=500) > > > Join Filter: (lower((tmp.email)::text) = lower((bl.value)::text)) > > > -> Seq Scan on import_temp_2 tmp (cost=0.00..13118.74 > > > rows=1007 width=400) > > > Filter: (status = 1) > > > -> Materialize (cost=100.00..116.71 rows=3 width=130) > > > -> Foreign Scan on blacklist_central bl > > > (cost=100.00..116.70 rows=3 width=130) > > > > So the question is why you are getting an estimate of 370496 import_temp_2 > > rows with status = 1 in the first case, and only 1007 rows in the second. > > > > I suspect that the true number of rows is quite large, causing the > > nested-loop plan to run slowly. (Is the row estimate of 3 for the > > foreign scan anywhere near reality, either?) > > > > You may need to insert a manual ANALYZE in your automated process to > > ensure that import_temp_2 has up-to-date stats before you try to do > > this step. It seems somewhat likely that autovacuum takes care of > > that for you in the "manual" case, but its reaction time is too slow > > to fill the gap for the automated process. > > > > regards, tom lane
pgsql-general by date: