OK, this sounds very interesting. We already know about many different problems with NULL values because we use heavily different GROUP BYs and WINDOW functions so we replace NULL's everywhere with 'unknown' or similar.
But maybe there is some problem in data import. I will check data.
Josef Machytka <josef.machytka@gmail.com> writes: > Yes, I am sorry, dblink is involved - I just did not see it as significant. > We start several processes in parallel to speed up whole billing > calculation otherwise it would take 10+ hours to calculate everything in > serial. > Ok, so at least "unknown error" is explained.
Well, we have a theory about where it came from, but still not enough information to improve the behavior. Did you look to see what happened on the remote server? > But problem with "NOT IN" remains. > When I replaced "NOT IN" with "NOT EXISTS" query ended after ~3 hours > without any problems. Even over dblink.
You do know that NOT IN and NOT EXISTS behave quite differently with respect to nulls? I'm suspicious that the real problem here is that your query is just wrong when written with NOT IN, and it specifies some unreasonable amount of computation. Possibly something is running out of memory and not dealing with the case very well, leading to the unhelpful error message.
FWIW, just about every bug report I've ever seen about NOT IN boiled down to the complainant's subquery returning one or more nulls and the complainant not understanding what will happen if it does. Unfortunately, that's not a bug, it's the behavior required by the SQL standard.