Re: DBT-3 with SF=20 got failed - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: DBT-3 with SF=20 got failed |
Date | |
Msg-id | 56054353.8070005@2ndquadrant.com Whole thread Raw |
In response to | Re: DBT-3 with SF=20 got failed (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: DBT-3 with SF=20 got failed
|
List | pgsql-hackers |
On 09/25/2015 02:54 AM, Robert Haas wrote: > On Thu, Sep 24, 2015 at 1:58 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> Meh, you're right - I got the math wrong. It's 1.3% in both cases. >> >> However the question still stands - why should we handle the >> over-estimate in one case and not the other? We're wasting the >> samefraction of memory in both cases. > > Well, I think we're going around in circles here. It doesn't seem > likely that either of us will convince the other. Let's agree we disagree ;-) That's perfectly OK, no hard feelings. > But for the record, I agree with you that in the scenario you lay > out, it's the about the same problem in both cases. I could argue > that it's slightly different because of [ tedious and somewhat > tenuous argument omitted ], but I'll spare you that. OK, although that makes kinda prevents further discussion. > However, consider the alternative scenario where, on the same > machine, perhaps even in the same query, we perform two hash joins, > one of which involves hashing a small table (say, 2MB) and one of > which involves hashing a big table (say, 2GB). If the small query > uses twice the intended amount of memory, probably nothing bad will > happen. If the big query does the same thing, a bad outcome is much > more likely. Say the machine has 16GB of RAM. Well, a 2MB > over-allocation is not going to break the world. A 2GB > over-allocation very well might. I've asked about case A. You've presented case B and shown that indeed, the limit seems to help here. I don't see how that makes any difference in case A, which I asked about? > I really don't see why this is a controversial proposition. It seems > clearly as daylight from here. I wouldn't say controversial, but I do see the proposed solution as misguided because we're fixing A and claiming to also fix B. Not only we're not really fixing B, but may actually make it needlessly slower for people who don't have problems with B at all. We've ran into problem with allocating more than MaxAllocSize. The proposed fix (imposing arbitrary limit) is also supposedly fixing over-estimation problems, but actually it not (IMNSHO). And I think my view is supported by the fact that solutions that seem to actually fix the over-estimation properly emerged. I mean the "let's not build the buckets at all, until the very end" and "let's start with nbatches=0" discussed yesterday. (And I'm not saying that because I proposed those two things.) Anyway, I think you're right we're going in circles here. I think we both presented all the arguments we had and we still disagree. I'm not going to continue with this - I'm unlikely to win an argument against two committers if that didn't happen until now. Thanks for the discussion though. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: