On 24.2.2015 05:09, Andrew Gierth wrote:
>>>>>> "Tomas" == Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>
> Tomas> I believe the small regressions (1-10%) for small data sets,
> Tomas> might be caused by this 'random padding' effect, because that's
> Tomas> probably where L1/L2 cache is most important. For large datasets
> Tomas> the caches are probably not as efficient anyway, so the random
> Tomas> padding makes no difference,
>
> Except that _your own results_ show that this is not the case.
>
> In your first set of results you claimed a reduction in performance
> to 91% for a query which is _in no way whatsoever_ affected by any
> of the code changes. How is this not noise?
>
> I refer specifically to this case from your spreadsheet:
>
> query select * from (select * from stuff_text order by randtxt
> offset 100000000000) foo
> type text
> scale 5
> master 26.949
> datum 28.051
> numeric 29.734
> datum 96%
> numeric 91%
>
> This query does not invoke any code path touched by either the datum
> or numeric patches! The observed slowdown is therefore just noise
> (assuming here that your timings are correct).
I don't see how that contradicts what I said? Weren't you claiming that
changes / random padding in unrelated functions may impact other places
by introducing different patterns of cache misses etc?
But yes, I do agree the first set of results is rather noisy, at least
partially due to running on CPU (i7-4770R) with some power-management
features etc. That's why I did the follow-up tests on a different system
with CPUs that don't not do that - at least the i5 does not, and the
results are much better IMHO.
> Whether that case can be improved by tweaking the _text_
> abbreviation code is another question, one which is not relevant to
> either of the patches currently in play.
I don't think I suggested that.
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services