Locale timings - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Locale timings
Date
Msg-id Pine.LNX.4.30.0111261852030.612-100000@peter.localdomain
Whole thread Raw
Responses Re: Locale timings
List pgsql-hackers
I did some "benchmarks" to check whether --enable-locale with LC_ALL=C is
just as fast as --disable-locale, to possibly justify making locale
support the default.  This test only covers locale-aware comparisons,
which seems to be the critical aspect for all intents and purposes.

I loaded a table of a single text column with 454240 rows of English
words.  The table had a size of 21.5 MB.  The values were explicitly
de-sorted, but the order was the same across all test runs.  Then I ran
SELECT * FROM test ORDER BY 1; and timed the wall-clock response time a
few times.  All configuration parameters were left at the default.

The averaged results follow.  Some logarithmic buffering cleverness
appeared to surface, but the results are still distinct enough to be
useful.

no locale:    58s
locale=C:    78s    (ca. 33% slower)
locale=en_US:    118s    (ca. 100% slower)

This confused me, because in my C library a strcoll() call with locale=C
is handed to strcmp() quite directly.  A look into varlena.c:varstr_cmp()
shows that the locale-aware path does some extra copying because there is
no strncoll() function we can use with non-terminated strings.

For testing's sake I replaced the two palloc() calls in that function with
alloca(), which is presumably the fastest possible memory allocator.
Result:

locale=C,alloca:    67s    (ca. 15% slower)

This shows that we're wasting quite a bit of time allocating memory --
probably not only in this place.  I'm pretty sure that the majority of the
rest of the gap comes from the memcpy() operations.  Not that there's a
whole lot we can do about either of these things.

However, I feel that we could reasonably cope with this situation by
replacing

#ifdef USE_LOCALE
/* locale-aware code */
#else
/* non-locale code */
#endif

with

if (locale_is_not_C)
{   /* locale-ware code */
}
else
{   /* non-locale code */
}

This practice should have minuscule impact, and it's probably the plan for
the multibyte side of things as well.

-- 
Peter Eisentraut   peter_e@gmx.net



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Pls, apply patch fot contrib/tsearch
Next
From: Bruce Momjian
Date:
Subject: Pre-page images in WAL