Home > mailing lists

Re: Memory-comparable Serialization of Data Types - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Memory-comparable Serialization of Data Types
Date	February 12, 2020 00:14:59
Msg-id	CAH2-Wzn75vSM4SDWOVaYssndZfM3YNyC9U-vp6sbsirXAmw19Q@mail.gmail.com Whole thread Raw
In response to	Re: Memory-comparable Serialization of Data Types (Shichao Jin <jsc0218@gmail.com>)
Responses	Re: Memory-comparable Serialization of Data Types
List	pgsql-hackers

Tree view

On Tue, Feb 11, 2020 at 12:19 PM Shichao Jin <jsc0218@gmail.com> wrote:
> Yes, this is exactly what I mean.

PostgreSQL doesn't have this capability. It might make sense to have
it for some specific data structures, such as tuples on internal
B-Tree pages -- these merely guide index scans, so there some
information loss may be acceptable compared to the native/base
representation. However, that would only be faster because memcmp() is
generally faster than the underlying datatype's native comparator. Not
because comparisons have to take place in "the upper levels". There is
some indirection/overhead involved in using SQL-callable operators,
but not that much.

Note that such a representation has to lose information in at least
some cases. For example, case-insensitive collations would have to
lose information about the original case used (or store the original
alongside the conditioned binary string). Note also that a "one pass"
representation that we can just memcmp() will have to be significantly
larger in some cases, especially when collatable text is used. A
strxfrm() blob is typically about 3.3x larger than the original string
IIRC.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Andrew Dunstan
Date: 12 February 2020, 00:08:39
Subject: Re: Postgres 32 bits client compilation fail. Win32 bits client issupported?

From: Alvaro Herrera
Date: 12 February 2020, 00:40:38
Subject: Re: Memory-comparable Serialization of Data Types

Re: Memory-comparable Serialization of Data Types - Mailing list pgsql-hackers

Previous

Next