Re: [HACKERS] lztext.c - Mailing list pgsql-hackers

From wieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] lztext.c
Date
Msg-id m11qS8T-0003kGC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to lztext.c  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: [HACKERS] lztext.c
List pgsql-hackers
Tatsuo Ishii wrote:

> I'm going to commit changes to make lztextlen() aware of
> multi-byte. While doing the work, I found that no POSITION() or
> SUBSTRING() for lztext has been implemented in the file.

    Thank's  for  that.  I  usually don't have multi-byte support
    compiled in and it's surely better if you  do  the  extension
    and tests.

    I know that a lot of functions are missing so far. Especially
    comparision and the mentioned ones. I thought to get back  on
    it after the multi-byte support is inside.

> BTW, does anybody work on making lztext indexable?  If no, I will take
> care of it with above addtions.

    IMHO something questionable.

    A compressed data type is preferred to store large amounts of
    data.  Indexing large fields OTOH is something to prevent  by
    database  design.   The  new  type  at hand offers reasonable
    compression rates only above some size of input.

    OTOOH, it might get someone around the btree  split  problems
    some of us encountered and which I where able to trigger with
    field contents above 2K already. In such a case it can  be  a
    last resort.

    I'd like to know what others think.

    Don't  spend  much efford for comparision and the SUBSTRING()
    things right now. I already have an  additional,  generalized
    decompressor in mind, that can be used in the comparision for
    example  to  decompress  two  values  on  the  fly  and  stop
    comparision  at  the  first difference, which usually happens
    early in two random datums.

    Tell me when you have the multi-byte  (and  maybe  cyrillic?)
    stuff committed and I'll take my hands back on the code.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: AW: [HACKERS] Getting OID in psql of recent insert
Next
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] A bug or a feature?