Thread: lztext and parser

lztext and parser

From

wieck@debis.com (Jan Wieck)

Date:

24 November 1999, 22:19:40

Hi,

    I'm hacking in the operator functions for the lztext type and
    have a little question.

    With the generic per-byte decompressor I added, it  would  be
    very easy to produce functions like

        bool lztext_text_eq(lztext, text)
        bool text_lztext_eq(text, lztext)

    too.  Comparision  between  lztext and text does already work
    because there  are  lztext->text  and  vice  versa  functions
    available and the parser automatically typecasts.

    So  would  it  be a win or a dead end street to provide those
    functions?  Does it look for a  direct  comparision  function
    allways  first?  Then  it  would  be,  because it would never
    choose to  compress  the  text  item  and  then  compare  two
    lztext's (what would be terrible).


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

Re: [HACKERS] lztext and parser

From

Karel Zak - Zakkr

Date:

25 November 1999, 10:40:49

On Thu, 25 Nov 1999, Jan Wieck wrote:

> Hi,
> 
>     I'm hacking in the operator functions for the lztext type and
>     have a little question.

Hi,

I see your the lztext/pg_compress in CVS and it is *very* nice, 
and I have a little comment. 

>     With the generic per-byte decompressor I added, it  would  be
>     very easy to produce functions like

It's very good if compression routines support data stream. Suppose
you add the pre-byte compressor? 

Reasoning: What is fastly - (1) a standard data I/O reading or           (2) a compressed data I/O reading and
decompressit          (fast CPU vs. slowly I/O)?            IMHO if (2) is fastly is prabably good for something
 data use second way. But I don't know where use this          way in PgSQL :-)

>     too.  Comparision  between  lztext and text does already work
>     because there  are  lztext->text  and  vice  versa  functions
>     available and the parser automatically typecasts.
> 
>     So  would  it  be a win or a dead end street to provide those
>     functions?  Does it look for a  direct  comparision  function
>     allways  first?  Then  it  would  be,  because it would never
>     choose to  compress  the  text  item  and  then  compare  two
>     lztext's (what would be terrible).
> 

In the lztext_eq(lztext *lz1, lztext *lz2) you use lztext_cmp().
I'm not sure if it is good, because you must decompress fistly, but 
you don't need information about '<' or '>', you need '=' only. Why 
you not use memcmp() (or other method) for this, and comparate this 
without decompression? Two equal string is equal in a compressed form, 
or not?  

IMHO in some routines (datetime) in PgSQL I saw internal cache, what 
add this to the lztext_cmp(), and not decompress all in each time?

All in this letter is comments and suggestions, you can always remove 
this letter to /dev/null :-)
                        Karel

------------------------------------------------------------------------------
Karel Zak <zakkr@zf.jcu.cz>                      http://home.zf.jcu.cz/~zakkr/

Docs:         http://docs.linux.cz                          (big docs archive)    
Kim Project:  http://home.zf.jcu.cz/~zakkr/kim/              (process manager)
FTP:          ftp://ftp2.zf.jcu.cz/users/zakkr/              (C/ncurses/PgSQL)
------------------------------------------------------------------------------

Re: [HACKERS] lztext and parser

From

wieck@debis.com (Jan Wieck)

Date:

25 November 1999, 10:55:49

Karel wrote:

> In the lztext_eq(lztext *lz1, lztext *lz2) you use lztext_cmp().
> I'm not sure if it is good, because you must decompress fistly, but
> you don't need information about '<' or '>', you need '=' only. Why
> you not use memcmp() (or other method) for this, and comparate this
> without decompression? Two equal string is equal in a compressed form,
> or not?

    For now, yes. But the current lztext implementation is only a
    starting poing (still). The final  version  might  have  some
    runtime customizable parameters for the compression algorithm
    (good match size and minimum compression rate). Then, if some
    data is stored and the parameters change after, this wouldn't
    be true any more.

> All in this letter is comments and suggestions, you can always remove
> this letter to /dev/null :-)

    Would never do so. All  suggestions  are  welcome  and  might
    trigger another idea in someone elses head.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #