LONG - Mailing list pgsql-hackers

From wieck@debis.com (Jan Wieck)
Subject LONG
Date
Msg-id m11wf9W-0003kGC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
Responses Re: [HACKERS] LONG
List pgsql-hackers
I  thought  about  the  huge size variable text type a little
    more.  And I think I could get the  following  implementation
    to work reliable for our upcoming release.

    For   any  relation,  having  one  or  more  LONG  data  type
    attributes,  another  relation  (named   pg_<something>)   is
    created,  accessible  only to superusers (and internal access
    routines).  All LONG data items are  stored  as  a  reference
    into  that relation, split up automatically so the chunks fit
    into the installation specific tuple limit size.   Items  are
    added/updated/removed totally transparent.

    It would not be indexable (jesus no!) and using it in a WHERE
    clause will be expensive. But who ever uses a WHERE on a  not
    indexable  (possibly containing megabytes per item) data type
    is a silly fool who should get what he wanted, poor  response
    times.

    I'd  like  to  name it LONG, like Oracle's 2G max. data type.
    Even if I intend to restrict the data size to some  megabytes
    for  now.  All  the data must still be processable in memory,
    and there might be multiple instances of one item  in  memory
    at  the  same time.  So a real 2G datatype is impossible with
    this kind of approach.  But  isn't  a  64MB  #define'd  limit
    enough  for  now?  This  would  possibly still blow away many
    installations due to limited memory and/or swap space. And we
    can  adjust  that #define in 2001 (an address space odyssey),
    when 64bit hardware and plenty of GB real memory is  the  low
    end standard *1).

    I already thought that the 8K default BLKSIZE is a little out
    of date for today's  hardware  standards.  Two  weeks  ago  I
    bought a PC for my kids. It's a 433MHz Celeron, 64MB ram, 6GB
    disk - costs about $500 (exactly DM 999,-- at  Media  Markt).
    With  the  actual  on  disk  cache  <->  memory and cache <->
    surface transfer rates, the 8K size seems a little archaic to
    me.

    Thus, if we can get a LONG data type in 7.0, and maybe adjust
    the default BLKSIZE to something more up  to  date,  wouldn't
    the long tuple item get away silently?

    Should I go ahead on this or not?


Jan

*1) Or will it be TB/PB?

    I  fear to estimate, because it's only a short time ago, that
    a 4G hard disk was high-end. Today, IBM offers a  3.5''  disk
    with 72G formatted capacity and 64M is the lowest end of real
    memory, so where's the limit?

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: wieck@debis.com (Jan Wieck)
Date:
Subject: Re: psql & regression (was: Error in new psql)
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] LONG