Thread: 8k limit

8k limit

From

Dan Delaney

Date:

27 October 1998, 14:06:20

Is there any chance of the 8k tuple limit going away in
future releases of PostgreSQL? I was working on setting up a
listserv archive with fields such as sentfrom, date,
subject, and body, but the content of the body field would
often exceed 8k because some people are just long-winded.
I'd really rather not have to deal with the LO interface.
   Thanks.
 --Dan

-----------------------------------------------------------------------
 Daniel G. Delaney                    The Louisville Times Chorus
 Dionysos@Dionysia.org                   www.LouisvilleTimes.org
 www.Dionysia.org/~dionysos/          Dionysia Design
 ICQ Number: 8171285                     www.Dionysia.com/design/
-----------------------------------------------------------------------
    "Only two things are infinite: the universe and stupidity--
     and I'm not sure about the former."
                                          --Albert Einstein

Re: [GENERAL] 8k limit

From

Mike Meyer

Date:

27 October 1998, 14:39:43

On Tue, 27 Oct 1998, Dan Delaney wrote:

> Is there any chance of the 8k tuple limit going away in
> future releases of PostgreSQL? I was working on setting up a
> listserv archive with fields such as sentfrom, date,
> subject, and body, but the content of the body field would
> often exceed 8k because some people are just long-winded.
> I'd really rather not have to deal with the LO interface.
>    Thanks.
>  --Dan

Put the body (and the header text) in large objects instead of using
it as text. If there's a function that lets you search the contents of
a large object from SQL, I couldn't find it built in. You can either
add it as a C extension (if you do that, please share it), or let SQL
narrow the selections down, and then do the search externally.

If all you were planning to offer was a full text search, then you'd
probably be better off using something like a glimpse index than SQL.

    <mike

Re: [GENERAL] 8k limit

From

The Hermit Hacker

Date:

27 October 1998, 21:16:22

There are plans for doing what Bruce refers to as...what was his term?
Its basically 'row wrapping'...the row limit is still 8k, but the server
is smart enough to link two rows together to give you larger then that.

It won't be in v6.4...but should be (or something similar) in a future
release...

On Tue, 27 Oct 1998, Dan Delaney wrote:

> Is there any chance of the 8k tuple limit going away in
> future releases of PostgreSQL? I was working on setting up a
> listserv archive with fields such as sentfrom, date,
> subject, and body, but the content of the body field would
> often exceed 8k because some people are just long-winded.
> I'd really rather not have to deal with the LO interface.
>    Thanks.
>  --Dan
>
> -----------------------------------------------------------------------
>  Daniel G. Delaney                    The Louisville Times Chorus
>  Dionysos@Dionysia.org                   www.LouisvilleTimes.org
>  www.Dionysia.org/~dionysos/          Dionysia Design
>  ICQ Number: 8171285                     www.Dionysia.com/design/
> -----------------------------------------------------------------------
>     "Only two things are infinite: the universe and stupidity--
>      and I'm not sure about the former."
>                                           --Albert Einstein
>

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: [GENERAL] 8k limit

From

Herouth Maoz

Date:

28 October 1998, 04:41:46

At 4:15 +0200 on 28/10/98, The Hermit Hacker wrote:

> There are plans for doing what Bruce refers to as...what was his term?
> Its basically 'row wrapping'...the row limit is still 8k, but the server
> is smart enough to link two rows together to give you larger then that.
>
> It won't be in v6.4...but should be (or something similar) in a future
> release...

Here is an idea to solve the text problem.

Create a table for the bodies of the messages. Something like

CREATE TABLE bodies (
    mesg_id int4,
    chunk_no int4,
    chunk    text );

In the application, divide your text into chunks of, say, 6k. Suppose your
application is written in perl, you'd have an array of chunks @chunks

Then you write the headers of your message to the main mail table (the one
which has the headers). The message gets an ID from a sequence. You get the
value of that sequence.

Then you loop over the @chunks array, in each iteration inserting into the
bodies table the given message id, the loop iterator (as chunk_no), and the
content of the $chunks[$i] itself.

Then, in your app, if you want to reconstruct a message, you just retrieve
its headers (including mesg_id). Then you:

SELECT chunk_no, chunk
FROM bodies
WHERE mesg_id = <your message>
ORDER BY chunk_no;

Ignore the chunk_no in the returned set, and just concatenate all the
returned chunks in order. Then you have the body.

Advantage of this method: Unlike the large-objects interface,

(a) The text will be visible in psql in case you need to fix something.
(b) You will be able to do some rough searching in the form

    SELECT DISTINCT mesg_id
    FROM bodies
    WHERE chunk like '%something%';

(c) pg_dump will dump the table, so you needn't come up with an elaborate
    backup scheme of your own (pg_dump DOES NOT backup LOBs).

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma

Re: [GENERAL] 8k limit

From

Dustin Sallings

Date:

28 October 1998, 12:25:02

On Wed, 28 Oct 1998, Herouth Maoz wrote:

    This is actually how I store the images in my photo album
(http://bleu.west.spy.net/~dustin/photo/)  I Base64 encode all of the
image data, then extract it with something like this:

    declare c cursor for select * from image_store where id=
              (select id from image_map where name='$img')
            order by line

    I decode it on the fly, and cache it so's that it doesn't have to
do that every time.  The speed isn't too bad when you index it right, it
just ends up taking some extra storage.  One problem I do have, though,
which is a bug in Postgres, is that sometimes it loses control of an Index
or something like that and I get all kinds of errors and it tells all the
connected clients to go away, so I get broken images.  Here's an example
of some of the errors:

Index images_id is not a btree
Index pg_class_oid_index is not a btree
OperatorObjectIdFillScanKeyEntry: unknown operator 676
SearchSysCache: Called while cache disabled
Index image_map_name is not a btree
fmgr_info: function 1043: cache lookup failed
Named portals may only be used in begin/end transaction blocks
FATAL 1:  SetUserId: user "nobody" is not in "pg_shadow"

    That's just a grep through my error logs.  Most of those are
incorrect, and happen sporadically throughout the day.  I'm hoping this is
fixed in 6.4...

# Here is an idea to solve the text problem.
#
# Create a table for the bodies of the messages. Something like
#
# CREATE TABLE bodies (
#     mesg_id int4,
#     chunk_no int4,
#     chunk    text );
#
# In the application, divide your text into chunks of, say, 6k. Suppose your
# application is written in perl, you'd have an array of chunks @chunks
#
# Then you write the headers of your message to the main mail table (the one
# which has the headers). The message gets an ID from a sequence. You get the
# value of that sequence.
#
# Then you loop over the @chunks array, in each iteration inserting into the
# bodies table the given message id, the loop iterator (as chunk_no), and the
# content of the $chunks[$i] itself.
#
# Then, in your app, if you want to reconstruct a message, you just retrieve
# its headers (including mesg_id). Then you:
#
# SELECT chunk_no, chunk
# FROM bodies
# WHERE mesg_id = <your message>
# ORDER BY chunk_no;
#
# Ignore the chunk_no in the returned set, and just concatenate all the
# returned chunks in order. Then you have the body.
#
# Advantage of this method: Unlike the large-objects interface,
#
# (a) The text will be visible in psql in case you need to fix something.
# (b) You will be able to do some rough searching in the form
#
#     SELECT DISTINCT mesg_id
#     FROM bodies
#     WHERE chunk like '%something%';
#
# (c) pg_dump will dump the table, so you needn't come up with an elaborate
#     backup scheme of your own (pg_dump DOES NOT backup LOBs).
#
# Herouth
#
# --
# Herouth Maoz, Internet developer.
# Open University of Israel - Telem project
# http://telem.openu.ac.il/~herutma
#
#
#
#

--
SA, beyond.com           My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________