Thread: 8k limit
Is there any chance of the 8k tuple limit going away in future releases of PostgreSQL? I was working on setting up a listserv archive with fields such as sentfrom, date, subject, and body, but the content of the body field would often exceed 8k because some people are just long-winded. I'd really rather not have to deal with the LO interface. Thanks. --Dan ----------------------------------------------------------------------- Daniel G. Delaney The Louisville Times Chorus Dionysos@Dionysia.org www.LouisvilleTimes.org www.Dionysia.org/~dionysos/ Dionysia Design ICQ Number: 8171285 www.Dionysia.com/design/ ----------------------------------------------------------------------- "Only two things are infinite: the universe and stupidity-- and I'm not sure about the former." --Albert Einstein
On Tue, 27 Oct 1998, Dan Delaney wrote: > Is there any chance of the 8k tuple limit going away in > future releases of PostgreSQL? I was working on setting up a > listserv archive with fields such as sentfrom, date, > subject, and body, but the content of the body field would > often exceed 8k because some people are just long-winded. > I'd really rather not have to deal with the LO interface. > Thanks. > --Dan Put the body (and the header text) in large objects instead of using it as text. If there's a function that lets you search the contents of a large object from SQL, I couldn't find it built in. You can either add it as a C extension (if you do that, please share it), or let SQL narrow the selections down, and then do the search externally. If all you were planning to offer was a full text search, then you'd probably be better off using something like a glimpse index than SQL. <mike
There are plans for doing what Bruce refers to as...what was his term? Its basically 'row wrapping'...the row limit is still 8k, but the server is smart enough to link two rows together to give you larger then that. It won't be in v6.4...but should be (or something similar) in a future release... On Tue, 27 Oct 1998, Dan Delaney wrote: > Is there any chance of the 8k tuple limit going away in > future releases of PostgreSQL? I was working on setting up a > listserv archive with fields such as sentfrom, date, > subject, and body, but the content of the body field would > often exceed 8k because some people are just long-winded. > I'd really rather not have to deal with the LO interface. > Thanks. > --Dan > > ----------------------------------------------------------------------- > Daniel G. Delaney The Louisville Times Chorus > Dionysos@Dionysia.org www.LouisvilleTimes.org > www.Dionysia.org/~dionysos/ Dionysia Design > ICQ Number: 8171285 www.Dionysia.com/design/ > ----------------------------------------------------------------------- > "Only two things are infinite: the universe and stupidity-- > and I'm not sure about the former." > --Albert Einstein > Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
At 4:15 +0200 on 28/10/98, The Hermit Hacker wrote: > There are plans for doing what Bruce refers to as...what was his term? > Its basically 'row wrapping'...the row limit is still 8k, but the server > is smart enough to link two rows together to give you larger then that. > > It won't be in v6.4...but should be (or something similar) in a future > release... Here is an idea to solve the text problem. Create a table for the bodies of the messages. Something like CREATE TABLE bodies ( mesg_id int4, chunk_no int4, chunk text ); In the application, divide your text into chunks of, say, 6k. Suppose your application is written in perl, you'd have an array of chunks @chunks Then you write the headers of your message to the main mail table (the one which has the headers). The message gets an ID from a sequence. You get the value of that sequence. Then you loop over the @chunks array, in each iteration inserting into the bodies table the given message id, the loop iterator (as chunk_no), and the content of the $chunks[$i] itself. Then, in your app, if you want to reconstruct a message, you just retrieve its headers (including mesg_id). Then you: SELECT chunk_no, chunk FROM bodies WHERE mesg_id = <your message> ORDER BY chunk_no; Ignore the chunk_no in the returned set, and just concatenate all the returned chunks in order. Then you have the body. Advantage of this method: Unlike the large-objects interface, (a) The text will be visible in psql in case you need to fix something. (b) You will be able to do some rough searching in the form SELECT DISTINCT mesg_id FROM bodies WHERE chunk like '%something%'; (c) pg_dump will dump the table, so you needn't come up with an elaborate backup scheme of your own (pg_dump DOES NOT backup LOBs). Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
On Wed, 28 Oct 1998, Herouth Maoz wrote: This is actually how I store the images in my photo album (http://bleu.west.spy.net/~dustin/photo/) I Base64 encode all of the image data, then extract it with something like this: declare c cursor for select * from image_store where id= (select id from image_map where name='$img') order by line I decode it on the fly, and cache it so's that it doesn't have to do that every time. The speed isn't too bad when you index it right, it just ends up taking some extra storage. One problem I do have, though, which is a bug in Postgres, is that sometimes it loses control of an Index or something like that and I get all kinds of errors and it tells all the connected clients to go away, so I get broken images. Here's an example of some of the errors: Index images_id is not a btree Index pg_class_oid_index is not a btree OperatorObjectIdFillScanKeyEntry: unknown operator 676 SearchSysCache: Called while cache disabled Index image_map_name is not a btree fmgr_info: function 1043: cache lookup failed Named portals may only be used in begin/end transaction blocks FATAL 1: SetUserId: user "nobody" is not in "pg_shadow" That's just a grep through my error logs. Most of those are incorrect, and happen sporadically throughout the day. I'm hoping this is fixed in 6.4... # Here is an idea to solve the text problem. # # Create a table for the bodies of the messages. Something like # # CREATE TABLE bodies ( # mesg_id int4, # chunk_no int4, # chunk text ); # # In the application, divide your text into chunks of, say, 6k. Suppose your # application is written in perl, you'd have an array of chunks @chunks # # Then you write the headers of your message to the main mail table (the one # which has the headers). The message gets an ID from a sequence. You get the # value of that sequence. # # Then you loop over the @chunks array, in each iteration inserting into the # bodies table the given message id, the loop iterator (as chunk_no), and the # content of the $chunks[$i] itself. # # Then, in your app, if you want to reconstruct a message, you just retrieve # its headers (including mesg_id). Then you: # # SELECT chunk_no, chunk # FROM bodies # WHERE mesg_id = <your message> # ORDER BY chunk_no; # # Ignore the chunk_no in the returned set, and just concatenate all the # returned chunks in order. Then you have the body. # # Advantage of this method: Unlike the large-objects interface, # # (a) The text will be visible in psql in case you need to fix something. # (b) You will be able to do some rough searching in the form # # SELECT DISTINCT mesg_id # FROM bodies # WHERE chunk like '%something%'; # # (c) pg_dump will dump the table, so you needn't come up with an elaborate # backup scheme of your own (pg_dump DOES NOT backup LOBs). # # Herouth # # -- # Herouth Maoz, Internet developer. # Open University of Israel - Telem project # http://telem.openu.ac.il/~herutma # # # # -- SA, beyond.com My girlfriend asked me which one I like better. pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net> | Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE L_______________________ I hope the answer won't upset her. ____________