Thread: text should be a blob field
Of course a first mail after a release must have the earned praise: Very nicely done, I like it all ;-) Reading the TODO, I see 'Allow text, char(), and varchar() overhead to be only 2 bytes, not 4 bytes' While this is very good for char and varchar, text is not a candidate since it is usually a blob datatype, without a length restriction (or a 2Gig limit). I think it should alternately read: Allow varchar() overhead to be only 2 bytes remove char() 4 byte overhead, use atttypmod instead make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob tablespace) What do you think ? Andreas
> > Of course a first mail after a release must have the earned praise: > Very nicely done, I like it all ;-) > > Reading the TODO, I see 'Allow text, char(), and varchar() overhead to be > only 2 bytes, not 4 bytes' > While this is very good for char and varchar, text is not a candidate > since it is usually a blob datatype, without a length restriction (or a 2Gig > limit). It was an idea. I think I will remove it from the TODO list. I had considered it so I could save the defined length(atttypmod now) in there, but now that we have atttypmod, we don't need it. It will stay at 4 bytes. > > I think it should alternately read: > Allow varchar() overhead to be only 2 bytes > remove char() 4 byte overhead, use atttypmod instead Ooh, this is interesting. Yea, I guess we really don't need that for char() anymore. The only problem is that we would have to do some fancy stuff to track char() separately in the backend, and I am sure atttypmod is not available in all the places we need it. Don't think it is worth it. > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob > tablespace) Hmmm. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob > > tablespace) > > Hmmm. I know Informix has 2 BLOB-types "text" and "binary" but I do not think we should change the meaning of keyword "text" too much. An idea to think about: if text fits within a tuple make it varchar-alias (as it works now) if text is larger make it a blob. Or simply call text-BLOBs "textblob" of something like that. What does SQL-92 say about BLOBs anyway? regards, -- --------------------------------------------- Göran Thyni, sysadm, JMS Bildbasen, Kiruna
Attachment
> Or simply call text-BLOBs "textblob" of something like that. > What does SQL-92 say about BLOBs anyway? Nothing afaik. That is why you get different meanings and usages between database products. I'd like to keep "text" as a useful string type. Conventionally, generic blobs are just binary objects with not much backend support (e.g. no useful operators other than perhaps "="). Imo generic blobs make more sense in a system without the capability to add types; perhaps a solution for Postgres would look a little different. At the moment, the frontend/backend protocol is different for large objects and everything else, so it would be difficult to transparently introduce blobs which behave identically to types which fit within a normal tuple. - Tom
On Tue, 3 Mar 1998, Bruce Momjian wrote: > > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob > > tablespace) > > Hmmm. > There was some talk about this about a month ago. Although we now have blob support in the JDBC driver, there is one outstanding issue with them, that I was waiting for 6.3 to be released before starting on it (and almost certainly starting a discussion here about it). Allowing text to use blobs for values larger than the current block size would hit the same problem. Ok, here's what the problem is at the moment: The JDBC example ImageViewer uses a table to store the name of an image, and the OID of the associated blob. # create table images (imgname name,imgoid oid); Ok, we now create an entry in the table for an image with: # insert into images values ('test.gif',lo_import('/home/pmount/test.gif')); This is fine so far. Now say we delete that row with: # delete from images where name = 'test.gif'; Fine again, except that the blob is still in the database. To get round this, you would have to add extra statements to handle this, and for JDBC, there is no standard way to do this. What I was thinking of, was to create a new type 'blob' which would delete the associated large object when the row is deleted. However, here's the problems against this: 1. Is there a call made by the backend to each datatype when a row is deleted? I can't see one. 2. When we update a row, we don't want the overhead of copying a very large blob when a row is first copied, then the original deleted, etc. Anyhow, I'm thinking of various ways around this - just don't hold your breath ;-) -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
> outstanding issue with them, that I was waiting for 6.3 to be released > before starting on it (and almost certainly starting a discussion here > about it). > > Allowing text to use blobs for values larger than the current block size > would hit the same problem. > > Ok, here's what the problem is at the moment: > > The JDBC example ImageViewer uses a table to store the name of an image, > and the OID of the associated blob. > > # create table images (imgname name,imgoid oid); > > Ok, we now create an entry in the table for an image with: > > # insert into images values ('test.gif',lo_import('/home/pmount/test.gif')); > > This is fine so far. Now say we delete that row with: > > # delete from images where name = 'test.gif'; > > Fine again, except that the blob is still in the database. To get round > this, you would have to add extra statements to handle this, and for JDBC, > there is no standard way to do this. > > What I was thinking of, was to create a new type 'blob' which would delete > the associated large object when the row is deleted. However, here's the > problems against this: > > 1. Is there a call made by the backend to each datatype when a row is > deleted? I can't see one. Well, you could have a RULE that deletes the large object at row deletion time. However, if two rows point to the same large object, the first one deleting it would delete the large object for the other. The only solution to this is to have a separate large object table, and use reference counts so only the last user of the object deletes it. > > 2. When we update a row, we don't want the overhead of copying a very > large blob when a row is first copied, then the original deleted, etc. Again, a deletion-only rule, but if the update the row and change the large object, you would have to delete the old stuff. Seems very messy to me. Perhaps put all the large objects in a table, and have a process clean up all the unreferenced large objects. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
Bruce wrote: > > 1. Is there a call made by the backend to each datatype when a row is > > deleted? I can't see one. > > Well, you could have a RULE that deletes the large object at row > deletion time. However, if two rows point to the same large object, the > first one deleting it would delete the large object for the other. The > only solution to this is to have a separate large object table, and use > reference counts so only the last user of the object deletes it. I think triggers are more appropriate. On INSERT check that the large object referenced exists. On UPDATE if large object reference changes, check that new large object exists and check if old large object isn't referenced any more in which case drop the old large object. On DELETE check if large object isn't referenced any more ... Yes - I like triggers :-) Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
On Wed, 4 Mar 1998, Bruce Momjian wrote: > > 1. Is there a call made by the backend to each datatype when a row is > > deleted? I can't see one. > > Well, you could have a RULE that deletes the large object at row > deletion time. As I haven't yet played with Rules & Triggers, and now we have 6.3 out of the way, I'm going to start. > However, if two rows point to the same large object, the first one > deleting it would delete the large object for the other. The only > solution to this is to have a separate large object table, and use > reference counts so only the last user of the object deletes it. Ah, in this case, there would be a single large object per column/row. If the row is deleted, then so will the blob. > > 2. When we update a row, we don't want the overhead of copying a very > > large blob when a row is first copied, then the original deleted, etc. > > Again, a deletion-only rule, but if the update the row and change the > large object, you would have to delete the old stuff. That's true. > Seems very messy to me. Perhaps put all the large objects in a table, > and have a process clean up all the unreferenced large objects. I think that would be a last resort thing to use. -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
Peter T Mount wrote: > > On Tue, 3 Mar 1998, Bruce Momjian wrote: > > > > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob > > > tablespace) > > > > There was some talk about this about a month ago. > > Although we now have blob support in the JDBC driver, there is one > outstanding issue with them, that I was waiting for 6.3 to be released > before starting on it (and almost certainly starting a discussion here > about it). > > Allowing text to use blobs for values larger than the current block size > would hit the same problem. When I told about multi-representation feature I ment that applications will not be affected by how text field is stored - in tuple or somewhere else. Is this Ok for you ? Vadim
On Thu, 5 Mar 1998, Vadim B. Mikheev wrote: > Peter T Mount wrote: > > > > On Tue, 3 Mar 1998, Bruce Momjian wrote: > > > > > > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob > > > > tablespace) > > > > > > > There was some talk about this about a month ago. > > > > Although we now have blob support in the JDBC driver, there is one > > outstanding issue with them, that I was waiting for 6.3 to be released > > before starting on it (and almost certainly starting a discussion here > > about it). > > > > Allowing text to use blobs for values larger than the current block size > > would hit the same problem. > > When I told about multi-representation feature I ment that applications > will not be affected by how text field is stored - in tuple or somewhere > else. Is this Ok for you ? Yes. What I was meaning was if the "somewhere else" is in a blob, then we would have to keep track of it if the tuple is updated or deleted. -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
On Wed, 4 Mar 1998, Jan Wieck wrote: > Bruce wrote: > > > > 1. Is there a call made by the backend to each datatype when a row is > > > deleted? I can't see one. > > > > Well, you could have a RULE that deletes the large object at row > > deletion time. However, if two rows point to the same large object, the > > first one deleting it would delete the large object for the other. The > > only solution to this is to have a separate large object table, and use > > reference counts so only the last user of the object deletes it. > > I think triggers are more appropriate. > > On INSERT check that the large object referenced exists. > > On UPDATE if large object reference changes, check that new > large object exists and check if old large object isn't > referenced any more in which case drop the old large object. > > On DELETE check if large object isn't referenced any more ... > > Yes - I like triggers :-) I'm begining to agree with you here. So far, I've got the trigger to work, so if a row of a table is deleted, or an oid referencing a BLOB is updated, then the old BLOB is deleted. This removes the orphaned BLOB problem. The only problem I have now, is: How to get a trigger to be automatically created on a table when the table is created. This would be required, so the end user doesn't have to do this (normally from within an application). This would be required, esp. for expanding the text type (or memo, or whatever). Any Ideas? -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
Peter Mount wrote: > > On Wed, 4 Mar 1998, Jan Wieck wrote: > > > I think triggers are more appropriate. > > > > I'm begining to agree with you here. > > So far, I've got the trigger to work, so if a row of a table is deleted, > or an oid referencing a BLOB is updated, then the old BLOB is deleted. > This removes the orphaned BLOB problem. > > The only problem I have now, is: > > How to get a trigger to be automatically created on a table when the > table is created. This would be required, so the end user doesn't have > to do this (normally from within an application). > > This would be required, esp. for expanding the text type (or memo, or > whatever). So you think of a new type that automatically causes trigger definition if used in CREATE/ALTER TABLE. Agree - would be a nice feature. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
On Mon, 16 Mar 1998, Jan Wieck wrote: > > Peter Mount wrote: > > > > On Wed, 4 Mar 1998, Jan Wieck wrote: > > > > > I think triggers are more appropriate. > > > > > > > I'm begining to agree with you here. > > > > So far, I've got the trigger to work, so if a row of a table is deleted, > > or an oid referencing a BLOB is updated, then the old BLOB is deleted. > > This removes the orphaned BLOB problem. > > > > The only problem I have now, is: > > > > How to get a trigger to be automatically created on a table when the > > table is created. This would be required, so the end user doesn't have > > to do this (normally from within an application). > > > > This would be required, esp. for expanding the text type (or memo, or > > whatever). > > So you think of a new type that automatically causes trigger > definition if used in CREATE/ALTER TABLE. > > Agree - would be a nice feature. Exactly, it would be a nice feature. I'm about to look at rules to see if that's a way to do it, but seeing it took me about 30 mins to do this with Triggers (and thats when I've never used them before), then it would be nice to use these. -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk
Would it be possible to have a slightly different interface in the frontend library which hides the fact that large objects are transfered 8kb at a time from the backend? Then the handling of text and large objects/blobs starts to look more alike... - Tom
On Tue, 17 Mar 1998, Thomas G. Lockhart wrote: > Would it be possible to have a slightly different interface in the > frontend library which hides the fact that large objects are transfered > 8kb at a time from the backend? Then the handling of text and large > objects/blobs starts to look more alike... The front end doesn't show the 8k limit... the storage manager handles splitting up the large object into 8k chunks - it may be that the examples show this because we know about it ourselves ;-) -- Peter T Mount petermount@earthling.net or pmount@maidast.demon.co.uk Main Homepage: http://www.demon.co.uk/finder Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk