Thread: text should be a blob field

text should be a blob field

From

Zeugswetter Andreas SARZ

Date:

03 March 1998, 08:42:08

Of course a first mail after a release must have the earned praise:
Very nicely done, I like it all ;-)

Reading the TODO, I see 'Allow text, char(), and varchar() overhead to be
only 2 bytes, not 4 bytes'
While this is very good for char and varchar, text is not a candidate
since it is usually a blob datatype, without a length restriction (or a 2Gig
limit).

I think it should alternately read:
Allow varchar() overhead to be only 2 bytes
remove char() 4 byte overhead, use atttypmod instead
make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
tablespace)

What do you think ?
Andreas

Re: [HACKERS] text should be a blob field

From

Bruce Momjian

Date:

03 March 1998, 09:34:57

>
> Of course a first mail after a release must have the earned praise:
> Very nicely done, I like it all ;-)
>
> Reading the TODO, I see 'Allow text, char(), and varchar() overhead to be
> only 2 bytes, not 4 bytes'
> While this is very good for char and varchar, text is not a candidate
> since it is usually a blob datatype, without a length restriction (or a 2Gig
> limit).

It was an idea.  I think I will remove it from the TODO list.  I had
considered it so I could save the defined length(atttypmod now) in
there, but now that we have atttypmod, we don't need it.  It will stay
at 4 bytes.

>
> I think it should alternately read:
> Allow varchar() overhead to be only 2 bytes
> remove char() 4 byte overhead, use atttypmod instead

Ooh, this is interesting.  Yea, I guess we really don't need that for
char() anymore.  The only problem is that we would have to do some fancy
stuff to track char() separately in the backend, and I am sure atttypmod
is not available in all the places we need it.  Don't think it is worth
it.

> make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
> tablespace)

Hmmm.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] text should be a blob field

From

Goran Thyni

Date:

03 March 1998, 11:29:35

> > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
> > tablespace)
>
> Hmmm.

I know Informix has 2 BLOB-types "text" and "binary"
but I do not think we should change the meaning of keyword "text" too
much.

An idea to think about:
if text fits within a tuple make it varchar-alias (as it works now)
if text is larger make it a blob.

Or simply call text-BLOBs "textblob" of something like that.
What does SQL-92 say about BLOBs anyway?

    regards,
--
---------------------------------------------
Göran Thyni, sysadm, JMS Bildbasen, Kiruna

Attachment

smime.p7s

Re: [HACKERS] text should be a blob field

From

"Thomas G. Lockhart"

Date:

03 March 1998, 11:44:46

> Or simply call text-BLOBs "textblob" of something like that.
> What does SQL-92 say about BLOBs anyway?

Nothing afaik. That is why you get different meanings and usages between database
products. I'd like to keep "text" as a useful string type. Conventionally, generic
blobs are just binary objects with not much backend support (e.g. no useful
operators other than perhaps "=").

Imo generic blobs make more sense in a system without the capability to add types;
perhaps a solution for Postgres would look a little different. At the moment, the
frontend/backend protocol is different for large objects and everything else, so
it would be difficult to transparently introduce blobs which behave identically to
types which fit within a normal tuple.

                                                            - Tom

Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

03 March 1998, 16:26:12

On Tue, 3 Mar 1998, Bruce Momjian wrote:

> > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
> > tablespace)
>
> Hmmm.
>

There was some talk about this about a month ago.

Although we now have blob support in the JDBC driver, there is one
outstanding issue with them, that I was waiting for 6.3 to be released
before starting on it (and almost certainly starting a discussion here
about it).

Allowing text to use blobs for values larger than the current block size
would hit the same problem.

Ok, here's what the problem is at the moment:

The JDBC example ImageViewer uses a table to store the name of an image,
and the OID of the associated blob.

# create table images (imgname name,imgoid oid);

Ok, we now create an entry in the table for an image with:

# insert into images values ('test.gif',lo_import('/home/pmount/test.gif'));

This is fine so far. Now say we delete that row with:

# delete from images where name = 'test.gif';

Fine again, except that the blob is still in the database. To get round
this, you would have to add extra statements to handle this, and for JDBC,
there is no standard way to do this.

What I was thinking of, was to create a new type 'blob' which would delete
the associated large object when the row is deleted. However, here's the
problems against this:

1. Is there a call made by the backend to each datatype when a row is
   deleted? I can't see one.

2. When we update a row, we don't want the overhead of copying a very
   large blob when a row is first copied, then the original deleted, etc.

Anyhow, I'm thinking of various ways around this - just don't hold your
breath ;-)

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Bruce Momjian

Date:

04 March 1998, 11:00:53

> outstanding issue with them, that I was waiting for 6.3 to be released
> before starting on it (and almost certainly starting a discussion here
> about it).
>
> Allowing text to use blobs for values larger than the current block size
> would hit the same problem.
>
> Ok, here's what the problem is at the moment:
>
> The JDBC example ImageViewer uses a table to store the name of an image,
> and the OID of the associated blob.
>
> # create table images (imgname name,imgoid oid);
>
> Ok, we now create an entry in the table for an image with:
>
> # insert into images values ('test.gif',lo_import('/home/pmount/test.gif'));
>
> This is fine so far. Now say we delete that row with:
>
> # delete from images where name = 'test.gif';
>
> Fine again, except that the blob is still in the database. To get round
> this, you would have to add extra statements to handle this, and for JDBC,
> there is no standard way to do this.
>
> What I was thinking of, was to create a new type 'blob' which would delete
> the associated large object when the row is deleted. However, here's the
> problems against this:
>
> 1. Is there a call made by the backend to each datatype when a row is
>    deleted? I can't see one.

Well, you could have a RULE that deletes the large object at row
deletion time.  However, if two rows point to the same large object, the
first one deleting it would delete the large object for the other.  The
only solution to this is to have a separate large object table, and use
reference counts so only the last user of the object deletes it.

>
> 2. When we update a row, we don't want the overhead of copying a very
>    large blob when a row is first copied, then the original deleted, etc.

Again, a deletion-only rule, but if the update the row and change the
large object, you would have to delete the old stuff.

Seems very messy to me.  Perhaps put all the large objects in a table,
and have a process clean up all the unreferenced large objects.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

jwieck@debis.com (Jan Wieck)

Date:

04 March 1998, 11:34:12

Bruce wrote:

> > 1. Is there a call made by the backend to each datatype when a row is
> >    deleted? I can't see one.
>
> Well, you could have a RULE that deletes the large object at row
> deletion time.  However, if two rows point to the same large object, the
> first one deleting it would delete the large object for the other.  The
> only solution to this is to have a separate large object table, and use
> reference counts so only the last user of the object deletes it.

    I think triggers are more appropriate.

    On INSERT check that the large object referenced exists.

    On  UPDATE  if large object reference changes, check that new
    large object exists and  check  if  old  large  object  isn't
    referenced  any more in which case drop the old large object.

    On DELETE check if large object isn't referenced any more ...

    Yes - I like triggers :-)


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

04 March 1998, 15:39:27

On Wed, 4 Mar 1998, Bruce Momjian wrote:

> > 1. Is there a call made by the backend to each datatype when a row is
> >    deleted? I can't see one.
>
> Well, you could have a RULE that deletes the large object at row
> deletion time.

As I haven't yet played with Rules & Triggers, and now we have 6.3 out of
the way, I'm going to start.

> However, if two rows point to the same large object, the first one
> deleting it would delete the large object for the other.  The only
> solution to this is to have a separate large object table, and use
> reference counts so only the last user of the object deletes it.

Ah, in this case, there would be a single large object per column/row. If
the row is deleted, then so will the blob.

> > 2. When we update a row, we don't want the overhead of copying a very
> >    large blob when a row is first copied, then the original deleted, etc.
>
> Again, a deletion-only rule, but if the update the row and change the
> large object, you would have to delete the old stuff.

That's true.

> Seems very messy to me.  Perhaps put all the large objects in a table,
> and have a process clean up all the unreferenced large objects.

I think that would be a last resort thing to use.

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

"Vadim B. Mikheev"

Date:

05 March 1998, 04:00:48

Peter T Mount wrote:
>
> On Tue, 3 Mar 1998, Bruce Momjian wrote:
>
> > > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
> > > tablespace)
> >
>
> There was some talk about this about a month ago.
>
> Although we now have blob support in the JDBC driver, there is one
> outstanding issue with them, that I was waiting for 6.3 to be released
> before starting on it (and almost certainly starting a discussion here
> about it).
>
> Allowing text to use blobs for values larger than the current block size
> would hit the same problem.

When I told about multi-representation feature I ment that applications
will not be affected by how text field is stored - in tuple or somewhere
else. Is this Ok for you ?

Vadim

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

06 March 1998, 02:41:13

On Thu, 5 Mar 1998, Vadim B. Mikheev wrote:

> Peter T Mount wrote:
> >
> > On Tue, 3 Mar 1998, Bruce Momjian wrote:
> >
> > > > make text a blob datatype (maybe storing <= 8k row with tuple, >=8k in blob
> > > > tablespace)
> > >
> >
> > There was some talk about this about a month ago.
> >
> > Although we now have blob support in the JDBC driver, there is one
> > outstanding issue with them, that I was waiting for 6.3 to be released
> > before starting on it (and almost certainly starting a discussion here
> > about it).
> >
> > Allowing text to use blobs for values larger than the current block size
> > would hit the same problem.
>
> When I told about multi-representation feature I ment that applications
> will not be affected by how text field is stored - in tuple or somewhere
> else. Is this Ok for you ?

Yes. What I was meaning was if the "somewhere else" is in a blob, then we
would have to keep track of it if the tuple is updated or deleted.

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

15 March 1998, 08:59:07

On Wed, 4 Mar 1998, Jan Wieck wrote:

> Bruce wrote:
>
> > > 1. Is there a call made by the backend to each datatype when a row is
> > >    deleted? I can't see one.
> >
> > Well, you could have a RULE that deletes the large object at row
> > deletion time.  However, if two rows point to the same large object, the
> > first one deleting it would delete the large object for the other.  The
> > only solution to this is to have a separate large object table, and use
> > reference counts so only the last user of the object deletes it.
>
>     I think triggers are more appropriate.
>
>     On INSERT check that the large object referenced exists.
>
>     On  UPDATE  if large object reference changes, check that new
>     large object exists and  check  if  old  large  object  isn't
>     referenced  any more in which case drop the old large object.
>
>     On DELETE check if large object isn't referenced any more ...
>
>     Yes - I like triggers :-)

I'm begining to agree with you here.

So far, I've got the trigger to work, so if a row of a table is deleted,
or an oid referencing a BLOB is updated, then the old BLOB is deleted.
This removes the orphaned BLOB problem.

The only problem I have now, is:

  How to get a trigger to be automatically created on a table when the
  table is created. This would be required, so the end user doesn't have
  to do this (normally from within an application).

This would be required, esp. for expanding the text type (or memo, or
whatever).

Any Ideas?

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

jwieck@debis.com (Jan Wieck)

Date:

16 March 1998, 02:52:04

Peter Mount wrote:
>
> On Wed, 4 Mar 1998, Jan Wieck wrote:
>
> >     I think triggers are more appropriate.
> >
>
> I'm begining to agree with you here.
>
> So far, I've got the trigger to work, so if a row of a table is deleted,
> or an oid referencing a BLOB is updated, then the old BLOB is deleted.
> This removes the orphaned BLOB problem.
>
> The only problem I have now, is:
>
>   How to get a trigger to be automatically created on a table when the
>   table is created. This would be required, so the end user doesn't have
>   to do this (normally from within an application).
>
> This would be required, esp. for expanding the text type (or memo, or
> whatever).

    So  you think of a new type that automatically causes trigger
    definition if used in CREATE/ALTER TABLE.

    Agree - would be a nice feature.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

16 March 1998, 15:07:44

On Mon, 16 Mar 1998, Jan Wieck wrote:

>
> Peter Mount wrote:
> >
> > On Wed, 4 Mar 1998, Jan Wieck wrote:
> >
> > >     I think triggers are more appropriate.
> > >
> >
> > I'm begining to agree with you here.
> >
> > So far, I've got the trigger to work, so if a row of a table is deleted,
> > or an oid referencing a BLOB is updated, then the old BLOB is deleted.
> > This removes the orphaned BLOB problem.
> >
> > The only problem I have now, is:
> >
> >   How to get a trigger to be automatically created on a table when the
> >   table is created. This would be required, so the end user doesn't have
> >   to do this (normally from within an application).
> >
> > This would be required, esp. for expanding the text type (or memo, or
> > whatever).
>
>     So  you think of a new type that automatically causes trigger
>     definition if used in CREATE/ALTER TABLE.
>
>     Agree - would be a nice feature.

Exactly, it would be a nice feature.

I'm about to look at rules to see if that's a way to do it, but seeing it
took me about 30 mins to do this with Triggers (and thats when I've never
used them before), then it would be nice to use these.

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

"Thomas G. Lockhart"

Date:

16 March 1998, 22:19:14

Would it be possible to have a slightly different interface in the
frontend library which hides the fact that large objects are transfered
8kb at a time from the backend? Then the handling of text and large
objects/blobs starts to look more alike...

                        - Tom

Re: [QUESTIONS] Re: [HACKERS] text should be a blob field

From

Peter T Mount

Date:

17 March 1998, 02:50:48

On Tue, 17 Mar 1998, Thomas G. Lockhart wrote:

> Would it be possible to have a slightly different interface in the
> frontend library which hides the fact that large objects are transfered
> 8kb at a time from the backend? Then the handling of text and large
> objects/blobs starts to look more alike...

The front end doesn't show the 8k limit... the storage manager handles
splitting up the large object into 8k chunks - it may be that the examples
show this because we know about it ourselves ;-)

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk