Thread: bytea and text

bytea and text

From
"Jean-Yves F. Barbier"
Date:
Hi list,

I'm asking myself what solution for storing pictures is the best:

* using a BYTEA column type, and having no intrinsic compression gain
   because a picture it almost every time already compressed,

* using a TEXT column type, and store a Base64(picture) in it; it should
   take benefits of intrinsic compression (?).

JY
--

Re: bytea and text

From
Brian Modra
Date:
2009/11/26 Jean-Yves F. Barbier <12ukwn@gmail.com>:
> Hi list,
>
> I'm asking myself what solution for storing pictures is the best:
>
> * using a BYTEA column type, and having no intrinsic compression gain
>   because a picture it almost every time already compressed,

I have not looked into what the internal representation is with BYTEA,
but if it is binary, then it will be more efficient than text and
base64.

> * using a TEXT column type, and store a Base64(picture) in it; it should
>   take benefits of intrinsic compression (?).

One disadvantage of BYTEA is that if you use the escaped format to
insert, then you can hit the limit of SQL size. This limit would not
be quite as tight if you use base64.

If this is the case though, I wonder why base64 isn't an option for
the escaped syntax for BYTEA?

>
> JY
> --
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>



--
Brian Modra   Land line: +27 23 5411 462
Mobile: +27 79 69 77 082
5 Jan Louw Str, Prince Albert, 6930
Postal: P.O. Box 2, Prince Albert 6930
South Africa
http://www.zwartberg.com/

Re: bytea and text

From
Tom Lane
Date:
"Jean-Yves F. Barbier" <12ukwn@gmail.com> writes:
> I'm asking myself what solution for storing pictures is the best:

> * using a BYTEA column type, and having no intrinsic compression gain
>    because a picture it almost every time already compressed,

> * using a TEXT column type, and store a Base64(picture) in it; it should
>    take benefits of intrinsic compression (?).

At best, the compression would get rid of the overhead you added by
converting to base64.  It probably wouldn't completely succeed at that,
though, meaning the second alternative is always a loser.

            regards, tom lane

Re: bytea and text

From
"Jean-Yves F. Barbier"
Date:
Tom Lane a écrit :
> "Jean-Yves F. Barbier" <12ukwn@gmail.com> writes:
>> I'm asking myself what solution for storing pictures is the best:
>
>> * using a BYTEA column type, and having no intrinsic compression gain
>>    because a picture it almost every time already compressed,
>
>> * using a TEXT column type, and store a Base64(picture) in it; it should
>>    take benefits of intrinsic compression (?).
>
> At best, the compression would get rid of the overhead you added by
> converting to base64.  It probably wouldn't completely succeed at that,
> though, meaning the second alternative is always a loser.
>
>             regards, tom lane

Yeah I saw that in a simple test: zipping a Base64 file from a picture
(I guess the LZ algo's the same from zip to PG) only crunch it by 25%,
which make the result always bigger than a BYTEA.

Thanks
--
Doubt isn't the opposite of faith; it is an element of faith.
        -- Paul Tillich, German theologian and historian

Re: bytea and text

From
Jasen Betts
Date:
On 2009-11-26, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:
> Hi list,
>
> I'm asking myself what solution for storing pictures is the best:
>
> * using a BYTEA column type, and having no intrinsic compression gain
>    because a picture it almost every time already compressed,
>
> * using a TEXT column type, and store a Base64(picture) in it; it should
>    take benefits of intrinsic compression (?).

use bytea and convert to/from base64 if you can't use the binary
interface and find the text interface inconvenient.

this will make the data over 1/3 bigger.


Re: bytea and text

From
"Jean-Yves F. Barbier"
Date:
Jasen Betts a écrit :
> On 2009-11-26, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote:
>> Hi list,
>>
>> I'm asking myself what solution for storing pictures is the best:
>>
>> * using a BYTEA column type, and having no intrinsic compression gain
>>    because a picture it almost every time already compressed,
>>
>> * using a TEXT column type, and store a Base64(picture) in it; it should
>>    take benefits of intrinsic compression (?).
>
> use bytea and convert to/from base64 if you can't use the binary
> interface and find the text interface inconvenient.

that's what I do now

> this will make the data over 1/3 bigger.

uuu, na: it comes as Base64 (to avoid a 0x89 unknown utf8 char error), but it
is converted to BYTEA as I now use decode('inparm', 'base64') before insert,
as back to Base64 when I extract it (with encode).

--
* JHM wonders what Joey did to earn "I'd just like to say, for the record,
  that Joey rules."
        -- Seen on #Debian

Re: bytea and text

From
richard terry
Date:
On Friday 27 November 2009 02:58:01 Tom Lane wrote:
> "Jean-Yves F. Barbier" <12ukwn@gmail.com> writes:
> > I'm asking myself what solution for storing pictures is the best:
> >
> > * using a BYTEA column type, and having no intrinsic compression gain
> >    because a picture it almost every time already compressed,
> >
> > * using a TEXT column type, and store a Base64(picture) in it; it should
> >    take benefits of intrinsic compression (?).
>
> At best, the compression would get rid of the overhead you added by
> converting to base64.  It probably wouldn't completely succeed at that,
> though, meaning the second alternative is always a loser.
>
>             regards, tom lane
>

tom, I wonder if you could give us a sample of using client side lo_creat ,
insert  functions to insert a blob into postgres - in an sql statement.

Despite reading the docs's I'm totally in the dark and can't understand the
syntax.

Regards

richard (using gambas basic)

Re: bytea and text

From
Syan Tan
Date:
Not sure if decode works for COPY statements though, I thought copy is faster
than insert for bulk loading lots of data
than using a client call in whatever language postgres driver. But then its only
one field for binary data which is a lot bigger
than the other fields in the tuple, so I'm not sure what is the rate-limiting
factor here.

On Fri 27/11/09 14:18 , "Jean-Yves F. Barbier" 12ukwn@gmail.com sent:
> Jasen Betts a écrit :
> > On 2009-11-26, Jean-Yves F. Barbier <12ukwn@gmail
> .com> wrote:>> Hi list,
> >>
> >> I'm asking myself what solution for storing
> pictures is the best:>>
> >> * using a BYTEA column type, and having no
> intrinsic compression gain >>    because a picture it almost every time
> already compressed,>>
> >> * using a TEXT column type, and store a
> Base64(picture) in it; it should>>    take benefits of intrinsic compression
> (?).>
> > use bytea and convert to/from base64 if you
> can't use the binary> interface and find the text interface
> inconvenient.
> that's what I do now
>
> > this will make the data over 1/3
> bigger.
> uuu, na: it comes as Base64 (to avoid a 0x89 unknown utf8 char error), but
> itis converted to BYTEA as I now use decode('inparm', 'base64') before
> insert,as back to Base64 when I extract it (with encode).
>
> --
> * JHM wonders what Joey did to earn "I'd just like to say, for the
> record,that Joey rules."
> -- Seen on #Debian
>
> --
> Sent via pgsql-novice mailing list (p
> gsql-novice@postgresql.org)To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>
>


Re: bytea and text

From
"Jean-Yves F. Barbier"
Date:
richard terry a écrit :
...
> tom, I wonder if you could give us a sample of using client side lo_creat ,
> insert  functions to insert a blob into postgres - in an sql statement.
>
> Despite reading the docs's I'm totally in the dark and can't understand the
> syntax.

Hi Richard, I modified the functions I sent you: they now store into
BINARY (BYTEA) format, not anymore into BASE64:

CREATE OR REPLACE FUNCTION ucommon.testbytea_ins(Pstring TEXT) RETURNS oid AS $$
DECLARE
        NewId      OID;
        NewBytea   BYTEA;
BEGIN
        NewBytea = decode(Pstring, 'base64');
        INSERT INTO common.testbytea VALUES(default, NewBytea);
        SELECT id INTO NewId FROM common.testbytea WHERE id = (SELECT currval('testbytea_id_seq'));
        RETURN NewId;
END;
$$ LANGUAGE PLPGSQL STRICT SECURITY DEFINER;
REVOKE ALL ON FUNCTION ucommon.testbytea_ins(TEXT) FROM PUBLIC;
---------------------------------------------------
CREATE OR REPLACE FUNCTION ucommon.testbytea_sel(Pid OID) RETURNS TEXT AS $$
DECLARE
        MyPic       BYTEA;
        MyString    TEXT;
BEGIN
        SELECT pic INTO MyPic FROM common.testbytea WHERE id = Pid;
        MyString = encode(MyPic, 'base64');
        RETURN MyString;
END;
$$ LANGUAGE PLPGSQL STRICT SECURITY DEFINER;
REVOKE ALL ON FUNCTION ucommon.testbytea_sel(OID) FROM PUBLIC;

HIWH

JY
--
Objects in mirror may be closer than they appear.

Re: bytea and text

From
Syan Tan
Date:
this is a wrapper for one parameter 'base64' , I thought what he wanted was a simple
way of uploading files by filename into a postgresql database field via a gambase
client interface.

On Sat 28/11/09 00:08 , "Jean-Yves F. Barbier" 12ukwn@gmail.com sent:
> richard terry a écrit :
> ...
> > tom, I wonder if you could give us a sample of
> using client side lo_creat , > insert  functions to insert a blob into postgres
> - in an sql statement.>
> > Despite reading the docs's I'm totally in the
> dark and can't understand the > syntax.
>
> Hi Richard, I modified the functions I sent you: they now store into
> BINARY (BYTEA) format, not anymore into BASE64:
>
> CREATE OR REPLACE FUNCTION ucommon.testbytea_ins(Pstring TEXT) RETURNS oid
> AS $$DECLARE
> NewId      OID;
> NewBytea   BYTEA;
> BEGIN
> NewBytea = decode(Pstring, 'base64');
> INSERT INTO common.testbytea VALUES(default, NewBytea);
> SELECT id INTO NewId FROM common.testbytea WHERE id = (SELECT
> currval('testbytea_id_seq'));RETURN NewId;
> END;
> $$ LANGUAGE PLPGSQL STRICT SECURITY DEFINER;
> REVOKE ALL ON FUNCTION ucommon.testbytea_ins(TEXT) FROM PUBLIC;
> ---------------------------------------------------
> CREATE OR REPLACE FUNCTION ucommon.testbytea_sel(Pid OID) RETURNS TEXT AS
> $$DECLARE
> MyPic       BYTEA;
> MyString    TEXT;
> BEGIN
> SELECT pic INTO MyPic FROM common.testbytea WHERE id = Pid;
> MyString = encode(MyPic, 'base64');
> RETURN MyString;
> END;
> $$ LANGUAGE PLPGSQL STRICT SECURITY DEFINER;
> REVOKE ALL ON FUNCTION ucommon.testbytea_sel(OID) FROM PUBLIC;
>
> HIWH
>
> JY
> --
> Objects in mirror may be closer than they appear.
>
> --
> Sent via pgsql-novice mailing list (p
> gsql-novice@postgresql.org)To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>
>


Re: bytea and text

From
Tom Lane
Date:
richard terry <rterry@pacific.net.au> writes:
> tom, I wonder if you could give us a sample of using client side lo_creat ,
> insert  functions to insert a blob into postgres - in an sql statement.

There's an example program in the docs:
http://www.postgresql.org/docs/8.4/static/lo-examplesect.html

            regards, tom lane

Re: bytea and text

From
Syan Tan
Date:
Is there a gambas specific wrapper for the lo_export/lo_import function ?

On Sat 28/11/09 01:29 , "Tom Lane" tgl@sss.pgh.pa.us sent:
> richard terry <rterry@
> pacific.net.au> writes:> tom, I wonder if you could give us a sample of
> using client side lo_creat , > insert  functions to insert a blob into postgres
> - in an sql statement.
> There's an example program in the docs:
> http://www.postgresql.org/docs/8.4/static/lo-examplesect.html
> regards, tom lane
>
> --
> Sent via pgsql-novice mailing list (p
> gsql-novice@postgresql.org)To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>
>


Re: bytea and text

From
Syan Tan
Date:
here's a quote from the postgresql documentation

Backslash characters (\) can be used in the COPY data to quote data characters that might otherwise be taken as
row or column delimiters. In particular, the following characters must be preceded by a backslash if they appear
as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.

So if I scan a field value ( say a big buffer to hold an entire image), and have a second buffer to copy
an escaped image string to , and I read one of the 4 characters mentioned above, and insert a backslash
into the destination buffer, and then continue copying into the second buffer, I should end up with a string
that I can put inside a comma-separated text file ( so comma = the current delimiter character ), and then
I can bulk load a huge text file containing thousands of images without problems using COPY..FROM , is that
correct ?


On Sat 28/11/09 01:29 , "Tom Lane" tgl@sss.pgh.pa.us sent:
> richard terry <rterry@
> pacific.net.au> writes:> tom, I wonder if you could give us a sample of
> using client side lo_creat , > insert  functions to insert a blob into postgres
> - in an sql statement.
> There's an example program in the docs:
> http://www.postgresql.org/docs/8.4/static/lo-examplesect.html
> regards, tom lane
>
> --
> Sent via pgsql-novice mailing list (p
> gsql-novice@postgresql.org)To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>
>


Re: bytea and text

From
Bob McConnell
Date:
Syan Tan wrote:
> here's a quote from the postgresql documentation
>
> Backslash characters (\) can be used in the COPY data to quote data
> characters that might otherwise be taken as row or column delimiters.
> In particular, the following characters must be preceded by a
> backslash if they appear as part of a column value: backslash itself,
> newline, carriage return, and the current delimiter character.
>
> So if I scan a field value ( say a big buffer to hold an entire
> image), and have a second buffer to copy an escaped image string to ,
> and I read one of the 4 characters mentioned above, and insert a
> backslash into the destination buffer, and then continue copying into
> the second buffer, I should end up with a string that I can put
> inside a comma-separated text file ( so comma = the current delimiter
> character ), and then I can bulk load a huge text file containing
> thousands of images without problems using COPY..FROM , is that
> correct ?
>
>
> On Sat 28/11/09 01:29 , "Tom Lane" tgl@sss.pgh.pa.us sent:
>> richard terry <rterry@ pacific.net.au> writes:> tom, I wonder if
>> you could give us a sample of using client side lo_creat , > insert
>> functions to insert a blob into postgres - in an sql statement.
>> There's an example program in the docs:
>> http://www.postgresql.org/docs/8.4/static/lo-examplesect.html
>> regards, tom lane

It is my understanding that the current SQL standard defines the single
quote as the official escape character. While the DB engine accepts
others, wouldn't it be best to use the official character?

Bob McConnell
N2SPP


Re: bytea and text

From
Didier Gasser-Morlay
Date:
I hope you won't mind if I had my 2 cents to this conversation; aside from the actual format, this question comes up with regularity on various database lists.

I have never understood the need for storing images inside a database (and to some extend blob data) because:
  - being a blog you do not query it just store and retrieve, you rarely update, so the need for a fancy SQL and DB engine is somewhat limited;
 
  - storing images is heavy, makes the database grow, thus makes backup/restore that bit more painful and possibly that bit less stable. ..

For years I designed system for picture libraries where numbers like 10s of thousands to millions of images are common place, we would never have contemplated to store images on anything more complex than a file system. using the database to store a URL to that resource.

Hope this helps

Didier

Re: bytea and text

From
"Jean-Yves F. Barbier"
Date:
Didier Gasser-Morlay a écrit :
> I hope you won't mind if I had my 2 cents to this conversation; aside
> from the actual format, this question comes up with regularity on
> various database lists.
>
> I have never understood the need for storing images inside a database
> (and to some extend blob data) because:
>   - being a blog you do not query it just store and retrieve, you rarely
> update, so the need for a fancy SQL and DB engine is somewhat limited;

this is not a blog, but pictures aren't updated

>   - storing images is heavy, makes the database grow, thus makes
> backup/restore that bit more painful and possibly that bit less stable. ..
                                                    ^^^^^^^^^^^^^^^^^^^^
?

> For years I designed system for picture libraries where numbers like 10s
> of thousands to millions of images are common place, we would never have
> contemplated to store images on anything more complex than a file
> system. using the database to store a URL to that resource.

hmmmm, just a one shot backup for "just users".

I'm not going to feed this troll, this is MY choice.

--
Being stoned on marijuana isn't very different from being stoned on gin.
        -- Ralph Nader

Re: bytea and text

From
Didier Gasser-Morlay
Date:
> this is not a blog, but pictures aren't updated
Sorry I meant "blob", I guess it was not obvious,

>hmmmm, just a one shot backup for "just users".
hmmmm never heard of backup scripts ?

>I'm not going to feed this troll, this is MY choice.
Not trolling, trying to help by sharing experience, which I thought was the purpose of this list.
I do apologize for interrupting your thought process

Re: bytea and text

From
richard terry
Date:
On Sunday 29 November 2009 17:50:10 Didier Gasser-Morlay wrote:
> I hope you won't mind if I had my 2 cents to this conversation; aside from
> the actual format, this question comes up with regularity on various
> database lists.
>
> I have never understood the need for storing images inside a database (and
> to some extend blob data) because:
>   - being a blog you do not query it just store and retrieve, you rarely
> update, so the need for a fancy SQL and DB engine is somewhat limited;
>
>   - storing images is heavy, makes the database grow, thus makes
> backup/restore that bit more painful and possibly that bit less stable. ..
>
> For years I designed system for picture libraries where numbers like 10s of
> thousands to millions of images are common place, we would never have
> contemplated to store images on anything more complex than a file system.
> using the database to store a URL to that resource.
>
> Hope this helps
>
> Didier
>
this is **mission critical** for me (alone perhaps) as the images (usually
very small  (eg many could be 15K for a small photo, 3k - 16K for a small
diagram of a body part) are part of some medical record software I'm writing
for myself, and sit in the progress notes when displayed.  Just alteration of
a line drawn on a body part for an injured patient could end you up in court
or de-registered.

Medico-legal issues abound.

Yes, one can easily backup files in a directory, but my current feeling is its
easier to do an entire DB dump - could be wrong, stand to be corrected.

Regards

Richard

Re: bytea and text

From
Jasen Betts
Date:
On 2009-11-29, Didier Gasser-Morlay <didiergm@gmail.com> wrote:
> --0016e6dab093a91c6704797cefd5
> Content-Type: text/plain; charset=ISO-8859-1
>
> I hope you won't mind if I had my 2 cents to this conversation; aside from
> the actual format, this question comes up with regularity on various
> database lists.
>
> I have never understood the need for storing images inside a database (and
> to some extend blob data) because:
>   - being a blog you do not query it just store and retrieve, you rarely
> update, so the need for a fancy SQL and DB engine is somewhat limited;
>
>   - storing images is heavy, makes the database grow, thus makes
> backup/restore that bit more painful and possibly that bit less stable. ..
>
> For years I designed system for picture libraries where numbers like 10s of
> thousands to millions of images are common place, we would never have
> contemplated to store images on anything more complex than a file system.
> using the database to store a URL to that resource.

ON DELETE CASCADE doesn't work for disk files.

all the applications that need the images need to ba able to
contact the server for the files.

only need to backup one thing.


Re: bytea and text

From
Kris Kewley
Date:
Some interesting points were made. Depending on the roadmap for you
application consideration should be given to storing the image files
outside of the db.

IMO the driving factors here are:
1. Scalability
2. Version control of images
3. Backup
4. Performance

Including images in your db will accerlate it's growth. You noted
legal concerns, version control is typically an important aspect of
this. Back up and restore time will increase as db size is larger, and
will be more costly (assume db on more expensive disk than file
server). Finally performance.. Blobs typically increase disk io on
queries (at least they do in oracle) so should be used judicially in
heavily accessed tables.

Realize that I have made some wild assumptions about your app here,
but this is my experience.

Consider a link in your db and using cvs for storing the images, or
filesystem depending on the extent of to your needs.

I too stand to be corrected :-)
Kris


On 29-Nov-09, at 3:30, richard terry <rterry@pacific.net.au> wrote:

> On Sunday 29 November 2009 17:50:10 Didier Gasser-Morlay wrote:
>> I hope you won't mind if I had my 2 cents to this conversation;
>> aside from
>> the actual format, this question comes up with regularity on various
>> database lists.
>>
>> I have never understood the need for storing images inside a
>> database (and
>> to some extend blob data) because:
>>  - being a blog you do not query it just store and retrieve, you
>> rarely
>> update, so the need for a fancy SQL and DB engine is somewhat
>> limited;
>>
>>  - storing images is heavy, makes the database grow, thus makes
>> backup/restore that bit more painful and possibly that bit less
>> stable. ..
>>
>> For years I designed system for picture libraries where numbers
>> like 10s of
>> thousands to millions of images are common place, we would never have
>> contemplated to store images on anything more complex than a file
>> system.
>> using the database to store a URL to that resource.
>>
>> Hope this helps
>>
>> Didier
>>
> this is **mission critical** for me (alone perhaps) as the images
> (usually
> very small  (eg many could be 15K for a small photo, 3k - 16K for a
> small
> diagram of a body part) are part of some medical record software I'm
> writing
> for myself, and sit in the progress notes when displayed.  Just
> alteration of
> a line drawn on a body part for an injured patient could end you up
> in court
> or de-registered.
>
> Medico-legal issues abound.
>
> Yes, one can easily backup files in a directory, but my current
> feeling is its
> easier to do an entire DB dump - could be wrong, stand to be
> corrected.
>
> Regards
>
> Richard
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice

Re: bytea and text

From
"Jean-Yves F. Barbier"
Date:
Kris Kewley a écrit :
> Some interesting points were made. Depending on the roadmap for you
> application consideration should be given to storing the image files
> outside of the db.
>
> IMO the driving factors here are:
> 1. Scalability
> 2. Version control of images
> 3. Backup
> 4. Performance
>
> Including images in your db will accerlate it's growth. You noted legal
> concerns, version control is typically an important aspect of this. Back
> up and restore time will increase as db size is larger, and will be more
> costly (assume db on more expensive disk than file server). Finally
> performance.. Blobs typically increase disk io on queries (at least they
> do in oracle) so should be used judicially in heavily accessed tables.

These tables are not heavily accessed: they're linked to heavily accessed
other tables.
They only contains an id, a RI, a pic_nb and a bytea.
Pictures are practically never upgraded and jettisoned when obsolete.

> Realize that I have made some wild assumptions about your app here, but
> this is my experience.
>
> Consider a link in your db and using cvs for storing the images, or
> filesystem depending on the extent of to your needs.

NO, I want only one backup solution, not one for each case; and perfs are
only needed for other tables.

> I too stand to be corrected :-)

I see, you're doing BDSM :D

JY
--
Leave no stone unturned.
        -- Euripides