Thread: images in database

images in database

From
"alex b."
Date:
well, as many of  you might have noticed, I am new to sql... :>

I know of the possibility in mysql to store images or other files in the
database.

so: is there a similar thing with pgsql?
if so, then how do you realize this?

thanks in advance


Re: images in database

From
Jeff Davis
Date:
> I know of the possibility in mysql to store images or other files in the
> database.
>
> so: is there a similar thing with pgsql?
> if so, then how do you realize this?

look into the "bytea" datatype.

Jeff


Re: images in database

From
Jan Wieck
Date:
"alex b." wrote:
>
> well, as many of  you might have noticed, I am new to sql... :>
>
> I know of the possibility in mysql to store images or other files in the
> database.
>
> so: is there a similar thing with pgsql?
> if so, then how do you realize this?

I never get why people are so eager to use knowingly proprietary
features. Aren't all the other differences in SQL dialects enough to
make porting hard?

What about B64 encoding and storing as strings?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: images in database

From
"Nigel J. Andrews"
Date:
On Thu, 3 Apr 2003, Jan Wieck wrote:

> "alex b." wrote:
> >
> > well, as many of  you might have noticed, I am new to sql... :>
> >
> > I know of the possibility in mysql to store images or other files in the
> > database.
> >
> > so: is there a similar thing with pgsql?
> > if so, then how do you realize this?
>
> I never get why people are so eager to use knowingly proprietary
> features. Aren't all the other differences in SQL dialects enough to
> make porting hard?
>
> What about B64 encoding and storing as strings?

Well depends on what people want. I mentioned Base64 encoding and storing
numerous times on a current project and mostly was met with blank stares, no
comment or the same questions I'd answered several times before.

In the end, we went with large objects because my tests showed them faster than
bytea and there could be large sized objects, a B64 encoding would bloat the
size (encode/decode time and size bloat were what slowed the bytea down I
think) and the large objects let us extract the file in portions for streaming
should we wish.

B64 would still require a bytea storage type to avoid any character encoding
issue between client and server though, right?


--
Nigel J. Andrews


Re: images in database

From
Shridhar Daithankar
Date:
On Thursday 03 April 2003 17:38, you wrote:
>
> In the end, we went with large objects because my tests showed them faster
> than bytea and there could be large sized objects, a B64 encoding would
> bloat the size (encode/decode time and size bloat were what slowed the
> bytea down I think) and the large objects let us extract the file in
> portions for streaming should we wish.


Just wondering, how does it exactly differ from storing an image in a file and
storing the filename in the database?

 Shridhar


Re: images in database

From
Jan Wieck
Date:
"Nigel J. Andrews" wrote:
> B64 would still require a bytea storage type to avoid any character encoding
> issue between client and server though, right?

I don't think so, since it uses characters from the 7-Bit ASCII set
exclusively and IIRC those map to the same codes in almost every
encoding except EBCDIC (though, I'm not an encoding expert).

What might have slowed down bytea as it would text is TOAST's attempt to
compress the data. As image data usually is compressed already (if not
using PPM or such), this will be a waste of time. For that reason alone,
TOAST offers you to disable the compression attempt on a per attribute
base.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: images in database

From
"Nigel J. Andrews"
Date:
On Thu, 3 Apr 2003, Jan Wieck wrote:

> "Nigel J. Andrews" wrote:
> > B64 would still require a bytea storage type to avoid any character encoding
> > issue between client and server though, right?
>
> I don't think so, since it uses characters from the 7-Bit ASCII set
> exclusively and IIRC those map to the same codes in almost every
> encoding except EBCDIC (though, I'm not an encoding expert).
>
> What might have slowed down bytea as it would text is TOAST's attempt to
> compress the data. As image data usually is compressed already (if not
> using PPM or such), this will be a waste of time. For that reason alone,
> TOAST offers you to disable the compression attempt on a per attribute
> base.
>

Now that is interesting, I didn't try that. I might have another test run to
see what difference it makes, if I find the time.

There was a definite relationship between data size increase through escaping
and speed. Some of which would be down to the escape/unescape process itself,
some down to the transmission between client and server and then there'd be
some impact from the server's operations as well. I didn't quantify the split.


--
Nigel J. Andrews


Re: images in database

From
"Nigel J. Andrews"
Date:
On Thu, 3 Apr 2003, Shridhar Daithankar wrote:

> On Thursday 03 April 2003 17:38, you wrote:
> >
> > In the end, we went with large objects because my tests showed them faster
> > than bytea and there could be large sized objects, a B64 encoding would
> > bloat the size (encode/decode time and size bloat were what slowed the
> > bytea down I think) and the large objects let us extract the file in
> > portions for streaming should we wish.
>
>
> Just wondering, how does it exactly differ from storing an image in a file and
> storing the filename in the database?
>

Well it's in the database is the principal difference. There's a central
storage location, the client doesn't need to run on the same machine, the
client doesn't need to manage a filesystem, pg_dump can back up everything
(although I haven't actually tested this yet and I seem to remember someone
reporting a restore problem).

The file system option was one we considered.


--
Nigel J. Andrews


Re: images in database

From
Jan Wieck
Date:
Shridhar Daithankar wrote:
>
> On Thursday 03 April 2003 17:38, you wrote:
> >
> > In the end, we went with large objects because my tests showed them faster
> > than bytea and there could be large sized objects, a B64 encoding would
> > bloat the size (encode/decode time and size bloat were what slowed the
> > bytea down I think) and the large objects let us extract the file in
> > portions for streaming should we wish.
>
> Just wondering, how does it exactly differ from storing an image in a file and
> storing the filename in the database?

The image data would not be included in a database dump. So you loose
the ability to save a consistent snapshot.

The image data would not be accessible remotely except if made available
via network filesystems. Then you get permission and path problems.

The image data does not follow the same transactional semantics as all
other data. Due to MVCC, some process might actually have a write lock
on the row containing the filename and is right now writing to the file
while another process reads the old row and then the (intermediate mess
in the) file.

Rolling back a database transaction in case of whatever error will not
rollback changes to the image data.

Process (or system) crash in the middle of an update is not covered by
the databases crash recovery capability.


Enough reasons to forget about filename storage?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: images in database

From
Arjen van der Meijden
Date:
> Jan Wieck wrote:
>
> The image data would not be included in a database dump. So
> you loose the ability to save a consistent snapshot.
With the penalty of (much?) larger and slower database dumps.

> The image data would not be accessible remotely except if
> made available via network filesystems. Then you get
> permission and path problems.
True, but they are all solvable ;)

> The image data does not follow the same transactional
> semantics as all other data. Due to MVCC, some process might
> actually have a write lock on the row containing the filename
> and is right now writing to the file while another process
> reads the old row and then the (intermediate mess in the) file.
>
> Rolling back a database transaction in case of whatever error
> will not rollback changes to the image data.
>
> Process (or system) crash in the middle of an update is not
> covered by the databases crash recovery capability.
All three nice reasons, although it is sometimes simply "not thát
important" and in that case you're running into the main reason not to
do this.
The performance is quite a lot lower for the database-stored version, I
think I recall seeing 5-10 times slower file load times (ie from
retrieving the filename/location from the database and then reading the
file from the FS vs retrieving it from a bytea field).
I did not consider the compression stuff of TOAST, but I did compare the
same functionality with mysql vs FS and postgres vs FS. These tests were
not at all well performed, but just to get an idea on how the
performance was, all done on my local system with connections to the
local databases and directly reading from file.

Perhaps my increased understanding of how things work could make a
difference (like only 4-7 times slower orso).
Another disadvantage is that processing time for the database-processes
are much longer and in that case you load the database-server with extra
load it wouldn't have when you'd use the FS.
How postgres' LOB functionality fits into this, I don't now. It probably
fits in between the two in terms of performance (and ease of use).

> Enough reasons to forget about filename storage?
Imho it entirely depends on the requirements, your reasons to not go for
the local filesystem are quite good, but the main reason to go for the
FS is also a very important one in some situations.

So it becomes a "correctness and ease of use vs raw performance and
system load (including distribution of tasks)" comparison, I guess.

Regards,

Arjen


Re: images in database

From
Jan Wieck
Date:
Arjen van der Meijden wrote:
> > Enough reasons to forget about filename storage?
> Imho it entirely depends on the requirements, your reasons to not go for
> the local filesystem are quite good, but the main reason to go for the
> FS is also a very important one in some situations.
>
> So it becomes a "correctness and ease of use vs raw performance and
> system load (including distribution of tasks)" comparison, I guess.

Sure, the "correctness" of the pix isn't that important, speed counts
... except maybe if "naked" is only important because of photo technical
issues - like for Xray?

You're right, it depends on the requirements. The problem is that people
tend to do again and again what they've done before (cut'n'paste
programmers) without really reevaluating how the solution fit's into the
new requirements.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: images in database

From
Arjen van der Meijden
Date:
> Jan Wieck wrote
>
> Sure, the "correctness" of the pix isn't that important,
> speed counts ... except maybe if "naked" is only important
> because of photo technical issues - like for Xray?
Indeed, there are a large number of examples where correctness really
counts.
And there are a large number of examples where it doesn't matter 'that
much', while speed is more important.

For instance, we offer our users a 'photoalbum', it's quite nice, works
well, performs well etc.
But I really don't care if an image doesn't get uploaded correctly,
forcing the user to do upload it again. Or that a user has a rare chance
of getting wrong data at his screen.
What I do care about is that such features don't overload our already
'nicely loaded' databases :)

In other areas it might be the other way around, your Xray images
example is a nice one.

> You're right, it depends on the requirements. The problem is
> that people tend to do again and again what they've done
> before (cut'n'paste
> programmers) without really reevaluating how the solution
> fit's into the new requirements.
Too bad they do :(
It's all the OO-paradigma problem! ;) Reuse, reuse and reuse.

Anyway, that is an entirely different discussion :)

Regards,

Arjen


Re: images in database

From
"alex b."
Date:
Nigel J. Andrews wrote:
> On Thu, 3 Apr 2003, Jan Wieck wrote:
>
>
>>"Nigel J. Andrews" wrote:
>>
>>>B64 would still require a bytea storage type to avoid any character encoding
>>>issue between client and server though, right?
>>
>>I don't think so, since it uses characters from the 7-Bit ASCII set
>>exclusively and IIRC those map to the same codes in almost every
>>encoding except EBCDIC (though, I'm not an encoding expert).
>>
>>What might have slowed down bytea as it would text is TOAST's attempt to
>>compress the data. As image data usually is compressed already (if not
>>using PPM or such), this will be a waste of time. For that reason alone,
>>TOAST offers you to disable the compression attempt on a per attribute
>>base.
>>
>
>
> Now that is interesting, I didn't try that. I might have another test run to
> see what difference it makes, if I find the time.
>
> There was a definite relationship between data size increase through escaping
> and speed. Some of which would be down to the escape/unescape process itself,
> some down to the transmission between client and server and then there'd be
> some impact from the server's operations as well. I didn't quantify the split.
>
>



well, I thank all of you guys for answering my question!! ;)
I appreciate it very much!

cheers,
  alex