Thread: How to store files into the DB with PHP. (bytea ?)

How to store files into the DB with PHP. (bytea ?)

From

"François Delpierre"

Date:

02 February 2009, 06:24:21

Hi,

I know this question came quite often, but I never found a clear answer in the
mailing list.

My problem :
- I need to store files (pdf, png, jpg) from a web page in PHP.
- I have a strong need of security and high availability. I must minimize the
risk of having these documents stolen if the web server is compromised.

My questions :
- Which datatype ? (once again, sorry.).  (Text, bytea, lo)  ?
- Most of samples redirect to lo (large objects), and PHP has a set of
dedicated functions for them. But in the documentation we can read :

All large objects are placed in a single system table called pg_largeobject.
PostgreSQL also supports a storage system called "TOAST" that automatically
stores values larger than a single database page into a secondary storage area
per table. This makes the large object facility partially obsolete.

Why obsolete ? Is it now because of TOAST that we can store up to 1GB into
bytea ? If bytea is the easiest solution, why do so many people redirect to LO
?
LO stores the datas within only one table, pg_largeobject, I really prefer to
store the big datas within my custom table stored on a dedicated tablespace.

Due to the high security level of my web application, all the datas must be
accessed through stored procedures. Does it change something to the choice ?

Where can I find samples of files stored through PHP & stored procedures ?

Regards,

--
François Delpierre

Re: How to store files into the DB with PHP. (bytea ?)

From

tv@fuzzy.cz

Date:

02 February 2009, 07:55:15

> Hi,
>
> I know this question came quite often, but I never found a clear answer in
> the
> mailing list.
>
> My problem :
> - I need to store files (pdf, png, jpg) from a web page in PHP.
> - I have a strong need of security and high availability. I must minimize
> the
> risk of having these documents stolen if the web server is compromised.

If the web server is compromised, the attacker will have full access to
the database just as the web application (the connection details are
stored in some configuration file, and he may load all the data he wants
using a custom PHP script uploaded to the server).

You may increase the security by encrypting the files, but the password
must not be saved in the database. For example if the files are not shared
by multiple users, each file may be encrypted using the user password.

> My questions :
> - Which datatype ? (once again, sorry.).  (Text, bytea, lo)  ?
> - Most of samples redirect to lo (large objects), and PHP has a set of
> dedicated functions for them. But in the documentation we can read :
>
> All large objects are placed in a single system table called
> pg_largeobject.
> PostgreSQL also supports a storage system called "TOAST" that
> automatically
> stores values larger than a single database page into a secondary storage
> area
> per table. This makes the large object facility partially obsolete.
>
> Why obsolete ? Is it now because of TOAST that we can store up to 1GB into
> bytea ? If bytea is the easiest solution, why do so many people redirect
> to LO
> ?
> LO stores the datas within only one table, pg_largeobject, I really prefer
> to
> store the big datas within my custom table stored on a dedicated
> tablespace.

Choice between BYTEA and LOB is mostly about memory requirements - when
loading / storing BYTEA columns, you need enough memory to keep the whole
object (data stored in the column). So when constructing the INSERT
statement you have to place the (escaped) data into it, and that may
exceed the memory_limit. With LOB objects this is not a problem thanks to
streaming (you may open the LOB as a stream and read / write).

On the other side, LOB columns do not have some interesting features
(foreign keys, etc.)

> Due to the high security level of my web application, all the datas must
> be
> accessed through stored procedures. Does it change something to the choice
> ?
>
> Where can I find samples of files stored through PHP & stored procedures ?

Sorry, but you have some custom layer of stored procedures, so we may only
guess. But LOBs are identified using an OID, so the procedures might
return OIDs.

regards
Tomas

Re: How to store files into the DB with PHP. (bytea ?)

From

François Delpierre

Date:

02 February 2009, 08:55:43

On Monday 02 February 2009 12:55:04 tv@fuzzy.cz wrote:
> > Hi,
> >
> > I know this question came quite often, but I never found a clear answer
> > in the
> > mailing list.
> >
> > My problem :
> > - I need to store files (pdf, png, jpg) from a web page in PHP.
> > - I have a strong need of security and high availability. I must minimize
> > the
> > risk of having these documents stolen if the web server is compromised.
>
> If the web server is compromised, the attacker will have full access to
> the database just as the web application (the connection details are
> stored in some configuration file, and he may load all the data he wants
> using a custom PHP script uploaded to the server).
>
> You may increase the security by encrypting the files, but the password
> must not be saved in the database. For example if the files are not shared
> by multiple users, each file may be encrypted using the user password.
I do not agree, this is in fact the reason why stored procedures are mandatory
here.

If you use stored procedures for security, the credentials that are stored on
your web server are restricted to execute procedures, and eventually have a
read only access to some tables with no critical informations (ie
translations), all other tables are simply forbidden for the web server user.

For this security level to work, it means that you use a "login" stored
procedure that returns a "ticket" to the web server and/or web browser, valid
for a certain amount of time and for the source IP. Every other procedures
will require a valid "ticket" in the arguments before it returns data.)

This way, the sessions are handled within the database, and the user/password
stored in the front end web server cannot read any data from the database
without going through a stored procedure that enforce the security.

An other option maybe would be to authenticate users on the database directly,
and only use a stored procedure (with security definer) to create new users. I
do not know exactly the pro/cons to work this way. I would prefer to keep my
users within an application table instead of the DB (and in a system table),
but this may be the easiest solutions as the stored procedures would no more
be required.

>
> > My questions :
> > - Which datatype ? (once again, sorry.).  (Text, bytea, lo)  ?
> > - Most of samples redirect to lo (large objects), and PHP has a set of
> > dedicated functions for them. But in the documentation we can read :
> >
> > All large objects are placed in a single system table called
> > pg_largeobject.
> > PostgreSQL also supports a storage system called "TOAST" that
> > automatically
> > stores values larger than a single database page into a secondary storage
> > area
> > per table. This makes the large object facility partially obsolete.
> >
> > Why obsolete ? Is it now because of TOAST that we can store up to 1GB
> > into bytea ? If bytea is the easiest solution, why do so many people
> > redirect to LO
> > ?
> > LO stores the datas within only one table, pg_largeobject, I really
> > prefer to
> > store the big datas within my custom table stored on a dedicated
> > tablespace.
>
> Choice between BYTEA and LOB is mostly about memory requirements - when
> loading / storing BYTEA columns, you need enough memory to keep the whole
> object (data stored in the column). So when constructing the INSERT
> statement you have to place the (escaped) data into it, and that may
> exceed the memory_limit. With LOB objects this is not a problem thanks to
> streaming (you may open the LOB as a stream and read / write).
So, there is no way to push binary directly into BYTEA without the need to
escape ?
I assume that you're talking about memory_limit parameter of PHP on the web
server.
Can we still use streaming through a stored procedure ?

>
> On the other side, LOB columns do not have some interesting features
> (foreign keys, etc.)
>
> > Due to the high security level of my web application, all the datas must
> > be
> > accessed through stored procedures. Does it change something to the
> > choice ?
> >
> > Where can I find samples of files stored through PHP & stored procedures
> > ?
>
> Sorry, but you have some custom layer of stored procedures, so we may only
> guess. But LOBs are identified using an OID, so the procedures might
> return OIDs.
The web server user cannot read (or write) any table directly, so it's forced
to use a stored procedure to read or write data. So, all the PHP fuctions on
LOB are useless I guess.

>
> regards
> Tomas
Regards,

--
François Delpierre

Re: How to store files into the DB with PHP. (bytea ?)

From

tv@fuzzy.cz

Date:

02 February 2009, 09:17:45

>> Choice between BYTEA and LOB is mostly about memory requirements - when
>> loading / storing BYTEA columns, you need enough memory to keep the
>> whole
>> object (data stored in the column). So when constructing the INSERT
>> statement you have to place the (escaped) data into it, and that may
>> exceed the memory_limit. With LOB objects this is not a problem thanks
>> to
>> streaming (you may open the LOB as a stream and read / write).
> So, there is no way to push binary directly into BYTEA without the need to
> escape ?

AFAIK no (at least in PHP) - you have to escape the data before using them
in SQL statement.

> I assume that you're talking about memory_limit parameter of PHP on the
> web
> server.
> Can we still use streaming through a stored procedure ?

The LOB principle is quite simple - each LOB is identified by OID, split
into multiple segments and stored in a table pg_largeobject. I am just
guessing but there are functions used to read / store the LOBs, so the
streaming might work (unless you want to use the tickets).

But I am just guessing - about a month ago I wrote an article about ways
to store files in a PostgreSQL database:

http://www.fuzzy.cz/en/articles/storing-files-in-a-postgresql-database/

The site is still under construction, so a lot of stuff is missing there
(sorry), but the last section might give you a hint how to mimic LOBs, get
rid of some of the BYTEA / LOB disadvantages.

>> Sorry, but you have some custom layer of stored procedures, so we may
>> only
>> guess. But LOBs are identified using an OID, so the procedures might
>> return OIDs.
> The web server user cannot read (or write) any table directly, so it's
> forced
> to use a stored procedure to read or write data. So, all the PHP fuctions
> on
> LOB are useless I guess.

Re: How to store files into the DB with PHP. (bytea ?)

From

François Delpierre

Date:

02 February 2009, 10:13:12

On Monday 02 February 2009 14:17:38 tv@fuzzy.cz wrote:
>
> But I am just guessing - about a month ago I wrote an article about ways
> to store files in a PostgreSQL database:
>
> http://www.fuzzy.cz/en/articles/storing-files-in-a-postgresql-database/
>
> The site is still under construction, so a lot of stuff is missing there
> (sorry), but the last section might give you a hint how to mimic LOBs, get
> rid of some of the BYTEA / LOB disadvantages.
Thanks for this address, this clearly expose the problem. It looks like there
is no perfect solution. As I can live with a file size limit of 15Mo for a
memory_limit of 64Mo in PHP, I'll certainly go to BYTEA. Anyway, if I use
stored procedures to enforce security, the PHP pg_lo_xxx functions will be
useless.

Thanks a lot,

François Delpierre

Re: How to store files into the DB with PHP. (bytea ?)

From

Andrew McMillan

Date:

02 February 2009, 17:18:10

On Mon, 2009-02-02 at 13:55 +0100, François Delpierre wrote:
> On Monday 02 February 2009 12:55:04 tv@fuzzy.cz wrote:
> >
> > If the web server is compromised, the attacker will have full access to
> > the database just as the web application (the connection details are
> > stored in some configuration file, and he may load all the data he wants
> > using a custom PHP script uploaded to the server).
> >
> > You may increase the security by encrypting the files, but the password
> > must not be saved in the database. For example if the files are not shared
> > by multiple users, each file may be encrypted using the user password.
> I do not agree, this is in fact the reason why stored procedures are mandatory
> here.

I don't see that this changes things.  Whether you use stored
procedures, authenticate against the database, or whatever, your web
application layer has access to the information on the way through and
any compromise of your webserver will necessarily involve having a 'man
in the middle' possibility.

> If you use stored procedures for security, the credentials that are stored on
> your web server are restricted to execute procedures, and eventually have a
> read only access to some tables with no critical informations (ie
> translations), all other tables are simply forbidden for the web server user.

> For this security level to work, it means that you use a "login" stored
> procedure that returns a "ticket" to the web server and/or web browser, valid
> for a certain amount of time and for the source IP. Every other procedures
> will require a valid "ticket" in the arguments before it returns data.)

So an attacker would (e.g.) log the user's credentials as they pass
through and then happily generate their own tickets to use to extract
the data.

> This way, the sessions are handled within the database, and the user/password
> stored in the front end web server cannot read any data from the database
> without going through a stored procedure that enforce the security.
>
> An other option maybe would be to authenticate users on the database directly,
> and only use a stored procedure (with security definer) to create new users. I
> do not know exactly the pro/cons to work this way. I would prefer to keep my
> users within an application table instead of the DB (and in a system table),
> but this may be the easiest solutions as the stored procedures would no more
> be required.

So in this case an attacker would (e.g.) log the user's credentials on
the way through and use them for subsequent connections to the
database...

That's not to say that you aren't adding some security: you are.  Making
an intruder understand how your application works before being able to
steal your data will naturally give you a longer window of opportunity
to catch them.  Probably you would have raised the bar sufficiently
above the interest level of an opportunistic script kiddy.

Odds are, though, that the level of difference between someone who would
be deliberately cracking your server to pull the data from your
database, and someone who would be deliberately cracking your server,
understanding your application and then pulling the data from your
database isn't that great.

If you take your paranoia one step further, do the user authentication
in the database using some kind of one-time password device then *maybe*
you will be getting a more repeatable level of security.  In that case
someone would only be able to sniff the files as they were accessed -
and would not be able to extract them from the database.

Cheers,
                    Andrew McMillan.

------------------------------------------------------------------------
andrew (AT) morphoss (DOT) com                            +64(272)DEBIAN
        Open Source: the difference between trust and antitrust
------------------------------------------------------------------------

Re: How to store files into the DB with PHP. (bytea ?)

From

François Delpierre

Date:

03 February 2009, 12:16:49

Hi Andrew,

> I don't see that this changes things.  Whether you use stored
> procedures, authenticate against the database, or whatever, your web
> application layer has access to the information on the way through and
> any compromise of your webserver will necessarily involve having a 'man
> in the middle' possibility.
You're right, authenticating against the DB will not change anything, my
mistake. As far as the user can read a table, he can read all records.


> So an attacker would (e.g.) log the user's credentials as they pass
> through and then happily generate their own tickets to use to extract
> the data.
Totally agree, the attacker will be able to access the files of the users that
are connecting from the time he put the sniffer in place BUT NOT dump the whole
content with thousands of documents from the previous months from users that
did not connect recently. So, this limit the impact.

To go back to the initial subject of this post, I'm now able to store/read
files from the DB up to 20MB without problem. Without using stored procedures
yet. (Maybe I can post the code here.) Only an annoying warning about escaping
that I can't figure out yet.

François.