Thread: Secure DB Systems - How to
I was wondering if it is possible to create a secure database system usingPostgreSQL/PHP combination? I have the following in mind: I wanted to store all my( and my brothers and sisters) important document information such as birth certificate, SSN, passport number, travel documents, insurance(car, home, etc) document, and other important documents imagined in the database. The data will be entered either manually and/or scanned(with OCR). I need to be able to search on all the fields in the database. We have 10 computers(5bros, 4sisters, and myself) plus 1 server with I maintained. The data should be synchronize/replicate between those computers. Well, so far it is easy, isn't it? Here's my question: a) How can I make sure that it secure so only authorized person can modify/add/delete the information? Beside transaction logs, are there any other method to trace any transaction(kind of paper trail)? Assuming there are 3 step process to one enter the info e.g: - One who enter the info (me) - One who verify the info(the owner of info) - One who verify and then commit the change! How can I implement such a process in PostgreSQL and/or PHP or any other web language? b) How can I make sure that no one can tap the info while we are entering the data in the computer? (our family are scattered within US and Canada) c) Is it possible to securely synchronize/replicate between our computers using VPN? Does PostgreSQL has this functionality by default? d) Other secure method that I have not yet mentioned. Anyone has good ideas on how to implement such a systems? Thanks
On Thu, Jul 08, 2004 at 11:49:36 -0400, Sarah Tanembaum <sarahtanembaum@yahoo.com> wrote: > I was wondering if it is possible to create a secure database system > usingPostgreSQL/PHP combination? > > I have the following in mind: > > I wanted to store all my( and my brothers and sisters) important document > information such as birth certificate, SSN, passport number, travel > documents, insurance(car, home, etc) document, and other important documents > imagined in the database. > > The data will be entered either manually and/or scanned(with OCR). I need to > be able to search on all the fields in the database. > > We have 10 computers(5bros, 4sisters, and myself) plus 1 server with I > maintained. The data should be synchronize/replicate between those > computers. > > Well, so far it is easy, isn't it? > > Here's my question: > > a) How can I make sure that it secure so only authorized person can > modify/add/delete the information? Beside transaction logs, are there any > other method to trace any transaction(kind of paper trail)? Keeping the system administrator from seeing the data while making it searchable is difficult. To do this you need to encrypt the data on the client side using a key the client has (and this key has to be protected from loss) and the only searches you can do are equality searches using a hash or encrypted value. The system administrator can always delete the data. If you store which user has access to a row in the row, you can use views to control access to the rows for people other than the system administrator. > b) How can I make sure that no one can tap the info while we are entering > the data in the computer? (our family are scattered within US and Canada) Use SSL. > c) Is it possible to securely synchronize/replicate between our computers > using VPN? Does PostgreSQL has this functionality by default? Probably the best thing here is to run one live server and make backups of the system that you store at your relatives along with instructions for recovering them if something happens to you. Probably the backups should be encrypted with either the keys in your safe deposit box or using a system where something like 3 out of 5 keys can be used to recover the backup files. Be sure to test the backup recovery.
> Keeping the system administrator from seeing the data while making it > searchable is difficult. To do this you need to encrypt the data on > the client side using a key the client has (and this key has to be > protected from loss) and the only searches you can do are equality > searches using a hash or encrypted value. You can also perform regex searches. Here is an example to get you started: CREATE TABLE crypto ( id SERIAL PRIMARY KEY, title VARCHAR(50), crypted_content BYTEA ); INSERT INTO crypto VALUES (1,'test1',encrypt_iv('daniel','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes')); INSERT INTO crypto VALUES (2,'test2',encrypt_iv('test','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes')); INSERT INTO crypto VALUES (3,'test3',encrypt_iv('struck','fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes')); SELECT *,decrypt_iv(crypted_content, 'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes') FROM crypto; -- equality search SELECT *,decrypt_iv(crypted_content,'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes') FROM crypto WHERE decrypt_iv(crypted_content, 'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes')='struck'; -- regex search SELECT *,decrypt_iv(crypted_content,'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'),'aes') FROM crypto WHERE encode(decrypt_iv(crypted_content, 'fooz',decode('9MlPeZtpuxKo5m4O4+pd4g==','base64'), 'aes'),'escape') ~* 'daniel'; "fooz" is the password and "9MlPeZtpuxKo5m4O4+pd4g==" is the IV (initialization vector) stored in base64 format. I choosebase64 because it is more convenient to create queries with it. In the real database I do use a different IV for every row, so I do also store the IV with the row. In my case I do generate the IV by PHP with /dev/urandom as a random source. Greetings, Daniel Struck -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: struck.d@retrovirology.lu
On Tue, Jul 13, 2004 at 11:35:57 +0200, Daniel Struck <struck.d@retrovirology.lu> wrote: > > Keeping the system administrator from seeing the data while making it > > searchable is difficult. To do this you need to encrypt the data on > > the client side using a key the client has (and this key has to be > > protected from loss) and the only searches you can do are equality > > searches using a hash or encrypted value. > > You can also perform regex searches. If you decrypt the data on the database, the sysadmin can see it. If you are willing to take that chance (e.g. if you primary concern is some third party getting a snapshot of the DB), then you can do lots of things.
> If you decrypt the data on the database, the sysadmin can see it. Hm, you are right. If one does decrypt the data on the database you have to sent the password to postgresql and so a administratorof the database could easily grasb the password. So the only way to go, would be to perform en/decryption on the client side? > If you are willing to take that chance (e.g. if you primary concern is > some third party getting a snapshot of the DB), then you can do lots of > things. I wonder now; if somebody could achieve to get a snapshot of the database, they could also be able to get the log-file ofpostgresql. So one would also have to make attention that the information like sql statements don't leak that way. Are there other places where this kind of information could leak? Greetings, Daniel Struck -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: struck.d@retrovirology.lu
Daniel Struck wrote: >>If you decrypt the data on the database, the sysadmin can see it. >> >> > >Hm, you are right. If one does decrypt the data on the database you have to sent the password to postgresql and so a administratorof the database could easily grasb the password. > >So the only way to go, would be to perform en/decryption on the client side? > > Exactly. That is the only way to ensure that the data is never decrypted within the database or database server. Now, the 'client' IMHO is the PHP application. If the key for your encryption is stored in the user's session (on the webserver temporarily) then there is no log of that key or data (unless you store the session data in the database, then you got problems, see below). I'm starting an article on doing just this for International PHP Magazine, and of course will use PostgreSQL as the back-end ;-) >I wonder now; if somebody could achieve to get a snapshot of the database, they could also be able to get the log-file ofpostgresql. >So one would also have to make attention that the information like sql statements don't leak that way. >Are there other places where this kind of information could leak? > > If you wanted the key based on each user, then you could use their password (which is typically stored in an MD5 hash) as the key for encryption/decryption, and put it in the session for use by the application while they are logged in. This is the easiest (and most effective, IMHO) way to keep the encryption at the user level, and keep the data in the database encrypted at all times. Basically every record would be encrypted with the key for the user associated with that record, and there's a lot of work for anyone with a snapshot who is working on brute forcing all that data row by row... :) The only time that data is not encrypted is on the webserver, and only during transmission of that data back to the client. SSL would be the most common approach to solving this problem. My absolute favourite DB layer for PHP is ADOdb, which also has a class that transparently stores your session data in the database (if desired). This is crucial for sites that have multiple webservers and load balancers, as your session data needs to be accessible from the webserver that you are currently at. The problem here is that the key for each user would also be stored in the database if this method were used, rendering your efforts pointless! OTOH, using the default storage of session data would put all the user keys in temp files on the hard drives of the webservers. Not only does this not scale well (as you have to tell pound or LocalDirector or whatever load balancer you use to stick each user to the primary server), but some would consider this absolutely not-acceptable, as any administrator on the servers could see that data. So I suppose you have to pick your poison on this one. I gotta figure this out so I can start writing ;^P -- Mitch
> Exactly. That is the only way to ensure that the data is never > decrypted within the database or database server. But then one will also lose the ability to make sql-queries with the encrypted columns . Something I wouldn't want to miss,or else it doesn't make sense to store this data in a database if one can't work with it. I do send queries with the password within it, so I disabled the logging of sql-statements in postgresql to prevent the leakageof the password. But the problem exists that the password is the available at several places. As I am the only admin of the machine this shouldn't be such a big problem. > Now, the 'client' IMHO is the PHP application. If the key for your > encryption is stored in the user's session (on the webserver > temporarily) then there is no log of that key or data (unless you store > the session data in the database, then you got problems, see below). I simply store a sha1 hash of the password in a configuration file, to verify that the user did enter the correct password,or else he would only see garbage ;-) > I'm starting an article on doing just this for International PHP > Magazine, and of course will use PostgreSQL as the back-end ;-) Interesting, good that one can also order this editions online ;-) > the data in the database encrypted at all times. Basically every record > would be encrypted with the key for the user associated with that > record, and there's a lot of work for anyone with a snapshot who is > working on brute forcing all that data row by row... :) I have a different scenario; a couple of common sensitive information several users must have access to. So I have to gowith only one password. (At last their is a second layer of protection if someone steals the server or hard drives: I will encrypt the postgres fileswith dm_crypt. Right now I am testing it, and it seems to work fine so far.) > during transmission of that data back to the client. SSL would be the > most common approach to solving this problem. I have tested two-way ssl authentication. The certificates for the clients are stored on smartcards accessible to the mozillabrowser. (I have written a small howto, you can find it on http://www.opensc.org/files/doc/apache-client-authentication(v0.5.1).pdf,but I haven't yet tested it with the latest releaseof opensc) > the database if this method were used, rendering your efforts pointless! True, if you find the solution I hope to find it in your article ;-) Daniel -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: struck.d@retrovirology.lu
On Tue, Jul 27, 2004 at 13:41:33 +0200, Daniel Struck <struck.d@retrovirology.lu> wrote: > > Exactly. That is the only way to ensure that the data is never > > decrypted within the database or database server. > > But then one will also lose the ability to make sql-queries with the encrypted columns . Something I wouldn't want to miss,or else it doesn't make sense to store this data in a database if one can't work with it. That depends on the kind of queries. Searching for exact matches should work fine. Some other things can be done in special cases. If you are interested in this topic you may want to pick up a copy of Translucent Databases.
Bruno Wolff III <bruno@wolff.to> writes: > That depends on the kind of queries. Searching for exact matches should work > fine. Some other things can be done in special cases. If searching for exact matches works then you're using a naive encryption system. The problem is that it also means your database is vulnerable to dictionary attacks. Good encryption systems will include random padding to ensure that you can't attack it by merely guessing many possible plaintexts and verifying to see if any match. -- greg
> If searching for exact matches works then you're using a naive encryption > system. The problem is that it also means your database is vulnerable to > dictionary attacks. Good encryption systems will include random padding to > ensure that you can't attack it by merely guessing many possible plaintexts > and verifying to see if any match. To prevent this, I use in my implementation for every encrypted value a corresponding IV to prevent that the same value willgive the same crypted text. This is a reason, why I must do pattern searches in postgresql itself, because I do need to include the IV-column in thesql statement. Daniel -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: struck.d@retrovirology.lu
On Wed, Jul 28, 2004 at 16:16:10 -0400, Greg Stark <gsstark@mit.edu> wrote: > > Bruno Wolff III <bruno@wolff.to> writes: > > > That depends on the kind of queries. Searching for exact matches should work > > fine. Some other things can be done in special cases. > > If searching for exact matches works then you're using a naive encryption > system. The problem is that it also means your database is vulnerable to > dictionary attacks. Good encryption systems will include random padding to > ensure that you can't attack it by merely guessing many possible plaintexts > and verifying to see if any match. IVs act to make the key appear longer. This is especially useful when humans are picking passphrases that are used to generate the key. If you control what the actual keys are, then you can make dictionary attacks impractical. However, there still would be the problem that identical items in the database would be identical. Which, depnding on your application, might be a problem because of information leakage.
> IVs act to make the key appear longer. This is especially useful when humans > are picking passphrases that are used to generate the key. If you control > what the actual keys are, then you can make dictionary attacks impractical. > However, there still would be the problem that identical items in the > database would be identical. Which, depnding on your application, might > be a problem because of information leakage. I don't think this is true for CBC mode. Here the first block of plaintext is XORed with the IV then encrypted with a blockciper like AES for example. The next plaintext block is than XORed with the previous cyphertext and so on. So IV in CBC is used to make the encrypted cyphertext unique. You can find some info in chapter 9.3 of "Appplied Cryptopgraphy" from Bruce Schneier. In my application I do use a different IV for every encrypted plaintext: $td = mcrypt_module_open('rijndael-128', '', 'cbc', ''); $iv = base64_encode(mcrypt_create_iv(mcrypt_enc_get_iv_size($td), MCRYPT_DEV_URANDOM)); I do use base64 as it is easier to store it in the database. The plaintext is then encrypted using: $ciphertext = "encrypt_iv('" . $plaintext . "','" . $password] . "',decode('$iv','base64'),'aes')"; The IV is then stored together with the ciphertext in the database. Daniel -- Retrovirology Laboratory Luxembourg Centre Hospitalier de Luxembourg 4, rue E. Barblé L-1210 Luxembourg phone: +352-44116105 fax: +352-44116113 web: http://www.retrovirology.lu e-mail: struck.d@retrovirology.lu
On Fri, Jul 30, 2004 at 11:43:31 +0200, Daniel Struck <struck.d@retrovirology.lu> wrote: > > IVs act to make the key appear longer. This is especially useful when humans > > are picking passphrases that are used to generate the key. If you control > > what the actual keys are, then you can make dictionary attacks impractical. > > However, there still would be the problem that identical items in the > > database would be identical. Which, depnding on your application, might > > be a problem because of information leakage. I should have stated the above a bit differently. IVs are probably more used to prevent the information leakage from have the same plain text encode to the same cipher text when using the same, then for protecting users from using poorly chosen keys. > I don't think this is true for CBC mode. Here the first block of plaintext is XORed with the IV then encrypted with a blockciper like AES for example. The next plaintext block is than XORed with the previous cyphertext and so on. In the context of records in a database you aren't (at least not normally) going to treat multiple records as part of the same cipher stream, so that using various types of block chaining over say a whole table isn't going to happen. > In my application I do use a different IV for every encrypted plaintext: That is certainly a reasonable approach, though there may be cases where it is useful to trade some information leakage for the ability to use indexes while doing all decryption on the client.