Thread: Supporting Encryption in Postgresql
For our research project, I need to implement an encryption support for Postgressql. At this current phase, I need to at least support page level encryption In other words, each page that belongs to a certain sensitive table will be stored encrypted on the harddisk. Since we are trying to have a new design that can start the decryption before even we see the data, I need to have some kind of thread support. My questions are in order to support page level encryption(i,e encrypt each page before writing back to disk and decrypt each page after we read from disk.) which parts of the code should be changed? Our more simply, is /src/backend/storage/smgr/md.c the only code that does the disk access? Since our design requires thread support (we will do some of the decryption, before we see the data, therefore during disk access, we need to continue decryption) Can you suggest me a good thread lib you think will work fine with postgresql ? Thanks for your help. Murat Kantarcioglu, PhD Candidate, Computer ScienceDepartment, Purdue University
On R, 2004-09-10 at 00:03, Murat Kantarcioglu wrote: > My questions are in order to support page level encryption(i,e encrypt > each page before writing back to disk and decrypt each page after we > read from disk.) which parts of the code should be changed? > Our more simply, is /src/backend/storage/smgr/md.c the only code > that does the disk access? > > Since our design requires thread support (we will do some of the > decryption, before we see the data, therefore during disk access, we > need to continue decryption) Why do you think that you need threads to do the (en/de)cryption? Why is it not sufficient to just replace the page read/write functions with ones supporting encryption ? Or just use encrypted filesystem ;) <evil grin> ------------- Hannu
Murat, > For our research project, I need to implement an encryption support for > Postgressql. At this current phase, I need to at least support page > level encryption In other words, each page that belongs to a certain > sensitive table will be stored encrypted on the harddisk. Are you planning on doing the decryption on the back-end, or on the client? It certainly seems to me that doing it on the client would make more sense; if the data is decrypted on the back-end, then you will still need the overhead of an SSL connection. In any case, I'm glad that you're looking into this; encryption-on-disk is one of those "missing features" that we might never have gotten around to as a project ... -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Given that the client does not write pages to the disk, this would be back-end encryption. Just out of curiosity, what threat model does this sort of encryption protect against? Surely any attacker who can read the files off the disk can also get the password used to encrypt them. Or would this be provided by the client and kept in RAM only? Paul Tillotson >Murat, > > > >>For our research project, I need to implement an encryption support for >>Postgressql. At this current phase, I need to at least support page >>level encryption In other words, each page that belongs to a certain >>sensitive table will be stored encrypted on the harddisk. >> >> > >Are you planning on doing the decryption on the back-end, or on the client? >It certainly seems to me that doing it on the client would make more sense; >if the data is decrypted on the back-end, then you will still need the >overhead of an SSL connection. > >In any case, I'm glad that you're looking into this; encryption-on-disk is one >of those "missing features" that we might never have gotten around to as a >project ... > > >
Paul Tillotson <pntil@shentel.net> writes: > Given that the client does not write pages to the disk, this would be > back-end encryption. Just out of curiosity, what threat model does > this sort of encryption protect against? Surely any attacker who can > read the files off the disk can also get the password used to encrypt > them. Or would this be provided by the client and kept in RAM only? If I have root- or postgres-level access to the machine, I can snarf the encryption key out of RAM even if it's never written to disk. I don't see what this (backend page-level encryption) would buy you over just using an encrypted partition, which is already available on most OSs... -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
Murat Kantarcioglu wrote: > For our research project, I need to implement an encryption support for > Postgressql. At this current phase, I need to at least support page > level encryption In other words, each page that belongs to a certain > sensitive table will be stored encrypted on the harddisk. > Since we are trying to have a new design that can start the decryption > before even we see the data, I need to have some kind of thread support. I have to say that this is becoming an important problem in the European market. In Italy for example the law impose that if you store personal data about your customers then this information shall be stored in an encrypted form. The bast way I found to accomplish this is using an encrypted file system. Google for "cryptoloop" or if you are brave enough look for StegFS. Regards Gaetano Mendola
Thanks for the comments. This piece will be a part of a bigger design and the problems mentioned are very real. In the future, our goal is to design a database system where the processing is done in a "secure coprocessor"(i.e no one will be able to see what is inside) and the small code inside the co-processor is verified using formal methods. Therefore, all the problems you have mentioned will not be a issue for our general case. We are even considering what could be revealed just watching the disk access. Initial technical report can be found at ( http://www.cs.purdue.edu/homes/kanmurat/technical.ps). Can you suggest me a solution to how to do this on Postgresql backend? I am asssuming that somewhere in the code, you are calling a function like getPage(Page_id) to retrieve the page(I am trying to change backend) All I need to do is (I am not sure yet) change such code with (ofcourse, I need to change writePage part) getPage(Page_id) { ctr=Hash_Table(Page_id) //return somevalue needed for deccryption Thread_Read(Page_id)// will call the original read code Thread_Encryption.start(ctr, length); when both threadsare done finish the encryption }
On Fri, Sep 10, 2004 at 11:52:26AM -0500, Murat Kantarcioglu wrote: > Can you suggest me a solution to how to do > this on Postgresql backend? > > I am asssuming that somewhere in the code, you are calling a function like > getPage(Page_id) > to retrieve the page(I am trying to change backend) Probably the code you want to modify is in src/backend/storage/smgr. Maybe you want to add a different storage manager (they are pluggable, sort of). > getPage(Page_id) > { > ctr=Hash_Table(Page_id) //return somevalue needed for deccryption > Thread_Read(Page_id) // will call the original read code > Thread_Encryption.start(ctr, length); > when both threads are done finish the encryption > } I think it would need extensive, painful and unwelcome modifications to use threads to do the job. You could just as well do it sequentially, like in encryptedPage = getPage(page_id); clearPage = unencrypt(encryptedPage); return clearPage; And the reverse for storage. This may only need modifications to mdread() and mdwrite() ... unless your encryption scheme returns a different length than the original. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Estoy de acuerdo contigo en que la verdad absoluta no existe... El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)
Murat Kantarcioglu <kanmurat@cs.purdue.edu> writes: > This piece will be a part of a bigger design and the problems > mentioned are very real. In the future, our goal is to design a database > system where the processing is done in a "secure coprocessor"(i.e no one > will be able to see what is inside) and > the small code inside the co-processor is verified using formal methods. [ raised eyebrow... ] You think a SQL database is small code you can verify using formal methods? I don't really see how you can expect that the decrypted data can be held entirely within a small secured area and still get any useful work done. regards, tom lane
Our basic claim is to be able to do most of the encryption while we are reading the page. That is the reason I need the threads. Any suggestion about the threads are welcome. Thanks. Murat Alvaro Herrera wrote: > On Fri, Sep 10, 2004 at 11:52:26AM -0500, Murat Kantarcioglu wrote: > > >>Can you suggest me a solution to how to do >>this on Postgresql backend? >> >>I am asssuming that somewhere in the code, you are calling a function like >> getPage(Page_id) >>to retrieve the page(I am trying to change backend) > > > Probably the code you want to modify is in src/backend/storage/smgr. > Maybe you want to add a different storage manager (they are pluggable, > sort of). > > >> getPage(Page_id) >> { >> ctr=Hash_Table(Page_id) //return somevalue needed for deccryption >> Thread_Read(Page_id) // will call the original read code >> Thread_Encryption.start(ctr, length); >> when both threads are done finish the encryption >> } > > > I think it would need extensive, painful and unwelcome modifications to > use threads to do the job. You could just as well do it sequentially, > like in > > encryptedPage = getPage(page_id); > clearPage = unencrypt(encryptedPage); > return clearPage; > > > And the reverse for storage. This may only need modifications to > mdread() and mdwrite() ... unless your encryption scheme returns a > different length than the original. >
It is really hard to describe all the project in few e-mails. Obviously, we will not try to run entire database software in that secure hardware. Also memory limitations are not important. For example, please check the research on "oblivious RAM" to see even small memory on such hardware can be leveraged to execute programs with huge memory requirements. Also please check the "Practical Private Information Retrieval" work to see how such hardware is used for solving PIR problem. Anyway, I totaly understand your reservations but we are trying to have a solution to answer your concerns and much more. Thanks for the interest. Murat Tom Lane wrote: > Murat Kantarcioglu <kanmurat@cs.purdue.edu> writes: > >>This piece will be a part of a bigger design and the problems >>mentioned are very real. In the future, our goal is to design a database >>system where the processing is done in a "secure coprocessor"(i.e no one >>will be able to see what is inside) and >>the small code inside the co-processor is verified using formal methods. > > > [ raised eyebrow... ] You think a SQL database is small code you can > verify using formal methods? I don't really see how you can expect that > the decrypted data can be held entirely within a small secured area and > still get any useful work done. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Centuries ago, Nostradamus foresaw when Murat Kantarcioglu <kanmurat@cs.purdue.edu> would write: > For our research project, I need to implement an encryption support > for Postgressql. At this current phase, I need to at least support > page level encryption In other words, each page that belongs to a > certain sensitive table will be stored encrypted on the harddisk. > Since we are trying to have a new design that can start the decryption > before even we see the data, I need to have some kind of thread > support. > > My questions are in order to support page level encryption(i,e > encrypt each page before writing back to disk and decrypt each page > after we read from disk.) which parts of the code should be changed? > Our more simply, is /src/backend/storage/smgr/md.c the only code > that does the disk access? > > Since our design requires thread support (we will do some of the > decryption, before we see the data, therefore during disk access, we > need to continue decryption) Can you suggest me a good thread lib > you think will work fine with postgresql ? > > Thanks for your help. You'd better step back to your threat model, and figure out what encryption will actually get you. I don't see any reason to think that you can actually gain _anything_ from page level encryption. If you think you do, then you ought to either: a) Show how you gain it using something like Linux's capability to use encrypted loopback filesystems, which would notrequire touching PostgreSQL at all, or b) Demonstrate what are the attacks that page level encryption would protect against, and how. The problem with any such mechanisms is essentially the same, namely that the encryption key has got to sit in memory in either the database server process or in the kernel's memory. As such, the key is vulnerable to anyone with root access that can access /proc or its equivalent and get at process memory. The only way for the encryption key NOT to be vulnerable in this fashion is if the encryption key is communicated neither to the database server nor to the OS kernel. I'd suggest you avail yourself of the book _Translucent Databases_ by Peter Weyner; it involves a model where the database engine is not entrusted with cryptography at all. Instead, cryptography is all done within the client. -- (reverse (concatenate 'string "gro.gultn" "@" "enworbbc")) http://www.ntlug.org/~cbbrowne/unix.html In case you weren't aware, "ad homineum" is not latin for "the user of this technique is a fine debater." -- Thomas F. Burdick