Thread: making postgresql "source" its data feed from external csv file ...
or keeping data processed in a DBMS in synch with local csv file ~ I kind of like to use a fast in memory DB for coding cycles (either HSQLDB or H2) and then when I have the need to stress test I use postgresql, so I keep all my data as csv files which I then "source" to tables, after running the DDL statements to create them, simply by going: ~ SET TABLE "<table_name>" SOURCE "<file_name.csv>;ignore_first=false;" ~ As I understand postgresql "COPY FROM" does in effect copy the data to the DBMS, but what I need is for the DBMS to "source" its data feed with the outer file and synchronize it as it changes without having to run "COPY TO". Is this possible? This way I can keep the data in a way that I can try on any DBMS ~ Is there an equivalent statement in postgresql or some other way to achieve the same result? ~ Thanks lbrtchx
On 01/04/2011 09:55 AM, Albretch Mueller wrote: > As I understand postgresql "COPY FROM" does in effect copy the data > to the DBMS, but what I need is for the DBMS to "source" its data feed > with the outer file and synchronize it as it changes without having to > run "COPY TO". Is this possible? This way I can keep the data in a way > that I can try on any DBMS No, that's not currently possible. What you describe is covered by SQL/MED. There's ongoing work to implement SQL/MED for 9.1 / 9.2, but there is no support for it in any current release version of PostgreSQL. SQL/MED is intended for a lot more than just accessing flat files, but should be capable of that. There's a reason relational databases don't store their backing data as CSV files - because they're *SLOW* to work with, in particular to update. You can expect horribly slow write performance with any scheme like this even if you do get it working. -- Craig Ringer
If your tables are read-only (which it kind of sounds like), you could abstract out the reload mechanism, use the current implementation for the in-memory databases, and use a combination of watching the CSV files (e.g., through inotify on Linux) and TRUNCATE / COPY in PostgreSQL. I think how well this will work will really depend on what you're doing with the system, but it could be worth a shot. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com
On Tue, Jan 4, 2011 at 12:14 AM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 01/04/2011 09:55 AM, Albretch Mueller wrote: > >> As I understand postgresql "COPY FROM" does in effect copy the data >> to the DBMS, but what I need is for the DBMS to "source" its data feed >> with the outer file and synchronize it as it changes without having to >> run "COPY TO". Is this possible? This way I can keep the data in a way >> that I can try on any DBMS > > No, that's not currently possible. What you describe is covered by SQL/MED. > There's ongoing work to implement SQL/MED for 9.1 / 9.2, but there is no > support for it in any current release version of PostgreSQL. SQL/MED is > intended for a lot more than just accessing flat files, but should be > capable of that. > > There's a reason relational databases don't store their backing data as CSV > files - because they're *SLOW* to work with, in particular to update. You > can expect horribly slow write performance with any scheme like this even if > you do get it working. ~ Well, yes, writing to an inmem db while coding should be fast enough I guess ;-) ~ I thought of something else a few moments ago. Is there a way to register to postgresql events like you do with java.awt., javax.swing. and j2ee management events and listen to them? ~ If you could somehow do that with jdbc, you could register to "COPY FROM" events to certain files and regularly synchronize the data. Certain schedule based on a measurable degree of dirtiness, time stamps (back up incremental deltas every Tuesdays night ...) and data semantics (if the increment or the total amount of monies reach some value …) would help ~ I think I will have to do with "COPY [FROM|TO]" for now until SQL/MED ~ http://wiki.postgresql.org/wiki/SQL/MED ~ http://wiki.postgresql.org/images/4/4c/SQLMED-FOSDEM2009.pdf ~ becomes a viable technology ~ Thanks lbrtchx