Thread: How to just "link" to some data feed
Hi, ~ I have some CSV files which I need to link to from within PG ~ From http://darkavngr.blogspot.com/2007/06/importar-datos-externos-nuestra-base-de.html ~ Say, you have the following data feed in /tmp/codigos_postales.csv ~ "01000","San Angel","Colonia","Alvaro Obregon","Distrito Federal" "01010","Los Alpes","Colonia","Alvaro Obregon","Distrito Federal" "01020","Guadalupe Inn","Colonia","Alvaro Obregon","Distrito Federal" "01028","Secretaria de Contraloria y Desarrollo Administrativo","Gran usuario","Alvaro Obregon","Distrito Federal" "01029","Infonavit","Gran usuario","Alvaro Obregon","Distrito Federal" "01030","Axotla","Pueblo","Alvaro Obregon","Distrito Federal" "01030","Florida","Colonia","Alvaro Obregon","Distrito Federal" "01040","Campestre","Colonia","Alvaro Obregon","Distrito Federal" ~ then you can defined the DB data model as: ~ CREATE TABLE codigos_postales( cp char(5), asentamiento varchar(120), tipo_asentamiento varchar(120), municipio varchar(120), estado varchar(120) ); ~ and copy all values into the DB like this: ~ COPY codigos_postales FROM '/tmp/codigos_postales.csv' DELIMITERS ',' CSV; ~ But how do you just link to the data feed effectively keeping it in a CSV file format not internal to PG? ~ Also I need for all changes to the data to (of course!) be propagated to the data back end ~ Then you can link to the same data feed from some other engine that reads in CSV (I think all do or should) ~ How do you link to a CSV using PG? ~ Thank you lbrtchx
On 3-Jun-08, at 8:04 PM, Albretch Mueller wrote: > Hi, > ~ > I have some CSV files which I need to link to from within PG > ~ > From http://darkavngr.blogspot.com/2007/06/importar-datos-externos-nuestra-base-de.html > ~ > Say, you have the following data feed in /tmp/codigos_postales.csv > ~ > "01000","San Angel","Colonia","Alvaro Obregon","Distrito Federal" > "01010","Los Alpes","Colonia","Alvaro Obregon","Distrito Federal" > "01020","Guadalupe Inn","Colonia","Alvaro Obregon","Distrito Federal" > "01028","Secretaria de Contraloria y Desarrollo Administrativo","Gran > usuario","Alvaro Obregon","Distrito Federal" > "01029","Infonavit","Gran usuario","Alvaro Obregon","Distrito Federal" > "01030","Axotla","Pueblo","Alvaro Obregon","Distrito Federal" > "01030","Florida","Colonia","Alvaro Obregon","Distrito Federal" > "01040","Campestre","Colonia","Alvaro Obregon","Distrito Federal" > ~ > then you can defined the DB data model as: > ~ > CREATE TABLE codigos_postales( > cp char(5), > asentamiento varchar(120), > tipo_asentamiento varchar(120), > municipio varchar(120), > estado varchar(120) > ); > ~ > and copy all values into the DB like this: > ~ > COPY codigos_postales FROM '/tmp/codigos_postales.csv' DELIMITERS > ',' CSV; > ~ > But how do you just link to the data feed effectively keeping it in a > CSV file format not internal to PG? > ~ > Also I need for all changes to the data to (of course!) be propagated > to the data back end > ~ > Then you can link to the same data feed from some other engine that > reads in CSV (I think all do or should) > ~ > How do you link to a CSV using PG? > ~ > Thank you > lbrtchx > You can't. You have to insert it into the table > so you need some code which reads the csv and inserts it into the table. Dave > -- > Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-jdbc
On Tue, Jun 3, 2008 at 8:24 PM, Dave Cramer <pg@fastcrypt.com> wrote: . . . > > You can't. Hmm! Doesn't PG have a way to do something like this, say in MySQL: load data local infile 'uniq.csv' into table tblUniq fields terminated by ',' enclosed by '"' lines terminated by '\n' (uniqName, uniqCity, uniqComments) and even in low end (not real) DBs like MS Access? Is there a technical reason for that, or should I apply for a RFE? Thank you very much lbrtchx
Albretch Mueller wrote: > On Tue, Jun 3, 2008 at 8:24 PM, Dave Cramer <pg@fastcrypt.com> wrote: > . . . >> You can't. > > Hmm! Doesn't PG have a way to do something like this, say in MySQL: > > load data local infile 'uniq.csv' into table tblUniq That's essentially the same as the COPY you quoted in your original email, isn't it? So.. what exactly is it you want to do that COPY doesn't do? Anyway, it's not really JDBC related. -O
On Tue, Jun 3, 2008 at 11:03 PM, Oliver Jowett <oliver@opencloud.com> wrote: > That's essentially the same as the COPY you quoted in your original email, > isn't it? So.. what exactly is it you want to do that COPY doesn't do? ~ well, actually, not exactly; based on: ~ http://postgresql.com.cn/docs/8.3/static/sql-copy.html ~ COPY <table_name> [FROM|TO] <data_feed> <OPTIONS> ~ import/export the data into/out of PG, so you will be essentially duplicating the data and having to synch it. This is exactly what I am trying to avoid, I would like for PG to handle the data right from the data feed ~ > Anyway, it's not really JDBC related. ~ Well, no, but I was hoping to get an answer here because I mostly access PG through jdbc and also because java developer would generally be more inclined to these types of DB-independent data formats, reusing, transferring issues ~ lbrtchx
Albretch Mueller wrote: > On Tue, Jun 3, 2008 at 11:03 PM, Oliver Jowett <oliver@opencloud.com> wrote: >> That's essentially the same as the COPY you quoted in your original email, >> isn't it? So.. what exactly is it you want to do that COPY doesn't do? > ~ > well, actually, not exactly; based on: > ~ > http://postgresql.com.cn/docs/8.3/static/sql-copy.html > ~ > COPY <table_name> [FROM|TO] <data_feed> <OPTIONS> > ~ > import/export the data into/out of PG, so you will be essentially > duplicating the data and having to synch it. This is exactly what I am > trying to avoid, I would like for PG to handle the data right from the > data feed As Dave said, PG won't magically keep the data up to date for you, you will need some external process to do the synchronization with the feed. That could use COPY if it wanted .. Then you said: > Hmm! Doesn't PG have a way to do something like this, say in MySQL: > > load data local infile 'uniq.csv' into table tblUniq > fields terminated by ',' > enclosed by '"' > lines terminated by '\n' > (uniqName, uniqCity, uniqComments) > > and even in low end (not real) DBs like MS Access? But isn't this doing exactly what PG's COPY does - loads data, once, from a local file, with no ongoing synchronization? > Is there a technical reason for that, or should I apply for a RFE? Personally I don't see this sort of synchronization as something that you want the core DB to be doing anyway. The rules for how you get the data, how often you check for updates, how you merge the updates, and so on are very application specific. -O
Oliver Jowett <oliver@opencloud.com> writes: > That's essentially the same as the COPY you quoted in your original > email, isn't it? So.. what exactly is it you want to do that COPY > doesn't do? The OP seems to be laboring under the delusion that we might consider storing "live" data in a flat text file. I'd be interested to know how such a representation could handle MVCC updates, or indexing ... even presuming that it was a good idea for any other reason, which he has not provided. > Anyway, it's not really JDBC related. Definitely not. regards, tom lane
Albretch Mueller wrote: > import/export the data into/out of PG, so you will be essentially > duplicating the data and having to synch it. This is exactly what I am > trying to avoid, I would like for PG to handle the data right from the > data feed You could write a set-returning function that reads the file, and perhaps a view on top of that. As the file is read on every invocation, that's only practical for small tables, though. It's likely a better idea to just COPY the table into the database periodically, or perhaps write an external script that watches the modification timestamp on the file and triggers a reload whenever it changes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hello. I'd recommend you to have some intermediate (temporary?) table to load data to and then sync data to main table using simple insert statement. The call list would look like copy temptbl FROM '/tmp/codigos_postales.csv' DELIMITERS ',' CSV; insert into maintbl select * from temptbl where cp not in (select cp from maintbl) delete from temptbl; commit; P.S. Last step is not needed if you use temporary table.