Re: CSV mode option for pg_dump - Mailing list pgsql-hackers
From | PFC |
---|---|
Subject | Re: CSV mode option for pg_dump |
Date | |
Msg-id | op.ta3crgb9cigqcu@apollo13 Whole thread Raw |
In response to | Re: CSV mode option for pg_dump (Volkan YAZICI <yazicivo@ttnet.net.tr>) |
Responses |
Re: CSV mode option for pg_dump
|
List | pgsql-hackers |
From what I gather, the CSV format dump would only contain data.I think pg_dump is the friend of pg_restore. It dumps everything including user defined functions, types, schemas etc. CSV does not fit with this. Besides, people will probably want to dump into CSV the result of any query, to load it into excel, not just the full contents of a table. So, why not create a separate tool, someone suggested pg_query for that, I second it.This tool would take a query and format options, and would output a file in whatever format chosen by the user (CSV, COPY format, xml, whatever) A script language (python) can be used, which will significantly shorten development times and allow easy modularity, as it is easier to add a module to a python program than a C program.I would vote for Python because I love it and it has a very good postgres adapter (psycopg2) which knows how to convers every postgres type to a native language type (yes, even multidimensional arrays of BOX get converted). And it's really fast at retrieving large volumes of data. So you have a stable, fast tool for backup and restore (pg_dump) and a rapidly evolving, user-friendly and extendable tool for exporting data, and everyone is happy. Mr Momijan talks about adding modular functionality to pg_dump. Is it really necessary ? What is the objective ? Is it to reuse code in pg_dump ? I guess not ; if a user wants to dump, for instance, all the tables in a schema, implementing this logic in python is only a few lines of code (select from information_schema...) To be realistic, output format modules should be written in script languages. Noone sane is eager to do string manipulation in C. Thus these modules would have to somehow fit with pg_dump, maybe with a pipe or something. This means designing another protocol. Reimplementing in a scripting langage the parts of pg_dump which will be reused by this project (mainly, enumerating tables and stuff) will be far easier. Just look. Python 2.4.2 (#1, Mar 30 2006, 14:34:35) [GCC 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. ...opens a db connection... >>> c.execute( "SELECT * FROM test.csv" ) >>> data = c.fetchall() >>> data [[1, datetime.date(2006, 6, 13), 'this\tcontains\ttabulations'], [2, datetime.date(2006, 6, 13), "this'contains'quotes"], [3, datetime.date(2006, 6, 13), 'this"contains"double quotes']] >>> import csv, sys >>> c = csv.writer( sys.stdout, dialect = csv.excel ) >>> c.writerows( data ) 1,2006-06-13,this contains tabulations 2,2006-06-13,this'contains'quotes 3,2006-06-13,"this""contains""double quotes"
pgsql-hackers by date: