Home > mailing lists

WIP Patch: Selective binary conversion of CSV file foreign tables - Mailing list pgsql-hackers

From	Etsuro Fujita
Subject	WIP Patch: Selective binary conversion of CSV file foreign tables
Date	May 8, 2012 08:22:41
Msg-id	001801cd2d0d$59dec990$0d9c5cb0$@lab.ntt.co.jp Whole thread Raw
Responses	Re: WIP Patch: Selective binary conversion of CSV file foreign tables Re: WIP Patch: Selective binary conversion of CSV file foreign tables
List	pgsql-hackers

Tree view

I would like to propose to improve parsing efficiency of contrib/file_fdw by
selective parsing proposed by Alagiannis et al.[1], which means that for a
CSV/TEXT file foreign table, file_fdw performs binary conversion only for
the columns needed for query processing.  Attached is a WIP patch
implementing the feature.

I evaluated the efficiency of the patch using SELECT count(*) on a CSV file
foreign table of 5,000,000 records, which had the same definition as the
pgbench history table.  The following run is done on a single core of a
3.00GHz Intel Xeon CPU with 8GB of RAM.  Configuration settings are all
default.

w/o the patch: 7255.898 ms
w/  the patch: 3363.297 ms

On reflection of [2], I think it would be better to disable this feature
when the validation option is set to 'true'; file_fdw converts all columns
to binary representation.  So, it verifies that each tuple meets all column
data types as well as all kinds of constraints.

I appreciate your comments.

Best regards,
Etsuro Fujita

[1] http://homepages.cwi.nl/~idreos/NoDBsigmod2012.pdf
[2] https://commitfest.postgresql.org/action/patch_view?id=822

pgsql-hackers by date:

From: Noah Misch
Date: 08 May 2012, 06:01:25
Subject: Re: Temporary tables under hot standby

From: Michael Nolan
Date: 08 May 2012, 11:09:57
Subject: Re: problem/bug in drop tablespace?

WIP Patch: Selective binary conversion of CSV file foreign tables - Mailing list pgsql-hackers

Previous

Next