Thread: csv_populate_recordset and csv_agg
Hello hackers,
The `json_populate_recordset` and `json_agg` functions allow systems to process/generate json directly on the database. This "cut outs the middle tier"[1] and notably reduces the complexity of web applications.
CSV processing is also a common use case and PostgreSQL has the COPY .. FROM .. CSV form but COPY is not compatible with libpq pipeline mode and the interface is clunkier to use.
I propose to include two new functions:
- csv_populate_recordset ( base anyelement, from_csv text )
- csv_agg ( anyelement )
I would gladly implement these if it sounds like a good idea.
I see there's already some code that deals with CSV on
- src/backend/commands/copyfromparse.c(CopyReadAttributesCSV)
- src/fe_utils/print.c(csv_print_field)
- src/backend/utils/error/csvlog(write_csvlog)
So perhaps a new csv module could benefit the codebase as well.
Best regards,
Steve
Steve Chavez <steve@supabase.io> writes: > CSV processing is also a common use case and PostgreSQL has the COPY .. > FROM .. CSV form but COPY is not compatible with libpq pipeline mode and > the interface is clunkier to use. > I propose to include two new functions: > - csv_populate_recordset ( base anyelement, from_csv text ) > - csv_agg ( anyelement ) The trouble with CSV is there are so many mildly-incompatible versions of it. I'm okay with supporting it in COPY, where we have the freedom to add random sub-options (QUOTE, ESCAPE, FORCE_QUOTE, yadda yadda) to cope with those variants. I don't see a nice way to handle that issue in the functions you propose --- you'd have to assume that there is One True CSV, which sadly ain't so, or else complicate the functions beyond usability. Also, in the end CSV is a surface presentation layer, and as such it's not terribly well suited as the calculation representation for aggregates and other functions. I think these proposed functions would have pretty terrible performance as a consequence of the need to constantly re-parse the surface format. The same point could be made about JSON ... which is why we prefer to implement processing functions with JSONB. regards, tom lane