Review: Patch FORCE_NULL option for copy COPY in CSV mode - Mailing list pgsql-hackers

From Payal Singh
Subject Review: Patch FORCE_NULL option for copy COPY in CSV mode
Date
Msg-id CANUg7LDW0WDrDpMhVHce9EDrOpR4O2_op6ySU3k0JoSdxrmHFA@mail.gmail.com
Whole thread Raw
Responses Re: Review: Patch FORCE_NULL option for copy COPY in CSV mode  (Ian Lawrence Barwick <barwick@gmail.com>)
List pgsql-hackers
The post was made before I subscribed to the mailing list, so posting my review separately. The link to the original patch mail is http://www.postgresql.org/message-id/CAB8KJ=jS-Um4TGwenS5wLUfJK6K4rNOm_V6GRUj+tcKekL2=GQ@mail.gmail.com


Hi,

This patch implements the following TODO item:
  Allow COPY in CSV mode to control whether a quoted zero-length  string is treated as NULL
      Currently this is always treated as a zero-length string,      which generates an error when loading into an integer column
        Re: [PATCHES] allow CSV quote in NULL        http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php

  http://wiki.postgresql.org/wiki/Todo#COPY


I had a very definite use-case for this functionality recently while importing
CSV files generated by Oracle, and was somewhat frustrated by the existence
of a FORCE_NOT_NULL option for specific columns, but not one for
FORCE_NULL.

I'll add this to the November commitfest.


Regards

Ian Barwick

==================
Contents & Purpose
==================

This patch introduces a new 'FORCE_NULL' option which has the opposite functionality of the already present 'FORCE_NOT_NULL' option for the COPY command. Prior to this option there was no way to convert a zeroed-length quoted value to a NULL value when COPY FROM is used to import data from CSV formatted files.

==================
Submission Review
==================

The patch is in the correct context diff format. It includes changes to the documentation as well as additional regression tests. The description has been discussed and defined in the previous mails leading to this patch.

=====================
Functionality Review
=====================

CORRECTION NEEDED - Due to a minor error in code (details in 'Code Review' section below), force_null option is not limited to COPY FROM, and works even when COPY TO is used. This should instead give an error message.

The updated documentation describes the added functionality clearly.

All regression tests passed successfully. 

Code compilation after including patch was successful. No warnings either.

Manually tested COPY FROM with FORCE_NULL for a number of scenarios, all with expected outputs. No issues.

Been testing the patch for a few days, no crashes or weird behavior observed.

=============================================
Code Formatting Review (Needs Improvement)
=============================================

Looks good, the tabs between variable declaration and accompanying comment can be improved. 

=================================
Code Review (Needs Improvement)
=================================

1. There is a " NOTE: force_not_null option are not applied to the returned fields." before COPY FROM block. Similar note should be added for force_null option too.

2. One of the conditions to check and give an error if force_null is true and copy from is false is wrong, cstate->force_null should be checked instead of cstate->force_notnull:

        /* Check force_notnull */
        if (!cstate->csv_mode && cstate->force_notnull != NIL)
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("COPY force not null available only in CSV mode")));
        if (cstate->force_notnull != NIL && !is_from)
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                          errmsg("COPY force not null only available using COPY FROM")));

        /* Check force_null */
        if (!cstate->csv_mode && cstate->force_null != NIL)
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("COPY force null available only in CSV mode")));

==>  if (cstate->force_notnull != NIL && !is_from) 
                ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                          errmsg("COPY force null only available using COPY FROM")));
                          
===============================
Suggested Changes & Conclusion
===============================

The Above mentioned error condition should be corrected. Minor comments and tab changes are upto the author. 

In all, suggested modifications aside, the patch works well and in my opinion, would be a useful addition to the COPY command.

 
Payal Singh,
OmniTi Computer Consulting Inc.
Junior Database Architect

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?
Next
From: Kevin Grittner
Date:
Subject: Re: Record comparison compiler warning