Re: Using the database to validate data - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Using the database to validate data
Date
Msg-id 55B15324.4000205@aklaver.com
Whole thread Raw
In response to Using the database to validate data  (JPLapham <lapham@jandr.org>)
Responses Re: Using the database to validate data  (Tim Clarke <tim.clarke@manifest.co.uk>)
List pgsql-general
On 07/23/2015 12:04 PM, Jon Lapham wrote:
> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
>> http://pgloader.io/
>
> Ok, thanks, I'll look into pgloader's data validation abilities.
>
> However, my naive understanding of pgloader is that it is used to
> quickly load data into a database, which is not what I am looking to do.
> I want to validate data integrity *before* putting it into the database.
> If there is a problem with any part of the data, I don't want any of it
> in the database.

I misunderstood, I thought you just wanted  information on the rows that
did not get in. pgloader does this by including the rejected data in
*.dat and the Postgres log of why it was rejected in *.log.

<Thinking out loud, not tested>

I could still see making use of this by using the --before <file_name>,
where file_name contains a CREATE TEMPORARY TABLE some_table script that
mimics the permanent table. Then it would load against the temporary
table, write out any errors and then drop the table at the end. This
would not put data into the permanent table on complete success though.
That would require some magic in AFTER LOAD EXECUTE that I have not come
up with yet:)

<Thinking out loud, not tested>
>
> -Jon
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: twoflower
Date:
Subject: The fastest way to update thousands of rows in moderately sized table
Next
From: Adrian Klaver
Date:
Subject: Re: The fastest way to update thousands of rows in moderately sized table