Re: Using the database to validate data - Mailing list pgsql-general

From Tim Clarke
Subject Re: Using the database to validate data
Date
Msg-id 55B1556D.7010807@manifest.co.uk
Whole thread Raw
In response to Re: Using the database to validate data  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: Using the database to validate data  (林士博 <lin@repica.co.jp>)
Re: Using the database to validate data  (JPLapham <lapham@jandr.org>)
List pgsql-general
Shouldn't be too difficult to import those new rows into one table,
write a procedure that inserts them into the real table one by one and
logs the validation failure if any - committing good rows and rolling
back bad. In fact if you could then write the failures to a third table
with a completely relaxed (or no) validation?

Tim Clarke

On 23/07/15 21:48, Adrian Klaver wrote:
> On 07/23/2015 12:04 PM, Jon Lapham wrote:
>> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
>>> http://pgloader.io/
>>
>> Ok, thanks, I'll look into pgloader's data validation abilities.
>>
>> However, my naive understanding of pgloader is that it is used to
>> quickly load data into a database, which is not what I am looking to do.
>> I want to validate data integrity *before* putting it into the database.
>> If there is a problem with any part of the data, I don't want any of it
>> in the database.
>
> I misunderstood, I thought you just wanted  information on the rows
> that did not get in. pgloader does this by including the rejected data
> in *.dat and the Postgres log of why it was rejected in *.log.
>
> <Thinking out loud, not tested>
>
> I could still see making use of this by using the --before
> <file_name>, where file_name contains a CREATE TEMPORARY TABLE
> some_table script that mimics the permanent table. Then it would load
> against the temporary table, write out any errors and then drop the
> table at the end. This would not put data into the permanent table on
> complete success though. That would require some magic in AFTER LOAD
> EXECUTE that I have not come up with yet:)
>
> <Thinking out loud, not tested>
>>
>> -Jon
>>
>
>



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: The fastest way to update thousands of rows in moderately sized table
Next
From: twoflower
Date:
Subject: Re: The fastest way to update thousands of rows in moderately sized table