Home > mailing lists

Re: Duplicates Processing - Mailing list pgsql-sql

From	Tim Landscheidt
Subject	Re: Duplicates Processing
Date	October 8, 2010 15:42:44
Msg-id	m3iq1cr0fi.fsf@passepartout.tim-landscheidt.de Whole thread Raw
In response to	Duplicates Processing (Gary Chambers <gwchamb@gmail.com>)
Responses	Re: Duplicates Processing
List	pgsql-sql

Tree view

Gary Chambers <gwchamb@gmail.com> wrote:

> I've been provided a CSV file of parts that contains duplicates of
> properties (e.g. resistors have a wattage, tolerance, and temperature
> coefficient property) of those parts that differ by a manufacturer
> part number.  What I'd like to do is to process this file and, upon
> encountering one of the duplicates, take that part with its new part
> number and move it to a part substitutes table.  It seems like it
> should be pretty simple, but I can't seem to generate a query or a
> function to accomplish it.  I'd greatly appreciate any insight or
> assistance with solving this problem.  Thank you very much in advance.

You can - for example - create a query with a call to
ROW_NUMBER() and then process the matching rows (untested):

| INSERT INTO substitutes ([...])
|   SELECT [...] FROM
|     (SELECT *,
|             ROW_NUMBER() OVER (PARTITION BY wattage, tolerance, temperature
|                         ORDER BY part_number) AS RN
|      FROM parts) AS SubQuery
|   WHERE RN > 1;

| DELETE FROM parts
| WHERE primary_key IN
|   (SELECT primary_key FROM
|     (SELECT *,
|             ROW_NUMBER() OVER (PARTITION BY wattage, tolerance, temperature
|                                ORDER BY part_number) AS RN
|      FROM parts) AS SubQuery
|    WHERE RN > 1);

Tim

pgsql-sql by date:

From: Gary Chambers
Date: 08 October 2010, 12:12:19
Subject: Duplicates Processing

From: Gary Chambers
Date: 08 October 2010, 16:42:33
Subject: Re: Duplicates Processing

Re: Duplicates Processing - Mailing list pgsql-sql

Previous

Next