Re: Find duplicates in a column then print Information to a file - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Find duplicates in a column then print Information to a file
Date
Msg-id CA+bJJbxN+nEiE7T02OpR4P_z8b7=tfVw+FUjLkO5icDuev-TKQ@mail.gmail.com
Whole thread Raw
In response to Find duplicates in a column then print Information to a file  (Sherman Willden <operasopranos@gmail.com>)
Responses Re: Find duplicates in a column then print Information to a file  (Sherman Willden <operasopranos@gmail.com>)
List pgsql-general
Mmmmm, I notice a faint homework smell here ;-> , but the question is
nicely asked so:

On Sun, Dec 31, 2017 at 7:19 PM, Sherman Willden
<operasopranos@gmail.com> wrote:
...
> SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis
> GROUP BY aria  HAVING COUNT(aria)>1);

The outer select is fully redundant. Just use the inner one. In fact,
not redundnat, harmful, as it will spit many copies of aria for the
duplicated ones. If you add id (and/or artist) to the outer query then
it is useful again.


> When I perform the following I get (0 rows):
> SELECT aria FROM aria_precis WHERE aria IN (SELECT aria FROM aria_precis
> GROUP BY id, aria  HAVING COUNT(aria)>1);

As expected. See the inner query, you are grouping by ID which is the
primary key, by PK definition all counts are going to be one, so no
results, nothing goes to the outer query.

Use the first query, but adding id and artist to the OUTER level.

> After I get the above information how do I write that information to a file?
> The only thing I know writes the entire database to a file:
>  \COPY aria_precis TO '/home/sherman/aria_precis_2.csv' WITH DELIMITER '#,'
> CSV HEADER;

If copy format is ok to you, knowing \copy look at the docs, looking
at https://www.postgresql.org/docs/9.6/static/app-psql.html you will
read:

>>>
\copy { table [ ( column_list ) ] | ( query ) } { from | to } {
'filename' | program 'command' | stdin | stdout | pstdin | pstdout } [
[ with ] ( option [, ...] ) ]

Performs a frontend (client) copy. This is an operation that runs an
SQL COPY command, but instead of the server reading or writing the
specified file, psql reads or writes the file and routes the data
between the server and the local file system. This means that file
accessibility and privileges are those of the local user, not the
server, and no SQL superuser privileges are required.
<<<

Notice the (query) option? Your copy is using the table+optional
column list format, (aria_precis), just change it to the "( query )"
format ( NOTICE THE MANDATORY PARENS, this is how copy knows what to
do. "t" => table, "t(c1)"=>table+columns, "(xx)" => query ( nothing
before the opening parens ). )

Happy new year.

Francisco Olarte.


pgsql-general by date:

Previous
From: Martin Moore
Date:
Subject: Re: Find duplicates in a column then print Information to a file
Next
From: Sherman Willden
Date:
Subject: Re: Find duplicates in a column then print Information to a file