How to do this in Postgres - Mailing list pgsql-general

From Holger Klawitter
Subject How to do this in Postgres
Date
Msg-id 383BAE18.B7F4D49C@klawitter.de
Whole thread Raw
List pgsql-general
Thanks for all the hints I've got!

What I've learned from this question is, that not everything
which seems to be a database problem should be solved using
databases.

(select * from data where <<prio is minimal per id pair>>)

The problem wasn't that there is no statement to get this data,
the problem was the large number of rows. Everything involving
a join of data with itself took plainly too long. Nested
selects only seem to be a syntactical escape.

However, there is one solution (PostgreSQL specific) using the
feature SELECT DISTINCT ON extending the SQL standard. This
select retrieves only the first occurrence obeying the implied order.
However, one has to be distinct on one *single* column.
(Feature request: SELECT DISTINCT ON (id1,id2) ... ):

    alter table data add column combined text;
        update data set combined = ( id1 || '|' ) || id2;
        select
                distinct on combined
                id1,id2,<<data>>
                from data
                order by prio

But, the update already takes more than 30 min.

The following perl script (paraphrased) solved the same
problem in 40 sec (not counting "copy into" which I had
to do anyhow):

while( <> ) {
    my( $id1, $id2, $prio, $data ) = split( /\t/, $_ );
    my( $key ) = "$id1\t$id2";
    my( $entry ) = $table{$key};
    if( $entry eq "" ) {
        $table{$key} = "$prio\t$data";
    } else {
        my( $oldprio, $junk ) = split( /\t/, $entry );
        if( $prio < $oldprio ) {
            $table{$key} = "$prio\t$data";
        }
    }
}
foreach $key (keys %table) {
    printf "%s\t%s", $key, $table{$key};
}

Regards,
    Holger Klawitter
--
Holger Klawitter                                     +49 (0)251 484 0637
holger@klawitter.de                             http://www.klawitter.de/



pgsql-general by date:

Previous
From: Elmar Haneke
Date:
Subject: Re: [GENERAL] Re: Is PostgreSQL ready for mission critical applications?
Next
From: Andrzej Mazurkiewicz
Date:
Subject: Referencial integrity - Re: Is PostgreSQL ready for mission criti calapplications?