Keeping top N records of a group - Mailing list pgsql-general

From Alex Magnum
Subject Keeping top N records of a group
Date
Msg-id CA+cR4zey3qZC2TA1cEYAnhzA70q5TchxairB1JV89G0eUqYNVw@mail.gmail.com
Whole thread Raw
Responses Re: Keeping top N records of a group  (Andreas Kretschmer <akretschmer@spamfence.net>)
List pgsql-general
Hi,
i want to archive data and am looking for a query to replace a rather slow function i am currently using.

The idea is that for every sym and doc_key I want to keep the records of the top 2 sources. Eg. in below table I want to archive 
sym 1022 of source 3000 but keep the 4 records from sources 4045 and 4081.

Any help on this would be appreciated.


 source    | sym          | doc_id  | doc_key
-----------+--------------+---------+--------------
      3000 | 1022         |  551008 | 23834363  <-- Archive
      3000 | 1022         |  551008 | 23834363  <-- Archive
      4045 | 1022         | 3699290 | 23834363
      4045 | 1022         | 3699290 | 23834363
      4081 | 1022         | 3811196 | 23834363
      4081 | 1022         | 3811196 | 23834363
      3000 | 1028         |  550997 | 23834363
      3468 | 1085         | 1740526 | 23834363
      3000 | 1149         |  551017 | 23834363
      4045 | 1149         | 3699291 | 23834363
      3000 | 12           |  551015 | 23834363
      3000 | 12           |  551015 | 23834363
      3951 | 12           | 3147700 | 23834363
      3951 | 12           | 3147700 | 23834363
      3000 | 13           |  551000 | 23834363
      4045 | 13           | 3699283 | 23834363
      3000 | 1327         |  551010 | 23834363
      3971 | 1327         | 3394469 | 23834363
      3000 | 15           |  551001 | 23834363
      4045 | 15           | 3699284 | 23834363
      3000 | 16           |  551002 | 23834363
      4045 | 16           | 3699285 | 23834363
      3000 | 18           |  551013 | 23834363
      4045 | 18           | 3699286 | 23834363
      3000 | 257          |  551005 | 23834363
      3951 | 257          | 3147701 | 23834363
      3000 | 2795         |  551011 | 23834363 <-- Archive
      3459 | 2795         | 1710571 | 23834363 <-- Archive
      3905 | 2795         | 2994791 | 23834363
      4045 | 2795         | 3699292 | 23834363
      3060 | 2913         |  856199 | 23834363 
      3000 | 2954         |  551012 | 23834363 <-- Archive
      3971 | 2954         | 3394470 | 23834363
      4212 | 2954         | 4650870 | 23834363
      3183 | 3427         | 1055492 | 23834363 <-- Archive
      3971 | 3427         | 3394471 | 23834363
      4248 | 3427         | 4763105 | 23834363
      3188 | 594          | 1062642 | 23834363 <-- Archive
      3188 | 594          | 1062642 | 23834363 <-- Archive
      3191 | 594          | 1067501 | 23834363 <-- Archive
      3191 | 594          | 1067501 | 23834363 <-- Archive
      3192 | 594          | 1068391 | 23834363 <-- Archive
      3192 | 594          | 1068391 | 23834363 <-- Archive
      3199 | 594          | 1096070 | 23834363 <-- Archive
      3199 | 594          | 1096070 | 23834363 <-- Archive
      3303 | 594          | 1305467 | 23834363
      3303 | 594          | 1305467 | 23834363
      4117 | 594          | 4000987 | 23834363

Thanks a lot for any ideas.
Alex

pgsql-general by date:

Previous
From: Venkata Balaji N
Date:
Subject: Re: Streaming replication, master recycling
Next
From: Andreas Kretschmer
Date:
Subject: Re: Keeping top N records of a group