Re: SELECT question - Mailing list pgsql-general

From Michael Glaesemann
Subject Re: SELECT question
Date
Msg-id BDDE38DE-B070-463E-9E98-E93E939984B4@seespotcode.net
Whole thread Raw
In response to SELECT question  (Michelle Konzack <linux4michelle@freenet.de>)
Responses Re: SELECT question  (Michelle Konzack <linux4michelle@freenet.de>)
List pgsql-general
On Aug 17, 2007, at 7:27 , Michelle Konzack wrote:

> *********************************************************************
> * Do not Cc: me, because I am on THIS list, if I write here.        *

You might want to consider changing your mailing list subscription
settings to "eliminatecc", e.g., send email to
majordomo@postgresql.org (not the list address!) with body

set pgsql-general eliminatecc

This should prevent the mailing list from sending you a second copy.

> I have a very huge table of references from Websites (html, pics, ...)
> where the elements stored on a fileserver using sha384.
>
> Now the indextable hold all filenames and download dates but now I
> like
> to get a "snapshoot on a paticular day.
>
> How must I create the SELCT statement to get ALL files valid on a
> particular day?
>
> Note:  There can be every time a new index.html for example but images
>        change only once a month...
>
>        So I need all elements valable on the paticular day which mean,
>        I need to select that LAST version of the elements...

I think what you want is something like:

SELECT DISTINCT ON (website_reference) website_reference,
download_date, file_path
FROM indextable
WHERE download_date <= ? -- whatever date you're interested in
ORDER BY website_reference, download_date DESC;

This should return the most recent website_reference and its
download_date that's earlier than the download_date specified in the
WHERE clause.

DISTINCT ON is a (very helpful) PostgreSQL extension. You can get
similar results using a subquery;

SELECT website_reference, download_date, file_path
FROM indextable
NATURAL JOIN (
     SELECT website_reference, max(download_date) as download_date
     FROM indextable
     WHERE download_date <= ?
     GROUP BY website_reference
     ) most_recent_versions;

This may return more than one row per website_reference if the
website_reference has more than on file_path for a particular
download_date.

Does this help? If not, could you give a bit more of a concrete example?

(Is is just me or have there been a lot of queries that can be solved
using DISTINCT ON recently?)

Michael Glaesemann
grzm seespotcode net


Attachment

pgsql-general by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Enterprise Wide Deployment
Next
From: Michael Glaesemann
Date:
Subject: Re: FInding "corrupt" values in UTF-8 tables (regexp question, I think)