Hello,
I need a bit of help with some SQL.
I have two tables, call them Page and Bookmark.
Each row in Page can have many Bookmarks pointing to it, and
they are joined via a FK (Page.id = Bookmark.page_id).
Page has a 'url' column: Page.url, which has a unique index on it.
My Page.url column got a little dirty, and I need to clean it up,
and that's what I need help with.
Here is an example of dirtiness:
Page:
id=1 url = 'http://example.com/'
id=2 url = 'http://example.com/#' -- dirty
id=3 url = 'http://example.com/#foo' -- dirty
The last two rows are dirty. Normally I normalize URLs before
inserting them, but these got in, and now I need to clean them.
The problem is that rows in Bookmark table may point to dirty
rows in Page, so I can't just remove the dirty rows, and I can't
just update 'url' column in Page to 'http://example.com/',
because that column is unique.
Is there some fancy SQL that I can use them to find the dirty
rows in page (... where url like '%#%') and then find rows in
Bookmark table that point to them, then point those rows to
good rows in Page (e.g. id=1 row above), and finally remove the
dirty rows from Page?
Any help would be greatly appreciated.
I'm using Pg 8.0.3
Thanks,
Otis