web archiving - Mailing list pgsql-novice

From Matt Price
Subject web archiving
Date
Msg-id 1026334740.699.82.camel@anarres
Whole thread Raw
Responses Re: web archiving
List pgsql-novice
Hi there,

I've just moved up from non-free os's to debian linux, and installed
postgresql, with the hope of getting started on some projects I've been
thinking about.  Several of these projects involve web archives.  The
idea is, a url is entered with a bunch of bibliographic-type data in
other fields (keywords, author, date, etc).  The html (and hopefully,
accompanying images/css's/etc) are then grabbed using curl, and archived
in a postgresql database.  A web or other gui interface then provides
fully-searchable access to the archive for later use.

So my question:  does anyone know of a similar tool which already
exists?  I'm a complete novice at database programming (and at php, too,
which is what I figured I'd use as the scripting language, though I'd
consider learning perl or java if folks think that's a much better
idea), and I'd rather work with some pre-existing code than start from
the ground up.  Any suggestings?  Is this the right list to be asking
this quesiton on?

Thanks loads,
Matt


pgsql-novice by date:

Previous
From: Leandro Fanzone
Date:
Subject: Translate problems
Next
From: Philip Hallstrom
Date:
Subject: Re: web archiving