Thread: New redirector

New redirector

From
Magnus Hagander
Date:
FYI - I've committed a new version of the URL redirector for downloads.

The old version was being used for linkfilter-breakthrough to distribute
viruses :-(

Since I was hacking around that code anyway, I didn't just add a filter
to it, but changed around how it works a bit. Apart from it no longer
being possible to use it to break through stupid linkblockers, it has
also made the URLs easier to read and copy/paste, and we're also storing
the logging information in a way that's much easier to analyze than before.

Do keep your eyes open for bugs, of course :-)

//Magnus


Re: New redirector

From
Magnus Hagander
Date:
Magnus Hagander wrote:
> FYI - I've committed a new version of the URL redirector for downloads.
> 
> The old version was being used for linkfilter-breakthrough to distribute
> viruses :-(
> 
> Since I was hacking around that code anyway, I didn't just add a filter
> to it, but changed around how it works a bit. Apart from it no longer
> being possible to use it to break through stupid linkblockers, it has
> also made the URLs easier to read and copy/paste, and we're also storing
> the logging information in a way that's much easier to analyze than before.
> 
> Do keep your eyes open for bugs, of course :-)

I have reverted the part of this that changes the format for logging,
because it turned out that it was impossible to wrestle the stackbuilder
traffic logging onto that format - since stackbuilder uses the
redirector to log arbitrary downloads, and not just things coming off
our mirror network. Also it seems that the mirror id primary key can
change around, and should not be used for logging.

I was not aware of these things, my apologies.

I think we're fine just loosing the info of the about 500 downloads that
happened into the new logging table. We could reconstruct the old format
from it, but I don't think it's worth it.


There should be no end-user visible changes in this revert, only the
backend logging.

//Magnus


Re: New redirector

From
Magnus Hagander
Date:
Magnus Hagander wrote:
> Magnus Hagander wrote:
>> FYI - I've committed a new version of the URL redirector for downloads.
>>
>> The old version was being used for linkfilter-breakthrough to distribute
>> viruses :-(
>>
>> Since I was hacking around that code anyway, I didn't just add a filter
>> to it, but changed around how it works a bit. Apart from it no longer
>> being possible to use it to break through stupid linkblockers, it has
>> also made the URLs easier to read and copy/paste, and we're also storing
>> the logging information in a way that's much easier to analyze than before.
>>
>> Do keep your eyes open for bugs, of course :-)
> 
> I have reverted the part of this that changes the format for logging,
> because it turned out that it was impossible to wrestle the stackbuilder
> traffic logging onto that format - since stackbuilder uses the
> redirector to log arbitrary downloads, and not just things coming off
> our mirror network. Also it seems that the mirror id primary key can
> change around, and should not be used for logging.

Meh, another misunderstanding there. The primary key doesn't change.
only the mirror index. I got them mixed up.

//Magnus



Re: New redirector

From
"Dave Page"
Date:
On Sat, Dec 20, 2008 at 5:05 PM, Magnus Hagander <magnus@hagander.net> wrote:
> FYI - I've committed a new version of the URL redirector for downloads.
>
> The old version was being used for linkfilter-breakthrough to distribute
> viruses :-(

FYI, I just removed 242769 log records from the clickthrus table. Some
were created by this issue, a few were mis-parsed URLs and such.


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: New redirector

From
Stefan Kaltenbrunner
Date:
Magnus Hagander wrote:
> FYI - I've committed a new version of the URL redirector for downloads.
> 
> The old version was being used for linkfilter-breakthrough to distribute
> viruses :-(
> 
> Since I was hacking around that code anyway, I didn't just add a filter
> to it, but changed around how it works a bit. Apart from it no longer
> being possible to use it to break through stupid linkblockers, it has
> also made the URLs easier to read and copy/paste, and we're also storing
> the logging information in a way that's much easier to analyze than before.
> 
> Do keep your eyes open for bugs, of course :-)

this change broke most of the website replication code and is close to 
running out some of the website mirrors out of diskspace. It seems that 
the mirror script is now copying tons of /redir/<mirrorid> directories 
to the slaves and some of them contain indvidual copies of the full 
source tarball for all active releases.
This causes both disk-usage related issues as well as very long 
sync-times between wwwmaster and the slaves...
I don't have time to look into that more closely now so it would ge good 
if somebody else could.


Stefan


Re: New redirector

From
Magnus Hagander
Date:
On 24 dec 2008, at 10.24, Stefan Kaltenbrunner  
<stefan@kaltenbrunner.cc> wrote:

> Magnus Hagander wrote:
>> FYI - I've committed a new version of the URL redirector for  
>> downloads.
>> The old version was being used for linkfilter-breakthrough to  
>> distribute
>> viruses :-(
>> Since I was hacking around that code anyway, I didn't just add a  
>> filter
>> to it, but changed around how it works a bit. Apart from it no longer
>> being possible to use it to break through stupid linkblockers, it has
>> also made the URLs easier to read and copy/paste, and we're also  
>> storing
>> the logging information in a way that's much easier to analyze than  
>> before.
>> Do keep your eyes open for bugs, of course :-)
>
> this change broke most of the website replication code and is close  
> to running out some of the website mirrors out of diskspace. It  
> seems that the mirror script is now copying tons of /redir/ 
> <mirrorid> directories to the slaves and some of them contain  
> indvidual copies of the full source tarball for all active releases.
> This causes both disk-usage related issues as well as very long sync- 
> times between wwwmaster and the slaves...
> I don't have time to look into that more closely now so it would ge  
> good if somebody else could

Oh shit...

It shouldn't crawl explicit links to wwwmaster, I thought :( perhaps  
some place is forgetting to make it explicit?
If not, then just making it exclude everything under /redir/  wyen  
mirroring should do the trick.

Unfortunately it'll be a while before I can look at it, so I'd  
appreciate if yet someone else could!

/Magnus



Re: New redirector

From
"Dave Page"
Date:
On Wed, Dec 24, 2008 at 10:08 AM, Magnus Hagander <magnus@hagander.net> wrote:
> Oh shit...
>
> It shouldn't crawl explicit links to wwwmaster, I thought :( perhaps some
> place is forgetting to make it explicit?

Nice work :-)

> If not, then just making it exclude everything under /redir/  wyen mirroring
> should do the trick.

Yeah - the exclude list had redir\? but not redir /. Added now.

Hmmm...

Update has been running since 2008-12-23T22:00:00+00:00 (12 hours 25
minutes 41 seconds).

Killed that as well, rm -rf'd the static/redir/ directory and
requested a sync. I've gotta leave to do Christmas now... have a good
one :-)

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com