Re: Interesting question - Mailing list pgsql-hackers

From Larry Rosenman
Subject Re: Interesting question
Date
Msg-id 20010518214400.A19196@lerami.lerctr.org
Whole thread Raw
In response to Re: Interesting question  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Interesting question  (Larry Rosenman <ler@lerctr.org>)
List pgsql-hackers
* Tom Lane <tgl@sss.pgh.pa.us> [010518 21:09]:
> Larry Rosenman <ler@lerctr.org> writes:
> >  EXPLAIN
> >  SELECT ...
> >  FROM attack_db   
> >  WHERE (start_time >= now() - '02:00:00'::interval OR   
> >         end_time >= now() - '02:00:00'::interval)   
> >         AND host(src_ip) NOT IN (select host(ip) from exempt_ips)    
> >         AND host(dst_ip) NOT IN (select host(ip) from exempt_ips)    
> >  ORDER BY bytes DESC;
> 
> > NOTICE:  QUERY PLAN:
> 
> > Sort  (cost=10870.77..10870.77 rows=5259 width=120)
> >   ->  Seq Scan on attack_db  (cost=0.00..10358.95 rows=5259 width=120)
> >         SubPlan
> >           ->  Seq Scan on exempt_ips  (cost=0.00..1.04 rows=4 width=12)
> >           ->  Seq Scan on exempt_ips  (cost=0.00..1.04 rows=4 width=12)
> 
> 
> Making use of the indexes on start_time and end_time would be a good
> thing.  The reason it's not doing that is it doesn't think that the
> expressions "now() - '02:00:00'::interval" reduce to constants.  We
> may have a proper solution for that by the time 7.2 comes out, but
> in the meantime you could fake it with a function that hides the
> noncacheable function and operator --- see previous discussions of
> this identical issue in the archives.
OK. What would you suggest for the function?  I'd like the
'02:00:00'::interval to be a variable somehow to change the 
interval we're searching.    What fills the table is a daemon that is
looking at the netflow data, and when a packet that matches one of the
attack profiles comes along, it does an insert into attack_db. 


> 
> The NOT INs are pretty ugly too (and do you need the host() conversion
> there?  Seems like a waste of cycles...).  You might be able to live
> with that if the timestamp condition will always be pretty restrictive,
> but otherwise they'll be a no go.  Consider NOT EXISTS with an index
> on exempt_ips(ip).
Yes, because the masks will probably be different each time (this is
from netflow data from my cisco's).  The exempt IP's table is, at the
moment 4 ip's, so that's quick anyway. 

> 
>             regards, tom lane

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Interesting question
Next
From: Tatsuo Ishii
Date:
Subject: Re: External search engine, advice