Thread: match an IP address

match an IP address

From
Joao Ferreira gmail
Date:
hello all,

I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
address

192.168.90.3
10.3.2.1

any help please...


thanks
joao



Re: match an IP address

From
hubert depesz lubaczewski
Date:
On Mon, Sep 22, 2008 at 05:59:25PM +0100, Joao Ferreira gmail wrote:
> I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> address
> 192.168.90.3
> 10.3.2.1
> any help please...

any reason not to use standard inet datatype? which does the validation.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: match an IP address

From
Raymond O'Donnell
Date:
On 22/09/2008 17:59, Joao Ferreira gmail wrote:

> I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> address

There are built-in types in PG for handling IP addresses - are they any
use to you?

If not, there's a useful site here which may get you started:

    http://regexlib.com/default.aspx


Ray.

------------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
------------------------------------------------------------------


Re: match an IP address

From
"Scott Marlowe"
Date:
On Mon, Sep 22, 2008 at 10:59 AM, Joao Ferreira gmail
<joao.miguel.c.ferreira@gmail.com> wrote:
> hello all,
>
> I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> address
>
> 192.168.90.3
> 10.3.2.1

As already mentioned inet / cidr types should work.  Example:

postgres=# create table inettest (a inet);
CREATE TABLE
postgres=# insert into inettest values
('192.168.0.1'),('192.168.1.1'),('10.0.0.1');
INSERT 0 3
postgres=# select a from inettest where '192.168.0.1/16' >> a;
      a
-------------
 192.168.0.1
 192.168.1.1
(2 rows)
postgres=# select a from inettest where '192.168.0.1/24' >> a;
      a
-------------
 192.168.0.1
(1 row)
postgres=# select a from inettest where '192.168.0.1/0' >> a;
      a
-------------
 192.168.0.1
 192.168.1.1
 10.0.0.1
(3 rows)

Re: match an IP address

From
Joao Ferreira gmail
Date:
well...

my IP addresses are stored in a TEXT type field. that field can actually
contain usernames like 'joao' or 'scott' and it can contain IP
addresses....



:(

joao


On Mon, 2008-09-22 at 11:13 -0600, Scott Marlowe wrote:
> On Mon, Sep 22, 2008 at 10:59 AM, Joao Ferreira gmail
> <joao.miguel.c.ferreira@gmail.com> wrote:
> > hello all,
> >
> > I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> > address
> >
> > 192.168.90.3
> > 10.3.2.1
>
> As already mentioned inet / cidr types should work.  Example:
>
> postgres=# create table inettest (a inet);
> CREATE TABLE
> postgres=# insert into inettest values
> ('192.168.0.1'),('192.168.1.1'),('10.0.0.1');
> INSERT 0 3
> postgres=# select a from inettest where '192.168.0.1/16' >> a;
>       a
> -------------
>  192.168.0.1
>  192.168.1.1
> (2 rows)
> postgres=# select a from inettest where '192.168.0.1/24' >> a;
>       a
> -------------
>  192.168.0.1
> (1 row)
> postgres=# select a from inettest where '192.168.0.1/0' >> a;
>       a
> -------------
>  192.168.0.1
>  192.168.1.1
>  10.0.0.1
> (3 rows)
>


Re: match an IP address

From
"Scott Marlowe"
Date:
On Mon, Sep 22, 2008 at 11:16 AM, Joao Ferreira gmail
<joao.miguel.c.ferreira@gmail.com> wrote:
> well...
>
> my IP addresses are stored in a TEXT type field. that field can actually
> contain usernames like 'joao' or 'scott' and it can contain IP
> addresses....

Then cast them to inet and use the method I showed above:

postgres=# create table b as select a::text from inettest ;
SELECT
postgres=# select * from b;
       a
----------------
 192.168.0.1/32
 192.168.1.1/32
 10.0.0.1/32
(3 rows)

postgres=# select a from b where '192.168.0.1/0' >> a::inet;
       a
----------------
 192.168.0.1/32
 192.168.1.1/32
 10.0.0.1/32
(3 rows)

postgres=# select a from b where '192.168.0.1/24' >> a::inet;
       a
----------------
 192.168.0.1/32

Re: match an IP address

From
Tino Wildenhain
Date:
Hi,

Joao Ferreira gmail wrote:
> well...
>
> my IP addresses are stored in a TEXT type field. that field can actually
> contain usernames like 'joao' or 'scott' and it can contain IP
> addresses....

Any reason not to change this in the first place?

For a quick fix you could use regex to find the records
likely containing an ip-address notation.

Tino

Attachment

Re: match an IP address

From
"Phoenix Kiula"
Date:
> > my IP addresses are stored in a TEXT type field. that field can actually
> > contain usernames like 'joao' or 'scott' and it can contain IP
> > addresses....



I think this is common DB design on many websites that have registered
user IDs.

My humble suggestion would be to make another column in the table
called "user_registered" or something. Make this an int(1). If the
user is registered, value is 1, otherwise 0.

To update your existing data onetime, run an SQL update query looking
for IP pattern with 3 dots (which is likely IP address and not user
id) or by the INET conditions in previous suggestions.

Next, include the columns "user_id" and "user_registered" in the same
index. This won't increase the size of the index too much, as the
user_registered field is a small INT number. But from now on your SQL
queries:

   ...WHERE user_id = 'testuser' AND user_registered = 1

will return much faster. I have found this to the most convenient and
fastest solution in PGSQL instead of haivng INET in the WHERE clauses.

Re: match an IP address

From
Tino Wildenhain
Date:
Phoenix Kiula wrote:
>>> my IP addresses are stored in a TEXT type field. that field can actually
>>> contain usernames like 'joao' or 'scott' and it can contain IP
>>> addresses....
>
>
>
> I think this is common DB design on many websites that have registered
> user IDs.

Is it? Name one! Sounds like crappy design to me.

> My humble suggestion would be to make another column in the table
> called "user_registered" or something. Make this an int(1). If the
> user is registered, value is 1, otherwise 0.

Maybe "registration" (aka authentication) is mixed up with simple
session handling?

> To update your existing data onetime, run an SQL update query looking
> for IP pattern with 3 dots (which is likely IP address and not user
> id) or by the INET conditions in previous suggestions.

It could also just be a fdn (hostname) - still I'm a bit puzzled
how a username and an IP-address can get into the same field.

> Next, include the columns "user_id" and "user_registered" in the same
> index. This won't increase the size of the index too much, as the
> user_registered field is a small INT number. But from now on your SQL
> queries:
>
>    ...WHERE user_id = 'testuser' AND user_registered = 1
>
> will return much faster. I have found this to the most convenient and
> fastest solution in PGSQL instead of haivng INET in the WHERE clauses.
>

Can you give example on where the inet entry is going to be used?

Cheers
Tino

Attachment

Re: match an IP address

From
"Phoenix Kiula"
Date:
> > I think this is common DB design on many websites that have registered
 > > user IDs.
 > >
 >
 >  Is it? Name one! Sounds like crappy design to me.
 >




It might sound crappy design to you, but for websites that allow users
 to do something while they are registered OR unregistered, will choose
 to have this for query speed. (Registered user goes in as "testuser"
 while an unregistered one goes as his IP address--some websites also
 use cookies but they're less reliable.)

 We can make this very relationally sound and whatnot, but having one
 field to go to, whether you are registered or not, makes it much
 simpler.

 Most websites have to allow for dots in their user ID these days as
 people prefer to have their email address as user ID. Which means that
 the dot checking of an IP address may not work to distinguish IP
 addresses (unregistered user) from registered user IDs.

 In this scenario, for query speed, again, if there is a column that
 tells us whether this user is registered or not it helps a great deal.
 The INET match condition is not good enough for speed for most modern
 websites with any sizeable traffic. I even wrote a function that
 converts IP to INET integer and vice versa, but no great boost in
 query speed that could compare to an indexed query on user_id and
 user_reg.

 Welcome your thoughts on how you would do it.

Re: match an IP address

From
Craig Ringer
Date:
Tino Wildenhain wrote:
> Phoenix Kiula wrote:
>>>> my IP addresses are stored in a TEXT type field. that field can
>>>> actually
>>>> contain usernames like 'joao' or 'scott' and it can contain IP
>>>> addresses....
>>
>>
>>
>> I think this is common DB design on many websites that have registered
>> user IDs.
>
> Is it? Name one! Sounds like crappy design to me.
>
>> My humble suggestion would be to make another column in the table
>> called "user_registered" or something. Make this an int(1). If the
>> user is registered, value is 1, otherwise 0.
>
> Maybe "registration" (aka authentication) is mixed up with simple
> session handling?
>
>> To update your existing data onetime, run an SQL update query looking
>> for IP pattern with 3 dots (which is likely IP address and not user
>> id) or by the INET conditions in previous suggestions.
>
> It could also just be a fdn (hostname) - still I'm a bit puzzled
> how a username and an IP-address can get into the same field.

Dodgy forum software. Lots of it uses an IP address as a fake username
for unregistered users, rather than doing the sensible thing and
tracking both IP address and (if defined) username.

How I'd want to do this if I was designing the setup from scratch would
probably be:

--
-- Track user identies
--
CREATE TABLE user (
   user_id    SERIAL PRIMARY KEY,
   user_name    VARCHAR NOT NULL,
   user_retired  BOOLEAN
   -- and whatever else you want to keep track of about them
);

-- This index enforces unique user names across active users,
-- permitting re-use of usernames for since deleted users without
-- losing information about the previous user of that name's activity
-- or losing the inability to differentiate between the old and new
-- owners of that name.
--
-- I'd revoke DELETE rights on `user' or use a rule that rewrote DELETE
-- to an UPDATE user SET user_retired = 't'.
--

CREATE UNIQUE INDEX user_active_name
ON user(user_name)
WHERE (NOT user_retired)

-- Now, say your problem described in your post is with user activity
-- logging:

CREATE TABLE access_log (
   -- blah blah
   user_id    INTEGER REFERENCES user(user_id),
   access_ip    cidr NOT NULL
);

CREATE INDEX access_log_ip ON access_log(access_ip);

CREATE INDEX access_user
ON access_log(user_id)
WHERE (user_id NOT NULL);



In other words: always store the IP address, and if the user involved
was a registered user store a reference to their user ID as well. Store
a reference to a globally unique user identity number rather than the
user name, and permit reuse of user names without losing information
about distinct username owners.

If you wanted you could use string user IDs and do away with the
synthetic "user_id" key I've used above, but I suspect you'd regret it
down the track.

If you wanted to look up activity that might be identifed by IP address
or by username, a query like this would do the trick and would simulate
the behaviour your forum software is used to, including the ability of a
user to create a username that's an IP address to throw the whole thing
into chaos:

SELECT * FROM access_user
WHERE COALESCE(user_id, to_char(access_ip)) = matchstring;

--
Craig Ringer

Re: match an IP address

From
"Phoenix Kiula"
Date:
My post at the bottom.



 On 9/23/08, Craig Ringer <craig@postnewspapers.com.au> wrote:

 >
 >  Dodgy forum software. Lots of it uses an IP address as a fake username for
 > unregistered users, rather than doing the sensible thing and tracking both
 > IP address and (if defined) username.
 >
 >  How I'd want to do this if I was designing the setup from scratch would
 > probably be:
 >
 >  --
 >  -- Track user identies
 >  --
 >  CREATE TABLE user (
 >   user_id       SERIAL PRIMARY KEY,
 >   user_name     VARCHAR NOT NULL,
 >   user_retired  BOOLEAN
 >   -- and whatever else you want to keep track of about them
 >  );
 >
 >  -- This index enforces unique user names across active users,
 >  -- permitting re-use of usernames for since deleted users without
 >  -- losing information about the previous user of that name's activity
 >  -- or losing the inability to differentiate between the old and new
 >  -- owners of that name.
 >  --
 >  -- I'd revoke DELETE rights on `user' or use a rule that rewrote DELETE
 >  -- to an UPDATE user SET user_retired = 't'.
 >  --
 >
 >  CREATE UNIQUE INDEX user_active_name
 >  ON user(user_name)
 >  WHERE (NOT user_retired)
 >
 >  -- Now, say your problem described in your post is with user activity
 >  -- logging:
 >
 >  CREATE TABLE access_log (
 >   -- blah blah
 >   user_id       INTEGER REFERENCES user(user_id),
 >   access_ip     cidr NOT NULL
 >  );
 >
 >  CREATE INDEX access_log_ip ON access_log(access_ip);
 >
 >  CREATE INDEX access_user
 >  ON access_log(user_id)
 >  WHERE (user_id NOT NULL);
 >
 >
 >
 >  In other words: always store the IP address, and if the user involved was a
 > registered user store a reference to their user ID as well. Store a
 > reference to a globally unique user identity number rather than the user
 > name, and permit reuse of user names without losing information about
 > distinct username owners.
 >
 >  If you wanted you could use string user IDs and do away with the synthetic
 > "user_id" key I've used above, but I suspect you'd regret it down the track.
 >
 >  If you wanted to look up activity that might be identifed by IP address or
 > by username, a query like this would do the trick and would simulate the
 > behaviour your forum software is used to, including the ability of a user to
 > create a username that's an IP address to throw the whole thing into chaos:
 >
 >  SELECT * FROM access_user
 >  WHERE COALESCE(user_id, to_char(access_ip)) = matchstring;
 >
 >  --
 >  Craig Ringer
 >





No. You have no idea what the design is for. Not forum crap.

 What happens when you need to store in a table the activity log?

   ACTIVITY_ID
   USER_STAMP  (currently user_id or ip for registered and unregistered resp.)

 The idea of storing IP for users is already being done. So what?
 Everytime they "do" something, you do not store their IP. Why would
 you? Just store their user id. For unregistered ones however, we store
 the IP because there is nothing else. There is no user ID for them.
 What's your logic for getting a user ID for unregistered guys --
 invent one automagically?

 Finally, this SQL:


    WHERE COALESCE(user_id, to_char(access_ip)) = matchstring;


Ever tried this crap on a table of 10 million records on a live
 website, where this query is happening at 3000 times per second? No
 such function schtick will match the raw speed of a simpler indexed
 query. Or did you mean my index should contain the COALESCE already?

Tino, I wasn't talking about sessions. This is not about session IDs.
A session ID is useless the next time a certain IP address "does
something" on the site. You want a commonality for non-registered
users across many different sessions. (Apart from the fact that
session IDs are usually long hashes which take up space in the table
and in the index)

Re: match an IP address

From
Tino Wildenhain
Date:
Phoenix Kiula wrote:
> My post at the bottom.
>
...
>
> No. You have no idea what the design is for. Not forum crap.
>
>  What happens when you need to store in a table the activity log?
>
>    ACTIVITY_ID
>    USER_STAMP  (currently user_id or ip for registered and unregistered resp.)

And here it gets wrong. Obviously you would store the session id
or if you have a lot of relations, use a sequence generated
key for session_id (compare with my design in the other post,
in this case session_id would be serial and you'd have a field
session_key text with the index for the cookies in the sessions
table instead)

>  The idea of storing IP for users is already being done. So what?

Abandon this idea I'd say. Its based on the wrong asumption IP
addresses map to users in 1:1 relation.

>  Everytime they "do" something, you do not store their IP. Why would
>  you? Just store their user id. For unregistered ones however, we store
>  the IP because there is nothing else. There is no user ID for them.
>  What's your logic for getting a user ID for unregistered guys --
>  invent one automagically?
>
>  Finally, this SQL:
>
>
>     WHERE COALESCE(user_id, to_char(access_ip)) = matchstring;
>
>
> Ever tried this crap on a table of 10 million records on a live
>  website, where this query is happening at 3000 times per second? No

You have 10 million people active the same time in your site?


>  such function schtick will match the raw speed of a simpler indexed
>  query. Or did you mean my index should contain the COALESCE already?
>
> Tino, I wasn't talking about sessions. This is not about session IDs.

Well actually this is. You are just naming it differently.

> A session ID is useless the next time a certain IP address "does
> something" on the site. You want a commonality for non-registered
> users across many different sessions. (Apart from the fact that
> session IDs are usually long hashes which take up space in the table
> and in the index)

Yes but only active ones.

btw, given IP is in every request, where is your username coming from?
Apart from basic auth, there is no way of having a userid tied to
the request directly, so how are you doing this?

Tino

Attachment

Re: match an IP address

From
hubert depesz lubaczewski
Date:
On Mon, Sep 22, 2008 at 05:59:25PM +0100, Joao Ferreira gmail wrote:
> I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> address
> 192.168.90.3
> 10.3.2.1
> any help please...

use this regular expression:

'^[0-9]{1,3}(.[0-9]{1,3}){3}$'

warning: do not use "like" or "similar to".
proper way to use it:

select * from table where field ~ '^[0-9]{1,3}(.[0-9]{1,3}){3}$';

this regexp is not 100% fault proof - it will happily return rows like:
'300.999.998.7'

but for most of the cases it should be enough. if you need it to match
only ips, and filter out things like '300.999.999.999' - let me know.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: match an IP address

From
Craig Ringer
Date:
Phoenix Kiula wrote:

> Ever tried this crap on a table of 10 million records on a live
>  website, where this query is happening at 3000 times per second? No
>  such function schtick will match the raw speed of a simpler indexed
>  query. Or did you mean my index should contain the COALESCE already?

Please forgive my attempt to help you based on a woefully insufficient
description of your problem and situation. I will not make any attempt
to do so again.

--
Craig Ringer

Re: match an IP address

From
Joao Ferreira gmail
Date:
thank you depesz

it seems a pretty good fix for my problem. Actually yestreday I came up
with something similar but your's is better.

cheers

joao

On Tue, 2008-09-23 at 09:26 +0200, hubert depesz lubaczewski wrote:
> On Mon, Sep 22, 2008 at 05:59:25PM +0100, Joao Ferreira gmail wrote:
> > I'm unable to build a LIKE or SIMILAR TO expression for matching and ip
> > address
> > 192.168.90.3
> > 10.3.2.1
> > any help please...
>
> use this regular expression:
>
> '^[0-9]{1,3}(.[0-9]{1,3}){3}$'
>
> warning: do not use "like" or "similar to".
> proper way to use it:
>
> select * from table where field ~ '^[0-9]{1,3}(.[0-9]{1,3}){3}$';
>
> this regexp is not 100% fault proof - it will happily return rows like:
> '300.999.998.7'
>
> but for most of the cases it should be enough. if you need it to match
> only ips, and filter out things like '300.999.999.999' - let me know.
>
> Best regards,
>
> depesz
>


Re: match an IP address

From
"Phoenix Kiula"
Date:
>  Please forgive my attempt to help you based on a woefully insufficient
> description of your problem and situation. I will not make any attempt to do
> so again.



Actually it was not my problem, this is a thread started by some one
else. I use Gmail so I see the entire thread as a "conversation" and
the context is maintained. You should try it. Anyway, sorry that you
feel bad.

To others: thanks for your suggestions, but this issue is not one of
session IDs, nor is it solved by storing IP addresses separately
(which does not assume 1:1 correlation between user and IP). We'll let
that be.

Let's just say that in *many* online situations it is vital for
querying speed to have the same column that stores users -- both
registered and unregistered. A query in SQL that matches against an IP
address regexp to identify the unregistered ones may work for some
with smaller databases, which is great, and if it doesn't (the "~"
match is simply not practical for large busy websites), then consider
a small separate column that stores the registration status as a flag.

Thanks.

Re: match an IP address

From
Craig Ringer
Date:
Phoenix Kiula wrote:

> Ever tried this crap on a table of 10 million records on a live
>  website, where this query is happening at 3000 times per second? No
>  such function schtick will match the raw speed of a simpler indexed
>  query. Or did you mean my index should contain the COALESCE already?

Hmm. My previous response may have been overly grumpy.

The point I was *trying* to make is that shoving a username/id and an IP
address into a single field is probably not ideal. At least in my
experience you pay for this sort of optimisation (if it even works out
as an optimisation in the first place) down the track. I have the
misfortunate to have to administrate a system full of such multi-use
fields, and have developed a real loathing for the approach.

If you don't want to store IPs for registered users, I'd use:

user_id INTEGER,
ip cidr,
CONSTRAINT must_have_userstamp
CHECK ( user_id IS NOT NULL OR ip IS NOT NULL)

... and yes, I'd use a functional index to look it up, or even a
trigger-maintained cache of the text representation if I had to. Then
again, I guess I'm lucky enough to work in environments where data
integrity and correctness is a priority and the resources available are
a good fit to the tasks the database needs to do.

--
Craig Ringe

Re: match an IP address

From
"Phoenix Kiula"
Date:
>  If you don't want to store IPs for registered users, I'd use:
>
>  user_id INTEGER,
>  ip cidr,
>  CONSTRAINT must_have_userstamp
>  CHECK ( user_id IS NOT NULL OR ip IS NOT NULL)
>
>  ... and yes, I'd use a functional index to look it up, or even a
>  trigger-maintained cache of the text representation if I had to. Then


Ok, this is an idea. And I do agree that multifunction fields are a
potential pain in the distant future.

My questions:

1. What extra tax will this constraint levy on an INSERT or UPDATE on
this table? There are about 100,000 inserts a day, and over three
times as many UPDATES. The concurrency is pretty high -- I mean
sometimes 1,000 users at the same time but no more than that. If the
additional cost of insertion/updating is not too heavy, I suppose this
could be a nice approach.

2. Why not have an INET field...why a CIDR? What's the benefit? It
stores those pesky ".../8" type additional data which one has to mask
with functions. Would INET work just as well?

3. Storage wise does this add significantly? How much space does an
INET field take as opposed to, say, a VARCHAR field?

4. Most importantly, how would you structure the index for this? I
would much rather have a fast "=" in my sql's WHERE clause. No "OR"
etc. Any thoughts?

Thanks

Re: match an IP address

From
Tino Wildenhain
Date:
Hi,

Phoenix Kiula wrote:
>>  Please forgive my attempt to help you based on a woefully insufficient
>> description of your problem and situation. I will not make any attempt to do
>> so again.
>
> To others: thanks for your suggestions, but this issue is not one of
> session IDs, nor is it solved by storing IP addresses separately
> (which does not assume 1:1 correlation between user and IP). We'll let
> that be.
>
> Let's just say that in *many* online situations it is vital for
> querying speed to have the same column that stores users -- both
> registered and unregistered. A query in SQL that matches against an IP

if not registered, where is the user coming from? The IP is clearly not
an identifier for a user. You (and the OP) should disregard that idea.

> address regexp to identify the unregistered ones may work for some
> with smaller databases, which is great, and if it doesn't (the "~"
> match is simply not practical for large busy websites), then consider
> a small separate column that stores the registration status as a flag.

The user id itself would serve as that flag. If non NULL -> user known,
otherwise unknown. Sounds easy, no? No regex at all! :)


> Thanks.
>
Thx ;)

Tino

Attachment

Re: match an IP address

From
Marcus Engene
Date:
Phoenix Kiula wrote:
>>  If you don't want to store IPs for registered users, I'd use:
>>
>>  user_id INTEGER,
>>  ip cidr,
>>  CONSTRAINT must_have_userstamp
>>  CHECK ( user_id IS NOT NULL OR ip IS NOT NULL)
>>
>>  ... and yes, I'd use a functional index to look it up, or even a
>>  trigger-maintained cache of the text representation if I had to. Then
>>
>
>
> Ok, this is an idea. And I do agree that multifunction fields are a
> potential pain in the distant future.
>
> My questions:
>
> 1. What extra tax will this constraint levy on an INSERT or UPDATE on
> this table? There are about 100,000 inserts a day, and over three
> times as many UPDATES. The concurrency is pretty high -- I mean
> sometimes 1,000 users at the same time but no more than that. If the
> additional cost of insertion/updating is not too heavy, I suppose this
> could be a nice approach.
>
> 2. Why not have an INET field...why a CIDR? What's the benefit? It
> stores those pesky ".../8" type additional data which one has to mask
> with functions. Would INET work just as well?
>
> 3. Storage wise does this add significantly? How much space does an
> INET field take as opposed to, say, a VARCHAR field?
>
> 4. Most importantly, how would you structure the index for this? I
> would much rather have a fast "=" in my sql's WHERE clause. No "OR"
> etc. Any thoughts?
>
> Thanks
>
>
Use the best of two worlds - consider memcached and use the db only when
you create/update an entry so that you can restore it if memcached
(perhaps as a consequence of a server reboot) gets restarted.
http://www.slideshare.net/vishnu/livejournals-backend-a-history-of-scaling

best regards,
Marcus


Re: match an IP address

From
Craig Ringer
Date:
Phoenix Kiula wrote:

> 1. What extra tax will this constraint levy on an INSERT or UPDATE on
> this table? There are about 100,000 inserts a day, and over three
> times as many UPDATES. The concurrency is pretty high -- I mean
> sometimes 1,000 users at the same time but no more than that. If the
> additional cost of insertion/updating is not too heavy, I suppose this
> could be a nice approach.

The best answer there is to do some testing. I wouldn't expect much of a
cost, but would recommend testing it to be sure.

> 2. Why not have an INET field...why a CIDR? What's the benefit? It
> stores those pesky ".../8" type additional data which one has to mask
> with functions. Would INET work just as well?

Yes, it would. I was just getting my types muddled.

> 3. Storage wise does this add significantly? How much space does an
> INET field take as opposed to, say, a VARCHAR field?

AFAIK nulls are not stored, they're just flagged in the null bitmap. As
such, there should be no or almost no storage cost.

> 4. Most importantly, how would you structure the index for this? I
> would much rather have a fast "=" in my sql's WHERE clause. No "OR"
> etc. Any thoughts?

I'd try a functional index first. If that didn't do the job, I'd use a
trigger-maintained column _purely_ as an optimisation (ie I could drop
it and lose no data) that stored text representations of the data.
Honestly, though, I expect the functional index would be more than good
enough and probably wouldn't have much of an INSERT/UPDATE cost.

Again, of course, I'd test before setting anything in stone.

--
Craig Ringer

--
Craig Ringer

Re: match an IP address

From
Tino Wildenhain
Date:
Craig Ringer wrote:
> Phoenix Kiula wrote:
>
>> 1. What extra tax will this constraint levy on an INSERT or UPDATE on
>> this table? There are about 100,000 inserts a day, and over three
>> times as many UPDATES. The concurrency is pretty high -- I mean
>> sometimes 1,000 users at the same time but no more than that. If the
>> additional cost of insertion/updating is not too heavy, I suppose this
>> could be a nice approach.
>
...
> I'd try a functional index first. If that didn't do the job, I'd use a
> trigger-maintained column _purely_ as an optimisation (ie I could drop
> it and lose no data) that stored text representations of the data.
> Honestly, though, I expect the functional index would be more than good
> enough and probably wouldn't have much of an INSERT/UPDATE cost.
>
> Again, of course, I'd test before setting anything in stone.

And do not forget you know beforehand if you are going to lookup a user
or an IP. So I fail to see what problem needs to be solved here :(

Maybe we can get an enlightenment on where every data comes from and
whats going to be its usage? Maybe we can then work out a better
solution at all? Peephole optimizing is great but sometimes the effect
is better if you just open the door :-)

Tino

Attachment

Re: match an IP address

From
Steve Atkins
Date:
On Sep 23, 2008, at 12:26 AM, hubert depesz lubaczewski wrote:

> On Mon, Sep 22, 2008 at 05:59:25PM +0100, Joao Ferreira gmail wrote:
>> I'm unable to build a LIKE or SIMILAR TO expression for matching
>> and ip
>> address
>> 192.168.90.3
>> 10.3.2.1
>> any help please...
>
> use this regular expression:
>
> '^[0-9]{1,3}(.[0-9]{1,3}){3}$'
>
> warning: do not use "like" or "similar to".
> proper way to use it:
>
> select * from table where field ~ '^[0-9]{1,3}(.[0-9]{1,3}){3}$';
>
> this regexp is not 100% fault proof - it will happily return rows
> like:
> '300.999.998.7'

It'll also return rows like 3L33T0o7 - which normally I'd not be too
bothered about, but on a forum that's a fairly typical username. :)

Gotta escape that period:

^[0-9]{1,3}(?:\.[0-9]{1,3}){3}$


Cheers,
   Steve


Re: match an IP address

From
"Phoenix Kiula"
Date:
<...snip...>
>
> I'd try a functional index first. If that didn't do the job, I'd use a
>  trigger-maintained column _purely_ as an optimisation (ie I could drop
>  it and lose no data) that stored text representations of the data.
>  Honestly, though, I expect the functional index would be more than good
>  enough and probably wouldn't have much of an INSERT/UPDATE cost.
<.../snip...>


I made a test table and tried a functional index (coalescing the two
columns). Works!

Now let me try that on a mirror on my full real table with millions of
rows and report back. Sounds like a cleaner design so I may be on to
something!

Hope this helps the original poster with some ideas too.