Thread: int4 vs varchar to store ip addr
Hello, I have an authorization table that associates 1 customer IP to a service IP to determine a TTL (used by a radius server). table auth client varchar(15); service varchar(15); ttl int4; client and service are both ip addr. The number of distinct clients can be rather large (say around 4 million) and the number of distinct service around 1000. table auth can contain between 10 M and 20 M lines. there's a double index on ( client , service ). Since I would like to maximize the chance to have the whole table cached by the OS (linux), I'd like to reduce the size of the table by replacing the varchar by another data type more suited to store ip addr. I could use PG internal inet/cidr type to store the ip addrs, which would take 12 bytes per IP, thus gaining a few bytes per row. Apart from gaining some bytes, would the btree index scan be faster with this data type compared to plain varchar ? Also, in my case, I don't need the mask provided by inet/cidr ; is there a way to store an IPV4 addr directly into an INT4 but using the same syntax as varchar or inet/cidr (that is I could use '192.12.18.1' for example), or should I create my own data type and develop the corresponding function to convert from a text input to an int4 storage ? This would really reduce the size of the table, since it would need 3 int4 for client/service/ttl and I guess index scan would be faster with int4 data that with varchar(15) ? Thanks for any input. Nicolas
* Pomarede Nicolas: > I could use PG internal inet/cidr type to store the ip addrs, which > would take 12 bytes per IP, thus gaining a few bytes per row. I thought it's down to 8 bytes in PostgreSQL 8.2, but I could be mistaken. > Apart from gaining some bytes, would the btree index scan be faster > with this data type compared to plain varchar ? It will be faster because less I/O is involved. For purposes like yours, there is a special ip4 type in a contributed package which brings down the byte count to 4. I'm not sure if it's been ported to PostgreSQL 8.2 yet. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Nicolas wrote: > I have an authorization table that associates 1 customer IP to a service > IP to determine a TTL (used by a radius server). > > table auth > client varchar(15); > service varchar(15); > ttl int4; > client and service are both ip addr. > > The number of distinct clients can be rather large (say around 4 > million) and the number of distinct service around 1000. > > there's a double index on ( client , service ). It comes to mind another solution... I don't know if it is better or worse, but you could give it a try. Store IP addresses as 4 distinct columns, like the following: CREATE TABLE auth ( client_ip1 shortint, client_ip2 shortint, client_ip3 shortint, client_ip4 shortint, service varchar(15), ttl int4, ); And then index by client_ip4/3/2/1, then service. CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, client_ip1); or: CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, client_ip1, service); I'm curious to know from pg internals experts if this could be a valid idea or is totally non-sense. Probably the builtin ip4 type is better suited for these tasks? -- Cosimo
On Mon, 29 Jan 2007, Florian Weimer wrote: > * Pomarede Nicolas: > >> I could use PG internal inet/cidr type to store the ip addrs, which >> would take 12 bytes per IP, thus gaining a few bytes per row. > > I thought it's down to 8 bytes in PostgreSQL 8.2, but I could be > mistaken. > >> Apart from gaining some bytes, would the btree index scan be faster >> with this data type compared to plain varchar ? > > It will be faster because less I/O is involved. > > For purposes like yours, there is a special ip4 type in a contributed > package which brings down the byte count to 4. I'm not sure if it's > been ported to PostgreSQL 8.2 yet. Yes thanks for this reference, ip4r package seems to be a nice addition to postgres for what I'd like to do. Does someone here have some real life experience with it (regarding performance and stability) ? Also, is it possible that this package functionalities' might be merged into postgres one day, I think the benefit of using 4 bytes to store an ipv4 addr could be really interesting for some case ? thanks, ---------------- Nicolas Pomarede e-mail: npomarede@corp.free.fr "In a world without walls and fences, who needs windows and gates ?"
On 1/30/07, Pomarede Nicolas <npomarede@corp.free.fr> wrote: > On Mon, 29 Jan 2007, Florian Weimer wrote: > > > * Pomarede Nicolas: > > > >> I could use PG internal inet/cidr type to store the ip addrs, which > >> would take 12 bytes per IP, thus gaining a few bytes per row. > > > > I thought it's down to 8 bytes in PostgreSQL 8.2, but I could be > > mistaken. > > > >> Apart from gaining some bytes, would the btree index scan be faster > >> with this data type compared to plain varchar ? > > > > It will be faster because less I/O is involved. > > > > For purposes like yours, there is a special ip4 type in a contributed > > package which brings down the byte count to 4. I'm not sure if it's > > been ported to PostgreSQL 8.2 yet. > > Yes thanks for this reference, ip4r package seems to be a nice addition to > postgres for what I'd like to do. Does someone here have some real life > experience with it (regarding performance and stability) ? I'm using IP4 and have not had a problem with it in 8.2 (or 8.1) in terms of stability. As I designed my DB using it, I don't really have any comparisons to inet and/or varchar. One of the most useful things for me is the ability to create a GIST index to support determination of range inclusion (i.e. 192.168.23.1 is in the 192.168/16 network range), although it doesn't sound like this would be useful to you. -Mike