Thread: MACADDR types NULL value (undocumented?)
hdb=> explain select * from host where mac = '00:00:00:00:00:00'; Seq Scan on host (cost=0.00..460.62 rows=1793 width=62) hdb=> explain select * from host where mac = '00:00:00:00:00:01'; Index Scan using host_mac on host (cost=0.00..145.07 rows=179 width=62) I take it that: 1) 00:00:00:00:00:00 is interpreted as NULL for MACADDR types (grr...) 2) NULL tests aren't optimised by indices (I know this already) This should be added to the documentation. I'm going to have to use a different value for my NULL, since I will need to be indexing those... Regards, Phil +----------------------------------+ | Phil Mayers, Network Support | | Centre for Computing Services | | Imperial College | +----------------------------------+
Judging by the row estimate drop from 1793 to 179 between explains, it looks as if '00:00:00:00:00:00' is the most common mac value in the host table. The second row count is roughly 1/10, or the default selectivity that postgres uses -- I belive -- for a value that is not the most common value (guesses row count by taking row count for most common val and multiplying by 1/10 -- which can be changed). In this situation, postgres beilieves that the '00:00:00:00:00:00' value occurs in your database enough to warrant a sequential scan over the table, instead of an index scan (which, when doing a large number of tuples, can be slower than a seq scan due to disk thrashing to get to the correct tuple location). If the query is faster when run as an index scan, just preceed the query with: SET ENABLE_SEQSCAN = 'OFF'; explain select * from host where mac = '00:00:00:00:00:00'; (You can also run these same two commands and compare the total cost from the explains, which would show that postgres belives the seq scan is cheaper) Here is more info on explain: http://www.postgresql.org/users-lounge/docs/7.0/user/c4884.htm And, to answer your other question, Postgres would not intrepret '00:00:00:00:00:00' as null. Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Wed, 21 Mar 2001, Mayers, Philip J wrote: > hdb=> explain select * from host where mac = '00:00:00:00:00:00'; > > Seq Scan on host (cost=0.00..460.62 rows=1793 width=62) > > hdb=> explain select * from host where mac = '00:00:00:00:00:01'; > > Index Scan using host_mac on host (cost=0.00..145.07 rows=179 width=62) > > > > I take it that: > > 1) 00:00:00:00:00:00 is interpreted as NULL for MACADDR types (grr...) > 2) NULL tests aren't optimised by indices (I know this already) > > This should be added to the documentation. I'm going to have to use a > different value for my NULL, since I will need to be indexing those... > > Regards, > Phil > > +----------------------------------+ > | Phil Mayers, Network Support | > | Centre for Computing Services | > | Imperial College | > +----------------------------------+ >
Oops, yes, you're quite correct: There are some 17700 entries in the table, with 1793 being 00:00:00:00:00:00 - and that warrants a sequential scan, you're correct. Damn :o) Sorry all! Regards, Phil +----------------------------------+ | Phil Mayers, Network Support | | Centre for Computing Services | | Imperial College | +----------------------------------+ -----Original Message----- From: Michael Fork [mailto:mfork@toledolink.com] Sent: 21 March 2001 15:05 To: Mayers, Philip J Cc: 'pgsql-general@postgresql.org' Subject: Re: [GENERAL] MACADDR types NULL value (undocumented?) <snip concise, correct reply>
"Mayers, Philip J" <p.mayers@ic.ac.uk> writes: > Oops, yes, you're quite correct: There are some 17700 entries in the table, > with 1793 being 00:00:00:00:00:00 - and that warrants a sequential scan, > you're correct. Damn :o) If these all-zero entries (a) are really dummy values ("don't know" or "not applicable"), and (b) outnumber any specific real entry, then it would be worth your while to replace them with NULLs. The statistics stuff accounts for NULLs separately from not-nulls, so after a vacuum analyze you'd find the planner more able to make an intelligent choice about seq vs index scan on this table. regards, tom lane
I was under the impression that indices aren't used for "IS NULL" - I will need to be scanning for these moderately frequently (but less frequently than the others). In actual fact, it would be better (from an architectural point of view) for me for them to be NULL. Regards, Phil +----------------------------------+ | Phil Mayers, Network Support | | Centre for Computing Services | | Imperial College | +----------------------------------+ -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 26 March 2001 15:45 To: Mayers, Philip J Cc: 'Michael Fork'; 'pgsql-general@postgresql.org' Subject: Re: [GENERAL] MACADDR types NULL value (undocumented?) "Mayers, Philip J" <p.mayers@ic.ac.uk> writes: > Oops, yes, you're quite correct: There are some 17700 entries in the table, > with 1793 being 00:00:00:00:00:00 - and that warrants a sequential scan, > you're correct. Damn :o) If these all-zero entries (a) are really dummy values ("don't know" or "not applicable"), and (b) outnumber any specific real entry, then it would be worth your while to replace them with NULLs. The statistics stuff accounts for NULLs separately from not-nulls, so after a vacuum analyze you'd find the planner more able to make an intelligent choice about seq vs index scan on this table. regards, tom lane
"Mayers, Philip J" <p.mayers@ic.ac.uk> writes: > I was under the impression that indices aren't used for "IS NULL" They're not (at present). But given those numbers you wouldn't want an indexscan when looking for nulls anyway, so where's the downside? regards, tom lane
Good point - thanks for the advice (Tom - where *do* you get the time to answer everyones questions? :o) Regards, Phil +----------------------------------+ | Phil Mayers, Network Support | | Centre for Computing Services | | Imperial College | +----------------------------------+ -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 26 March 2001 16:05 To: Mayers, Philip J Cc: 'pgsql-general@postgresql.org' Subject: Re: [GENERAL] MACADDR types NULL value (undocumented?) "Mayers, Philip J" <p.mayers@ic.ac.uk> writes: > I was under the impression that indices aren't used for "IS NULL" They're not (at present). But given those numbers you wouldn't want an indexscan when looking for nulls anyway, so where's the downside? regards, tom lane