Thread: Null comparisons and the transform_null_equals run-time parameter

Null comparisons and the transform_null_equals run-time parameter

From
"Ken Winter"
Date:

When the run-time parameter transform_null_equals is on, shouldn’t two variables with NULL values evaluate as equal?  They don’t seem to. 

 

At the bottom of this message is a little test function.  It tries all comparisons of NULL-valued variables and NULL constants, both before and after turning transform_null_equals on.  Here’s what it returns:

 

transform_null_equals OFF: NULL = NULL -> Unknown

transform_null_equals OFF: v1 = NULL -> Unknown

transform_null_equals OFF: NULL = v2 -> Unknown

transform_null_equals OFF: v1 = v2 -> Unknown

transform_null_equals ON: NULL = NULL -> True

transform_null_equals ON: v1 = NULL -> True

transform_null_equals ON: NULL = v2 -> True

transform_null_equals ON: v1 = v2 -> Unknown

 

My problem is in the last line: Comparing two NULL variables produces an unknown result.  I need it to evaluate as True, like the preceding three comparisons.

 

Any suggestions?

 

~ TIA

~ Ken

 

 

CREATE OR REPLACE FUNCTION test() RETURNS varchar AS

$BODY$

DECLARE

      v1 VARCHAR;

      v2 VARCHAR;

      s VARCHAR := '';

BEGIN

      v1 := Null;

      v2 := Null;

      IF NULL = NULL THEN

            s := s || 'transform_null_equals OFF: NULL = NULL -> True ';

      ELSIF NOT (NULL = NULL) THEN

            s := s || 'transform_null_equals OFF: NULL = NULL -> False ';

      ELSE

            s := s || 'transform_null_equals OFF: NULL = NULL -> Unknown ';

      END IF;

      s := s || chr(10);

      IF v1 = NULL THEN

            s := s || 'transform_null_equals OFF: v1 = NULL -> True ';

      ELSIF NOT (v1 = NULL) THEN

            s := s || 'transform_null_equals OFF: v1 = NULL -> False ';

      ELSE

            s := s || 'transform_null_equals OFF: v1 = NULL -> Unknown ';

      END IF;

      s := s || chr(10);

      IF NULL = v2 THEN

            s := s || 'transform_null_equals OFF: NULL = v2 -> True ';

      ELSIF NOT (NULL = v2) THEN

            s := s || 'transform_null_equals OFF: NULL = v2 -> False ';

      ELSE

            s := s || 'transform_null_equals OFF: NULL = v2 -> Unknown ';

      END IF;

      s := s || chr(10);

      IF v1 = v2 THEN

            s := s || 'transform_null_equals OFF: v1 = v2 - > True ';

      ELSIF NOT v1 = v2 THEN

            s := s || 'transform_null_equals OFF: v1 = v2 -> False ';

      ELSE

            s := s || 'transform_null_equals OFF: v1 = v2 -> Unknown ';

      END IF;

      s := s || chr(10);

 

      SET LOCAL transform_null_equals TO ON;

 

      IF NULL = NULL THEN

            s := s || 'transform_null_equals ON: NULL = NULL -> True ';

      ELSIF NOT (NULL = NULL) THEN

            s := s || 'transform_null_equals ON: NULL = NULL -> False ';

      ELSE

            s := s || 'transform_null_equals ON: NULL = NULL -> Unknown ';

      END IF;

      s := s || chr(10);

      IF v1 = NULL THEN

            s := s || 'transform_null_equals ON: v1 = NULL -> True ';

      ELSIF NOT (v1 = NULL) THEN

            s := s || 'transform_null_equals ON: v1 = NULL -> False ';

      ELSE

            s := s || 'transform_null_equals ON: v1 = NULL -> Unknown ';

      END IF;

      s := s || chr(10);

      IF NULL = v2 THEN

            s := s || 'transform_null_equals ON: NULL = v2 -> True ';

      ELSIF NOT (NULL = v2) THEN

            s := s || 'transform_null_equals ON: NULL = v2 -> False ';

      ELSE

            s := s || 'transform_null_equals ON: NULL = v2 -> Unknown ';

      END IF;

      s := s || chr(10);

      IF v1 = v2 THEN

            s := s || 'transform_null_equals ON: v1 = v2 -> True ';

      ELSIF NOT v1 = v2 THEN

            s := s || 'transform_null_equals ON: v1 = v2 -> False ';

      ELSE

            s := s || 'transform_null_equals ON: v1 = v2 -> Unknown ';

      END IF;

     

      RETURN s;  

END;

$BODY$

LANGUAGE plpgsql VOLATILE;

 

 

SELECT test();

Re: Null comparisons and the transform_null_equals run-time parameter

From
Tom Lane
Date:
"Ken Winter" <ken@sunward.org> writes:
> When the run-time parameter transform_null_equals is on, shouldn't two
> variables with NULL values evaluate as equal?

No.  That setting does not change the runtime behavior of comparison.
The only thing it does is change the literal syntax "something = NULL"
to "something IS NULL".

            regards, tom lane

Re: Null comparisons and the transform_null_equals run-time parameter

From
"Ken Winter"
Date:
I get it.  Thanks, Tom.

In case anybody else needs this functionality, let me offer a family of
functions that do comparisons that treat NULL as a real value (rather than
as "unknown").  For example:

CREATE OR REPLACE FUNCTION eqnull(varchar, varchar) RETURNS boolean AS
$BODY$
/*
Return True if both args have the same non-NULL values
or both args are NULL; otherwise False.
*/
DECLARE
    v1 ALIAS FOR $1;
    v2 ALIAS FOR $2;
BEGIN
    -- NULL = NULL
    IF v1 IS NULL AND v2 IS NULL THEN
        RETURN True;
    -- NULL != Any non-NULL value
    ELSIF v1 IS NULL AND v2 IS NOT NULL THEN
        RETURN False;
    -- Any non-NULL value != NULL
    ELSIF v1 IS NOT NULL AND v2 IS NULL THEN
        RETURN False;
    -- Non-NULL value = non-NULL value
    ELSIF v1 = v2 THEN
        RETURN True;
    -- Non-NULL value != non-NULL value
    ELSE
        RETURN False;
    END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
;

You need a separate function for each data type you want to compare; the
only difference among these functions is their argument types.  I've
implemented variants for VARCHAR, NUMERIC, TIMESTAMP, and BOOLEAN.

The reason I need this is that I'm writing functions to test my database
programming (triggers, rules, etc), and these tests sometimes need to treat
Null as just another value in doing test comparisons.

~ Ken


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Saturday, June 05, 2010 9:41 AM
> To: Ken Winter
> Cc: 'PostgreSQL pg-general List'
> Subject: Re: [GENERAL] Null comparisons and the transform_null_equals run-
> time parameter
>
> "Ken Winter" <ken@sunward.org> writes:
> > When the run-time parameter transform_null_equals is on, shouldn't two
> > variables with NULL values evaluate as equal?
>
> No.  That setting does not change the runtime behavior of comparison.
> The only thing it does is change the literal syntax "something = NULL"
> to "something IS NULL".
>
>             regards, tom lane


Re: Null comparisons and the transform_null_equals run-time parameter

From
Steve Atkins
Date:
On Jun 5, 2010, at 9:46 AM, Ken Winter wrote:

> I get it.  Thanks, Tom.
>
> In case anybody else needs this functionality, let me offer a family of
> functions that do comparisons that treat NULL as a real value (rather than
> as "unknown").  For example:

You should take a look at "is not distinct from".

>
> The reason I need this is that I'm writing functions to test my database
> programming (triggers, rules, etc), and these tests sometimes need to treat
> Null as just another value in doing test comparisons.

You also might find http://pgtap.org/ useful.

Cheers,
  Steve


Re: Null comparisons and the transform_null_equals run-time parameter

From
"Ken Winter"
Date:
Steve ~

Thanks for the great tips.  See comments below.

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Steve Atkins
> Sent: Saturday, June 05, 2010 1:04 PM
> To: PostgreSQL pg-general List
> Subject: Re: [GENERAL] Null comparisons and the transform_null_equals run-
> time parameter
>
>
> On Jun 5, 2010, at 9:46 AM, Ken Winter wrote:

> > In case anybody else needs this functionality, let me offer a family of
> > functions that do comparisons that treat NULL as a real value (rather
> than
> > as "unknown").  For example:
>
> You should take a look at "is not distinct from".

Wow, that's the same wheel I just reinvented!  Works fine in my test
functions.  Guess I can scrap my home-brewed functions that do the same
thing.

> > The reason I need this is that I'm writing functions to test my database
> > programming (triggers, rules, etc), and these tests sometimes need to
> treat
> > Null as just another value in doing test comparisons.
>
> You also might find http://pgtap.org/  useful.

Yes indeed, looks very promising.

I see there are also test frameworks available at
http://en.dklab.ru/lib/dklab_pgunit/ and http://www.epictest.org/ .

This will take me a while to digest, but hopefully I'll be able to scrap, or
greatly simplify, my elaborate homemade test functions too.  It's always a
joy to be able to replace my own code with somebody else's.

~ Ken


Re: Null comparisons and the transform_null_equals run-time parameter

From
Tom Lane
Date:
"Ken Winter" <ken@sunward.org> writes:
> In case anybody else needs this functionality, let me offer a family of
> functions that do comparisons that treat NULL as a real value (rather than
> as "unknown").  For example:

Er ... this'd be a lot shorter using IS [NOT] DISTINCT FROM.

            regards, tom lane

Dell Poweredge server and Postgres

From
u235sentinel
Date:
I'm curious if anyone has had any experiences (good and bad) using
Postgres on Dell PowerEdge servers.

My manager and I are looking at replacing a Sun x4540 server with a Dell
server connected to a disk subsystem (or two).  We're looking at the
R710 servers connected to an MD1220 I believe (I'd have to look again at
the quote).

The concern we have is our 4540 has a 2TB database which is working
great.  The server has 48 hard drives (250 gig drives) in RAID 10 across
6 disk controllers.  A couple HBA controllers connected to a couple
dozen disks may be slower (though dell assures us it will be faster than
our Sun box).

I thought I'd toss this out and see if anyone has any thoughts on this.
I'm inclined to try it.  The drives quoted are 15k drives and the PERC
controller has cache vs. the Sun controllers have no cache AFAIK.

BTW, in the next few months I believe we're be hitting the 2.5-3 TB size
for our database.  The tables are partitioned which helps a lot.
Performance would be a problem otherwise with that much data I think :-)

Thanks!

Re: Dell Poweredge server and Postgres

From
John R Pierce
Date:
u235sentinel wrote:
> I'm curious if anyone has had any experiences (good and bad) using
> Postgres on Dell PowerEdge servers.

Poweredge is a brandname that describes every single server Dell has
sold since about 15 years ago, maybe more.   I had a Poweredge that was
a dual 600Mhz pentium-III w/ 1GB ram and 4 x 72gb scsi disks.

> My manager and I are looking at replacing a Sun x4540 server with a
> Dell server connected to a disk subsystem (or two).  We're looking at
> the R710 servers connected to an MD1220 I believe (I'd have to look
> again at the quote).
>

why are you looking at replacing it?  you go onto say its working great.


> The concern we have is our 4540 has a 2TB database which is working
> great.  The server has 48 hard drives (250 gig drives) in RAID 10
> across 6 disk controllers.  A couple HBA controllers connected to a
> couple dozen disks may be slower (though dell assures us it will be
> faster than our Sun box).

of course they do, they want to sell you their iorn...    do they have
benchmarks of their proposed configuration vs a x4540 doing the sorts of
tasks you require to back up their claim?     The MD1220 is connected to
the host with a single X4 SAS cable.

btw, the x4540 has each disk on its own SATA channel, 8 channels to the
SATA chip, and each SATA chip on a PCI-E x4 bus, so it has sufficient IO
backplane bandwidth to keep all 48 disks busy at once.      ZFS is
extremely good at this.    The Sun Thumpers have been benchmarked with
rather high IOPS numbers, that few other sorts of systems can sustain in
real world tests.

you can make ZFS on a thumper even faster by using a couple SSDs for the
ZIL logs.

> I thought I'd toss this out and see if anyone has any thoughts on
> this.  I'm inclined to try it.  The drives quoted are 15k drives and
> the PERC controller has cache vs. the Sun controllers have no cache
> AFAIK.

Solaris and ZFS use main memory as the cache.   main memory is faster
than any memory out on an IO controller.

>
> BTW, in the next few months I believe we're be hitting the 2.5-3 TB
> size for our database.  The tables are partitioned which helps a lot.
> Performance would be a problem otherwise with that much data I think :-)

upgrade your thumper to 1TB drives and go to town.    They -should- be
Sun approved drives, however, as there is all sorts of room for
sketchiness using the wrong SATA disks.

Re: Dell Poweredge server and Postgres

From
Scott Marlowe
Date:
On Sat, Jun 5, 2010 at 5:27 PM, u235sentinel <u235sentinel@gmail.com> wrote:
> I'm curious if anyone has had any experiences (good and bad) using Postgres
> on Dell PowerEdge servers.

I've had lots of experience with Dell, most of it poor.  Wrong upgrade
CPUs, use of non-buffered memory meaning I can't max out my machine's
RAM, mediocre performance from most PERC RAID controllers.  I just got
a quote for a 4 CPU / 48 core AMD Magny Cours with 32 15k6 seagates
and 128Gig ram from my favorite white box supplier, with a 5 year
warranty and great support for $25k including shipping.  If Dell can
deliver that much horsepower for that price let me know.

Re: Dell Poweredge server and Postgres

From
Greg Smith
Date:
u235sentinel wrote:
> The concern we have is our 4540 has a 2TB database which is working
> great.  The server has 48 hard drives (250 gig drives) in RAID 10
> across 6 disk controllers.  A couple HBA controllers connected to a
> couple dozen disks may be slower (though dell assures us it will be
> faster than our Sun box).

A Dell system running a PERC with battery-backed write controller will
be faster on database writes than your 4540.  Those Sun boxes are
terrible at OLTP style workloads in particular, the types of writes
PostgreSQL does can't be cached by the hard drives in the Sun Thumper
systems.  It's possible to bottleneck at ~100 write transactions/second
on them given a particularly incompatible application (I wrote one once,
learned the hard way).

The flip side is that you can absolutely approach 2GB/s on reads on your
Sun system, and I'd expect the Dell one will bottleneck somewhere in the
500MB-1GB/s range no matter how many controllers or drives you put into
it.  If your workload is so read heavy that you won't see any advantage
from the write cache you're missing in your Sun box, it's absolutely
possible for the Dell system to be a step backwards.  A big Thumper box
will chug away doing big reads against a 2TB database like it's no
problem at all, as you already know.

While I generally dislike Dell, on the hard drive side of things the
current PERC controllers are rebranded LSI Logic ones, and those seem to
have the best performance while keeping reasonable reliability trade-off
available right now.  Dell just happens to have one of the best known
formulas for building this sort of server right now.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: Dell Poweredge server and Postgres

From
u235sentinel
Date:
On 06/05/2010 07:20 PM, John R Pierce wrote:
>
>> My manager and I are looking at replacing a Sun x4540 server with a
>> Dell server connected to a disk subsystem (or two).  We're looking at
>> the R710 servers connected to an MD1220 I believe (I'd have to look
>> again at the quote).
>>
>
> why are you looking at replacing it?  you go onto say its working great.

The server is close to it's EOL.  We're looking at extending the support
contract and warranty for another 3 years.  Even if we do that, we're
looking at purchasing at least one more system and replicating the data
to a DR site.  With 2 TB of data it would take us a couple days to
recover from a backup.  Much longer than we want the database down in
case of failure.

The new Oracle/Sun company has changed the config for the 4540.  The
smallest drives you can get are 1TB drives which more than doubles the
price of the server.  So we're looking at Dell, IBM, HP and so on.

I figured their sales / marketing would say they are faster than the Sun
systems.  I don't know a single sales geek who wouldn't say that.  I
have asked for comparisons and will be following up on it tomorrow.  I'd
like to see their numbers.  I figured I'd ask for opinions.  Guess I
received a few :-)

thanks!


Re: Dell Poweredge server and Postgres

From
u235sentinel
Date:
On 06/06/2010 02:04 AM, Greg Smith wrote:
> u235sentinel wrote:
>
> A Dell system running a PERC with battery-backed write controller will
> be faster on database writes than your 4540.  Those Sun boxes are
> terrible at OLTP style workloads in particular, the types of writes
> PostgreSQL does can't be cached by the hard drives in the Sun Thumper
> systems.  It's possible to bottleneck at ~100 write
> transactions/second on them given a particularly incompatible
> application (I wrote one once, learned the hard way).

Hmmm.. thanks for the wake up.  I completely spaced on this. Cache for
writing may be faster but cache for reading the data?  Especially with
as much as we're pulling off.  Their controllers can do read caching but
I don't think it will make up the difference between the two.  I'm
guessing it will be slower even in a raid 10 configuration (the 4540 was
setup with a software raid 10 with the ZFS filesystem).

>
> The flip side is that you can absolutely approach 2GB/s on reads on
> your Sun system, and I'd expect the Dell one will bottleneck somewhere
> in the 500MB-1GB/s range no matter how many controllers or drives you
> put into it.  If your workload is so read heavy that you won't see any
> advantage from the write cache you're missing in your Sun box, it's
> absolutely possible for the Dell system to be a step backwards.  A big
> Thumper box will chug away doing big reads against a 2TB database like
> it's no problem at all, as you already know.
Makes sense.  The database is only now 2TB and growing still.  Currently
we can grow up to around 5 TB.  I'm leery of doing this but we'll be
asking for their comparison reports.  I'm very curious.

Thanks!


Re: Dell Poweredge server and Postgres

From
Vick Khera
Date:
On Sat, Jun 5, 2010 at 11:38 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sat, Jun 5, 2010 at 5:27 PM, u235sentinel <u235sentinel@gmail.com> wrote:
>> I'm curious if anyone has had any experiences (good and bad) using Postgres
>> on Dell PowerEdge servers.
>
> I've had lots of experience with Dell, most of it poor.  Wrong upgrade
> CPUs, use of non-buffered memory meaning I can't max out my machine's
> RAM, mediocre performance from most PERC RAID controllers.  I just got

Ditto.  Of late I'm buying HPs, but I haven't yet put one into
database service.  Our DB servers are all currently Sun with fibre
channel cards to external RAID systems.

Re: Dell Poweredge server and Postgres

From
u235sentinel
Date:
On 06/07/2010 12:13 PM, Vick Khera wrote:
>
> Ditto.  Of late I'm buying HPs, but I haven't yet put one into
> database service.  Our DB servers are all currently Sun with fibre
> channel cards to external RAID systems.
>
>

What kind of external RAID systems do you connect your Sun servers to?
I've talked to Oracle/Sun and haven't been able to get a solution even
similar to the 4540 systems.  I'm hoping to find something that will
allow a couple disk controllers to a subsystem.  Even one would be an
improvement.  That way if I have to I can setup a ZFS pool in whatever
RAID config I want across multiple controllers and disks.  I'm figuring
a 2TB database will choke if I only have one controller handling more
than a dozen or so disks.

Dell's solution doesn't sound right to me.  We've looked at HP.  They
are more expensive with similar hardware to what Dell is offering.

Re: Dell Poweredge server and Postgres

From
Scott Marlowe
Date:
On Mon, Jun 7, 2010 at 7:43 PM, u235sentinel <u235sentinel@gmail.com> wrote:
> On 06/07/2010 12:13 PM, Vick Khera wrote:
>>
>> Ditto.  Of late I'm buying HPs, but I haven't yet put one into
>> database service.  Our DB servers are all currently Sun with fibre
>> channel cards to external RAID systems.
>>
>>
>
> What kind of external RAID systems do you connect your Sun servers to?  I've
> talked to Oracle/Sun and haven't been able to get a solution even similar to
> the 4540 systems.  I'm hoping to find something that will allow a couple
> disk controllers to a subsystem.  Even one would be an improvement.  That
> way if I have to I can setup a ZFS pool in whatever RAID config I want
> across multiple controllers and disks.  I'm figuring a 2TB database will
> choke if I only have one controller handling more than a dozen or so disks.

Where I work we use these:

http://www.pc-pitstop.com/sata_enclosures/scsas16rm.asp

for when we need lots of throughput (file servers).  They allow four
SAS connectors instead of the typical one or two.

and will be using these:

http://www.aberdeeninc.com/abcatg/kitjbod-1003.htm

for our database servers, where IOPS is far more important than seq speed.

My experience has been that the number of RAID controllers is no where
near as important as the speed of said RAID controllers.  I'd rather
have a very fast RAID controller handling 16 disks at once, than 4
mediocre ones handling 4 disks each.  The optimal is to have two RAID
controllers, so you have redundancy.  Most decent RAID controllers can
run RAID-1 across the two and then RAID-0 over those RAID-1 pairs
(either software or hardware, depending on OS and hardware
performance).
>
> Dell's solution doesn't sound right to me.  We've looked at HP.  They are
> more expensive with similar hardware to what Dell is offering.

Re: Dell Poweredge server and Postgres

From
Vick Khera
Date:
On Mon, Jun 7, 2010 at 9:43 PM, u235sentinel <u235sentinel@gmail.com> wrote:
> What kind of external RAID systems do you connect your Sun servers to?  I've
> talked to Oracle/Sun and haven't been able to get a solution even similar to
> the 4540 systems.  I'm hoping to find something that will allow a couple
> disk

Our primary DB server is a Sun X4200 M2 with 20Gb RAM with a "Dual
LSILogic FC7X04X 4Gb/s FC PCI-Express Adapter" fibre channel card
plugged into it, directly connected to a SurfRAID Triton 16 array from
Partners Data Systems.  The SurfRAID does the RAID using its internal
controlled.  It has 16 SATA drives and 2GB of cache.  I bought the
whole system configured and burn-in tested from Partners Data Systems.
 I highly recommend them.

Our backup DB server is almost identical.  The only difference is that
it is a Sun X4200 and uses the PCI-X version of the LSI card.

If you're looking to do ZFS you can set this disk system to be in JBOD
mode and it will easily handle all the data for you; I don't think
you'll need multiple controllers.

Re: Dell Poweredge server and Postgres

From
u235sentinel
Date:
On 06/07/2010 08:01 PM, Scott Marlowe wrote:
>
> Where I work we use these:
>
> http://www.pc-pitstop.com/sata_enclosures/scsas16rm.asp
>
> for when we need lots of throughput (file servers).  They allow four
> SAS connectors instead of the typical one or two.
>
> and will be using these:
>
> http://www.aberdeeninc.com/abcatg/kitjbod-1003.htm
>
> for our database servers, where IOPS is far more important than seq speed.
>
> My experience has been that the number of RAID controllers is no where
> near as important as the speed of said RAID controllers.  I'd rather
> have a very fast RAID controller handling 16 disks at once, than 4
> mediocre ones handling 4 disks each.  The optimal is to have two RAID
> controllers, so you have redundancy.  Most decent RAID controllers can
> run RAID-1 across the two and then RAID-0 over those RAID-1 pairs
> (either software or hardware, depending on OS and hardware
> performance).
>

I appreciate it.  I'll chat with management about these.  Thanks for the tip

Re: Dell Poweredge server and Postgres

From
u235sentinel
Date:
On 06/07/2010 08:08 PM, Vick Khera wrote:
>
> Our primary DB server is a Sun X4200 M2 with 20Gb RAM with a "Dual
> LSILogic FC7X04X 4Gb/s FC PCI-Express Adapter" fibre channel card
> plugged into it, directly connected to a SurfRAID Triton 16 array from
> Partners Data Systems.  The SurfRAID does the RAID using its internal
> controlled.  It has 16 SATA drives and 2GB of cache.  I bought the
> whole system configured and burn-in tested from Partners Data Systems.
>   I highly recommend them.
>
> Our backup DB server is almost identical.  The only difference is that
> it is a Sun X4200 and uses the PCI-X version of the LSI card.
>
> If you're looking to do ZFS you can set this disk system to be in JBOD
> mode and it will easily handle all the data for you; I don't think
> you'll need multiple controllers.
>
>

Thanks for the help.  Sounds like we have a few things to talk about
tomorrow in our management review :-)

thanks again!