Thread: Performance of DOMAINs

Performance of DOMAINs

From
David Wheeler
Date:
Howdy,

Didn't see anything in the archives, so I thought I'd ask: has anyone
done any work to gauge the performance penalty of using DOMAINs? I'm
thinking of something like Elein's email DOMAIN:

   http://www.varlena.com/GeneralBits/

I figured that most simple domains that have a constraint check are
no faster or slower than tables with constraints that validate a
particular column. Is that the case?

But I'm also interested in how Elein made the email domain case-
insensitive, since I'd like to have/create a truly case-insensitive
text type (ITEXT anyone?). The functions for the operator class there
were mainly written in SQL, and if it adds a significant overhead,
I'm not sure it'd be a good idea to use that approach for a case-
insensitive text type, since I use it quite a lot in my apps, and
often do LIKE queries against text data. Thoughts?

Many TIA,

David

Re: Performance of DOMAINs

From
"Jim C. Nasby"
Date:
On Wed, Jun 21, 2006 at 11:26:16AM -0700, David Wheeler wrote:
> Howdy,
>
> Didn't see anything in the archives, so I thought I'd ask: has anyone
> done any work to gauge the performance penalty of using DOMAINs? I'm
> thinking of something like Elein's email DOMAIN:
>
>   http://www.varlena.com/GeneralBits/
>
> I figured that most simple domains that have a constraint check are
> no faster or slower than tables with constraints that validate a
> particular column. Is that the case?

Probably. Only thing that might pose a difference is if you're doing a
lot of manipulating of the domain that didn't involve table access;
presumably PostgreSQL will perform the checks every time you cast
something to a domain.

> But I'm also interested in how Elein made the email domain case-
> insensitive, since I'd like to have/create a truly case-insensitive
> text type (ITEXT anyone?). The functions for the operator class there

http://gborg.postgresql.org/project/citext/projdisplay.php

> were mainly written in SQL, and if it adds a significant overhead,
> I'm not sure it'd be a good idea to use that approach for a case-
> insensitive text type, since I use it quite a lot in my apps, and
> often do LIKE queries against text data. Thoughts?
>
> Many TIA,
>
> David
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance of DOMAINs

From
Tom Lane
Date:
David Wheeler <david@kineticode.com> writes:
> Didn't see anything in the archives, so I thought I'd ask: has anyone
> done any work to gauge the performance penalty of using DOMAINs?

There are some reports in the archives of particular usage patterns
where they pretty much suck, because GetDomainConstraints() searches
pg_constraint every time it's called.  We do what we can to avoid
calling that multiple times per query, but for something like a simple
INSERT ... VALUES into a domain column, the setup overhead is still bad.

I've been intending to try to fix things so that the search result can
be cached by typcache.c, but not gotten round to it.  (The hard part,
if anyone wants to tackle it, is figuring out a way to clear the cache
entry when needed.)

            regards, tom lane

Re: Performance of DOMAINs

From
Josh Berkus
Date:
David,

> But I'm also interested in how Elein made the email domain case-
> insensitive, since I'd like to have/create a truly case-insensitive
> text type (ITEXT anyone?). The functions for the operator class there
> were mainly written in SQL, and if it adds a significant overhead,
> I'm not sure it'd be a good idea to use that approach for a case-
> insensitive text type, since I use it quite a lot in my apps, and
> often do LIKE queries against text data. Thoughts?

Well, current case-insensitivity hacks definitely aren't compatible with
LIKE as far as "begins with" indexes are concerned.   Of course, floating
LIKEs (%value%) are going to suck no matter what data type you're using.

I created an operator for CI equality ... =~ ... which performs well on
indexed columns.   But it doesn't do "begins with".

ITEXT is a TODO, but there are reasons why it's harder than it looks.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Performance of DOMAINs

From
Michael Glaesemann
Date:
> since I'd like to have/create a truly case-insensitive
> text type (ITEXT anyone?).

I haven't seen it mentioned in this thread yet, but have you looked
at citext?

http://gborg.postgresql.org/project/citext/projdisplay.php

I don't have any experience with it, but perhaps it can do what
you're looking for.

Michael Glaesemann
grzm seespotcode net




Re: Performance of DOMAINs

From
David Wheeler
Date:
On Jun 21, 2006, at 13:08, Tom Lane wrote:

> There are some reports in the archives of particular usage patterns
> where they pretty much suck, because GetDomainConstraints() searches
> pg_constraint every time it's called.  We do what we can to avoid
> calling that multiple times per query, but for something like a simple
> INSERT ... VALUES into a domain column, the setup overhead is still
> bad.

I assume that there's no domain thingy that you already have that
could cache it, eh?

Sorry, I ask this as someone who knows no C and less about
PostgreSQL's internals.

Best,

David


Re: Performance of DOMAINs

From
David Wheeler
Date:
On Jun 21, 2006, at 18:19, Josh Berkus wrote:

> Well, current case-insensitivity hacks definitely aren't compatible
> with
> LIKE as far as "begins with" indexes are concerned.

Yes, currently I use LOWER() for my indexes and for all LIKE, =, etc.
queries. This works well, but ORDER by of course isn't what I'd like.
That's one of the things that Elein's email domain addresses, albeit
with a USING keyword, which is unfortunate.

> Of course, floating
> LIKEs (%value%) are going to suck no matter what data type you're
> using.

Yes, I know that. :-) I avoid that.

> I created an operator for CI equality ... =~ ... which performs
> well on
> indexed columns.   But it doesn't do "begins with".

Oops. So how could it perform well on indexed columns?

> ITEXT is a TODO, but there are reasons why it's harder than it looks.

I'm sure. I should bug potential future SoC students about it. ;-)

Best,

David

Re: Performance of DOMAINs

From
David Wheeler
Date:
On Jun 21, 2006, at 19:24, Michael Glaesemann wrote:

> I haven't seen it mentioned in this thread yet, but have you looked
> at citext?
>
> http://gborg.postgresql.org/project/citext/projdisplay.php
>
> I don't have any experience with it, but perhaps it can do what
> you're looking for.

Yes, I've seen it. I haven't tried it, either. It'd be nice if it had
a compatible license with PostgreSQL, though.

Best,

David