RESOLVED: varchar comparison and trim() - Mailing list pgsql-novice

From ta@lavabit.com
Subject RESOLVED: varchar comparison and trim()
Date
Msg-id 55752.87.252.128.110.1356083656.squirrel@lavabit.com
Whole thread Raw
Responses Re: RESOLVED: varchar comparison and trim()  (Michael Wood <esiotrot@gmail.com>)
List pgsql-novice
Considering Kevin’s answer I have chosen to stick with “the Postgres way”
of resolving this, but here are some of my other rumblings that someone
might find useful:

This all is about converting part of (and interoperate with PG/database) a
legacy app that uses ISAM-like fixed length data.
So, data and queries come in containing a lot of trailing spaces.

Possible solutions for this “trailing space” problem that I’ve come to:

1.Use char(n) for all indexed/searchable columns
(+) simple as it can be, imposes no compatibility risk
(-) will cause data pages and already somewhat oversized btree indexes to
increase in size slightly – as PG does not feature compressed indexes yet,
and that would mean more disk pages to read in…

2. Redefine (in public schema) “varchar to varchar” operators to conform
to current “char to varchar” and “char to char” behavior
(+)  easy and transparent. Both char and varchar columns are actually
meant to store human-readable text where no trailing space has any
significance.  It’s highly unlikely this change would interfere with any
other software as programmers in general (being humans) are not supposed
to consider trailing spaces “significant” – if one wants to store trailing
spaces, char(n) and char arrays might be much better suited to
(-) “LANGUAGE internal” operators are supposed to run faster (have done no
timings) – so, slight slowdown in execution speed might be expected as
this comparison operations are performed quite often
(-) one has to make sure that “public” schema is always included in
search_path
(-) opposes “the standard”

3. Introduce a new datatype extension with its own set of operators (like
citext is)
(+) would not interfere with anything
(-) requires some knowledge and skill and is doubtful whether it would be
accepted by the community

4. Modify client application logic to prevent excess trailing spaces ever
reaching the database. (One can put constraints/domains or triggers to
enforce Rtrim() on data, but main problem remains:  the query arguments!)
(+) does not require any possibly incompatible database tweaks
(-) application logic might be too difficult to patch (ORM-s  etc…)

5. Use another database (MySql supports it out of the box – actually it’s
the default and only way it works - so this situation might arise as well
if you are converting a MySql application to Postgres, and MSSQL supports
it via ANSI_PADDING setting – it was their default at the beginning too).
(+) just switch to
(-) would miss some nice PG features (like search_path for instance)

I finally opted for “client app” solution, inserting a custom made query
parser in app=>database comm process, preventing all trailing spaces ever
leaving the application unless specifically instructed to.

Tom





pgsql-novice by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: varchar comparison and trim()
Next
From: Michael Wood
Date:
Subject: Re: RESOLVED: varchar comparison and trim()