Thread: A simple question about text fields

A simple question about text fields

From
Martin Mueller
Date:
Are there performance issues with the choice of 'text' vs. varchar and some character limit?  For instance, if I have a
tablewith ten million records and text fields that may range in length from 15 to 150, can I expect a measurable
improvementin response time for using varchar(150) or will text    do just or nearly as well. 
 

If the latter is the case, using text across the board is a simpler choice  
    


Re: A simple question about text fields

From
Tom Lane
Date:
Martin Mueller <martinmueller@northwestern.edu> writes:
> Are there performance issues with the choice of 'text' vs. varchar and some character limit?  For instance, if I have
atable with ten million records and text fields that may range in length from 15 to 150, can I expect a measurable
improvementin response time for using varchar(150) or will text    do just or nearly as well.  

There is no situation where varchar outperforms text in Postgres.
If you need to apply a length constraint for application semantic
reasons, do so ... otherwise, text is the native type.  It's
useful to think of varchar as being a domain over text, though
for various reasons it's not implemented quite that way.

            regards, tom lane



Re: A simple question about text fields

From
Gavan Schneider
Date:
On 17 Jun 2021, at 1:08, Tom Lane wrote:

> Martin Mueller <martinmueller@northwestern.edu> writes:
>
>> Are there performance issues with the choice of 'text' vs. varchar and some character limit?  For instance, if I
havea table with ten million records and text fields that may range in length from 15 to 150, can I expect a measurable
improvementin response time for using varchar(150) or will text    do just or nearly as well. 
>
>  There is no situation where varchar outperforms text in Postgres.
>  If you need to apply a length constraint for application semantic
>  reasons, do so ... otherwise, text is the native type.  It's
>  useful to think of varchar as being a domain over text, though
>  for various reasons it's not implemented quite that way.
>
This reminds of my days converting from MySQL to PostgreSQL. MySQL, along with other databases, seemed to have a strong
preferencefor setting a length on character strings. And all this from before the advent of UTF encoding which has made
theconcept of string ‘length’ very messy. 

Database guru and SQL author Joe Celko asserts in his ’SQL for Smarties’ that if he finds a text field without a length
limithe will input the Heart Sutra (presumably in ASCII :) to demonstrate the design error. (Of course he is ignoring
thepotential for this input to help the database achieve inner consistency. :) . But taking Joe’s central point there
doseem to be grounds for restricting user input text fields to a reasonable length according to the business need… if
onlyto limit the damage of a cat sitting on the keyboard. 

My approach is to define such fields as ‘text’ and set a constraint using char_length(). This allows PG to do the
businesswith the text in native form, and only imposes the cost of any length check when the field is updated… best of
bothworlds. 

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a well-known solution to every human problem —
neat,plausible, and wrong. 
— H. L. Mencken, 1920



Re: A simple question about text fields

From
"David G. Johnston"
Date:
On Thursday, June 17, 2021, Gavan Schneider <list.pg.gavan@pendari.org> wrote:

My approach is to define such fields as ‘text’ and set a constraint using char_length(). This allows PG to do the business with the text in native form, and only imposes the cost of any length check when the field is updated… best of both worlds.

Those are basically the same world…your alternative probably is strictly worse than varchar(n) because of its novel way of implementing the same functionality.

For most text fields any good constraint is going be done in the form of a regular expression, one that at minimum prevents non-printable characters (linefeed and carriage return being obvious examples).

David J.

Re: A simple question about text fields

From
Gavan Schneider
Date:
On 18 Jun 2021, at 9:34, David G. Johnston wrote:

> On Thursday, June 17, 2021, Gavan Schneider <list.pg.gavan@pendari.org>
> wrote:
>
>>
>> My approach is to define such fields as ‘text’ and set a constraint using
>> char_length(). This allows PG to do the business with the text in native
>> form, and only imposes the cost of any length check when the field is
>> updated… best of both worlds.
>>
>
> Those are basically the same world…your alternative probably is strictly
> worse than varchar(n) because of its novel way of implementing the same
> functionality.
>
Not sure if this is strictly true. Novelty per se is not always worse. :)
The design advantage is in all text fields being defined the same — no built in length.
When it becomes apparent a length constraint is needed it can be added for the relevant field(s), e.g., when the system
doesnot impose proper restraint at the input stage. 

> For most text fields any good constraint is going be done in the form of a
> regular expression, one that at minimum prevents non-printable characters
> (linefeed and carriage return being obvious examples).
>
Agree. If the business rules need some additional restrictions they can go here as well.
Not so obvious that newlines, etc. are unwelcome. Consider the need for human readable text in comment fields (or other
annotationwhere the readability is enhanced by such layout). There is always the consideration of SQL injection (but
it’smostly too late if we’re leaving this to a constraint) and other toxic string sequences. But this is all business
logic.The database just needs to handle the ‘text’ and we need to design the restraint around the content. 

Gavan Schneider
——
Gavan Schneider, Sodwalls, NSW, Australia
Explanations exist; they have existed for all time; there is always a well-known solution to every human problem —
neat,plausible, and wrong. 
— H. L. Mencken, 1920



Re: A simple question about text fields

From
Laurenz Albe
Date:
On Fri, 2021-06-18 at 10:28 +1000, Gavan Schneider wrote:
> On 18 Jun 2021, at 9:34, David G. Johnston wrote:
> > On Thursday, June 17, 2021, Gavan Schneider <list.pg.gavan@pendari.org> wrote:
> > 
> > > My approach is to define such fields as ‘text’ and set a constraint using
> > > char_length(). This allows PG to do the business with the text in native
> > > form, and only imposes the cost of any length check when the field is
> > > updated… best of both worlds.
> > > 
> > 
> > Those are basically the same world…your alternative probably is strictly
> > worse than varchar(n) because of its novel way of implementing the same
> > functionality.
> 
> Not sure if this is strictly true. Novelty per se is not always worse. :)

True in general, but not in this case.

There is no advantage in a "text" with a check constraint on the length,
that is, no added functionality.

And it is worse for these reasons:

- the performance will be worse (big reason)

- the length limit is less obvious if you look at the table definition
  (small reason)

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com