Thread: Performance differences using varchar, char and text

Performance differences using varchar, char and text

From

Yusuf

Date:

19 June 2003, 13:33:00

In the Postgres 7.3.3. User Guide section 5.3, it says that there's no performance difference between the three type.
  But in 'PostgreSQL Database Performance Tuning' by Jean-Paul ARGUDO (section 5: use correct datatype), it says there
is a performance difference because of you might need more I/O to read the data (which makes sense).

So, is there a performance difference?

Re: Performance differences using varchar, char and text

From

Andrew Ayers

Date:

19 June 2003, 14:19:14

Yusuf wrote:
> In the Postgres 7.3.3. User Guide section 5.3, it says that there's no
> performance difference between the three type.  But in 'PostgreSQL
> Database Performance Tuning' by Jean-Paul ARGUDO (section 5: use correct
> datatype), it says there is a performance difference because of you
> might need more I/O to read the data (which makes sense).
>
> So, is there a performance difference?

Yusuf,

Here is what I noticed - I am using Postgres 7.3.2 (on a Sun box, not
sure what OS version) with the latest ODBC driver on a Windows XP Pro
box. I am in the process of conversion of a legacy VB app from using
Access 97 to PostgreSQL via an ODBC connection (DSN-less).

I had in my Access DB several tables which utilized Memo-type fields to
store data in a "multi-value" delimited format (will never do that
again). I tried to first convert them to TEXT type fields on the
PostgreSQL tables. These didn't work right (more on that later), so I
converted them to large VARCHAR field (ie, VARCHAR(100000), and larger,
in some cases).

I noticed when doing selects (via the ODBC driver) that any accesses to
these fields caused MASSIVE slowdowns on the select - whether I was
selecting for them, or if the field was part of the WHERE clause of the
SQL statement. I found that if I decreased the size of the field, the
speed would increase.

One thing I did note was that in using psql, selects on these fields
didn't matter - they seemed fast. I tend to think it was the ODBC driver
in some manner.

Later, I determined that the reason I was having troubles with the TEXT
fields was because I was using DAO. Switching the code to use ADO calls
instead fixed the issue, and I switched to using the TEXT type on the
fields. However, by then I had installed many "workarounds" to avoid
those fields as much as possible in my code that I don't know if the
selects on them would be the same, faster, or slower...

Hope this helps a little...

Andrew

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain
informationthat is privileged, confidential and exempt from disclosure under applicable law. If you are not the
intendedaddressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use,
copy,disclose or distribute to anyone the message or any information contained in the message. If you have received
thismessage in error, please immediately advise the sender by reply email, and delete the message. Thank you.

Re: Performance differences using varchar, char and text

From

Tom Lane

Date:

19 June 2003, 15:18:25

Andrew Ayers <aayers@eldocomp.com> writes:
> I had in my Access DB several tables which utilized Memo-type fields to
> store data in a "multi-value" delimited format (will never do that
> again). I tried to first convert them to TEXT type fields on the
> PostgreSQL tables. These didn't work right (more on that later), so I
> converted them to large VARCHAR field (ie, VARCHAR(100000), and larger,
> in some cases).

> I noticed when doing selects (via the ODBC driver) that any accesses to
> these fields caused MASSIVE slowdowns on the select - whether I was
> selecting for them, or if the field was part of the WHERE clause of the
> SQL statement. I found that if I decreased the size of the field, the
> speed would increase.

I believe what you're reporting here is problems on the Access side, not
problems in the underlying database.  (That doesn't make them any less
of a real problem if you're using Access, of course.)  Access doesn't
work very well with datatypes that aren't found in MS SQL Server...

I think if you look in the pgsql-odbc list archives you will find some
discussion of workarounds for Access with TEXT fields.

As far as the original question goes: there is no reason within Postgres
to choose one of these three types on performance grounds; you should
make the choice based on the semantics you want.  Do you really want
every value blank-padded to exactly N characters?  Use char(N).  If you
don't want padding, but do want a specific upper limit on the field
width, use varchar(N).  If you haven't got any specific upper limit in
mind (and if you're putting in numbers like 100000 then you don't ;-))
then use text.  The performance differences that exist come directly
from the cycles expended to add padding blanks, check that the width
limit is not exceeded, etc.

            regards, tom lane

Re: Performance differences using varchar, char and text

From

Andrew Ayers

Date:

19 June 2003, 15:49:46

Tom Lane wrote:
> I believe what you're reporting here is problems on the Access side, not
> problems in the underlying database.  (That doesn't make them any less
> of a real problem if you're using Access, of course.)  Access doesn't
> work very well with datatypes that aren't found in MS SQL Server...

No - I had converted an Access 97 DB over to Postgres, and my VB app was
updating the PG database via the ODBC driver. My application is written
in VB (version 6, Enterprise Edition) - it isn't a case of an Access/VB
for Applications front-end to PG backend.

> I think if you look in the pgsql-odbc list archives you will find some
> discussion of workarounds for Access with TEXT fields.

I believe my problems extended from using DAO, instead of ADO - once I
switched to using ADO, my problems went away (of course, other problems
cropped up, but that was due to code differences between DAO and ADO
usage). I could use TEXT datatypes on the PG database/tables without
problems.

Honestly, most of my issues was my bad decision to "bastardize"
relational DBs - I have a lot of background in PICK/MultiValue
databases, where you can store and retrieve from data-cubes, and I
wanted this functionality for my application. I have since learned that
this approach is far from correct (and I probably have Codd or Cobb or
whatever his name is spinning in his new grave - RIP). In the process of
converting to PG, a few of the tables I *had* to normalize and split
into proper relational tables in order for it to work properly and
quickly when I was using VARCHAR fields - so some good has come out of this.

> As far as the original question goes: there is no reason within Postgres
> to choose one of these three types on performance grounds; you should
> make the choice based on the semantics you want.  Do you really want
> every value blank-padded to exactly N characters?  Use char(N).  If you
> don't want padding, but do want a specific upper limit on the field
> width, use varchar(N).  If you haven't got any specific upper limit in
> mind (and if you're putting in numbers like 100000 then you don't ;-))
> then use text.  The performance differences that exist come directly
> from the cycles expended to add padding blanks, check that the width
> limit is not exceeded, etc.

I agree in practice, when you are creating and maintaining a proper
relational database, and accessing it properly.

However, because of DAO I was hitting a big problem on updates of TEXT
fields, which I didn't have in regards to VARCHAR, or any other datatype
field in PG - only the TEXT datatype. Furthermore as I have noted, many
of these large fields were being used in a manner inconsistant with the
rules of relational database structure. I shouldn't have done that.

I first thought it was the ODBC driver, but after much discussion and
such, I finally tried using ADO instead of DAO, and the update issue
with TEXT fields went away. I don't know if the performance issue is
still there, though. I kinda suspect it is. However, I have since coded
around it (when I was using VARCHAR fields), so the problem isn't a
problem for me anymore (I had fields that were very large holding
descriptions, and I wanted to do a "keyword" LIKE search on them - I
ended up with a different strategy, which keeps me away from TEXT and
large VARCHAR fields, and have a keyword table for allowances of lexical
stats and such now).

Andrew

-- CONFIDENTIALITY NOTICE --

This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain
informationthat is privileged, confidential and exempt from disclosure under applicable law. If you are not the
intendedaddressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use,
copy,disclose or distribute to anyone the message or any information contained in the message. If you have received
thismessage in error, please immediately advise the sender by reply email, and delete the message. Thank you.