RE: text fields and performance for ETL - Mailing list pgsql-docs

From Grega Jesih
Subject RE: text fields and performance for ETL
Date
Msg-id 70cf9fe110c246afb4eab054caab0c04@actual-it.si
Whole thread Raw
In response to Re: text fields and performance for ETL  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: text fields and performance for ETL  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-docs

David,

 

> Can you demonstrate, with actual numbers, using today's implementation, a situation where defining a column as char(3) or varchar(3) instead of text has a significant performance improvement?

Sure I can.


But first, what am I to prove ?  A performance in dataflow from server A to server B.

 

What is that performance change based on ? It is based on a fact that the tool that pumps the data may calculate block size and thus work with several

rows as a time instead of a row at a time.

 

Is this Postgres performance related ? NO. It is model design related.


So why would it then be published ? Because inexperienced programmers take your statement that "it's the same performance" from a wrong perspective, so it would be fair to note, that the remark "it's the same performance" is meant "within any operation inside Postgres database". In the moment we want to take this data elsewhere,

the problematic lack of model design comes out.

Now for the demo: I can make a video to see it, but here is a column in context that talks about the thing:

https://dba.stackexchange.com/questions/102830/avoiding-row-by-row-fetch-method-when-dealing-with-source-lob-columns

 

If this contribution from stackexchange isn't enough, let me know.

BR
Grega

 





 

From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Wednesday, November 3, 2021 7:40 PM
To: Grega Jesih <Grega.Jesih@actual-it.si>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; grega.jesih@gmail.com; Pg Docs <pgsql-docs@lists.postgresql.org>
Subject: Re: text fields and performance for ETL

 

On Wed, Nov 3, 2021 at 11:09 AM Grega Jesih <Grega.Jesih@actual-it.si> wrote:

The new architectures include more and more data exchange among databases. 
Now when you deal with bigger data sizes that go from millions to billions, this fixed size vs of text - undefined size becomes very  relevant. 

Can you demonstrate, with actual numbers, using today's implementation, a situation where defining a column as char(3) or varchar(3) instead of text has a significant performance improvement?  Without a concrete example to examine I'm unable to be convinced to move away from the status quo.

 

You also need to convince me as to why constraints are an insufficient feature.  i.e., why is char(3) better than (check length(val) = 3)?

 

Even with all that I'd probably still not do anything beyond reviewing a proposed patch (i.e, I wouldn't try to write one myself from scratch...I don't have authority to commit regardless).

 

David J.

 


NOTICE - NOT TO BE REMOVED.
This e-mail and any attachments are confidential and may contain legally privileged information and/or copyright material of Actual I.T. or third parties. If you are not an authorised recipient of this e-mail, please contact Actual I.T. immediately by return email or by telephone or facsimile on the above numbers.
You should not read, print, re-transmit, store or act in reliance on this email or any attachments and you should destroy all copies of them.

pgsql-docs by date:

Previous
From: Tom Lane
Date:
Subject: Re: text fields and performance for ETL
Next
From: "David G. Johnston"
Date:
Subject: Re: text fields and performance for ETL