Thread: field error on refreshed materialized view
I'm in the process of updating a server from postgresql 10.4 on Centos 7 to 16.1 on AlmaLInux 9, and am reminded of the old line about when you're up to your necks in alligators, it is difficult to remember that your goal was to drain the swamp. We have several materialized views that are refreshed from an external mysql/mariadb server. On the 10.4 server this materialized view works, but on the 16.1 server it fails: uscf=# refresh materialized view uscf_vip; refresh materialized view uscf_vip; ERROR: value too long for type character varying(255) Both materialized view tables appear to be identical. (We did a restore from a recent database dump to populate the new server.) I don't see anything in the log files to help me determine which field or row is causing the error. As far as I can tell, encoding is the same on both versions, but there could be issues in how the mysql server is coding non-ASCII characters. Is this something that has to be tracked down on the mysql/mariadb server or is there anything I can do to help narrow down the problem on my 16.1 server? -- Mike Nolan htfoot@gmail.com
On 1/25/24 1:29 PM, Michael Nolan wrote:
I'm in the process of updating a server from postgresql 10.4 on Centos 7 to 16.1 on AlmaLInux 9, and am reminded of the old line about when you're up to your necks in alligators, it is difficult to remember that your goal was to drain the swamp. We have several materialized views that are refreshed from an external mysql/mariadb server.
Define "... refreshed from an external mysql/mariadb server"?
On the 10.4 server this materialized view works, but on the 16.1 server it fails: uscf=# refresh materialized view uscf_vip; refresh materialized view uscf_vip; ERROR: value too long for type character varying(255)
Is there more to the error, that shows the actual value?
Or have you looked in server log to see if there is more information?
Both materialized view tables appear to be identical. (We did a restore from a recent database dump to populate the new server.) I don't see anything in the log files to help me determine which field or row is causing the error. As far as I can tell, encoding is the same on both versions, but there could be issues in how the mysql server is coding non-ASCII characters. Is this something that has to be tracked down on the mysql/mariadb server or is there anything I can do to help narrow down the problem on my 16.1 server? -- Mike Nolan htfoot@gmail.com
Michael Nolan <htfoot@gmail.com> writes: > On the 10.4 server this materialized view works, but on the 16.1 > server it fails: > uscf=# refresh materialized view uscf_vip; > refresh materialized view uscf_vip; > ERROR: value too long for type character varying(255) > Both materialized view tables appear to be identical. (We did a > restore from a recent database dump to populate the new server.) > I don't see anything in the log files to help me determine which field > or row is causing the error. As far as I can tell, encoding is the > same on both versions, but there could be issues in how the mysql > server is coding non-ASCII characters. Yeah, an encoding discrepancy could explain this, by causing the server to count characters differently than before. (Note that we consider varchar(255) to allow 255 characters, not 255 bytes.) One idea for tracking it down could be to remove or increase all the varchar limits in the matview declaration, load the data, and then probe for bad entries with something like "select * from matview where length(some_varchar_field) > 255". regards, tom lane
On Thu, Jan 25, 2024 at 3:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Nolan <htfoot@gmail.com> writes: > > On the 10.4 server this materialized view works, but on the 16.1 > > server it fails: > > > uscf=# refresh materialized view uscf_vip; > > refresh materialized view uscf_vip; > > ERROR: value too long for type character varying(255) > > > Both materialized view tables appear to be identical. (We did a > > restore from a recent database dump to populate the new server.) > > > I don't see anything in the log files to help me determine which field > > or row is causing the error. As far as I can tell, encoding is the > > same on both versions, but there could be issues in how the mysql > > server is coding non-ASCII characters. > > Yeah, an encoding discrepancy could explain this, by causing the > server to count characters differently than before. (Note that > we consider varchar(255) to allow 255 characters, not 255 bytes.) Thanks, Tom. Both postgresql databases are using SQL-ASCII and C LC values, I don't know what the mariadb is using, but I've asked the people who maintain that server for that information. I suspect it may be UTF-8 or something else that supports multi-byte characters. I don't know if changing the encoding on the new database server would introduce more issues than it would solve, but as we're still in testing mode I could test that. > One idea for tracking it down could be to remove or increase all the > varchar limits in the matview declaration, load the data, and then > probe for bad entries with something like "select * from matview where > length(some_varchar_field) > 255". The reasons for the 255 are mostly historical (the original server this data came from was a COBOL system.) Increasing or removing the 255 character limits may not cause any new problems. == Mike Nolan
Followup: The problem turned out to be a field in the mysql server that was not size-limited to 255 characters and had several rows with as many as 299 characters in them. Apparently when using an FDW and materialized view in postgresql 10.4, field size limits aren't checked, because a query on the matview in the 10.4 system finds those records with more than 255 characters in that field. So, not a character encoding issue at all. -- Mike Nolan
Michael Nolan <htfoot@gmail.com> writes: > Followup: The problem turned out to be a field in the mysql server > that was not size-limited to 255 characters and had several rows with > as many as 299 characters in them. Apparently when using an FDW and > materialized view in postgresql 10.4, field size limits aren't > checked, because a query on the matview in the 10.4 system finds those > records with more than 255 characters in that field. > So, not a character encoding issue at all. Hm ... if the foreign table's field was declared as varchar(255), older PG might've believed that and not rechecked the length while storing into the matview. Not sure if this behavior change was intentional or an accidental product of refactoring. regards, tom lane