Thread: BUG #16953: OOB access while converting "interval" to char
The following bug has been logged on the website: Bug reference: 16953 Logged by: Theodor Arsenij Larionov-Trichkin Email address: t.larionov@postgrespro.ru PostgreSQL version: 13.2 Operating system: Ubuntu 20.04.2 LTS Description: Hello! How to reproduce: 1. mkdir -p ./installation/databases 2. git clone --single-branch --depth=1 --branch=REL_13_2 https://github.com/postgres/postgres postgres_src 3. cd postgres_src 4. ./configure --prefix=`pwd`/../installation/pgbuild 5. make -j20 && make install && cd .. 6. ./installation/pgbuild/bin/initdb -U username -D ./installation/databases/db_clean 7. ./installation/pgbuild/bin/postgres -D ./installation/databases/db_clean/ 8. ./installation/pgbuild/bin/psql -h 127.0.0.1 -p 5432 -U username postgres 9. Performing this query will result in OOB access of rm_months_lower array and as a result crash: SELECT * from TO_CHAR(interval '-1Mon', 'rm'); Output: 2021-04-07 12:07:27.060 MSK [33887] LOG: starting PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit 2021-04-07 12:07:27.060 MSK [33887] LOG: listening on IPv4 address "127.0.0.1", port 5432 2021-04-07 12:07:27.065 MSK [33887] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2021-04-07 12:07:27.069 MSK [33888] LOG: database system was shut down at 2021-04-07 12:07:22 MSK 2021-04-07 12:07:27.071 MSK [33887] LOG: database system is ready to accept connections 2021-04-07 12:08:01.013 MSK [33887] LOG: server process (PID 34113) was terminated by signal 11: Segmentation fault 2021-04-07 12:08:01.013 MSK [33887] DETAIL: Failed process was running: SELECT * from TO_CHAR(interval '-1Mon', 'rm'); 2021-04-07 12:08:01.013 MSK [33887] LOG: terminating any other active server processes 2021-04-07 12:08:01.013 MSK [33892] WARNING: terminating connection because of crash of another server process 2021-04-07 12:08:01.013 MSK [33892] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2021-04-07 12:08:01.013 MSK [33892] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2021-04-07 12:08:01.013 MSK [35036] FATAL: the database system is in recovery mode 2021-04-07 12:08:01.014 MSK [33887] LOG: all server processes terminated; reinitializing 2021-04-07 12:08:01.027 MSK [35038] LOG: database system was interrupted; last known up at 2021-04-07 12:07:27 MSK 2021-04-07 12:08:01.248 MSK [35038] LOG: database system was not properly shut down; automatic recovery in progress 2021-04-07 12:08:01.249 MSK [35038] LOG: redo starts at 0/1559798 2021-04-07 12:08:01.249 MSK [35038] LOG: invalid record length at 0/15597D0: wanted 24, got 0 2021-04-07 12:08:01.249 MSK [35038] LOG: redo done at 0/1559798 2021-04-07 12:08:01.256 MSK [33887] LOG: database system is ready to accept connections Postgres version: PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
Hi, On Wed, Apr 07, 2021 at 09:09:25AM +0000, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 16953 > Logged by: Theodor Arsenij Larionov-Trichkin > Email address: t.larionov@postgrespro.ru > PostgreSQL version: 13.2 > Operating system: Ubuntu 20.04.2 LTS > Description: > > 9. Performing this query will result in OOB access of rm_months_lower array > and as a result crash: SELECT * from TO_CHAR(interval '-1Mon', 'rm'); > > Output: > [...] > terminated by signal 11: Segmentation fault > 2021-04-07 12:08:01.013 MSK [33887] DETAIL: Failed process was running: > SELECT * from TO_CHAR(interval '-1Mon', 'rm'); Indeed, thanks a lot for the report! It's because rm/RM are computed in a way that doesn't play nice with negative values: sprintf(s, "%*s", S_FM(n->suffix) ? 0 : -4, rm_months_lower[MONTHS_PER_YEAR - tm->tm_mon]); PFA a naive patch to fix this problem with some regression tests. I'm assuming that -1 month should be january and not december. I had a quick look at the rest of formatting.c and didn't spot any similar problem, but another pair of eyes wouldn't hurt.
Attachment
On Wed, Apr 07, 2021 at 08:08:56PM +0800, Julien Rouhaud wrote: > PFA a naive patch to fix this problem with some regression tests. I'm assuming > that -1 month should be january and not december. I had a quick look at the > rest of formatting.c and didn't spot any similar problem, but another pair of > eyes wouldn't hurt. Assuming an absolute number is not really intuitive when it comes to a negative number here, while counting backward feels more natural, so I would vote for making -1 be December, -2 November, etc. Let's also make the tests more extended. I would suggest a single query with generate_series() from say -12 to 12, that checks the output of both RM and rm for the full range of values supported by tm_mon. -- Michael
Attachment
On Thu, Apr 08, 2021 at 07:17:50PM +0900, Michael Paquier wrote: > On Wed, Apr 07, 2021 at 08:08:56PM +0800, Julien Rouhaud wrote: > > PFA a naive patch to fix this problem with some regression tests. I'm assuming > > that -1 month should be january and not december. I had a quick look at the > > rest of formatting.c and didn't spot any similar problem, but another pair of > > eyes wouldn't hurt. > > Assuming an absolute number is not really intuitive when it comes to a > negative number here, while counting backward feels more natural, so I > would vote for making -1 be December, -2 November, etc. Honestly I might as well have flipped a coin here, so I'm fine either way. > Let's also make the tests more extended. I would suggest a single > query with generate_series() from say -12 to 12, that checks the > output of both RM and rm for the full range of values supported by > tm_mon. I'm fine with it too, although I'd probably go with [-13, 13] just to make sure that there's isn't silly off-by-one mistake. I'll just wait a bit to see if anyone else has any opinion on whether -1 month should be January or December.
On Thu, Apr 08, 2021 at 07:04:03PM +0800, Julien Rouhaud wrote: > I'm fine with it too, although I'd probably go with [-13, 13] just to make sure > that there's isn't silly off-by-one mistake. Also, I guess that you'd just want to compile twice a modulo based on MONTHS_PER_YEAR to get the correct positive position in each array. > I'll just wait a bit to see if anyone else has any opinion on whether -1 month > should be January or December. Sure. If you can send an updated patch, that would be great. -- Michael
Attachment
On Thu, Apr 08, 2021 at 08:37:21PM +0900, Michael Paquier wrote: > On Thu, Apr 08, 2021 at 07:04:03PM +0800, Julien Rouhaud wrote: > > I'm fine with it too, although I'd probably go with [-13, 13] just to make sure > > that there's isn't silly off-by-one mistake. > > Also, I guess that you'd just want to compile twice a modulo based on > MONTHS_PER_YEAR to get the correct positive position in each array. I'm not sure what you mean by that. We receive a pg_tm struct which can't contain more than 12 in tm_mon right? And actually for intervals it will reduce "12 months" to "1 year 0 month", so to_char previously didn't report anything for 12 months either. > > > I'll just wait a bit to see if anyone else has any opinion on whether -1 month > > should be January or December. > > Sure. If you can send an updated patch, that would be great. Hearing no other opinion I went with -1 -> december and so on in attached v2. I also fixed the "[-]12 months" case and updated the regression tests. Given the extra code needed to deduce the correct array position I factorized DCH_RM and DCH_rm.
Attachment
On Fri, Apr 09, 2021 at 06:42:46PM +0800, Julien Rouhaud wrote: > I'm not sure what you mean by that. We receive a pg_tm struct which can't > contain more than 12 in tm_mon right? And actually for intervals it will > reduce "12 months" to "1 year 0 month", so to_char previously didn't report > anything for 12 months either. I did not take the time to look in details, but for reference I just imagined that a formula like this one would give pretty much the position in rm_months_upper: M_PER_Y - ((tm_mon % M_PER_Y) + M_PER_Y) % M_PER_Y > Hearing no other opinion I went with -1 -> december and so on in attached v2. > I also fixed the "[-]12 months" case and updated the regression tests. Given > the extra code needed to deduce the correct array position I factorized DCH_RM > and DCH_rm. Yep. The regression tests show what I would expect. I'll check in details later. -- Michael
Attachment
On Fri, Apr 09, 2021 at 07:58:51PM +0900, Michael Paquier wrote: > Yep. The regression tests show what I would expect. I'll check in > details later. I have spent some time on that today and applied this patch down to 9.6, after adding more comments and simplifying a bit the calculation method used. Instead of using the "mon" variable to store an intermediate result, I have simplified things so as this only uses tm->tm_mon, and "mon" for the position in the roman-numeral array is adjusted only once. -- Michael
Attachment
On Mon, Apr 12, 2021 at 12:04:23PM +0900, Michael Paquier wrote: > On Fri, Apr 09, 2021 at 07:58:51PM +0900, Michael Paquier wrote: > > Yep. The regression tests show what I would expect. I'll check in > > details later. > > I have spent some time on that today and applied this patch down to > 9.6, after adding more comments and simplifying a bit the calculation > method used. Instead of using the "mon" variable to store an > intermediate result, I have simplified things so as this only uses > tm->tm_mon, and "mon" for the position in the roman-numeral array is > adjusted only once. Thanks Michael! I thought we'd want to keep the original calculation for consistency, but directly computing the needed offset is clearly simpler to read!