BUG #16953: OOB access while converting "interval" to char
The following bug has been logged on the website:
Bug reference: 16953
Logged by: Theodor Arsenij Larionov-Trichkin
Email address: t.larionov@postgrespro.ru
PostgreSQL version: 13.2
Operating system: Ubuntu 20.04.2 LTS
Description:
Hello!
How to reproduce:
1. mkdir -p ./installation/databases
2. git clone --single-branch --depth=1 --branch=REL_13_2
https://github.com/postgres/postgres postgres_src
3. cd postgres_src
4. ./configure --prefix=`pwd`/../installation/pgbuild
5. make -j20 && make install && cd ..
6. ./installation/pgbuild/bin/initdb -U username -D
./installation/databases/db_clean
7. ./installation/pgbuild/bin/postgres -D
./installation/databases/db_clean/
8. ./installation/pgbuild/bin/psql -h 127.0.0.1 -p 5432 -U username
postgres
9. Performing this query will result in OOB access of rm_months_lower array
and as a result crash: SELECT * from TO_CHAR(interval '-1Mon', 'rm');
Output:
2021-04-07 12:07:27.060 MSK [33887] LOG: starting PostgreSQL 13.2 on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0,
64-bit
2021-04-07 12:07:27.060 MSK [33887] LOG: listening on IPv4 address
"127.0.0.1", port 5432
2021-04-07 12:07:27.065 MSK [33887] LOG: listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-04-07 12:07:27.069 MSK [33888] LOG: database system was shut down at
2021-04-07 12:07:22 MSK
2021-04-07 12:07:27.071 MSK [33887] LOG: database system is ready to accept
connections
2021-04-07 12:08:01.013 MSK [33887] LOG: server process (PID 34113) was
terminated by signal 11: Segmentation fault
2021-04-07 12:08:01.013 MSK [33887] DETAIL: Failed process was running:
SELECT * from TO_CHAR(interval '-1Mon', 'rm');
2021-04-07 12:08:01.013 MSK [33887] LOG: terminating any other active
server processes
2021-04-07 12:08:01.013 MSK [33892] WARNING: terminating connection because
of crash of another server process
2021-04-07 12:08:01.013 MSK [33892] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2021-04-07 12:08:01.013 MSK [33892] HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2021-04-07 12:08:01.013 MSK [35036] FATAL: the database system is in
recovery mode
2021-04-07 12:08:01.014 MSK [33887] LOG: all server processes terminated;
reinitializing
2021-04-07 12:08:01.027 MSK [35038] LOG: database system was interrupted;
last known up at 2021-04-07 12:07:27 MSK
2021-04-07 12:08:01.248 MSK [35038] LOG: database system was not properly
shut down; automatic recovery in progress
2021-04-07 12:08:01.249 MSK [35038] LOG: redo starts at 0/1559798
2021-04-07 12:08:01.249 MSK [35038] LOG: invalid record length at
0/15597D0: wanted 24, got 0
2021-04-07 12:08:01.249 MSK [35038] LOG: redo done at 0/1559798
2021-04-07 12:08:01.256 MSK [33887] LOG: database system is ready to accept
connections
Postgres version:
PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
Hi,
On Wed, Apr 07, 2021 at 09:09:25AM +0000, PG Bug reporting form wrote:
The following bug has been logged on the website:
Bug reference: 16953
Logged by: Theodor Arsenij Larionov-Trichkin
Email address: t.larionov@postgrespro.ru
PostgreSQL version: 13.2
Operating system: Ubuntu 20.04.2 LTS
Description:9. Performing this query will result in OOB access of rm_months_lower array
and as a result crash: SELECT * from TO_CHAR(interval '-1Mon', 'rm');Output:
[...]
terminated by signal 11: Segmentation fault
2021-04-07 12:08:01.013 MSK [33887] DETAIL: Failed process was running:
SELECT * from TO_CHAR(interval '-1Mon', 'rm');
Indeed, thanks a lot for the report!
It's because rm/RM are computed in a way that doesn't play nice with negative
values:
sprintf(s, "%*s", S_FM(n->suffix) ? 0 : -4,
rm_months_lower[MONTHS_PER_YEAR - tm->tm_mon]);
PFA a naive patch to fix this problem with some regression tests. I'm assuming
that -1 month should be january and not december. I had a quick look at the
rest of formatting.c and didn't spot any similar problem, but another pair of
eyes wouldn't hurt.
Attachments:
v1-fix_rm.difftext/plain; charset=us-asciiDownload+18-2
On Wed, Apr 07, 2021 at 08:08:56PM +0800, Julien Rouhaud wrote:
PFA a naive patch to fix this problem with some regression tests. I'm assuming
that -1 month should be january and not december. I had a quick look at the
rest of formatting.c and didn't spot any similar problem, but another pair of
eyes wouldn't hurt.
Assuming an absolute number is not really intuitive when it comes to a
negative number here, while counting backward feels more natural, so I
would vote for making -1 be December, -2 November, etc.
Let's also make the tests more extended. I would suggest a single
query with generate_series() from say -12 to 12, that checks the
output of both RM and rm for the full range of values supported by
tm_mon.
--
Michael
On Thu, Apr 08, 2021 at 07:17:50PM +0900, Michael Paquier wrote:
On Wed, Apr 07, 2021 at 08:08:56PM +0800, Julien Rouhaud wrote:
PFA a naive patch to fix this problem with some regression tests. I'm assuming
that -1 month should be january and not december. I had a quick look at the
rest of formatting.c and didn't spot any similar problem, but another pair of
eyes wouldn't hurt.Assuming an absolute number is not really intuitive when it comes to a
negative number here, while counting backward feels more natural, so I
would vote for making -1 be December, -2 November, etc.
Honestly I might as well have flipped a coin here, so I'm fine either way.
Let's also make the tests more extended. I would suggest a single
query with generate_series() from say -12 to 12, that checks the
output of both RM and rm for the full range of values supported by
tm_mon.
I'm fine with it too, although I'd probably go with [-13, 13] just to make sure
that there's isn't silly off-by-one mistake.
I'll just wait a bit to see if anyone else has any opinion on whether -1 month
should be January or December.
On Thu, Apr 08, 2021 at 07:04:03PM +0800, Julien Rouhaud wrote:
I'm fine with it too, although I'd probably go with [-13, 13] just to make sure
that there's isn't silly off-by-one mistake.
Also, I guess that you'd just want to compile twice a modulo based on
MONTHS_PER_YEAR to get the correct positive position in each array.
I'll just wait a bit to see if anyone else has any opinion on whether -1 month
should be January or December.
Sure. If you can send an updated patch, that would be great.
--
Michael
On Thu, Apr 08, 2021 at 08:37:21PM +0900, Michael Paquier wrote:
On Thu, Apr 08, 2021 at 07:04:03PM +0800, Julien Rouhaud wrote:
I'm fine with it too, although I'd probably go with [-13, 13] just to make sure
that there's isn't silly off-by-one mistake.Also, I guess that you'd just want to compile twice a modulo based on
MONTHS_PER_YEAR to get the correct positive position in each array.
I'm not sure what you mean by that. We receive a pg_tm struct which can't
contain more than 12 in tm_mon right? And actually for intervals it will
reduce "12 months" to "1 year 0 month", so to_char previously didn't report
anything for 12 months either.
I'll just wait a bit to see if anyone else has any opinion on whether -1 month
should be January or December.Sure. If you can send an updated patch, that would be great.
Hearing no other opinion I went with -1 -> december and so on in attached v2.
I also fixed the "[-]12 months" case and updated the regression tests. Given
the extra code needed to deduce the correct array position I factorized DCH_RM
and DCH_rm.
Attachments:
v2-fix_rm.difftext/plain; charset=us-asciiDownload+64-10
On Fri, Apr 09, 2021 at 06:42:46PM +0800, Julien Rouhaud wrote:
I'm not sure what you mean by that. We receive a pg_tm struct which can't
contain more than 12 in tm_mon right? And actually for intervals it will
reduce "12 months" to "1 year 0 month", so to_char previously didn't report
anything for 12 months either.
I did not take the time to look in details, but for reference I just
imagined that a formula like this one would give pretty much the
position in rm_months_upper:
M_PER_Y - ((tm_mon % M_PER_Y) + M_PER_Y) % M_PER_Y
Hearing no other opinion I went with -1 -> december and so on in attached v2.
I also fixed the "[-]12 months" case and updated the regression tests. Given
the extra code needed to deduce the correct array position I factorized DCH_RM
and DCH_rm.
Yep. The regression tests show what I would expect. I'll check in
details later.
--
Michael
On Fri, Apr 09, 2021 at 07:58:51PM +0900, Michael Paquier wrote:
Yep. The regression tests show what I would expect. I'll check in
details later.
I have spent some time on that today and applied this patch down to
9.6, after adding more comments and simplifying a bit the calculation
method used. Instead of using the "mon" variable to store an
intermediate result, I have simplified things so as this only uses
tm->tm_mon, and "mon" for the position in the roman-numeral array is
adjusted only once.
--
Michael
On Mon, Apr 12, 2021 at 12:04:23PM +0900, Michael Paquier wrote:
On Fri, Apr 09, 2021 at 07:58:51PM +0900, Michael Paquier wrote:
Yep. The regression tests show what I would expect. I'll check in
details later.I have spent some time on that today and applied this patch down to
9.6, after adding more comments and simplifying a bit the calculation
method used. Instead of using the "mon" variable to store an
intermediate result, I have simplified things so as this only uses
tm->tm_mon, and "mon" for the position in the roman-numeral array is
adjusted only once.
Thanks Michael! I thought we'd want to keep the original calculation for
consistency, but directly computing the needed offset is clearly simpler to
read!