Not deleted mentions of the cleared path
Hi, hackers!
I have already written about the problem of InvalidPath [0].ptr_value)->subpath)->path->parent->pathlist->elements[1].ptr_value) $19 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f, param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143, rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306, pathkeys = 0x7f7f7f7f7f7f7f7f} (gdb) appearing. I
investigated this and found an error in the add_path() function, when we
reject a path, we free up the memory of the path, but do not delete
various mentions of it (for example, in the ancestor of relation, as in
the example below).
Thus, we may face the problem of accessing the freed memory.
I demonstrated this below using gdb when I call a query after running a
test in test/regression/sql/create_misc.sql:
*Query:*
:-- That ALTER TABLE should have added TOAST tables.
SELECT relname, reltoastrelid <> 0 AS has_toast_table
FROM pg_class
WHERE oid::regclass IN ('a_star', 'c_star')
ORDER BY 1;
--UPDATE b_star*
-- SET a = text 'gazpacho'
-- WHERE aa > 4;
SELECT class, aa, a FROM a_star*;
*gdb:
*
0x00007ff3f7325fda in epoll_wait (epfd=5, events=0x55bf9ee75c38,
maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) b /home/alena/postgrespro_3/src/backend/optimizer/util/pathnode.c:620
Breakpoint 1 at 0x55bf9cfe4c65: file pathnode.c, line 621.
(gdb) c
Continuing.
Breakpoint 1, add_path (parent_rel=0x55bf9ef7f5c0,
new_path=0x55bf9ef7f4e0) at pathnode.c:621
621 if (!IsA(new_path, IndexPath))
(gdb) n
622 pfree(new_path);
(gdb) n
624 }
(gdb) p *new_path
$1 = {type = T_Invalid, pathtype = T_Invalid, parent =
0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f,
param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe
= 127, parallel_workers = 2139062143,
rows = 1.3824172084878715e+306, startup_cost =
1.3824172084878715e+306, total_cost = 1.3824172084878715e+306,
pathkeys = 0x7f7f7f7f7f7f7f7f}
*(gdb) p new_path
$2 = (Path *) 0x55bf9ef7f4e0*
(gdb) p ((ProjectionPath *)((SortPath*)parent_rel->pathlist->elements
[0]: .ptr_value)->subpath)->path->parent->pathlist->elements[1].ptr_value) $19 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f, param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143, rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306, pathkeys = 0x7f7f7f7f7f7f7f7f} (gdb)
*$20 = (struct Path *) 0x55bf9ef7f4e0*
(gdb) p *((ProjectionPath *)((SortPath*)parent_rel->pathlist->elements
[0]: .ptr_value)->subpath)->path->parent->pathlist->elements[1].ptr_value) $19 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f, param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143, rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306, pathkeys = 0x7f7f7f7f7f7f7f7f} (gdb)
$17 = {type = T_Invalid, pathtype = T_Invalid, parent =
0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f,
param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe
= 127, parallel_workers = 2139062143,
rows = 1.3824172084878715e+306, startup_cost =
1.3824172084878715e+306, total_cost = 1.3824172084878715e+306,
pathkeys = 0x7f7f7f7f7f7f7f7f}
(gdb) p (Path*)(((ProjectionPath
*)((SortPath*)parent_rel->pathlist->elements
[0]: .ptr_value)->subpath)->path->parent->pathlist->elements[1].ptr_value) $19 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f, param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143, rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306, pathkeys = 0x7f7f7f7f7f7f7f7f} (gdb)
*$21 = (Path *) 0x55bf9ef7f4e0*
(gdb) p *(Path*)(((ProjectionPath
*)((SortPath*)parent_rel->pathlist->elements
[0]: .ptr_value)->subpath)->path->parent->pathlist->elements[1].ptr_value) $19 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f, param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143, rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306, pathkeys = 0x7f7f7f7f7f7f7f7f} (gdb)
$19 = {type = T_Invalid, pathtype = T_Invalid, parent =
0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f,
param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe
= 127, parallel_workers = 2139062143,
rows = 1.3824172084878715e+306, startup_cost =
1.3824172084878715e+306, total_cost = 1.3824172084878715e+306,
pathkeys = 0x7f7f7f7f7f7f7f7f}
(gdb)
The same problem may be in the add_partial_path() function.
Unfortunately, I have not yet been able to find a problematic query with
the described case, but I have prepared a patch to fix this problem.
What do you think?
0.
/messages/by-id/01aa76cc-93e1-4de9-ab34-b3a890bf76b3@postgrespro.ru
--
Regards,
Alena Rybakina
Postgres Professional
Attachments:
v0-Fix-reject-released-path-we-shouldn-t-only-clean-pat.patchtext/x-patch; charset=UTF-8; name=v0-Fix-reject-released-path-we-shouldn-t-only-clean-pat.patchDownload
From b9483800eb9d8dc9bafb114c602c63d5cf596ee7 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <a.rybakina@postgrespro.ru>
Date: Mon, 30 Oct 2023 14:22:21 +0300
Subject: [PATCH] Fix reject released path: we shouldn't only clean path, but
we should delete all mentions ottherwice we will face the problem access to
released memory.
---
src/backend/optimizer/util/pathnode.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 0b1d17b9d33..4df9d34c9f1 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -619,7 +619,19 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
{
/* Reject and recycle the new path */
if (!IsA(new_path, IndexPath))
- pfree(new_path);
+ {
+ if(new_path == new_path->parent->cheapest_startup_path)
+ new_path->parent->cheapest_startup_path = NULL;
+ if(new_path == new_path->parent->cheapest_total_path)
+ new_path->parent->cheapest_total_path = NULL;
+ foreach(p1, new_path->parent->pathlist)
+ {
+ Path *path = (Path *) lfirst(p1);
+
+ if (path == new_path)
+ new_path->parent->pathlist = foreach_delete_current(new_path->parent->pathlist, p1);
+ }
+ }
}
}
@@ -849,7 +861,17 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
else
{
/* Reject and recycle the new path */
- pfree(new_path);
+ if(new_path == new_path->parent->cheapest_startup_path)
+ new_path->parent->cheapest_startup_path = NULL;
+ if(new_path == new_path->parent->cheapest_total_path)
+ new_path->parent->cheapest_total_path = NULL;
+ foreach(p1, new_path->parent->pathlist)
+ {
+ Path *path = (Path *) lfirst(p1);
+
+ if (path == new_path)
+ new_path->parent->pathlist = foreach_delete_current(new_path->parent->pathlist, p1);
+ }
}
}
--
2.34.1
On Mon, Oct 30, 2023 at 5:01 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
Hi, hackers!
I have already written about the problem of InvalidPath [0] appearing. I investigated this and found an error in the add_path() function, when we reject a path, we free up the memory of the path, but do not delete various mentions of it (for example, in the ancestor of relation, as in the example below).
Thus, we may face the problem of accessing the freed memory.
I demonstrated this below using gdb when I call a query after running a test in test/regression/sql/create_misc.sql:
Query:
:-- That ALTER TABLE should have added TOAST tables.
SELECT relname, reltoastrelid <> 0 AS has_toast_table
FROM pg_class
WHERE oid::regclass IN ('a_star', 'c_star')
ORDER BY 1;--UPDATE b_star*
-- SET a = text 'gazpacho'
-- WHERE aa > 4;SELECT class, aa, a FROM a_star*;
gdb:
0x00007ff3f7325fda in epoll_wait (epfd=5, events=0x55bf9ee75c38, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) b /home/alena/postgrespro_3/src/backend/optimizer/util/pathnode.c:620
Breakpoint 1 at 0x55bf9cfe4c65: file pathnode.c, line 621.
(gdb) c
Continuing.Breakpoint 1, add_path (parent_rel=0x55bf9ef7f5c0, new_path=0x55bf9ef7f4e0) at pathnode.c:621
621 if (!IsA(new_path, IndexPath))
(gdb) n
622 pfree(new_path);
(gdb) n
624 }
(gdb) p *new_path
$1 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f,
param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143,
rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306,
pathkeys = 0x7f7f7f7f7f7f7f7f}
(gdb) p new_path
$2 = (Path *) 0x55bf9ef7f4e0
At this point the new_path has not been added to the parent_rel. We do
not set the cheapest* paths while paths are being added. The stack
trace will give you an idea where this is happening.
(gdb) p ((ProjectionPath *)((SortPath*)parent_rel->pathlist->elements [0].ptr_value)->subpath)->path->parent->cheapest_startup_path
$20 = (struct Path *) 0x55bf9ef7f4e0
This looks familiar though. There was some nearby thread where Tom
Lane, if my memory serves well, provided a case where a path from
lower rel was added to an upper rel without copying or changing its
parent. This very much looks like that case.
--
Best Wishes,
Ashutosh Bapat
On Mon, Oct 30, 2023 at 7:31 PM Alena Rybakina <a.rybakina@postgrespro.ru>
wrote:
I have already written about the problem of InvalidPath [0] appearing. I
investigated this and found an error in the add_path() function, when we
reject a path, we free up the memory of the path, but do not delete various
mentions of it (for example, in the ancestor of relation, as in the example
below).
I agree that what you observed is true - add_path() may free a path
while it's still referenced from some lower rels. For instance, when
creating ordered paths, we may use the input path unchanged without
copying if it's already well ordered, and it might be freed afterwards
if it fails when competing in add_path().
But this doesn't seem to be a problem in practice. We will not access
these references from the lower rels.
I'm not sure if this is an issue that we need to fix, or we need to live
with. But I do think it deserves some explanation in the comment of
add_path().
Thanks
Richard
Hi! Thank you for the interest to this issue.
On 31.10.2023 06:25, Richard Guo wrote:
On Mon, Oct 30, 2023 at 7:31 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
I have already written about the problem of InvalidPath [0]
appearing. I investigated this and found an error in the add_path()
function, when we reject a path, we free up the memory of the path,
but do not delete various mentions of it (for example, in the
ancestor of relation, as in the example below).I agree that what you observed is true - add_path() may free a path
while it's still referenced from some lower rels. For instance, when
creating ordered paths, we may use the input path unchanged without
copying if it's already well ordered, and it might be freed afterwards
if it fails when competing in add_path().
But this doesn't seem to be a problem in practice. We will not access
these references from the lower rels.
I'm not sure if this is an issue that we need to fix, or we need to live
with. But I do think it deserves some explanation in the comment of
add_path().
I agree that the code looks like an error, but without a real request,
it is still difficult to identify it as a bug. I'll try to reproduce it.
And yes, at least a comment is required here, and to be honest, I have
already faced this problem myself.
On 30.10.2023 17:36, Ashutosh Bapat wrote:
On Mon, Oct 30, 2023 at 5:01 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:Hi, hackers!
I have already written about the problem of InvalidPath [0] appearing. I investigated this and found an error in the add_path() function, when we reject a path, we free up the memory of the path, but do not delete various mentions of it (for example, in the ancestor of relation, as in the example below).
Thus, we may face the problem of accessing the freed memory.
I demonstrated this below using gdb when I call a query after running a test in test/regression/sql/create_misc.sql:
Query:
:-- That ALTER TABLE should have added TOAST tables.
SELECT relname, reltoastrelid <> 0 AS has_toast_table
FROM pg_class
WHERE oid::regclass IN ('a_star', 'c_star')
ORDER BY 1;--UPDATE b_star*
-- SET a = text 'gazpacho'
-- WHERE aa > 4;SELECT class, aa, a FROM a_star*;
gdb:
0x00007ff3f7325fda in epoll_wait (epfd=5, events=0x55bf9ee75c38, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) b /home/alena/postgrespro_3/src/backend/optimizer/util/pathnode.c:620
Breakpoint 1 at 0x55bf9cfe4c65: file pathnode.c, line 621.
(gdb) c
Continuing.Breakpoint 1, add_path (parent_rel=0x55bf9ef7f5c0, new_path=0x55bf9ef7f4e0) at pathnode.c:621
621 if (!IsA(new_path, IndexPath))
(gdb) n
622 pfree(new_path);
(gdb) n
624 }
(gdb) p *new_path
$1 = {type = T_Invalid, pathtype = T_Invalid, parent = 0x7f7f7f7f7f7f7f7f, pathtarget = 0x7f7f7f7f7f7f7f7f,
param_info = 0x7f7f7f7f7f7f7f7f, parallel_aware = 127, parallel_safe = 127, parallel_workers = 2139062143,
rows = 1.3824172084878715e+306, startup_cost = 1.3824172084878715e+306, total_cost = 1.3824172084878715e+306,
pathkeys = 0x7f7f7f7f7f7f7f7f}
(gdb) p new_path
$2 = (Path *) 0x55bf9ef7f4e0At this point the new_path has not been added to the parent_rel. We do
not set the cheapest* paths while paths are being added. The stack
trace will give you an idea where this is happening.(gdb) p ((ProjectionPath *)((SortPath*)parent_rel->pathlist->elements [0].ptr_value)->subpath)->path->parent->cheapest_startup_path
$20 = (struct Path *) 0x55bf9ef7f4e0This looks familiar though. There was some nearby thread where Tom
Lane, if my memory serves well, provided a case where a path from
lower rel was added to an upper rel without copying or changing its
parent. This very much looks like that case.
Thank you, I think this might help me to find a query to reproduce it.