Calling PrepareTempTablespaces in BufFileCreateTemp
Hi,
PrepareTempTablespaces is called by most callers of BufFileCreateTemp, so I
was
wondering if there is a reason not to call it inside BufFileCreateTemp.
As a developer using BufFileCreateTemp to write code that will create spill
files, it was easy to forget the extra step of checking the temp_tablespaces
GUC to ensure I create the spill files there if it is set.
Thanks,
Melanie Plageman
On Mon, Apr 22, 2019 at 3:12 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
PrepareTempTablespaces is called by most callers of BufFileCreateTemp, so I was
wondering if there is a reason not to call it inside BufFileCreateTemp.
The best answer I can think of is that a BufFileCreateTemp() caller
might not want to do catalog access. Perhaps the contortions within
assign_temp_tablespaces() are something that callers ought to opt in
to explicitly.
That doesn't seem like a particularly good or complete answer, though.
Perhaps it should simply be called within BufFileCreateTemp(). The
BufFile/fd.c layering is confusing in a number of ways IMV.
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
On Mon, Apr 22, 2019 at 3:12 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:PrepareTempTablespaces is called by most callers of BufFileCreateTemp, so I was
wondering if there is a reason not to call it inside BufFileCreateTemp.
The best answer I can think of is that a BufFileCreateTemp() caller
might not want to do catalog access. Perhaps the contortions within
assign_temp_tablespaces() are something that callers ought to opt in
to explicitly.
It's kind of hard to see a reason to call it outside a transaction,
and even if we did, there are provisions for it not to go boom.
That doesn't seem like a particularly good or complete answer, though.
Perhaps it should simply be called within BufFileCreateTemp(). The
BufFile/fd.c layering is confusing in a number of ways IMV.
I don't actually see why BufFileCreateTemp should do it; if
we're to add a call, seems like OpenTemporaryFile is the place,
as that's what is really concerned with the temp tablespace(s).
I'm in favor of doing this, I think, as it sure looks to me like
gistInitBuildBuffers() is calling BufFileCreateTemp without any
closely preceding PrepareTempTablespaces. So we already have an
instance of Melanie's bug in core. It'd be difficult to notice
because of the silent-fallback-to-default-tablespace behavior.
regards, tom lane
I wrote:
Peter Geoghegan <pg@bowt.ie> writes:
That doesn't seem like a particularly good or complete answer, though.
Perhaps it should simply be called within BufFileCreateTemp(). The
BufFile/fd.c layering is confusing in a number of ways IMV.
I don't actually see why BufFileCreateTemp should do it; if
we're to add a call, seems like OpenTemporaryFile is the place,
as that's what is really concerned with the temp tablespace(s).
I'm in favor of doing this, I think, as it sure looks to me like
gistInitBuildBuffers() is calling BufFileCreateTemp without any
closely preceding PrepareTempTablespaces. So we already have an
instance of Melanie's bug in core. It'd be difficult to notice
because of the silent-fallback-to-default-tablespace behavior.
Here's a draft patch for that.
It's slightly ugly that this adds a dependency on commands/tablespace
to fd.c, which is a pretty low-level module. I think wanting to avoid
that layering violation might've been the reason for doing things the
way they are. However, this gets rid of tablespace dependencies in
some other files that are only marginally higher-level, like
tuplesort.c, so I'm not sure how strong that objection is.
There are three functions in fd.c that have a dependency on the
temp tablespace info having been set up:
OpenTemporaryFile
GetTempTablespaces
GetNextTempTableSpace
This patch makes the first of those automatically set up the info
if it's not done yet. The second one has always had an assertion
that the caller did it already, and now the third one does too.
An about equally plausible change would be to make all three
call PrepareTempTablespaces, but there are so few callers of the
second and third that I'm not sure that'd be better. Thoughts?
regards, tom lane
Attachments:
move-temp-tablespace-setup-calls-1.patchtext/x-diff; charset=us-ascii; name=move-temp-tablespace-setup-calls-1.patchDownload+14-14
On Tue, Apr 23, 2019 at 1:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are three functions in fd.c that have a dependency on the
temp tablespace info having been set up:
OpenTemporaryFile
GetTempTablespaces
GetNextTempTableSpace
This patch makes the first of those automatically set up the info
if it's not done yet. The second one has always had an assertion
that the caller did it already, and now the third one does too.
An about equally plausible change would be to make all three
call PrepareTempTablespaces, but there are so few callers of the
second and third that I'm not sure that'd be better. Thoughts?
I think an assertion is sufficiently clear for GetNextTempTableSpace based
on
what it does and its current callers. The same is probably true for
GetTempTableSpaces.
--
Melanie Plageman
I wrote:
Here's a draft patch for that.
It's slightly ugly that this adds a dependency on commands/tablespace
to fd.c, which is a pretty low-level module. I think wanting to avoid
that layering violation might've been the reason for doing things the
way they are. However, this gets rid of tablespace dependencies in
some other files that are only marginally higher-level, like
tuplesort.c, so I'm not sure how strong that objection is.There are three functions in fd.c that have a dependency on the
temp tablespace info having been set up:
OpenTemporaryFile
GetTempTablespaces
GetNextTempTableSpace
This patch makes the first of those automatically set up the info
if it's not done yet. The second one has always had an assertion
that the caller did it already, and now the third one does too.
After a bit more thought it seemed like another answer would be to
make all three of those functions assert that the caller did the
right thing, as per attached. This addresses the layering-violation
complaint, but might be more of a pain in the rear for developers.
Not really sure which way I like better.
regards, tom lane
Attachments:
assert-that-temp-tablespaces-are-prepared.patchtext/x-diff; charset=us-ascii; name=assert-that-temp-tablespaces-are-prepared.patchDownload+8-1
On Wed, Apr 24, 2019 at 12:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
After a bit more thought it seemed like another answer would be to
make all three of those functions assert that the caller did the
right thing, as per attached. This addresses the layering-violation
complaint, but might be more of a pain in the rear for developers.
In what sense is it not already a layering violation to call
PrepareTempTablespaces() as often as we do? PrepareTempTablespaces()
parses and validates the GUC variable and passes it to fd.c, but to me
that seems almost the same as calling the fd.c function
SetTempTablespaces() directly. PrepareTempTablespaces() allocates
memory that it won't free itself within TopTransactionContext. I'm not
seeing why the context that the PrepareTempTablespaces() catalog
access occurs in actually matters.
Like you, I find it hard to prefer one of the approaches over the
other, though I don't really know how to assess this layering
business. I'm glad that either approach will prevent oversights,
though.
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
... I'm not
seeing why the context that the PrepareTempTablespaces() catalog
access occurs in actually matters.
The point there is that a catalog access might leak some amount of
memory. Probably not enough to be a big deal, but ...
regards, tom lane
On Wed, Apr 24, 2019 at 5:48 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Apr 24, 2019 at 12:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
After a bit more thought it seemed like another answer would be to
make all three of those functions assert that the caller did the
right thing, as per attached. This addresses the layering-violation
complaint, but might be more of a pain in the rear for developers.In what sense is it not already a layering violation to call
PrepareTempTablespaces() as often as we do? PrepareTempTablespaces()
parses and validates the GUC variable and passes it to fd.c, but to me
that seems almost the same as calling the fd.c function
SetTempTablespaces() directly. PrepareTempTablespaces() allocates
memory that it won't free itself within TopTransactionContext. I'm not
seeing why the context that the PrepareTempTablespaces() catalog
access occurs in actually matters.Like you, I find it hard to prefer one of the approaches over the
other, though I don't really know how to assess this layering
business. I'm glad that either approach will prevent oversights,
though.
Just to provide my opinion, since we are at intersection and can go
either way on this. Second approach (just adding assert) only helps
if the code path for ALL future callers gets excersied and test exist for
the
same, to expose potential breakage. But with first approach fixes the issue
for current and future users, plus excersicing the same just with a single
test
already tests it for future callers as well. So, that way first approach
sounds
more promising if we are fetch between the two.
Ashwin Agrawal <aagrawal@pivotal.io> writes:
On Wed, Apr 24, 2019 at 5:48 PM Peter Geoghegan <pg@bowt.ie> wrote:
Like you, I find it hard to prefer one of the approaches over the
other, though I don't really know how to assess this layering
business. I'm glad that either approach will prevent oversights,
though.
Just to provide my opinion, since we are at intersection and can go
either way on this. Second approach (just adding assert) only helps
if the code path for ALL future callers gets excersied and test exist for
the same, to expose potential breakage.
In view of the fact that the existing regression tests fail to expose the
need for gistInitBuildBuffers to worry about this [1]/messages/by-id/24954.1556130678@sss.pgh.pa.us, that's a rather
strong point. It's hard to believe that somebody writing new code would
fail to notice such an assertion, but it's more plausible that later
rearrangements could break things and not notice due to lack of coverage.
However, by that argument we should change all 3 of these functions to
set up the data. If we're eating the layering violation to the extent
of letting OpenTemporaryFile call into commands/tablespace, then there's
little reason for the other 2 not to do likewise.
I still remain concerned that invoking catalog lookups from fd.c is a darn
bad idea, even if we have a fallback for it to work (for some value of
"work") in non-transactional states. It's not really hard to envision
that kind of thing leading to infinite recursion. I think it's safe
right now, because catalog fetches shouldn't lead to any temp-file
access, but that's sort of a rickety assumption isn't it?
regards, tom lane
On Thu, Apr 25, 2019 at 9:19 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
Just to provide my opinion, since we are at intersection and can go
either way on this. Second approach (just adding assert) only helps
if the code path for ALL future callers gets excersied and test exist for
the
same, to expose potential breakage. But with first approach fixes the issue
for current and future users, plus excersicing the same just with a single
test
already tests it for future callers as well. So, that way first approach
sounds
more promising if we are fetch between the two.Would an existing test cover the code after moving PrepareTempTablespaces
into
OpenTemporaryFile?
--
Melanie Plageman
On Thu, Apr 25, 2019 at 9:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
However, by that argument we should change all 3 of these functions to
set up the data. If we're eating the layering violation to the extent
of letting OpenTemporaryFile call into commands/tablespace, then there's
little reason for the other 2 not to do likewise.
I agree to that point, same logic should be used for all three calls
irrespective of the approach we pick.
I still remain concerned that invoking catalog lookups from fd.c is a darn
bad idea, even if we have a fallback for it to work (for some value of
"work") in non-transactional states. It's not really hard to envision
that kind of thing leading to infinite recursion. I think it's safe
right now, because catalog fetches shouldn't lead to any temp-file
access, but that's sort of a rickety assumption isn't it?
Is there (easy) way to assert for that assumption? If yes, then can add the
same and make it not rickety.
Though I agree any exceptions/violations coded generally bites in long run
somewhere later.
On Thu, Apr 25, 2019 at 12:45:03PM -0400, Tom Lane wrote:
I still remain concerned that invoking catalog lookups from fd.c is a darn
bad idea, even if we have a fallback for it to work (for some value of
"work") in non-transactional states. It's not really hard to envision
that kind of thing leading to infinite recursion. I think it's safe
right now, because catalog fetches shouldn't lead to any temp-file
access, but that's sort of a rickety assumption isn't it?
Introducing catalog lookups into fd.c which is not a layer designed
for that is a choice that I find strange, and I fear that it may bite
in the future. I think that the choice proposed upthread to add
an assertion on TempTablespacesAreSet() when calling a function
working on temporary data is just but fine, and that we should just
make sure that the gist code calls PrepareTempTablespaces()
correctly. So [1]/messages/by-id/11777.1556133426@sss.pgh.pa.us -- Michael is a proposal I find much more acceptable than the
other one.
I think that one piece is missing from the patch. Wouldn't it be
better to add an assertion at the beginning of OpenTemporaryFile() to
make sure that PrepareTempTablespaces() has been called when interXact
is true? We could just go with that:
Assert(!interXact || TempTablespacesAreSet());
And this gives me the attached.
[1]: /messages/by-id/11777.1556133426@sss.pgh.pa.us -- Michael
--
Michael
Attachments:
assert-that-temp-tablespaces-are-prepared-v2.patchtext/x-diff; charset=us-asciiDownload+11-1
On Thu, Apr 25, 2019 at 10:40:01AM -0700, Ashwin Agrawal wrote:
Is there (easy) way to assert for that assumption? If yes, then can add the
same and make it not rickety.
IsTransactionState() would be enough?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
I think that one piece is missing from the patch. Wouldn't it be
better to add an assertion at the beginning of OpenTemporaryFile() to
make sure that PrepareTempTablespaces() has been called when interXact
is true? We could just go with that:
Assert(!interXact || TempTablespacesAreSet());
The version that I posted left it to GetNextTempTableSpace to assert
that. That seemed cleaner to me than an Assert that has to depend
on interXact.
regards, tom lane
On Fri, Apr 26, 2019 at 11:05:11AM -0400, Tom Lane wrote:
Michael Paquier <michael@paquier.xyz> writes:
The version that I posted left it to GetNextTempTableSpace to assert
that. That seemed cleaner to me than an Assert that has to depend
on interXact.
Okay, no objections for that approach as well. Are you planning to do
something about this thread for v12? It seems like the direction to take
is pretty clear, at least from my perspective.
--
Michael
On Thu, Apr 25, 2019 at 11:53 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Apr 25, 2019 at 12:45:03PM -0400, Tom Lane wrote:
I still remain concerned that invoking catalog lookups from fd.c is a darn
bad idea, even if we have a fallback for it to work (for some value of
"work") in non-transactional states. It's not really hard to envision
that kind of thing leading to infinite recursion. I think it's safe
right now, because catalog fetches shouldn't lead to any temp-file
access, but that's sort of a rickety assumption isn't it?Introducing catalog lookups into fd.c which is not a layer designed
for that is a choice that I find strange, and I fear that it may bite
in the future. I think that the choice proposed upthread to add
an assertion on TempTablespacesAreSet() when calling a function
working on temporary data is just but fine, and that we should just
make sure that the gist code calls PrepareTempTablespaces()
correctly. So [1] is a proposal I find much more acceptable than the
other one.
Well the one thing I wish to point out explicitly is just taking fd.c
changes from [1]/messages/by-id/11777.1556133426@sss.pgh.pa.us, and running make check hits no assertions and
doesn't flag issue exist for gistbuildbuffers.c. Means its missing
coverage and in future same can happen as well.
On Mon, Apr 29, 2019 at 12:31 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
Well the one thing I wish to point out explicitly is just taking fd.c
changes from [1], and running make check hits no assertions and
doesn't flag issue exist for gistbuildbuffers.c. Means its missing
coverage and in future same can happen as well.
I believe that the test coverage of GiST index builds is something
that is being actively worked on right now. It's a recognized problem
[1]: /messages/by-id/24954.1556130678@sss.pgh.pa.us -- Peter Geoghegan
[1]: /messages/by-id/24954.1556130678@sss.pgh.pa.us -- Peter Geoghegan
--
Peter Geoghegan
On Fri, Apr 26, 2019 at 8:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The version that I posted left it to GetNextTempTableSpace to assert
that. That seemed cleaner to me than an Assert that has to depend
on interXact.Running `make check` with [1] applied and one of the calls to
PrepareTempTablespaces commented out, I felt like I deserved more as a
developer than the assertion in this case.
Assertions are especially good to protect against regressions, but, in this
case, I'm just trying to use an API that is being provided.
Assertions don't give me a nice, easy-to-understand test failure. I see that
there was a crash halfway through make check and now I have to figure out
why.
If that is the default way for developers to find out that they are missing
something when using the API, it would be nice if it gave me some sort of
understandable diff or error message.
I also think that if there is a step that a caller should always take before
calling a function, then there needs to be a very compelling reason not to
move
that step into the function itself.
So, just to make sure I understand this case:
PrepareTempTablespaces should not be called in BufFileCreateTemp because it
is
not concerned with temp tablespaces.
OpenTemporaryFile is concerned with temp tablespaces, so any reference to
those
should be there.
However, PrepareTempTablespaces should not be called in OpenTemporaryFile
because it is in fd.c and no functions that make up part of the file
descriptor
API should do catalog lookups.
Is this correct?
[1]: /messages/by-id/11777.1556133426@sss.pgh.pa.us
--
Melanie Plageman
Melanie Plageman <melanieplageman@gmail.com> writes:
I also think that if there is a step that a caller should always take before
calling a function, then there needs to be a very compelling reason not to
move that step into the function itself.
Fair complaint.
PrepareTempTablespaces should not be called in BufFileCreateTemp because
it is not concerned with temp tablespaces.
Actually, my reason for thinking that was mostly "that won't fix the
problem, because what about other callers of OpenTemporaryFile?"
However, looking around, there aren't any others --- buffile.c is it.
So maybe a reasonable compromise is to add the Assert(s) in fd.c as
per previous patch, but *also* add PrepareTempTablespaces in
BufFileCreateTemp, so that at least users of buffile.c are insulated
from the issue. buffile.c is still kind of low-level, but it's not
part of core infrastructure in the same way as fd.c, so probably I could
hold my nose for this solution from the system-structural standpoint.
regards, tom lane