Removal of currtid()/currtid2() and some table AM cleanup

Started by Michael Paquieralmost 6 years ago29 messageshackers
Jump to latest
#1Michael Paquier
michael@paquier.xyz

Hi all,

I have been looking at the ODBC driver and the need for currtid() as
well as currtid2(), and as mentioned already in [1]/messages/by-id/20200529005559.jl2gsolomyro4l4n@alap3.anarazel.de -- Michael, matching with my
lookup of things, these are actually not needed by the driver as long
as we connect to a server newer than 8.2 able to support RETURNING. I
am adding in CC of this thread Saito-san and Inoue-san who are the
two main maintainers of the driver for comments. It is worth noting
that on its latest HEAD the ODBC driver requires libpq from at least
9.2.

I would like to remove those two functions and the surrounding code
for v14, leading to some cleanup:
6 files changed, 326 deletions(-)

While on it, I have noticed that heap_get_latest_tid() is still
located within heapam.c, but we can just move it within
heapam_handler.c.

Attached are two patches to address both points. Comments are
welcome.

Thanks,

[1]: /messages/by-id/20200529005559.jl2gsolomyro4l4n@alap3.anarazel.de -- Michael
--
Michael

Attachments:

0001-Remove-currtid-and-currtid2.patchtext/x-diff; charset=us-asciiDownload+0-303
0002-Move-heap_get_latest_tid-within-the-heap-AM-handler.patchtext/x-diff; charset=us-asciiDownload+117-120
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#1)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Michael Paquier <michael@paquier.xyz> writes:

I would like to remove those two functions and the surrounding code
for v14, leading to some cleanup:

+1

While on it, I have noticed that heap_get_latest_tid() is still
located within heapam.c, but we can just move it within
heapam_handler.c.

It looks like table_beginscan_tid wouldn't need to be exported anymore
either.

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: Removal of currtid()/currtid2() and some table AM cleanup

I wrote:

It looks like table_beginscan_tid wouldn't need to be exported anymore
either.

Ah, scratch that, I misread it.

regards, tom lane

#4Inoue, Hiroshi
h-inoue@dream.email.ne.jp
In reply to: Michael Paquier (#1)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Hi,

On 2020/06/03 11:14, Michael Paquier wrote:

Hi all,

I have been looking at the ODBC driver and the need for currtid() as
well as currtid2(), and as mentioned already in [1], matching with my
lookup of things, these are actually not needed by the driver as long
as we connect to a server newer than 8.2 able to support RETURNING.

Though currtid2() is necessary even for servers which support RETURNING,
I don't object to remove it.

regards,
Hiroshi Inoue

Show quoted text

I
am adding in CC of this thread Saito-san and Inoue-san who are the
two main maintainers of the driver for comments. It is worth noting
that on its latest HEAD the ODBC driver requires libpq from at least
9.2.

I would like to remove those two functions and the surrounding code
for v14, leading to some cleanup:
6 files changed, 326 deletions(-)

While on it, I have noticed that heap_get_latest_tid() is still
located within heapam.c, but we can just move it within
heapam_handler.c.

Attached are two patches to address both points. Comments are
welcome.

Thanks,

[1]: /messages/by-id/20200529005559.jl2gsolomyro4l4n@alap3.anarazel.de
--
Michael

#5Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#1)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Hi,

On 2020-06-03 11:14:48 +0900, Michael Paquier wrote:

I would like to remove those two functions and the surrounding code
for v14, leading to some cleanup:
6 files changed, 326 deletions(-)

+1

While on it, I have noticed that heap_get_latest_tid() is still
located within heapam.c, but we can just move it within
heapam_handler.c.

What's the point of that change? I think the differentiation between
heapam_handler.c and heapam.c could be clearer, but if anything, I'd
argue that heap_get_latest_tid is sufficiently low-level that it'd
belong in heapam.c.

Greetings,

Andres Freund

#6Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#5)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Thu, Jun 04, 2020 at 10:59:05AM -0700, Andres Freund wrote:

What's the point of that change? I think the differentiation between
heapam_handler.c and heapam.c could be clearer, but if anything, I'd
argue that heap_get_latest_tid is sufficiently low-level that it'd
belong in heapam.c.

Well, heap_get_latest_tid() is only called in heapam_handler.c if
anything, as it is not used elsewhere and not publish it. And IMO we
should try to encourage using table_get_latest_tid() instead if some
plugins need that. Anyway, if you are opposed to this change, I won't
push hard for it either.
--
Michael

#7Michael Paquier
michael@paquier.xyz
In reply to: Inoue, Hiroshi (#4)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Wed, Jun 03, 2020 at 10:10:21PM +0900, Inoue, Hiroshi wrote:

On 2020/06/03 11:14, Michael Paquier wrote:

I have been looking at the ODBC driver and the need for currtid() as
well as currtid2(), and as mentioned already in [1], matching with my
lookup of things, these are actually not needed by the driver as long
as we connect to a server newer than 8.2 able to support RETURNING.

Though currtid2() is necessary even for servers which support RETURNING,
I don't object to remove it.

In which cases is it getting used then? From what I can see there is
zero coverage for that part in the tests. And based on a rough read
of the code, this would get called with LATEST_TUPLE_LOAD being set,
where there is some kind of bulk deletion involved. Couldn't that be
a problem?
--
Michael

#8Inoue, Hiroshi
h-inoue@dream.email.ne.jp
In reply to: Michael Paquier (#7)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On 2020/06/05 15:22, Michael Paquier wrote:

On Wed, Jun 03, 2020 at 10:10:21PM +0900, Inoue, Hiroshi wrote:

On 2020/06/03 11:14, Michael Paquier wrote:

I have been looking at the ODBC driver and the need for currtid() as
well as currtid2(), and as mentioned already in [1], matching with my
lookup of things, these are actually not needed by the driver as long
as we connect to a server newer than 8.2 able to support RETURNING.

Though currtid2() is necessary even for servers which support RETURNING,
I don't object to remove it.

In which cases is it getting used then?

Keyset-driven cursors always detect changes made by other applications
(and themselves). currtid() is necessary to detect the changes.
CTIDs are changed by updates unfortunately.

regards,
Hiroshi Inoue

Show quoted text

From what I can see there is
zero coverage for that part in the tests. And based on a rough read
of the code, this would get called with LATEST_TUPLE_LOAD being set,
where there is some kind of bulk deletion involved. Couldn't that be
a problem?
--
Michael

#9Michael Paquier
michael@paquier.xyz
In reply to: Inoue, Hiroshi (#8)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Fri, Jun 05, 2020 at 10:25:00PM +0900, Inoue, Hiroshi wrote:

Keyset-driven cursors always detect changes made by other applications
(and themselves). currtid() is necessary to detect the changes.
CTIDs are changed by updates unfortunately.

You mean currtid2() here and not currtid(), right? We have two
problems here then:
1) We cannot actually really remove currtid2() from the backend yet
without removing the dependency in the driver, or that may break some
users.
2) The driver does not include tests for that stuff yet.
--
Michael

#10Inoue, Hiroshi
h-inoue@dream.email.ne.jp
In reply to: Michael Paquier (#9)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Sorry for the reply.

On 2020/06/08 15:52, Michael Paquier wrote:

On Fri, Jun 05, 2020 at 10:25:00PM +0900, Inoue, Hiroshi wrote:

Keyset-driven cursors always detect changes made by other applications
(and themselves). currtid() is necessary to detect the changes.
CTIDs are changed by updates unfortunately.

You mean currtid2() here and not currtid(), right?

Yes.

We have two
problems here then:
1) We cannot actually really remove currtid2() from the backend yet
without removing the dependency in the driver, or that may break some
users.

I think only ODBC driver uses currtid2().

2) The driver does not include tests for that stuff yet.

SQLSetPos(.., .., SQL_REFRESH, ..) call in positioned-update-test passes
the stuff
�when 'Use Declare/Fetch' option is turned off. In other words,
keyset-driven cursor
is not supported when 'Use Declare/Fetch' option is turned on. Probably
keyset-driven
cursor support would be lost regardless of 'Use Declare/Fetch' option
after the
removal of currtid2().

Show quoted text

--
Michael

#11Michael Paquier
michael@paquier.xyz
In reply to: Inoue, Hiroshi (#10)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Mon, Jun 15, 2020 at 08:50:23PM +0900, Inoue, Hiroshi wrote:

Sorry for the reply.

No problem, thanks for taking the time.

On 2020/06/08 15:52, Michael Paquier wrote:

On Fri, Jun 05, 2020 at 10:25:00PM +0900, Inoue, Hiroshi wrote:
We have two
problems here then:
1) We cannot actually really remove currtid2() from the backend yet
without removing the dependency in the driver, or that may break some
users.

I think only ODBC driver uses currtid2().

Check. I think so too.

2) The driver does not include tests for that stuff yet.

SQLSetPos(.., .., SQL_REFRESH, ..) call in positioned-update-test passes the
stuff
 when 'Use Declare/Fetch' option is turned off. In other words,
keyset-driven cursor
is not supported when 'Use Declare/Fetch' option is turned on. Probably
keyset-driven
cursor support would be lost regardless of 'Use Declare/Fetch' option after
the removal of currtid2().

Sorry, but I am not quite sure what is the relationship between
UseDeclareFetch and currtid2()? Is that related to the use of
SQL_CURSOR_KEYSET_DRIVEN? The only thing I can be sure of here is
that we never call currtid2() in any of the regression tests present
in the ODBC code for any of the scenarios covered by installcheck-all,
so that does not really bring any confidence that removing currtid2()
is a wise thing to do, because we may silently break stuff. If the
function is used, it would be good to close the gap with a test to
stress that at least in the driver.

currtid(), on the contrary, would be fine as far as I understand
because the ODBC code relies on a RETURNING ctid instead, and that's
supported for ages in the Postgres backend.
--
Michael

#12Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#11)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Tue, Jun 23, 2020 at 01:29:06PM +0900, Michael Paquier wrote:

Sorry, but I am not quite sure what is the relationship between
UseDeclareFetch and currtid2()? Is that related to the use of
SQL_CURSOR_KEYSET_DRIVEN? The only thing I can be sure of here is
that we never call currtid2() in any of the regression tests present
in the ODBC code for any of the scenarios covered by installcheck-all,
so that does not really bring any confidence that removing currtid2()
is a wise thing to do, because we may silently break stuff. If the
function is used, it would be good to close the gap with a test to
stress that at least in the driver.

Actually, while reviewing the code, the only code path where we use
currtid2() involves positioned_load() and LATEST_TUPLE_LOAD. And the
only location where this happens is in SC_pos_reload_with_key(), where
I don't actually see how it would be possible to not have a keyset and
still use a CTID, which would led to LATEST_TUPLE_LOAD being used. So
could it be possible that the code paths of currtid2() are actually
just dead code?
--
Michael

#13Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#12)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Tue, Jun 23, 2020 at 02:02:33PM +0900, Michael Paquier wrote:

Actually, while reviewing the code, the only code path where we use
currtid2() involves positioned_load() and LATEST_TUPLE_LOAD. And the
only location where this happens is in SC_pos_reload_with_key(), where
I don't actually see how it would be possible to not have a keyset and
still use a CTID, which would led to LATEST_TUPLE_LOAD being used. So
could it be possible that the code paths of currtid2() are actually
just dead code?

I have dug more into this one, and we actually stressed this code path
quite a lot up to commit d9cb23f in the ODBC driver, with tests
cursor-block-delete, positioned-update and bulkoperations particularly
when calling SQLSetPos(). However, 86e2e7a has reworked the code in
such a way that we visibly don't use anymore CTIDs if we don't have a
keyset, and that combinations of various options like UseDeclareFetch
or UpdatableCursors don't trigger this code path anymore. In short,
currtid2() does not get used. Inoue-san, Saito-san, what do you
think? I am adding also Tsunakawa-san in CC who has some experience
in this area.
--
Michael

#14Inoue, Hiroshi
h-inoue@dream.email.ne.jp
In reply to: Michael Paquier (#13)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Hi Michael,

Where do you test, on Windows or on *nix?
How do you test there?

regards,
Hiroshi Inoue

Show quoted text

On 2020/06/24 11:11, Michael Paquier wrote:

On Tue, Jun 23, 2020 at 02:02:33PM +0900, Michael Paquier wrote:

Actually, while reviewing the code, the only code path where we use
currtid2() involves positioned_load() and LATEST_TUPLE_LOAD. And the
only location where this happens is in SC_pos_reload_with_key(), where
I don't actually see how it would be possible to not have a keyset and
still use a CTID, which would led to LATEST_TUPLE_LOAD being used. So
could it be possible that the code paths of currtid2() are actually
just dead code?

I have dug more into this one, and we actually stressed this code path
quite a lot up to commit d9cb23f in the ODBC driver, with tests
cursor-block-delete, positioned-update and bulkoperations particularly
when calling SQLSetPos(). However, 86e2e7a has reworked the code in
such a way that we visibly don't use anymore CTIDs if we don't have a
keyset, and that combinations of various options like UseDeclareFetch
or UpdatableCursors don't trigger this code path anymore. In short,
currtid2() does not get used. Inoue-san, Saito-san, what do you
think? I am adding also Tsunakawa-san in CC who has some experience
in this area.
--
Michael

#15Michael Paquier
michael@paquier.xyz
In reply to: Inoue, Hiroshi (#14)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Hi Inoue-san,

On Wed, Jun 24, 2020 at 05:20:42PM +0900, Inoue, Hiroshi wrote:

Where do you test, on Windows or on *nix?
How do you test there?

I have been testing the driver on macos only, with various backend
versions, from 11 to 14.

Thanks,
--
Michael

#16Inoue, Hiroshi
h-inoue@dream.email.ne.jp
In reply to: Michael Paquier (#13)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Hi,

I seem to have invalidated KEYSET-DRIVEN cursors used in
positioned-update test .
It was introduced by the commit 4a272fd but was invalidated by the
commit 2be35a6.

I don't object to the removal of currtid(2) because keyset-driven
cursors in psqlodbc are changed into static cursors in many cases and
I've hardly ever heard a complaint about it.

regards,
Hiroshi Inoue

Show quoted text

On 2020/06/24 11:11, Michael Paquier wrote:

On Tue, Jun 23, 2020 at 02:02:33PM +0900, Michael Paquier wrote:

Actually, while reviewing the code, the only code path where we use
currtid2() involves positioned_load() and LATEST_TUPLE_LOAD. And the
only location where this happens is in SC_pos_reload_with_key(), where
I don't actually see how it would be possible to not have a keyset and
still use a CTID, which would led to LATEST_TUPLE_LOAD being used. So
could it be possible that the code paths of currtid2() are actually
just dead code?

I have dug more into this one, and we actually stressed this code path
quite a lot up to commit d9cb23f in the ODBC driver, with tests
cursor-block-delete, positioned-update and bulkoperations particularly
when calling SQLSetPos(). However, 86e2e7a has reworked the code in
such a way that we visibly don't use anymore CTIDs if we don't have a
keyset, and that combinations of various options like UseDeclareFetch
or UpdatableCursors don't trigger this code path anymore. In short,
currtid2() does not get used. Inoue-san, Saito-san, what do you
think? I am adding also Tsunakawa-san in CC who has some experience
in this area.
--
Michael

#17Michael Paquier
michael@paquier.xyz
In reply to: Inoue, Hiroshi (#16)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Thu, Jun 25, 2020 at 10:14:00PM +0900, Inoue, Hiroshi wrote:

I seem to have invalidated KEYSET-DRIVEN cursors used in positioned-update
test. It was introduced by the commit 4a272fd but was invalidated by the
commit 2be35a6.

I don't object to the removal of currtid(2) because keyset-driven cursors in
psqlodbc are changed into static cursors in many cases and I've hardly ever
heard a complaint about it.

Hmm. I am not sure that this completely answers my original concern
though. In short, don't we still have corner cases where
keyset-driven cursors are not changed into static cursors, meaning
that currtid2() could get used? The removal of the in-core functions
would hurt applications using that, meaning that we should at least
provide an equivalent of currtid2() in the worse case as a contrib
module, no? If the code paths of currtid2() are reachable, shouldn't
we also make sure that they are still reached in the regression tests
of the driver, meaning that the driver code needs more coverage? I
have been looking at the tests and tried to tweak them using
SQLSetPos() so as the code paths involving currtid2() get reached, but
I am not really able to do so. It does not mean that that currtid2()
never gets reached, it just means that I am not able to be sure that
this part can be safely removed from the Postgres backend code :(

From what I can see on this thread, we could just remove currtid() per
the arguments of the RETURNING ctid clause supported since PG 8.2, but
it would make more sense to me to just remove both currtid/currtid2()
at once.
--
Michael

#18Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#17)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On Fri, Jun 26, 2020 at 01:11:55PM +0900, Michael Paquier wrote:

From what I can see on this thread, we could just remove currtid() per
the arguments of the RETURNING ctid clause supported since PG 8.2, but
it would make more sense to me to just remove both currtid/currtid2()
at once.

The CF bot is complaining, so here is a rebase for the main patch.
Opinions are welcome about the arguments of upthread.
--
Michael

Attachments:

v2-0001-Remove-currtid-and-currtid2.patchtext/x-diff; charset=us-asciiDownload+0-303
#19Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#18)
Re: Removal of currtid()/currtid2() and some table AM cleanup

On 2020-09-03 12:14, Michael Paquier wrote:

On Fri, Jun 26, 2020 at 01:11:55PM +0900, Michael Paquier wrote:

From what I can see on this thread, we could just remove currtid() per
the arguments of the RETURNING ctid clause supported since PG 8.2, but
it would make more sense to me to just remove both currtid/currtid2()
at once.

The CF bot is complaining, so here is a rebase for the main patch.
Opinions are welcome about the arguments of upthread.

It appears that currtid2() is still used, so we ought to keep it.

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#19)
Re: Removal of currtid()/currtid2() and some table AM cleanup

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 2020-09-03 12:14, Michael Paquier wrote:

Opinions are welcome about the arguments of upthread.

It appears that currtid2() is still used, so we ought to keep it.

Yeah, if pgODBC were not using it at all then I think it'd be fine
to get rid of, but if it still contains calls then we cannot.
The suggestion upthread that those calls might be unreachable
is interesting, but it seems unproven.

regards, tom lane

#21Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#21)
#23Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#23)
#25Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#24)
#26Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#24)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#26)
#28Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#27)
#29Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#28)