Combine pg_walinspect till_end_of_wal functions with others

Started by Bharath Rupireddyabout 3 years ago35 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

In a recent discussion [1]/messages/by-id/CALj2ACV-WBN=EUgUPyYOGitp+rn163vMnQd=HcWrnKt-uqFYFA@mail.gmail.com, Michael Paquier asked if we can combine
pg_walinspect till_end_of_wal functions with other functions
pg_get_wal_records_info and pg_get_wal_stats. The code currently looks
much duplicated and the number of functions that pg_walinspect exposes
to the users is bloated. The point was that the till_end_of_wal
functions determine the end LSN and everything else that they do is
the same as their counterpart functions. Well, the idea then was to
keep things simple, not clutter the APIs, have better and consistent
user-inputted end_lsn validations at the cost of usability and code
redundancy. However, now I tend to agree with the feedback received.

I'm attaching a patch doing the $subject with the following behavior:
1. If start_lsn is NULL, error out/return NULL.
2. If end_lsn isn't specified, default to NULL, then determine the end_lsn.
3. If end_lsn is specified as NULL, then determine the end_lsn.
4. If end_lsn is specified as non-NULL, then determine if it is
greater than start_lsn if yes, go ahead do the job, otherwise error
out.

Another idea is to convert till_end_of_wal flavors to SQL-only
functions and remove the c code from pg_walinspect.c. However, I
prefer $subject and completely remove till_end_of_wal flavors for
better usability in the long term.

Thoughts?

[1]: /messages/by-id/CALj2ACV-WBN=EUgUPyYOGitp+rn163vMnQd=HcWrnKt-uqFYFA@mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Reduce-pg_walinspect-s-functions-without-losing-f.patchapplication/x-patch; name=v1-0001-Reduce-pg_walinspect-s-functions-without-losing-f.patchDownload+336-134
#2Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Wed, Mar 1, 2023 at 1:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

In a recent discussion [1], Michael Paquier asked if we can combine
pg_walinspect till_end_of_wal functions with other functions
pg_get_wal_records_info and pg_get_wal_stats. The code currently looks
much duplicated and the number of functions that pg_walinspect exposes
to the users is bloated. The point was that the till_end_of_wal
functions determine the end LSN and everything else that they do is
the same as their counterpart functions. Well, the idea then was to
keep things simple, not clutter the APIs, have better and consistent
user-inputted end_lsn validations at the cost of usability and code
redundancy. However, now I tend to agree with the feedback received.

I'm attaching a patch doing the $subject with the following behavior:
1. If start_lsn is NULL, error out/return NULL.
2. If end_lsn isn't specified, default to NULL, then determine the end_lsn.
3. If end_lsn is specified as NULL, then determine the end_lsn.
4. If end_lsn is specified as non-NULL, then determine if it is
greater than start_lsn if yes, go ahead do the job, otherwise error
out.

Another idea is to convert till_end_of_wal flavors to SQL-only
functions and remove the c code from pg_walinspect.c. However, I
prefer $subject and completely remove till_end_of_wal flavors for
better usability in the long term.

Thoughts?

[1] /messages/by-id/CALj2ACV-WBN=EUgUPyYOGitp+rn163vMnQd=HcWrnKt-uqFYFA@mail.gmail.com

Needed a rebase due to 019f8624664dbf1e25e2bd721c7e99822812d109.
Attaching v2 patch. Sorry for the noise.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Reduce-pg_walinspect-s-functions-without-losing-f.patchapplication/x-patch; name=v2-0001-Reduce-pg_walinspect-s-functions-without-losing-f.patchDownload+336-134
#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Bharath Rupireddy (#2)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Wed, Mar 01, 2023 at 08:30:00PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 1, 2023 at 1:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

In a recent discussion [1], Michael Paquier asked if we can combine
pg_walinspect till_end_of_wal functions with other functions
pg_get_wal_records_info and pg_get_wal_stats. The code currently looks
much duplicated and the number of functions that pg_walinspect exposes
to the users is bloated. The point was that the till_end_of_wal
functions determine the end LSN and everything else that they do is
the same as their counterpart functions. Well, the idea then was to
keep things simple, not clutter the APIs, have better and consistent
user-inputted end_lsn validations at the cost of usability and code
redundancy. However, now I tend to agree with the feedback received.

+1, especially since I really don't like the use of "till" in the function
names.

I'm attaching a patch doing the $subject with the following behavior:
1. If start_lsn is NULL, error out/return NULL.

Maybe naive and unrelated question, but is that really helpful? If for some
reason I want to see information about *all available WAL*, I have to manually
dig for a suitable LSN. The same action with pg_waldump is easier as I just
need to use the oldest available WAL that's present on disk.

Another idea is to convert till_end_of_wal flavors to SQL-only
functions and remove the c code from pg_walinspect.c. However, I
prefer $subject and completely remove till_end_of_wal flavors for
better usability in the long term.

I agree that using default arguments is a way better API.

Nitpicking:

Maybe we could group the kept unused exported C function at the end of the
file?

Also:

/*
- * Get info and data of all WAL records from start LSN till end of WAL.
+ * NB: This function does nothing and stays here for backward compatibility.
+ * Without it, the extension fails to install.
  *
- * This function emits an error if a future start i.e. WAL LSN the database
- * system doesn't know about is specified.
+ * Try using pg_get_wal_records_info() for the same till_end_of_wal
+ * functionaility.
  */
 Datum
 pg_get_wal_records_info_till_end_of_wal(PG_FUNCTION_ARGS)
 {
-   XLogRecPtr  start_lsn;
-   XLogRecPtr  end_lsn = InvalidXLogRecPtr;
-
-   start_lsn = PG_GETARG_LSN(0);
-
-   end_lsn = ValidateInputLSNs(true, start_lsn, end_lsn);
-
-   GetWALRecordsInfo(fcinfo, start_lsn, end_lsn);
-
-   PG_RETURN_VOID();
+   PG_RETURN_NULL();
 }

I don't like much this chunk (same for the other kept function). Apart from
the obvious typo in "functionaility", I don't think that the comment is really
accurate.

Also, are we actually helping users if we simply return NULL there? It's quite
possible that people will start to use the new shared lib while still having
the 1.1 SQL definition of the extension installed. In that case, they will
simply retrieve a NULL row and may spend some time wondering why until they
eventually realize that their only option is to upgrade the extension first and
then use another function. Why not make their life easier and explicity raise
a suitable error at the SQL level if users try to use those functions?

#4Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Julien Rouhaud (#3)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Mon, Mar 6, 2023 at 2:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

I'm attaching a patch doing the $subject with the following behavior:
1. If start_lsn is NULL, error out/return NULL.

Maybe naive and unrelated question, but is that really helpful? If for some
reason I want to see information about *all available WAL*, I have to manually
dig for a suitable LSN. The same action with pg_waldump is easier as I just
need to use the oldest available WAL that's present on disk.

Are you saying that the pg_walinspect functions should figure out the
oldest available WAL file and LSN, and start from there if start_lsn
specified as NULL or invalid? Note that pg_waldump requires either
explicit startlsn and/or startseg (WAL file name), it can't search for
the oldest WAL file available and start from there automatically.

If the user wants to figure it out, they can do something like below:

ostgres=# select * from pg_ls_waldir() order by name;
name | size | modification
--------------------------+----------+------------------------
000000010000000000000001 | 16777216 | 2023-03-06 14:54:55+00
000000010000000000000002 | 16777216 | 2023-03-06 14:54:55+00

If we try to make these functions figure out the oldest WAl file and
start from there, then it'll unnecessarily complicate the APIs and
functions. If we still think we need a better function for the users
to figure out the oldest WAL file, perhaps, add a SQL-only
view/function to pg_walinspect that returns "select name from
pg_ls_waldir() order by name limit 1;", but honestly, that's so
trivial.

Another idea is to convert till_end_of_wal flavors to SQL-only
functions and remove the c code from pg_walinspect.c. However, I
prefer $subject and completely remove till_end_of_wal flavors for
better usability in the long term.

I agree that using default arguments is a way better API.

Thanks. Yes, that's true.

Nitpicking:

Maybe we could group the kept unused exported C function at the end of the
file?

Will do.

Also:

/*
- * Get info and data of all WAL records from start LSN till end of WAL.
+ * NB: This function does nothing and stays here for backward compatibility.
+ * Without it, the extension fails to install.
*
- * This function emits an error if a future start i.e. WAL LSN the database
- * system doesn't know about is specified.
+ * Try using pg_get_wal_records_info() for the same till_end_of_wal
+ * functionaility.

I don't like much this chunk (same for the other kept function). Apart from
the obvious typo in "functionaility", I don't think that the comment is really
accurate.

Can you be more specific what's inaccurate about the comment?

Also, are we actually helping users if we simply return NULL there? It's quite
possible that people will start to use the new shared lib while still having
the 1.1 SQL definition of the extension installed. In that case, they will
simply retrieve a NULL row and may spend some time wondering why until they
eventually realize that their only option is to upgrade the extension first and
then use another function. Why not make their life easier and explicity raise
a suitable error at the SQL level if users try to use those functions?

I thought about it initially, but wanted to avoid more errors. An
error would make them use the new version easily. I will change it
that way.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#5Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#4)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Mon, 6 Mar 2023 at 16:06, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we try to make these functions figure out the oldest WAl file and
start from there, then it'll unnecessarily complicate the APIs and
functions. If we still think we need a better function for the users
to figure out the oldest WAL file, perhaps, add a SQL-only
view/function to pg_walinspect that returns "select name from
pg_ls_waldir() order by name limit 1;", but honestly, that's so
trivial.

That "order by name limit 1" has subtle bugs when you're working on a
system that has experienced timeline switches. It is entirely possible
that the first file (as sorted by the default collation) is not the
first record you can inspect, or even in your timeline's history.

Kind regards,

Matthias van de Meent

#6Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#5)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Mon, Mar 6, 2023 at 8:52 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Mon, 6 Mar 2023 at 16:06, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we try to make these functions figure out the oldest WAl file and
start from there, then it'll unnecessarily complicate the APIs and
functions. If we still think we need a better function for the users
to figure out the oldest WAL file, perhaps, add a SQL-only
view/function to pg_walinspect that returns "select name from
pg_ls_waldir() order by name limit 1;", but honestly, that's so
trivial.

That "order by name limit 1" has subtle bugs when you're working on a
system that has experienced timeline switches. It is entirely possible
that the first file (as sorted by the default collation) is not the
first record you can inspect, or even in your timeline's history.

Hm. Note that pg_walinspect currently searches WAL on insertion
timeline; it doesn't care about the older timelines. The idea of
making it look at WAL on an older timeline was discussed, but for the
sake of simplicity we kept the functions simple. If needed, I can try
adding the timeline as input parameters to all the functions (with
default -1 meaning current insertion timeline; if specified, look for
WAL on that timeline).

Are you saying that a pg_walinspect function that traverses the pg_wal
directory and figures out the old valid WAL on a given timeline is
still useful? Or make the functions look for older WAL if start_lsn is
given as NULL or invalid?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#7Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#6)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Mon, 6 Mar 2023 at 16:37, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Mar 6, 2023 at 8:52 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Mon, 6 Mar 2023 at 16:06, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we try to make these functions figure out the oldest WAl file and
start from there, then it'll unnecessarily complicate the APIs and
functions. If we still think we need a better function for the users
to figure out the oldest WAL file, perhaps, add a SQL-only
view/function to pg_walinspect that returns "select name from
pg_ls_waldir() order by name limit 1;", but honestly, that's so
trivial.

That "order by name limit 1" has subtle bugs when you're working on a
system that has experienced timeline switches. It is entirely possible
that the first file (as sorted by the default collation) is not the
first record you can inspect, or even in your timeline's history.

Hm. Note that pg_walinspect currently searches WAL on insertion
timeline; it doesn't care about the older timelines. The idea of
making it look at WAL on an older timeline was discussed, but for the
sake of simplicity we kept the functions simple. If needed, I can try
adding the timeline as input parameters to all the functions (with
default -1 meaning current insertion timeline; if specified, look for
WAL on that timeline).

Are you saying that a pg_walinspect function that traverses the pg_wal
directory and figures out the old valid WAL on a given timeline is
still useful? Or make the functions look for older WAL if start_lsn is
given as NULL or invalid?

The specific comment I made was only regarding the following issue: An
instance may still have WAL segments from before the latest timeline
switch. These segments may have a higher LSN and lower timeline number
than your current running timeline+LSN (because of e.g. pg_rewind).
This will then result in unwanted behaviour when you sort the segments
numerically/alphabetically and then assume that the first file's LSN
is valid (or available) in your current timeline.

That is why "order by name limit 1" isn't a good solution, and that's
what I was commenting on: you need to parse the timeline hierarchy to
determine which timelines you can use which WAL segments of.

To answer your question on whether I'd like us to traverse timeline
switches: Yes, I'd really like it if we were able to decode the
current timeline's hierarchical WAL of a PG instance in one go, from
the start at (iirc) 0x10000 all the way to the current LSN, assuming
the segments are available.

Kind regards,

Matthias van de Meent

#8Julien Rouhaud
rjuju123@gmail.com
In reply to: Bharath Rupireddy (#4)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Mon, Mar 06, 2023 at 08:36:17PM +0530, Bharath Rupireddy wrote:

On Mon, Mar 6, 2023 at 2:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Also:

/*
- * Get info and data of all WAL records from start LSN till end of WAL.
+ * NB: This function does nothing and stays here for backward compatibility.
+ * Without it, the extension fails to install.
*
- * This function emits an error if a future start i.e. WAL LSN the database
- * system doesn't know about is specified.
+ * Try using pg_get_wal_records_info() for the same till_end_of_wal
+ * functionaility.

I don't like much this chunk (same for the other kept function). Apart from
the obvious typo in "functionaility", I don't think that the comment is really
accurate.

Can you be more specific what's inaccurate about the comment?

It's problematic to install the extension if we rely on upgrade scripts only.
We could also provide a pg_walinspect--1.2.sql file and it would just work, and
that may have been a good idea if there wasn't also the problem of people still
having the version 1.1 locally installed, as we don't want them to see random
failures like "could not find function ... in file ...", or keeping the ability
to install the former 1.1 version (with those functions bypassed).

#9Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#8)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, Mar 07, 2023 at 09:13:46AM +0800, Julien Rouhaud wrote:

It's problematic to install the extension if we rely on upgrade scripts only.
We could also provide a pg_walinspect--1.2.sql file and it would just work, and
that may have been a good idea if there wasn't also the problem of people still
having the version 1.1 locally installed, as we don't want them to see random
failures like "could not find function ... in file ...", or keeping the ability
to install the former 1.1 version (with those functions bypassed).

Why would we need a 1.2? HEAD is the only branch with pg_walinspect
1.1, and it has not been released yet.
--
Michael

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#9)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, 7 Mar 2023, 12:36 Michael Paquier, <michael@paquier.xyz> wrote:

On Tue, Mar 07, 2023 at 09:13:46AM +0800, Julien Rouhaud wrote:

It's problematic to install the extension if we rely on upgrade scripts

only.

We could also provide a pg_walinspect--1.2.sql file and it would just

work, and

that may have been a good idea if there wasn't also the problem of

people still

having the version 1.1 locally installed, as we don't want them to see

random

failures like "could not find function ... in file ...", or keeping the

ability

to install the former 1.1 version (with those functions bypassed).

Why would we need a 1.2? HEAD is the only branch with pg_walinspect
1.1, and it has not been released yet.

ah right I should have checked. but the same ABI compatibility concern
still exists for version 1.0 of the extension.

Show quoted text
#11Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#10)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, Mar 07, 2023 at 12:42:20PM +0800, Julien Rouhaud wrote:

ah right I should have checked. but the same ABI compatibility concern
still exists for version 1.0 of the extension.

Yes, we'd better make sure that the past code is able to run, at
least. Now I am not really convinced that we have the need to enforce
an error with the new code even if 1.0 is still installed, so as it is
possible to remove all the traces of the code that triggers errors if
an end LSN is higher than the current insert LSN for primaries or
replayed LSN for standbys.
--
Michael

#12Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#11)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, Mar 07, 2023 at 01:56:24PM +0900, Michael Paquier wrote:

On Tue, Mar 07, 2023 at 12:42:20PM +0800, Julien Rouhaud wrote:

ah right I should have checked. but the same ABI compatibility concern
still exists for version 1.0 of the extension.

Yes, we'd better make sure that the past code is able to run, at
least. Now I am not really convinced that we have the need to enforce
an error with the new code even if 1.0 is still installed,

So keep this "deprecated" C function working, as it would only be a few lines
of code?

so as it is
possible to remove all the traces of the code that triggers errors if
an end LSN is higher than the current insert LSN for primaries or
replayed LSN for standbys.

+1 for that

#13Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Julien Rouhaud (#12)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, Mar 7, 2023 at 11:17 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Mar 07, 2023 at 01:56:24PM +0900, Michael Paquier wrote:

On Tue, Mar 07, 2023 at 12:42:20PM +0800, Julien Rouhaud wrote:

ah right I should have checked. but the same ABI compatibility concern
still exists for version 1.0 of the extension.

Yes, we'd better make sure that the past code is able to run, at
least. Now I am not really convinced that we have the need to enforce
an error with the new code even if 1.0 is still installed,

So keep this "deprecated" C function working, as it would only be a few lines
of code?

so as it is
possible to remove all the traces of the code that triggers errors if
an end LSN is higher than the current insert LSN for primaries or
replayed LSN for standbys.

+1 for that

I understand that we want to keep till_end_of_wal functions defined
around in .c file so that if someone does CREATE EXTENSION
pg_walinspect WITH VERSION '1.0'; on the latest extension shared
library (with 1.1 version), the till_end_of_wal functions should work
for them.

Also, I noticed that there's some improvement needed for the input
validations, especially for the end_lsn.

Here I'm with the v3 patch addressing the above comments. Please
review it further.

1. When start_lsn is NULL or invalid ('0/0'), emit an error. There was
a comment on the functions automatically determining start_lsn to be
the oldest WAL LSN. I'm not implementing this change now, as it
requires extra work to traverse the pg_wal directory. I'm planning to
do it in the next set of improvements where I'm planning to make the
functions timeline-aware, introduce functions for inspecting
wal_buffers and so on.
2. When end_lsn is NULL or invalid ('0/0') IOW end_lsn is not
specified, deduce end_lsn to be the current flush LSN when not in
recovery, current replayed LSN when in recovery. This is the main
change that avoids till_end_of_wal functions in version 1.1.
3. When end_lsn is specified but greater than or equal to the
start_lsn, return NULL. Given the above review comments on more errors
being reported, I chose to return NULL for better usability.
4. When end_lsn is specified but less than the start_lsn, get
info/stats up until end_lsn.
5. Retained pg_get_wal_records_info_till_end_of_wal and
pg_get_wal_stats_till_end_of_wal for backward compatibility.
6. Piggybacked these functions and behaviour under the new HEAD-only
extension version 1.1 introduced recently, instead of bumping to 1.2.
When PG16 is out, users will have 1.1 with all of these new
functionality.
7. Added tests to verify the extension update path in
oldextversions.sql similar to other extensions'. (suggested by Michael
Paquier).
8. Added a note in the pg_walinspect documentation about removal of
pg_get_wal_records_info_till_end_of_wal and
pg_get_wal_stats_till_end_of_wal in version 1.1 and how the other
functions can be used to achieve the same functionality and how these
till_end_of_wal functions can work if extension is installed
explicitly with version 1.0.
9. Refactored the tests according to the new behaviours.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Rework-pg_walinspect-functions.patchapplication/x-patch; name=v3-0001-Rework-pg_walinspect-functions.patchDownload+442-154
#14Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#12)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Tue, Mar 07, 2023 at 01:47:01PM +0800, Julien Rouhaud wrote:

So keep this "deprecated" C function working, as it would only be a few lines
of code?

Yes, I guess that this would be the final picture, moving forward I'd
like to think that we should just remove the SQL declaration of the
till_end_of_wal() functions to keep a clean interface.
--
Michael

#15Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#13)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Wed, Mar 08, 2023 at 01:40:46PM +0530, Bharath Rupireddy wrote:

1. When start_lsn is NULL or invalid ('0/0'), emit an error. There was
a comment on the functions automatically determining start_lsn to be
the oldest WAL LSN. I'm not implementing this change now, as it
requires extra work to traverse the pg_wal directory. I'm planning to
do it in the next set of improvements where I'm planning to make the
functions timeline-aware, introduce functions for inspecting
wal_buffers and so on.

[.. long description ..]

9. Refactored the tests according to the new behaviours.

Hmm. I think this patch ought to have a result simpler than what's
proposed here.

First, do we really have to begin marking the functions as non-STRICT
to abide with the treatment of NULL as a special value? The part that
I've found personally the most annoying with these functions is that
an incorrect bound leads to a random failure, particularly when such
queries are used for monitoring. I would simplify the whole with two
simple rules, as of:
- Keeping all the functions strict.
- When end_lsn is a LSN in the future of the current LSN inserted or
replayed, adjust its value to be the exactly GetXLogReplayRecPtr() or
GetFlushRecPtr(). This way, monitoring tools can use a value ahead,
at will.
- Failing if start_lsn > end_lsn.
- Failing if start_lsn refers to a position older than what exists is
still fine by me.

I would also choose to remove
pg_get_wal_records_info_till_end_of_wal() from the SQL interface in
1.1 to limit the confusion arount it, but keep a few lines of code so
as we are still able to use it when pg_walinspect 1.0 is the version
enabled with CREATE EXTENSION.

In short, pg_get_wal_records_info_till_end_of_wal() should be able to
use exactly the same code as pg_get_wal_records_info(), still you need
to keep *two* functions for their prosrc with PG_FUNCTION_ARGS as
arguments so as 1.0 would work when dropped in place. The result, it
seems to me, mostly comes to simplify ValidateInputLSNs() and remove
its till_end_of_wal argument.

+-- Removed function
+SELECT pg_get_functiondef('pg_get_wal_records_info_till_end_of_wal'::regproc);
+ERROR:  function "pg_get_wal_records_info_till_end_of_wal" does not exist
+LINE 1: SELECT pg_get_functiondef('pg_get_wal_records_info_till_end_...

It seems to me that you should just replace all that and anything
depending on pg_get_functiondef() with a \dx+ pg_walinspect, that
would list all the objects part of the extension for the specific
version you want to test. Not sure that there is a need to list the
full function definitions, either. That just bloats the tests.

I think, however, that it is critical to test in oldextversions.out
the *executions* of the functions, so as we make sure that they don't
crash. The patch is missing that.

+-- Invalid input LSNs
+SELECT * FROM pg_get_wal_record_info('0/0'); -- ERROR
+ERROR:  invalid input LSN
--
Michael
#16Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#13)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Wed, Mar 8, 2023 at 1:40 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 7, 2023 at 11:17 AM Julien Rouhaud <rjuju123@gmail.com> wrote:

Here I'm with the v3 patch addressing the above comments. Please
review it further.

Needed a rebase. v4 patch is attached. I'll address the latest review
comments in a bit.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Rework-pg_walinspect-functions.patchapplication/octet-stream; name=v4-0001-Rework-pg_walinspect-functions.patchDownload+428-187
#17Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#15)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Fri, Mar 10, 2023 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Mar 08, 2023 at 01:40:46PM +0530, Bharath Rupireddy wrote:

1. When start_lsn is NULL or invalid ('0/0'), emit an error. There was
a comment on the functions automatically determining start_lsn to be
the oldest WAL LSN. I'm not implementing this change now, as it
requires extra work to traverse the pg_wal directory. I'm planning to
do it in the next set of improvements where I'm planning to make the
functions timeline-aware, introduce functions for inspecting
wal_buffers and so on.

[.. long description ..]

9. Refactored the tests according to the new behaviours.

Hmm. I think this patch ought to have a result simpler than what's
proposed here.

First, do we really have to begin marking the functions as non-STRICT
to abide with the treatment of NULL as a special value? The part that
I've found personally the most annoying with these functions is that
an incorrect bound leads to a random failure, particularly when such
queries are used for monitoring.

As long as we provide a sensible default value (so I guess '0/0' to
mean "no upper bound") and that we therefore don't have to manually
specify an upper bound if we don't want one I'm fine with keeping the
functions marked as STRICT.

I would simplify the whole with two
simple rules, as of:
- Keeping all the functions strict.
- When end_lsn is a LSN in the future of the current LSN inserted or
replayed, adjust its value to be the exactly GetXLogReplayRecPtr() or
GetFlushRecPtr(). This way, monitoring tools can use a value ahead,
at will.
- Failing if start_lsn > end_lsn.
- Failing if start_lsn refers to a position older than what exists is
still fine by me.

+1

I would also choose to remove
pg_get_wal_records_info_till_end_of_wal() from the SQL interface in
1.1 to limit the confusion arount it, but keep a few lines of code so
as we are still able to use it when pg_walinspect 1.0 is the version
enabled with CREATE EXTENSION.

Yeah the SQL function should be removed no matter what.

#18Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#17)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Fri, Mar 10, 2023 at 04:04:15PM +0800, Julien Rouhaud wrote:

As long as we provide a sensible default value (so I guess '0/0' to
mean "no upper bound") and that we therefore don't have to manually
specify an upper bound if we don't want one I'm fine with keeping the
functions marked as STRICT.

FWIW, using also InvalidXLogRecPtr as a shortcut to say "Don't fail,
just do the job" is fine by me. Something like a FFF/FFFFFFFF should
just mean the same on a fresh cluster, still it gets risky the longer
the WAL is generated.
--
Michael

#19Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#18)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Fri, 10 Mar 2023, 16:14 Michael Paquier, <michael@paquier.xyz> wrote:

On Fri, Mar 10, 2023 at 04:04:15PM +0800, Julien Rouhaud wrote:

As long as we provide a sensible default value (so I guess '0/0' to
mean "no upper bound") and that we therefore don't have to manually
specify an upper bound if we don't want one I'm fine with keeping the
functions marked as STRICT.

FWIW, using also InvalidXLogRecPtr as a shortcut to say "Don't fail,
just do the job" is fine by me.

isn't '0/0' the same as InvalidXLogRecPtr? but my point is that we
shouldn't require to spell it explicitly, just rely on the default value.

Something like a FFF/FFFFFFFF should

just mean the same on a fresh cluster, still it gets risky the longer
the WAL is generated.

yeah, it would be handy to accept 'infinity' in that context.

Show quoted text
#20Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#19)
Re: Combine pg_walinspect till_end_of_wal functions with others

On Fri, Mar 10, 2023 at 04:37:23PM +0800, Julien Rouhaud wrote:

isn't '0/0' the same as InvalidXLogRecPtr? but my point is that we
shouldn't require to spell it explicitly, just rely on the default value.

Perhaps. Still the addition of a DEFAULT to the function definitions
and its value looks like a second patch to me. The first should just
lift the bound restrictions currently in place while cleaning up the
till_* functions.
--
Michael

#21Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#15)
#22Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#21)
#23Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#22)
#24Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#23)
#25Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#24)
#27Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#26)
#28Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#25)
#29Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#27)
#30Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#29)
#31Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#31)
#33Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#32)
#34Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#33)
#35Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#34)