pg_walinspect - a new extension to get raw WAL data and WAL stats

Started by Bharath Rupireddyover 4 years ago90 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

While working on one of the internal features, we found that it is a
bit difficult to run pg_waldump for a normal user to know WAL info and
stats of a running postgres database instance in the cloud. Many a
times users or DBAs or developers would want to get and analyze
following:
1) raw WAL record associated with an LSN or raw WAL records between a
start LSN and end LSN for feeding to some other functionality
2) WAL statistics associated with an LSN or between start LSN and end
LSN for debugging or analytical purposes. The WAL stats are the number
of inserts, updates, deletes, index inserts, commits, checkpoints,
aborts, wal record sizes, FPI (Full Page Image) count etc. which are
basically everything that we get with pg_waldump --stats option plus
some other information as we may feel will be useful.

An available option is to use pg_waldump, a standalone program
emitting human readable WAL info into a standard output/file. This
works well when users have access to the system on which postgres is
running. But for a postgres database instance running in the cloud
environments, starting the pg_waldump, fetching and presenting its
output to the users in a structured way may be a bit hard to do.

How about we create a new extension, called pg_walinspect (synonymous
to pageinspect extension) with a bunch of SQL-callable functions that
get the raw WAL records or stats out of a running postgres database
instance in a more structured way that is easily consumable by all the
users or DBAs or developers? We can also provide these functionalities
into the core postgres (in xlogfuncs.c) instead of a new extension,
but we would like it to be pluggable so that the functions will be
used only if required.

[1]: a) bytea pg_get_wal_record(pg_lsn lsn); and bytea pg_get_wal_record(pg_lsn lsn, text wal_dir); - Returns a single row of raw WAL record of bytea type. WAL data is read from pg_wal or specified wal_dir directory.
extension can provide. These are not exhaustive; we can
add/remove/modify as we move further.

We would like to invite more thoughts from the hackers.

Credits: Thanks to Satya Narlapuram, Chen Liang (for some initial
work), Tianyu Zhang and Ashutosh Sharma (copied in cc) for internal
discussions.

[1]: a) bytea pg_get_wal_record(pg_lsn lsn); and bytea pg_get_wal_record(pg_lsn lsn, text wal_dir); - Returns a single row of raw WAL record of bytea type. WAL data is read from pg_wal or specified wal_dir directory.
a) bytea pg_get_wal_record(pg_lsn lsn); and bytea
pg_get_wal_record(pg_lsn lsn, text wal_dir); - Returns a single row of
raw WAL record of bytea type. WAL data is read from pg_wal or
specified wal_dir directory.

b) bytea[] pg_get_wal_record(pg_lsn start_lsn, pg_lsn end_lsn); and
bytea[] pg_get_wal_record(pg_lsn start_lsn, pg_lsn end_lsn, text
wal_dir); - Returns multiple rows of raw WAL records of bytea type,
one row per each WAL record. WAL data is read from pg_wal or specified
wal_dir directory.

CREATE TYPE walinspect_stats_type AS (stat1, stat2, stat3 …. statN);
c) walinspect_stats_type pg_get_wal_stats(pg_lsn lsn); and
walinspect_stats_type pg_get_wal_stats(pg_lsn lsn, text wal_dir); -
Returns a single row of WAL record’s stats of walinspect_stats_type
type. WAL data is read from pg_wal or specified wal_dir directory.

d) walinspect_stats_type[] pg_get_wal_stats(pg_lsn start_lsn, pg_lsn
end_lsn); and walinspect_stats_type[] pg_get_wal_stats(pg_lsn
start_lsn, pg_lsn end_lsn, text wal_dir); - Returns multiple rows of
WAL record stats of walinspect_stats_type type, one row per each WAL
record. WAL data is read from pg_wal or specified wal_dir directory.

e) walinspect_stats_type pg_get_wal_stats(bytea wal_record); -
Returns a single row of provided WAL record (wal_record) stats.

f) walinspect_stats_type pg_get_wal_stats_aggr(pg_lsn start_lsn,
pg_lsn end_lsn); and walinspect_stats_type
pg_get_wal_stats_aggr(pg_lsn start_lsn, pg_lsn end_lsn, text wal_dir);
- Returns a single row of aggregated stats of all the WAL records
between start_lsn and end_lsn. WAL data is read from pg_wal or
specified wal_dir directory.

CREATE TYPE walinspect_lsn_range_type AS (pg_lsn start_lsn, pg_lsn end_lsn);
g) walinspect_lsn_range_type walinspect_get_lsn_range(text
wal_dir); - Returns a single row of start LSN and end LSN of the WAL
records available under pg_wal or specified wal_dir directory.

Regards,
Bharath Rupireddy.

#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 9/8/21, 6:49 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:

How about we create a new extension, called pg_walinspect (synonymous
to pageinspect extension) with a bunch of SQL-callable functions that
get the raw WAL records or stats out of a running postgres database
instance in a more structured way that is easily consumable by all the
users or DBAs or developers? We can also provide these functionalities
into the core postgres (in xlogfuncs.c) instead of a new extension,
but we would like it to be pluggable so that the functions will be
used only if required.

+1

Nathan

#3Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#2)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Thu, Sep 09, 2021 at 10:49:46PM +0000, Bossart, Nathan wrote:

+1

A backend approach has the advantage that you can use the proper locks
to make sure that a segment is not recycled or removed by a concurrent
checkpoint, so that would be reliable.
--
Michael

#4Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Nathan Bossart (#2)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 9/10/21 12:49 AM, Bossart, Nathan wrote:

On 9/8/21, 6:49 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:

How about we create a new extension, called pg_walinspect (synonymous
to pageinspect extension) with a bunch of SQL-callable functions that
get the raw WAL records or stats out of a running postgres database
instance in a more structured way that is easily consumable by all the
users or DBAs or developers? We can also provide these functionalities
into the core postgres (in xlogfuncs.c) instead of a new extension,
but we would like it to be pluggable so that the functions will be
used only if required.

+1

Nathan

+1

Bertrand

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#3)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Fri, Sep 10, 2021 at 7:21 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Sep 09, 2021 at 10:49:46PM +0000, Bossart, Nathan wrote:

+1

A backend approach has the advantage that you can use the proper locks
to make sure that a segment is not recycled or removed by a concurrent
checkpoint, so that would be reliable.

Thanks for sharing your thoughts. IMO, using locks for showing WAL
stats isn't a good way, because these new functions may block the
checkpointer from removing/recycling the WAL files. We don't want to
do that. If at all, user has asked stats of an LSN/range of LSNs if
it is/they are available in the pg_wal directory, we provide the info
otherwise we can throw warnings/errors. This behaviour is pretty much
in sycn with what pg_waldump does right now.

And, some users may not need these new functions at all, so in such
cases going with an extension way makes it more usable.

Regards,
Bharath Rupireddy.

#6Bruce Momjian
bruce@momjian.us
In reply to: Bharath Rupireddy (#1)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote:

Hi,

While working on one of the internal features, we found that it is a
bit difficult to run pg_waldump for a normal user to know WAL info and
stats of a running postgres database instance in the cloud. Many a
times users or DBAs or developers would want to get and analyze
following:

Uh, are we going to implement everything that is only available at the
command line as an extension just for people who are using managed cloud
services where the command line is not available and the cloud provider
has not made that information accessible? Seems like this might lead to
a lot of duplicated effort.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#7Chapman Flack
chap@anastigmatix.net
In reply to: Bruce Momjian (#6)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 10/05/21 18:07, Bruce Momjian wrote:

Uh, are we going to implement everything that is only available at the
command line as an extension just for people who are using managed cloud
services

One extension that runs a curated menu of command-line tools for you
and returns their stdout?

Regards,
-Chap

#8Jeremy Schneider
schneider@ardentperf.com
In reply to: Bruce Momjian (#6)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 10/5/21 15:07, Bruce Momjian wrote:

On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote:

While working on one of the internal features, we found that it is a
bit difficult to run pg_waldump for a normal user to know WAL info and
stats of a running postgres database instance in the cloud. Many a
times users or DBAs or developers would want to get and analyze
following:

Uh, are we going to implement everything that is only available at the
command line as an extension just for people who are using managed cloud
services where the command line is not available and the cloud provider
has not made that information accessible? Seems like this might lead to
a lot of duplicated effort.

No. For most command line utilities, there's no reason to expose them in
SQL or they already are exposed in SQL. For example, everything in
pg_controldata is already available via SQL functions.

Specifically exposing pg_waldump functionality in SQL could speed up
finding bugs in the PG logical replication code. We found and fixed a
few over this past year, but there are more lurking out there.

Having the extension in core is actually the only way to avoid
duplicated effort, by having shared code which the pg_dump binary and
the extension both wrap or call.

-Jeremy

--
http://about.me/jeremy_schneider

#9Bruce Momjian
bruce@momjian.us
In reply to: Chapman Flack (#7)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Tue, Oct 5, 2021 at 06:22:19PM -0400, Chapman Flack wrote:

On 10/05/21 18:07, Bruce Momjian wrote:

Uh, are we going to implement everything that is only available at the
command line as an extension just for people who are using managed cloud
services

One extension that runs a curated menu of command-line tools for you
and returns their stdout?

Yes, that would make sense, and something the cloud service providers
would write.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#10Bruce Momjian
bruce@momjian.us
In reply to: Jeremy Schneider (#8)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Tue, Oct 5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote:

On 10/5/21 15:07, Bruce Momjian wrote:

On Wed, Sep 8, 2021 at 07:18:08PM +0530, Bharath Rupireddy wrote:

While working on one of the internal features, we found that it is a
bit difficult to run pg_waldump for a normal user to know WAL info and
stats of a running postgres database instance in the cloud. Many a
times users or DBAs or developers would want to get and analyze
following:

Uh, are we going to implement everything that is only available at the
command line as an extension just for people who are using managed cloud
services where the command line is not available and the cloud provider
has not made that information accessible? Seems like this might lead to
a lot of duplicated effort.

No. For most command line utilities, there's no reason to expose them in
SQL or they already are exposed in SQL. For example, everything in
pg_controldata is already available via SQL functions.

That's a good example.

Specifically exposing pg_waldump functionality in SQL could speed up
finding bugs in the PG logical replication code. We found and fixed a
few over this past year, but there are more lurking out there.

Uh, why is pg_waldump more important than other command line tool
information?

Having the extension in core is actually the only way to avoid
duplicated effort, by having shared code which the pg_dump binary and
the extension both wrap or call.

Uh, aren't you duplicating code by having pg_waldump as a command-line
tool and an extension?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#11Jeremy Schneider
schneider@ardentperf.com
In reply to: Bruce Momjian (#10)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 10/5/21 17:43, Bruce Momjian wrote:

On Tue, Oct 5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote:

Specifically exposing pg_waldump functionality in SQL could speed up
finding bugs in the PG logical replication code. We found and fixed a
few over this past year, but there are more lurking out there.

Uh, why is pg_waldump more important than other command line tool
information?

Going down the list of all other utilities in src/bin:

* pg_amcheck, pg_config, pg_controldata: already available in SQL
* psql, pgbench, pg_dump: already available as client-side apps
* initdb, pg_archivecleanup, pg_basebackup, pg_checksums, pg_ctl,
pg_resetwal, pg_rewind, pg_upgrade, pg_verifybackup: can't think of any
possible use case outside server OS access, most of these are too low
level and don't even make sense in SQL
* pg_test_fsync, pg_test_timing: marginally interesting ideas in SQL,
don't feel any deep interest myself

Speaking selfishly, there are a few reasons I would be specifically
interested in pg_waldump (the only remaining one on the list).

.

First, to better support existing features around logical replication
and decoding.

In particular, it seems inconsistent to me that all the replication
management SQL functions take LSNs as arguments - and yet there's no
SQL-based way to find the LSNs that you are supposed to pass into these
functions.

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-REPLICATION

Over the past few years I've been pulled in to help several large PG
users who ran into these bugs, and it's very painful - because the only
real remediation is to drop and recreate the replication slot, which
means either re-copying all the data to the downstream system or
figuring out a way to resync it. Some PG users have 3rd party tools like
HVR which can do row-by-row resync (IIUC), but no matter how you slice
it, we're talking about a lot of pain for people replicating large data
sets between multiple systems. In most cases, the only/best option even
with very large tables is to just make a fresh copy of the data - which
can translate to a business outage of hours or even days.

My favorite example is the SQL function pg_replication_slot_advance() -
this could really help PG users find less painful solutions to broken
decoding, however it's not really possible to /know/ an LSN to advance
to without inspecting WAL. ISTM there's a strong use case here for a SQL
interface on WAL inspection.

.

Second: debugging and troubleshooting logical replication and decoding bugs.

I helped track down a few logical replication bugs and get fixed into
code at postgresql.org this past year. (But I give credit to others who
are much better at C than I am, and who did a lot more work than I did
on these bugs!)

Logical decoding bugs are some of the hardest to fix - because all you
have is a WAL stream, but you don't know the SQL or workload patterns or
PG code paths which created that WAL stream.

Dumping the WAL - knowing which objects and which types of operations
are involved and stats like number of changes or number of
subtransactions - this helps identify which transaction and what SQL in
the application triggered the failure, which can help find short-term
workarounds. Businesses need those short-term workarounds so they can
keep going while we work on finding and fixing bugs, which can take some
time. This is another place where I think a SQL interface to WAL would
be helpful to PG users. Especially the ability to filter and trace a
single transaction through a large number of WAL files in the directory.

.

Third: statistics on WAL - especially full page writes. Giving users the
full power of SQL allows much more sophisticated analysis of the WAL
records. Personally, I'd probably find myself importing all the WAL
stats into the DB anyway because SQL is my preferred data manipulation
language.

Having the extension in core is actually the only way to avoid
duplicated effort, by having shared code which the pg_dump binary and
the extension both wrap or call.

Uh, aren't you duplicating code by having pg_waldump as a command-line
tool and an extension?

Well this whole conversation is just theoretical anyway until the code
is shared. :) But if Bharath is writing functions to decode WAL, then
wouldn't we just have pg_waldump use these same functions in order to
avoid duplicating code?

Bharath - was some code already posted and I just missed it?

Looking at the proposed API from the initial email, I like that there's
both stats functionality and WAL record inspection functionality
(similar to pg_waldump). I like that there's the ability to pull the
records as raw bytea data, however I think we're also going to want an
ability in v1 of the patch to decode it (similar to pageinspect
heap_page_item_attrs, etc).

Another feature that might be interesting down the road would be the
ability to provide filtering of WAL records for security purposes. For
example, allowing a user to only dump raw WAL records for one particular
database, or maybe excluding WAL records that change system catalogs or
the like. But I probably wouldn't start here, personally.

Now then.... as Blaise Pascal said in 1657 (and as was also said by
Winston Churchill, Mark Twain, etc).... "I'm sorry I wrote you such a
long letter; I didn't have time to write a short one."

-Jeremy

--
http://about.me/jeremy_schneider

#12Bruce Momjian
bruce@momjian.us
In reply to: Jeremy Schneider (#11)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Wed, Oct 6, 2021 at 09:56:34AM -0700, Jeremy Schneider wrote:

On 10/5/21 17:43, Bruce Momjian wrote:

On Tue, Oct 5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote:

Specifically exposing pg_waldump functionality in SQL could speed up
finding bugs in the PG logical replication code. We found and fixed a
few over this past year, but there are more lurking out there.

Uh, why is pg_waldump more important than other command line tool
information?

Going down the list of all other utilities in src/bin:

* pg_amcheck, pg_config, pg_controldata: already available in SQL
* psql, pgbench, pg_dump: already available as client-side apps
* initdb, pg_archivecleanup, pg_basebackup, pg_checksums, pg_ctl,
pg_resetwal, pg_rewind, pg_upgrade, pg_verifybackup: can't think of any
possible use case outside server OS access, most of these are too low
level and don't even make sense in SQL
* pg_test_fsync, pg_test_timing: marginally interesting ideas in SQL,
don't feel any deep interest myself

Speaking selfishly, there are a few reasons I would be specifically
interested in pg_waldump (the only remaining one on the list).

This is the analysis I was looking for to understand if copying the
features of command-line tools in extensions was a wise direction.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jeremy Schneider (#11)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On 2021-Oct-06, Jeremy Schneider wrote:

Well this whole conversation is just theoretical anyway until the code
is shared. :) But if Bharath is writing functions to decode WAL, then
wouldn't we just have pg_waldump use these same functions in order to
avoid duplicating code?

Actually, a lot of the code is already shared, since the rmgrdesc
routines are in src/backend. Keep in mind that it was there before
pg_xlogdump existed, to support WAL_DEBUG. When pg_xlogdump was added,
what we did was allow that backend-only code be compilable in a frontend
environment. Also, we already have xlogreader.

So pg_waldump itself is mostly scaffolding to let the frontend
environment get argument values to pass to backend-enabled code. The
only really interesting, novel thing is the --stats mode ... and I bet
you can write that with some SQL-level aggregation of the raw data, no
need for any C code.

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/

#14Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeremy Schneider (#11)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Wed, Oct 6, 2021 at 10:26 PM Jeremy Schneider
<schneider@ardentperf.com> wrote:

On 10/5/21 17:43, Bruce Momjian wrote:

On Tue, Oct 5, 2021 at 03:30:07PM -0700, Jeremy Schneider wrote:

Specifically exposing pg_waldump functionality in SQL could speed up
finding bugs in the PG logical replication code. We found and fixed a
few over this past year, but there are more lurking out there.

Uh, why is pg_waldump more important than other command line tool
information?

Going down the list of all other utilities in src/bin:

* pg_amcheck, pg_config, pg_controldata: already available in SQL
* psql, pgbench, pg_dump: already available as client-side apps
* initdb, pg_archivecleanup, pg_basebackup, pg_checksums, pg_ctl,
pg_resetwal, pg_rewind, pg_upgrade, pg_verifybackup: can't think of any
possible use case outside server OS access, most of these are too low
level and don't even make sense in SQL
* pg_test_fsync, pg_test_timing: marginally interesting ideas in SQL,
don't feel any deep interest myself

Speaking selfishly, there are a few reasons I would be specifically
interested in pg_waldump (the only remaining one on the list).

Thanks Jeremy for the analysis.

First, to better support existing features around logical replication
and decoding.

In particular, it seems inconsistent to me that all the replication
management SQL functions take LSNs as arguments - and yet there's no
SQL-based way to find the LSNs that you are supposed to pass into these
functions.

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-REPLICATION

Over the past few years I've been pulled in to help several large PG
users who ran into these bugs, and it's very painful - because the only
real remediation is to drop and recreate the replication slot, which
means either re-copying all the data to the downstream system or
figuring out a way to resync it. Some PG users have 3rd party tools like
HVR which can do row-by-row resync (IIUC), but no matter how you slice
it, we're talking about a lot of pain for people replicating large data
sets between multiple systems. In most cases, the only/best option even
with very large tables is to just make a fresh copy of the data - which
can translate to a business outage of hours or even days.

My favorite example is the SQL function pg_replication_slot_advance() -
this could really help PG users find less painful solutions to broken
decoding, however it's not really possible to /know/ an LSN to advance
to without inspecting WAL. ISTM there's a strong use case here for a SQL
interface on WAL inspection.

Second: debugging and troubleshooting logical replication and decoding bugs.

I helped track down a few logical replication bugs and get fixed into
code at postgresql.org this past year. (But I give credit to others who
are much better at C than I am, and who did a lot more work than I did
on these bugs!)

Logical decoding bugs are some of the hardest to fix - because all you
have is a WAL stream, but you don't know the SQL or workload patterns or
PG code paths which created that WAL stream.

Dumping the WAL - knowing which objects and which types of operations
are involved and stats like number of changes or number of
subtransactions - this helps identify which transaction and what SQL in
the application triggered the failure, which can help find short-term
workarounds. Businesses need those short-term workarounds so they can
keep going while we work on finding and fixing bugs, which can take some
time. This is another place where I think a SQL interface to WAL would
be helpful to PG users. Especially the ability to filter and trace a
single transaction through a large number of WAL files in the directory.

Third: statistics on WAL - especially full page writes. Giving users the
full power of SQL allows much more sophisticated analysis of the WAL
records. Personally, I'd probably find myself importing all the WAL
stats into the DB anyway because SQL is my preferred data manipulation
language.

Just to add to the above points, with the new extension pg_walinspect
we will have following advantages:
1) Usability - SQL callable functions will be easier to use for the
users/admins/developers.
2) Access Control - we can provide better access control for the WAL data/stats.
3) Emitting the actual WAL data(as bytea structure) and stats via SQL
callable functions will help to analyze and answer questions like how
much WAL data is being generated in the system, what kind of WAL data
it is, how many FPWs are happening and so on. Jermey has already given
more realistic use cases.
4) I came across this - there's a similar capability in SQL server -
https://www.mssqltips.com/sqlservertip/3076/how-to-read-the-sql-server-database-transaction-log/

Having the extension in core is actually the only way to avoid
duplicated effort, by having shared code which the pg_dump binary and
the extension both wrap or call.

Uh, aren't you duplicating code by having pg_waldump as a command-line
tool and an extension?

Well this whole conversation is just theoretical anyway until the code
is shared. :) But if Bharath is writing functions to decode WAL, then
wouldn't we just have pg_waldump use these same functions in order to
avoid duplicating code?

Bharath - was some code already posted and I just missed it?

Looking at the proposed API from the initial email, I like that there's
both stats functionality and WAL record inspection functionality
(similar to pg_waldump). I like that there's the ability to pull the
records as raw bytea data, however I think we're also going to want an
ability in v1 of the patch to decode it (similar to pageinspect
heap_page_item_attrs, etc).

I'm yet to start working on the patch. I will be doing it soon.

Another feature that might be interesting down the road would be the
ability to provide filtering of WAL records for security purposes. For
example, allowing a user to only dump raw WAL records for one particular
database, or maybe excluding WAL records that change system catalogs or
the like. But I probably wouldn't start here, personally.

+1.

Regards,
Bharath Rupireddy.

#15Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#14)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Thu, Oct 7, 2021 at 10:43 AM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Looking at the proposed API from the initial email, I like that there's
both stats functionality and WAL record inspection functionality
(similar to pg_waldump). I like that there's the ability to pull the
records as raw bytea data, however I think we're also going to want an
ability in v1 of the patch to decode it (similar to pageinspect
heap_page_item_attrs, etc).

I'm yet to start working on the patch. I will be doing it soon.

Thanks all. Here's the v1 patch set for the new extension pg_walinspect.
Note that I didn't include the documentation part now, I will be doing it a
bit later.

Please feel free to review and provide your thoughts.

Regards,
Bharath Rupireddy.

Attachments:

v1-0001-pg_walinspect.patchapplication/x-patch; name=v1-0001-pg_walinspect.patchDownload+950-10
v1-0002-pg_walinspect-tests.patchapplication/x-patch; name=v1-0002-pg_walinspect-tests.patchDownload+70-1
#16Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#15)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Thu, Nov 18, 2021 at 6:43 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Oct 7, 2021 at 10:43 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Looking at the proposed API from the initial email, I like that there's
both stats functionality and WAL record inspection functionality
(similar to pg_waldump). I like that there's the ability to pull the
records as raw bytea data, however I think we're also going to want an
ability in v1 of the patch to decode it (similar to pageinspect
heap_page_item_attrs, etc).

I'm yet to start working on the patch. I will be doing it soon.

Thanks all. Here's the v1 patch set for the new extension pg_walinspect. Note that I didn't include the documentation part now, I will be doing it a bit later.

Please feel free to review and provide your thoughts.

The v1 patch set was failing to compile on Windows. Here's the v2
patch set fixing that.

Regards,
Bharath Rupireddy.

Attachments:

v2-0001-pg_walinspect.patchapplication/octet-stream; name=v2-0001-pg_walinspect.patchDownload+960-12
v2-0002-pg_walinspect-tests.patchapplication/octet-stream; name=v2-0002-pg_walinspect-tests.patchDownload+70-1
#17Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#16)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Thu, Nov 25, 2021 at 3:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks all. Here's the v1 patch set for the new extension pg_walinspect. Note that I didn't include the documentation part now, I will be doing it a bit later.

Please feel free to review and provide your thoughts.

The v1 patch set was failing to compile on Windows. Here's the v2
patch set fixing that.

I forgot to specify this: the v1 patch set was failing to compile on
Windows with errors shown at [1](Link target) -> pg_walinspect.obj : error LNK2001: unresolved external symbol forkNames [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol pg_comp_crc32c [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol wal_segment_size [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol RmgrTable [C:\Users\bhara\postgres\pg_walinspect.vcxproj] .\Release\pg_walinspect\pg_walinspect.dll : fatal error LNK1120: 4 unresolved externals [C:\Users\bhara\postgres\pg_walinspect.vcxproj]. Thanks to Julien Rouhaud who
suggested to use PGDLLIMPORT in an off-list discussion.

[1]: (Link target) -> pg_walinspect.obj : error LNK2001: unresolved external symbol forkNames [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol pg_comp_crc32c [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol wal_segment_size [C:\Users\bhara\postgres\pg_walinspect.vcxproj] pg_walinspect.obj : error LNK2001: unresolved external symbol RmgrTable [C:\Users\bhara\postgres\pg_walinspect.vcxproj] .\Release\pg_walinspect\pg_walinspect.dll : fatal error LNK1120: 4 unresolved externals [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
forkNames [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
pg_comp_crc32c [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
wal_segment_size [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
RmgrTable [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
.\Release\pg_walinspect\pg_walinspect.dll : fatal error LNK1120: 4
unresolved externals [C:\Users\bhara\postgres\pg_walinspect.vcxproj]

5 Error(s)

Regards,
Bharath Rupireddy.

#18Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#17)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Thu, Nov 25, 2021 at 5:54 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Nov 25, 2021 at 3:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks all. Here's the v1 patch set for the new extension pg_walinspect. Note that I didn't include the documentation part now, I will be doing it a bit later.

Please feel free to review and provide your thoughts.

The v1 patch set was failing to compile on Windows. Here's the v2
patch set fixing that.

I forgot to specify this: the v1 patch set was failing to compile on
Windows with errors shown at [1]. Thanks to Julien Rouhaud who
suggested to use PGDLLIMPORT in an off-list discussion.

[1] (Link target) ->
pg_walinspect.obj : error LNK2001: unresolved external symbol
forkNames [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
pg_comp_crc32c [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
wal_segment_size [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
pg_walinspect.obj : error LNK2001: unresolved external symbol
RmgrTable [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
.\Release\pg_walinspect\pg_walinspect.dll : fatal error LNK1120: 4
unresolved externals [C:\Users\bhara\postgres\pg_walinspect.vcxproj]

5 Error(s)

Here's the v3 patch-set with fixes for the compiler warnings reported
in the cf bot at
https://cirrus-ci.com/task/4979131497578496?logs=gcc_warning#L506.

Please review.

Regards,
Bharath Rupireddy.

Attachments:

v3-0001-pg_walinspect.patchapplication/octet-stream; name=v3-0001-pg_walinspect.patchDownload+951-12
v3-0001-pg_walinspect-tests.patchapplication/octet-stream; name=v3-0001-pg_walinspect-tests.patchDownload+70-1
#19Bruce Momjian
bruce@momjian.us
In reply to: Bharath Rupireddy (#18)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

So I looked at this patch and I have the same basic question as Bruce.
Do we really want to expose every binary tool associated with Postgres
as an extension? Obviously this is tempting for cloud provider users
which is not an unreasonable argument. But it does have consequences.

1) Some things like pg_waldump are running code that is not normally
under user control. This could have security issues or reliability
issues.

On that front I'm especially concerned that pg_verify_raw_wal_record()
for example would let an attacker feed arbitrary hand crafted xlog
records into the parser which is not normally something a user can do.
If they feed it something it's not expecting it might be easy to cause
a crash and server restart.

There's also a bit of concern about data retention. Generally in
Postgres when rows are deleted there's very weak guarantees about the
data really being wiped. We definitely don't wipe it from disk of
course. And things like pageinspect could expose it long after it's
been deleted. But one might imagine after pageinspect shows it's gone
and/or after a vacuum full the data is actually purged. But then
something like pg_walinspect would make even that insufficient.

2) There's no documentation. I'm guessing you hesitated to write
documentation until the interface is settled but actually sometimes
writing documentation helps expose things in the interface that look
strange when you try to explain them.

3) And the interface does look a bit strange. Like what's the deal
with pg_get_wal_record_info_2() ? I gather it's just a SRF version of
pg_get_wal_record_info() but that's a strange name. And then what's
the point of pg_get_wal_record_info() at all? Why wouldn't the SRF be
sufficient even for the specific case of a single record?

And the stats functions seem a bit out of place to me. If the SRF
returned the data in the right format the user should be able to do
aggregate queries to generate these stats easily enough. If anything a
simple SQL function to do the aggregations could be provided.

Now this is starting to get into the realm of bikeshedding but... Some
of the code is taken straight from pg_waldump and does things like:

+ appendStringInfo(&rec_blk_ref, "blkref #%u: rel %u/%u/%u fork %s blk %u",

But that's kind of out of place for an SQL interface. It makes it hard
to write queries since things like the relid, block number etc are in
the string. If I wanted to use these functions I would expect to be
doing something like "select all the decoded records pertaining to
block n".

All that said, I don't want to gatekeep based on this kind of
criticism. The existing code is based on pg_waldump and if we want an
extension to expose that then that's a reasonable place to start. We
can work on a better format for the data later it doesn't mean we
shouldn't start with something we have today.

4) This isn't really an issue with your patch at all but why on earth
do we have a bitvector for WAL compression methods?! Like, what does
it mean to have multiple compression methods set? That should just be
a separate field with values for each type of compression surely?

I suppose this raises the issue of what happens if someone fixes that.
They'll now have to update pg_waldump *and* pg_walinspect. I don't
think that would actually be a lot of work but it's definitely more
than just one. Also, perhaps they should be in the same contrib
directory so at least people won't forget there are two.

#20Justin Pryzby
pryzby@telsasoft.com
In reply to: Bruce Momjian (#19)
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

On Mon, Jan 31, 2022 at 04:40:09PM -0500, Greg Stark wrote:

4) This isn't really an issue with your patch at all but why on earth
do we have a bitvector for WAL compression methods?! Like, what does
it mean to have multiple compression methods set? That should just be
a separate field with values for each type of compression surely?

I don't have an answer to your question, but the discussion was here.

In the versions of the patches I sent on Mar 15, Mar 21, May 18, May 24, Jun
13, I avoided "one bit per compression method", but Michael thought this was
simpler.

/messages/by-id/20210622031358.GF29179@telsasoft.com
On Mon, Jun 21, 2021 at 10:13:58PM -0500, Justin Pryzby wrote:

+/* compression methods supported */
+#define BKPIMAGE_COMPRESS_PGLZ 0x04
+#define BKPIMAGE_COMPRESS_ZLIB 0x08
+#define BKPIMAGE_COMPRESS_LZ4  0x10
+#define BKPIMAGE_COMPRESS_ZSTD 0x20
+#define        BKPIMAGE_IS_COMPRESSED(info) \
+       ((info & (BKPIMAGE_COMPRESS_PGLZ | BKPIMAGE_COMPRESS_ZLIB | \
+                         BKPIMAGE_COMPRESS_LZ4 | BKPIMAGE_COMPRESS_ZSTD)) != 0)

You encouraged saving bits here, so I'm surprised to see that your patches
use one bit per compression method: 2 bits to support no/pglz/lz4, 3 to add
zstd, and the previous patch used 4 bits to also support zlib.

There are spare bits available for that, but now there can be an inconsistency
if two bits are set. Also, 2 bits could support 4 methods (including "no").

On Tue, Jun 22, 2021 at 12:53:46PM +0900, Michael Paquier wrote:

Yeah, I know. I have just finished with that to get something
readable for the sake of the tests. As you say, the point is moot
just we add one new method, anyway, as we need just one new bit.
And that's what I would like to do for v15 with LZ4 as the resulting
patch is simple. It would be an idea to discuss more compression
methods here once we hear more from users when this is released in the
field, re-considering at this point if more is necessary or not.

#21Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#19)
#22Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bruce Momjian (#19)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#19)
#24Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Robert Haas (#23)
In reply to: Robert Haas (#23)
#26Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#23)
#27Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andrew Dunstan (#26)
#28Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#27)
#29Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Bharath Rupireddy (#29)
#31Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Robert Haas (#30)
#32Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#29)
#33Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#32)
#34Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#33)
#35Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#34)
#36Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#35)
#37Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#35)
#38Nitin Jadhav
nitinjadhavpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#37)
#39Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Kyotaro Horiguchi (#37)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Bharath Rupireddy (#33)
#41Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Robert Haas (#40)
#42Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#41)
#43Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#35)
#44Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#43)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#43)
#46Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#45)
#47Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#44)
#48Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#47)
#49Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Kyotaro Horiguchi (#48)
#50Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Kyotaro Horiguchi (#48)
#51Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#47)
#52Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Bharath Rupireddy (#44)
#53Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#48)
#54Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro Horiguchi (#47)
#55Jeff Davis
pgsql@j-davis.com
In reply to: Stephen Frost (#46)
#56Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#52)
#57Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Robert Haas (#54)
#58Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#54)
#59Stephen Frost
sfrost@snowman.net
In reply to: Jeff Davis (#55)
#60Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Stephen Frost (#59)
#61Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#60)
#62Stephen Frost
sfrost@snowman.net
In reply to: Bharath Rupireddy (#61)
#63Ashutosh Sharma
ashu.coek88@gmail.com
In reply to: Stephen Frost (#62)
#64Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Ashutosh Sharma (#63)
#65Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Ashutosh Sharma (#63)
#66Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#64)
#67Nitin Jadhav
nitinjadhavpostgres@gmail.com
In reply to: Bharath Rupireddy (#66)
#68Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#65)
#69Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nitin Jadhav (#67)
#70Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#68)
#71Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#70)
#72Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#71)
#73Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#72)
#74Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#71)
#75Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#71)
#76Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#75)
#77Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#76)
#78Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#77)
#79Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#78)
#80RKN Sai Krishna
rknsaiforpostgres@gmail.com
In reply to: Bharath Rupireddy (#79)
#81Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: RKN Sai Krishna (#80)
#82Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#81)
#83Andres Freund
andres@anarazel.de
In reply to: Jeff Davis (#82)
#84Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#82)
#85Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#84)
#86Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#85)
#87Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#86)
#88Amit Kapila
amit.kapila16@gmail.com
In reply to: Bharath Rupireddy (#87)
#89Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Kapila (#88)
#90Amit Kapila
amit.kapila16@gmail.com
In reply to: Bharath Rupireddy (#89)