Switch PgStat_HashKey.objoid from Oid to uint64
Hi all,
While working more on the cumulative pgstats and its interactions with
pg_stat_statements, one thing that I have been annoyed with is that
the dshash key for variable-numbered stats uses a pair of (Oid dboid,
Oid objoid), mostly to stick with the fact that most of the stats are
dealing with system objects.
That's not completely true, though, as statistics can also implement
their own index numbering without storing these numbers to disk, by
defining {from,to}_serialized_name. Replication slots do that, so we
are already considering as OIDs numbers that are not that.
For pg_stat_statements, one issue with the current pgstats is that we
want to use the query ID as hash key, which is 8 bytes, while also
having some knowledge of the database OID because we want to be able
to clean up stats entries about specific databases.
Please find attached a patch switching PgStat_HashKey.objoid from an
Oid to uint64 to be able to handle cases of stats that want more
space. The size of PgStat_HashKey is increased from 12 to 16 bytes,
but with alignment the size of PgStatShared_HashEntry (what's stored
in the dshash) is unchanged at 32 bytes.
Perhaps what's proposed here is a bad idea for a good reason, and we
could just leave with storing 4 bytes of the query ID in the dshash
instead of 8. Anyway, we make a lot of efforts to use 8 bytes to
reduce conflicts with different statements.
Another thing to note is the change for xl_xact_stats_item, requiring
a bump of XLOG_PAGE_MAGIC. A second thing is pg_stat_have_stats that
needs to use a different argument than an OID for the object,
requiring a catversion bump.
An interesting thing is that I have seen ubsan complain about this
patch, due to the way WAL records xl_xact_commit are built with
XACT_XINFO_HAS_DROPPED_STATS and parsed as xl_xact_stats_item requires
an 8-byte alignment now (see pg_waldump TAP reports when using the
attached), but we don't enforce anything as the data of such WAL
records is added with a simple XLogRegisterData(), like:
# xactdesc.c:91:28: runtime error: member access within misaligned
address 0x5651e996b86c for type 'struct xl_xact_stats_items', which
requires 8 byte alignment # 0x5651e996b86c: note: pointer points here
TBH, I've looked at that for quite a bit, thinking about the addition
of some "dummy" member to some of the parsed structures to force some
padding, or play with the alignment macros, or for some alignment when
inserting the record, or looked at pg_attribute_aligned().
First I'm surprised that it did not show up as an issue yet in this
area. Second, I could not get down to something "nice", but perhaps
there are preferred approaches when it comes to that and somebody has
a fancier idea? Or perhaps the problem is bigger than that due to
the way the record is designed and built? It also feels that I'm
missing something obvious, not sure what TBH. Still I'm OK to paint
some more MAXALIGN()s to make sure that all these deparsing pointers
have a correct alignment with some more TYPEALIGN()s or similar,
because this deparsing stuff is about that, but I'm also wondering if
there is an argument for forcing that for the record itself? I'll
think more about that next week or so.
Anyway, I'm attaching that to the next CF for discussion for now, as
there could be objections about this whole idea, as well.
Thoughts or comments?
--
Michael
Attachments:
0001-Bump-PgStat_HashKey.objoid-to-be-8-bytes.patchtext/x-diff; charset=us-asciiDownload+82-71
On 26/08/2024 03:58, Michael Paquier wrote:
An interesting thing is that I have seen ubsan complain about this
patch, due to the way WAL records xl_xact_commit are built with
XACT_XINFO_HAS_DROPPED_STATS and parsed as xl_xact_stats_item requires
an 8-byte alignment now (see pg_waldump TAP reports when using the
attached), but we don't enforce anything as the data of such WAL
records is added with a simple XLogRegisterData(), like:
# xactdesc.c:91:28: runtime error: member access within misaligned
address 0x5651e996b86c for type 'struct xl_xact_stats_items', which
requires 8 byte alignment # 0x5651e996b86c: note: pointer points hereTBH, I've looked at that for quite a bit, thinking about the addition
of some "dummy" member to some of the parsed structures to force some
padding, or play with the alignment macros, or for some alignment when
inserting the record, or looked at pg_attribute_aligned().First I'm surprised that it did not show up as an issue yet in this
area. Second, I could not get down to something "nice", but perhaps
there are preferred approaches when it comes to that and somebody has
a fancier idea? Or perhaps the problem is bigger than that due to
the way the record is designed and built? It also feels that I'm
missing something obvious, not sure what TBH. Still I'm OK to paint
some more MAXALIGN()s to make sure that all these deparsing pointers
have a correct alignment with some more TYPEALIGN()s or similar,
because this deparsing stuff is about that, but I'm also wondering if
there is an argument for forcing that for the record itself? I'll
think more about that next week or so.
Currently, we rely on the fact that all the xl_xact_* structs require
sizeof(int) alignment. See comment above struct xl_xact_xinfo.
One idea is to store the uint64 as two uint32's.
--
Heikki Linnakangas
Neon (https://neon.tech)
On Mon, Aug 26, 2024 at 09:32:54AM +0300, Heikki Linnakangas wrote:
Currently, we rely on the fact that all the xl_xact_* structs require
sizeof(int) alignment. See comment above struct xl_xact_xinfo.
Thanks, I have missed this part. So that explains the alignment I'd
better use in the record.
One idea is to store the uint64 as two uint32's.
Nice, we could just do that. This idea makes me feel much better than
sticking more aligment macros in the paths where the record is built.
Attached is an updated patch doing that. ubsan is silent with that.
--
Michael
Attachments:
v2-0001-Bump-PgStat_HashKey.objoid-to-be-8-bytes.patchtext/x-diff; charset=us-asciiDownload+95-71
Hi,
On Thu, Aug 29, 2024 at 08:56:59AM +0900, Michael Paquier wrote:
On Mon, Aug 26, 2024 at 09:32:54AM +0300, Heikki Linnakangas wrote:
Currently, we rely on the fact that all the xl_xact_* structs require
sizeof(int) alignment. See comment above struct xl_xact_xinfo.Thanks, I have missed this part. So that explains the alignment I'd
better use in the record.One idea is to store the uint64 as two uint32's.
Nice, we could just do that. This idea makes me feel much better than
sticking more aligment macros in the paths where the record is built.Attached is an updated patch doing that. ubsan is silent with that.
Thanks!
Yeah, indeed, with "COPT=-fsanitize=alignment -fno-sanitize-recover=all", then
"make -C src/test/modules/test_custom_rmgrs check":
- Is fine on master
- Fails with v1 applied due to things like:
xactdesc.c:91:28: runtime error: member access within misaligned address 0x5d29d22cc13c for type 'struct xl_xact_stats_items', which requires 8 byte alignment
0x5d29d22cc13c: note: pointer points here
7f 06 00 00 02 00 00 00 fe 7f 00 00 03 00 00 00 05 00 00 00 05 40 00 00 00 00 00 00 03 00 00 00
- Is fine with v2
So v2 does fix the alignment issue and I also think that's a nice way to fix it.
Lot of stuff that this patch does is mechanical changes:
- replace "objoid" by "objid" in *stats* files
- change the related type from Oid to uint64
- make use of hash_bytes_extended() instead of hash_bytes when needed
and I don't see any issues here.
There is also some manipulation around the 2 new uint32 fields (objid_hi and
objid_lo) in the xactdesc.c and pgstat_xact.c files that look good to me.
But now we end up having functions that accept Oid as parameters to call
functions that accept uint64 as parameter (for the exact same parameter), for
example:
"
void
pgstat_create_function(Oid proid)
{
pgstat_create_transactional(PGSTAT_KIND_FUNCTION,
MyDatabaseId,
proid);
}
"
as pgstat_create_transactional is now:
-pgstat_create_transactional(PgStat_Kind kind, Oid dboid, Oid objoid)
+pgstat_create_transactional(PgStat_Kind kind, Oid dboid, uint64 objid)
That's not an issue as both are unsigned and as we do those calls in that
order (Oid -> uint64).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Thu, Sep 12, 2024 at 01:37:52PM +0000, Bertrand Drouvot wrote:
There is also some manipulation around the 2 new uint32 fields (objid_hi and
objid_lo) in the xactdesc.c and pgstat_xact.c files that look good to me.
Thanks for the reviews. The high and low manipulations are still kind
of OK to me as a solution for the record constructions.
But now we end up having functions that accept Oid as parameters to call
functions that accept uint64 as parameter (for the exact same parameter), for
example:"
void
pgstat_create_function(Oid proid)
{
pgstat_create_transactional(PGSTAT_KIND_FUNCTION,
MyDatabaseId,
proid);
}
"as pgstat_create_transactional is now:
-pgstat_create_transactional(PgStat_Kind kind, Oid dboid, Oid objoid) +pgstat_create_transactional(PgStat_Kind kind, Oid dboid, uint64 objid)That's not an issue as both are unsigned and as we do those calls in that
order (Oid -> uint64).
Yes, that's intentional. All the pgstats routines associated to a
particular object that depends on an OID should still use an OID, and
anything that's generic enough to be used for all stats kinds had
better use a uint64. I was wondering if it would be better hiding
that behind a dedicated type, but decided to stick with uint64.
--
Michael
Hi,
On Fri, Sep 13, 2024 at 07:34:21AM +0900, Michael Paquier wrote:
On Thu, Sep 12, 2024 at 01:37:52PM +0000, Bertrand Drouvot wrote:
There is also some manipulation around the 2 new uint32 fields (objid_hi and
objid_lo) in the xactdesc.c and pgstat_xact.c files that look good to me.Thanks for the reviews. The high and low manipulations are still kind
of OK to me as a solution for the record constructions.
Agree.
But now we end up having functions that accept Oid as parameters to call
functions that accept uint64 as parameter (for the exact same parameter), for
example:"
void
pgstat_create_function(Oid proid)
{
pgstat_create_transactional(PGSTAT_KIND_FUNCTION,
MyDatabaseId,
proid);
}
"as pgstat_create_transactional is now:
-pgstat_create_transactional(PgStat_Kind kind, Oid dboid, Oid objoid) +pgstat_create_transactional(PgStat_Kind kind, Oid dboid, uint64 objid)That's not an issue as both are unsigned and as we do those calls in that
order (Oid -> uint64).Yes, that's intentional. All the pgstats routines associated to a
particular object that depends on an OID should still use an OID, and
anything that's generic enough to be used for all stats kinds had
better use a uint64.
Yeah, that sounds good to me.
I was wondering if it would be better hiding
that behind a dedicated type, but decided to stick with uint64.
That makes sense to me.
Overall, the patch LGTM.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com