int64 support in List API
I was working on an extension [1]pg_block_queries https://github.com/gurjeet/pg_block_queries/ that needed to manage a list of query IDs.
Query ID is internally of type uint64 (struct Query's member named queryId),
but it is exposed to the SQL layer as signed int64 (e.g
pg_stat_activity.query_id is of type bigint).
I wanted to use the list api from pg_list.h. It has special implementations for
int, oid, pointer, and xid types, which help with lower code overhead (no need
to create structures whose sole member is of one of these types) and better
performance. So I was wondering if there's any interest in having a similar API
for int64 type, as well. I am not sure if there are any candidates in Postgres
core that'd benefit from this, but it sure would've helped when I was
developing the extension.
Please see attached a minimal patch that I developed while developing my
extension. By no means is the patch complete, but if there's interest in list
API for int64 type, I can complete the patch and make the new API match the
current API for int type.
[1]: pg_block_queries https://github.com/gurjeet/pg_block_queries/
https://github.com/gurjeet/pg_block_queries/
Best regards,
Gurjeet
http://Gurje.et
Attachments:
int64_list_api.patchapplication/octet-stream; name=int64_list_api.patchDownload+56-5
Gurjeet Singh <gurjeet@singh.im> writes:
I wanted to use the list api from pg_list.h. It has special implementations for
int, oid, pointer, and xid types, which help with lower code overhead (no need
to create structures whose sole member is of one of these types) and better
performance. So I was wondering if there's any interest in having a similar API
for int64 type, as well.
This has been discussed before, and we've felt that it wasn't worth
the additional code duplication. I would not favor approaching this
with the mindset of lets-copy-and-paste-all-the-code.
However: it might be interesting to think about having just two
underlying implementations, one for 32-bit datums and one for 64-bits,
with the existing APIs becoming macros-with-casts wrappers around the
appropriate one of those. That line of attack might lead to
physically less code not more. The devil's in the details though.
regards, tom lane
20.01.2025 07:36, Tom Lane пишет:
Gurjeet Singh <gurjeet@singh.im> writes:
I wanted to use the list api from pg_list.h. It has special implementations for
int, oid, pointer, and xid types, which help with lower code overhead (no need
to create structures whose sole member is of one of these types) and better
performance. So I was wondering if there's any interest in having a similar API
for int64 type, as well.This has been discussed before, and we've felt that it wasn't worth
the additional code duplication. I would not favor approaching this
with the mindset of lets-copy-and-paste-all-the-code.However: it might be interesting to think about having just two
underlying implementations, one for 32-bit datums and one for 64-bits,
with the existing APIs becoming macros-with-casts wrappers around the
appropriate one of those. That line of attack might lead to
physically less code not more. The devil's in the details though.
There's masterpiece typesafe std-C compliant macros+functions
implementation of vector:
It wraps any struct with "T *data; int length; int capacity" fields, and
uses `sizeof(*(v)->data)` to instruct wrapped allocation/move functions.
Although it could not be directly adapted to List*, and it is less
sophisticated considering "mutation during iteration", it could be
really useful in many places, where List* used not as a Node, but just
as dynamic array.
On Sun, Jan 19, 2025 at 9:12 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
20.01.2025 07:36, Tom Lane пишет:
...
This has been discussed before, and we've felt that it wasn't worth
the additional code duplication. I would not favor approaching this
with the mindset of lets-copy-and-paste-all-the-code.However: it might be interesting to think about having just two
underlying implementations, one for 32-bit datums and one for 64-bits,
with the existing APIs becoming macros-with-casts wrappers around the
appropriate one of those. That line of attack might lead to
physically less code not more. The devil's in the details though.There's masterpiece typesafe std-C compliant macros+functions
implementation of vector:It wraps any struct with "T *data; int length; int capacity" fields, and
uses `sizeof(*(v)->data)` to instruct wrapped allocation/move functions.Although it could not be directly adapted to List*, and it is less
sophisticated considering "mutation during iteration", it could be
really useful in many places, where List* used not as a Node, but just
as dynamic array.
+1 to adopting something like this ^ for std::vector-like resizable
arrays of fixed-size values.
The overhead of pg_list is not just the cost of an extra pointer
dereference (because you have to store the List as a T* array, rather
than a T array), but also the palloc() overhead, since every T* must
point to a T, and so that T must live somewhere on the heap.
I sometimes see palloc() showing up in perf reports, so the allocation
cost of T* vs. T seems non-zero to me.
In one case, I want to store a 4x 4-byte struct in some sort of list /
expandable array. From a human-readability point of view, it's awkward
to split the struct into 4x IntLists; from a CPU point of view, it's
awkward to take up an 8-byte pointer to point to a 16-byte struct,
allocated on the heap.
James