Large writable variables

Started by Andres Freundover 7 years ago51 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

while briefly thinking about the per-process overhead of postgres, I
looked at the section sizes of a modern postgres. In particular which
memory areas are *not* shared between processes.

If you look at the section sizes that are mapped read-write:

$ size --format=sysv src/backend/postgres
src/backend/postgres :
section size addr
.rodata 1593617 5730304 (read-only, for reference)
.data.rel.ro 134944 8039904 (read only after start)
.data 53888 8178912 (read-write, initialized)
.bss 510416 8232800 (read-write, uninitialized)
Total 52417197

So we have 500kb of not-initialized memory mapped into every
process. That's, uh, not nothing.

top unitialized allocations:
$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
0000000008251872 0000000000131144 b LagTracker
0000000008585088 0000000000131104 b hist_entries
0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache
0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009
0000000008716192 0000000000016384 b hist_start
0000000008238336 0000000000008192 b PqRecvBuffer
0000000008734208 0000000000005136 B BackendWritebackContext
0000000008386368 0000000000003200 b held_lwlocks

So we have a two variables sized 130kb. Yikes.

struct {
XLogRecPtr last_lsn; /* 0 8 */
WalTimeSample buffer[8192]; /* 8 131072 */
/* --- cacheline 2048 boundary (131072 bytes) was 8 bytes ago --- */
int write_head; /* 131080 4 */
int read_heads[3]; /* 131084 12 */
WalTimeSample last_read[3]; /* 131096 48 */
/* --- cacheline 2049 boundary (131136 bytes) was 8 bytes ago --- */

/* size: 131144, cachelines: 2050, members: 5 */
/* last cacheline: 8 bytes */
};

that's not actually used very often, nor in all processes... Thomas?

hist_entries is pg_lzcompress; DCHCache,NUMCache are formatting.c.

top itialized allocations:
$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[dD]\b'|head
0000000008086944 0000000000087904 D fmgr_builtins
0000000008201120 0000000000017280 d ConfigureNamesInt
0000000008218400 0000000000013680 d ConfigureNamesBool
0000000008189248 0000000000008512 d ConfigureNamesString
0000000008077344 0000000000007040 D ScanKeywords
0000000008184928 0000000000004320 d ConfigureNamesEnum
0000000008197760 0000000000003360 d ConfigureNamesReal
0000000008062976 0000000000002304 d DCH_keywords
0000000008069952 0000000000002016 D pg_wchar_table
0000000008075552 0000000000001776 d encoding_match_list

fmgr_builtins isn't readonly even though declared a const - I assume
because it's full of addresses that will be mapped differently from
execution to execution.

ConfigureNames* isn't marked as const, because we update them:
/* Rather than requiring vartype to be filled in by hand, do this: */
conf->gen.vartype = PGC_BOOL;

I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
section.

Greetings,

Andres Freund

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: Large writable variables

Hi,

On 2018-10-15 13:07:54 -0700, Andres Freund wrote:

top itialized allocations:
$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[dD]\b'|head
0000000008086944 0000000000087904 D fmgr_builtins
0000000008201120 0000000000017280 d ConfigureNamesInt
0000000008218400 0000000000013680 d ConfigureNamesBool
0000000008189248 0000000000008512 d ConfigureNamesString
0000000008077344 0000000000007040 D ScanKeywords
0000000008184928 0000000000004320 d ConfigureNamesEnum
0000000008197760 0000000000003360 d ConfigureNamesReal
0000000008062976 0000000000002304 d DCH_keywords
0000000008069952 0000000000002016 D pg_wchar_table
0000000008075552 0000000000001776 d encoding_match_list

fmgr_builtins isn't readonly even though declared a const - I assume
because it's full of addresses that will be mapped differently from
execution to execution.

ConfigureNames* isn't marked as const, because we update them:
/* Rather than requiring vartype to be filled in by hand, do this: */
conf->gen.vartype = PGC_BOOL;

I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
section.

It's because they contain pointers to strings, which are affected by
relocations (and position independent executables force everything to be
relocatable). They do go into .data.rel.ro however:

$ objdump -t ~/build/postgres/dev-optimize/vpath/src/backend/postgres|grep -E '\b(ScanKeywords|fmgr_builtins|DCH_keywords|pg_wchar_table)\b'
00000000007b0800 l O .data.rel.ro 0000000000000900 DCH_keywords
00000000007b4020 g O .data.rel.ro 0000000000001b80 ScanKeywords
00000000007b65a0 g O .data.rel.ro 0000000000015760 fmgr_builtins
00000000007b2340 g O .data.rel.ro 00000000000007e0 pg_wchar_table

as long as we're using forking rather than EXEC_BACKEND, that's
perfectly fine. I don't quite know how windows handles any of this, so
I can't say whether it's a problem there.

Greetings,

Andres Freund

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: Large writable variables

Andres Freund <andres@anarazel.de> writes:

So we have 500kb of not-initialized memory mapped into every
process. That's, uh, not nothing.

Bleah.

0000000008585088 0000000000131104 b hist_entries
0000000008716192 0000000000016384 b hist_start

I'm unsure what fraction of processes would have use for these.

0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache

We could surely allocate these on first use.

0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009

I think that tzdefrules_s is not used in common cases (though I could be
wrong about that), so we could win by alloc-on-first-use. The same might
be true for gmtmem, but there's a sticking point: there is no provision
for failure there, so I'm unsure how we avoid crashing on OOM.

0000000008238336 0000000000008192 b PqRecvBuffer
0000000008734208 0000000000005136 B BackendWritebackContext
0000000008386368 0000000000003200 b held_lwlocks

These are below my personal threshold of pain.

fmgr_builtins isn't readonly even though declared a const - I assume
because it's full of addresses that will be mapped differently from
execution to execution.

Check.

I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
section.

I think it's the same problem: pointers can't be truly const because
they have to be changed if one relocates the executable.

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere. No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: Large writable variables

Hi,

On 2018-10-15 16:36:26 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

So we have 500kb of not-initialized memory mapped into every
process. That's, uh, not nothing.

Bleah.

Yea...

0000000008585088 0000000000131104 b hist_entries
0000000008716192 0000000000016384 b hist_start

I'm unsure what fraction of processes would have use for these.

Yea, I'm not sure either. Although I suspect that given the cost of
compression having an "allocate on first use" check should be quite
doable.

0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache

We could surely allocate these on first use.

yep.

0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009

I think that tzdefrules_s is not used in common cases (though I could be
wrong about that), so we could win by alloc-on-first-use. The same might
be true for gmtmem, but there's a sticking point: there is no provision
for failure there, so I'm unsure how we avoid crashing on OOM.

I guess we could return false / NULL to the caller. Not perfect, but
there's not that many callers. We could wrap them in wrappers that throw
errors...

0000000008238336 0000000000008192 b PqRecvBuffer
0000000008734208 0000000000005136 B BackendWritebackContext
0000000008386368 0000000000003200 b held_lwlocks

These are below my personal threshold of pain.

Yea, agreed. PqRecvBuffer and held_lwlocks are common enough that it
makes sense to pre-allocate them anyway. I guess you could argue that
BackendWritebackContext should be dynamically allocated.

I'm unclear as to why ScanKeywords, DCH_keywords aren't in a readonly
section.

I think it's the same problem: pointers can't be truly const because
they have to be changed if one relocates the executable.

Right. It's, as I noticed when looking via objdupm, in .data.rel.ro, so
I think it's not that bad.

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere. No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

Yea, that might even help performancewise. Alternatively we could change
ScanKeyword to store the keyword name inline, but that'd be a measurable
size increase...

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: Large writable variables

Andres Freund <andres@anarazel.de> writes:

On 2018-10-15 16:36:26 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

0000000008585088 0000000000131104 b hist_entries
0000000008716192 0000000000016384 b hist_start

I'm unsure what fraction of processes would have use for these.

Yea, I'm not sure either. Although I suspect that given the cost of
compression having an "allocate on first use" check should be quite
doable.

I don't think the extra check would be a problem, but if we end up
needing the space in most processes, we're not really buying anything.
It'd need some investigation.

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere. No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

Yea, that might even help performancewise. Alternatively we could change
ScanKeyword to store the keyword name inline, but that'd be a measurable
size increase...

Yeah. It also seems like doing it this way would improve locality of
access: the pieces of the giant string would presumably be in the same
order as the ScanKeywords entries, whereas with the current setup,
who knows where the compiler has put 'em or in what order.

We'd need some tooling to generate the constants that way, though;
I can't see how to make it directly from kwlist.h.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Large writable variables

Hi,

On 2018-10-15 16:54:53 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-10-15 16:36:26 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

0000000008585088 0000000000131104 b hist_entries
0000000008716192 0000000000016384 b hist_start

I'm unsure what fraction of processes would have use for these.

Yea, I'm not sure either. Although I suspect that given the cost of
compression having an "allocate on first use" check should be quite
doable.

I don't think the extra check would be a problem, but if we end up
needing the space in most processes, we're not really buying anything.
It'd need some investigation.

I don't think it's particularly uncommon to have processes that don't
use any toasted datums. I'm not sure however how to put numbers on
that? After all, it'll be drastically workload dependant.

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere. No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

Yea, that might even help performancewise. Alternatively we could change
ScanKeyword to store the keyword name inline, but that'd be a measurable
size increase...

Yeah. It also seems like doing it this way would improve locality of
access: the pieces of the giant string would presumably be in the same
order as the ScanKeywords entries, whereas with the current setup,
who knows where the compiler has put 'em or in what order.

I assume you're talking about the offset approach. Performancewise I
assume that my suggestion of inlining the names into the struct would be
faster. Are there many realistic cases where performance matters enough
to warrant the size increase?

We'd need some tooling to generate the constants that way, though;
I can't see how to make it directly from kwlist.h.

I guess we could create a struct with all the strings as members. But
that seems fairly nasty.

Greetings,

Andres Freund

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#1)
Re: Large writable variables

On Tue, Oct 16, 2018 at 9:07 AM Andres Freund <andres@anarazel.de> wrote:

$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
0000000008251872 0000000000131144 b LagTracker

...

So we have a two variables sized 130kb. Yikes.

...

that's not actually used very often, nor in all processes... Thomas?

Yeah, here's a patch to move it in to the heap.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

0001-Move-the-replication-lag-tracker-into-heap-memory.patchapplication/octet-stream; name=0001-Move-the-replication-lag-tracker-into-heap-memory.patchDownload+29-28
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#6)
Re: Large writable variables

Andres Freund <andres@anarazel.de> writes:

On 2018-10-15 16:54:53 -0400, Tom Lane wrote:

Yeah. It also seems like doing it this way would improve locality of
access: the pieces of the giant string would presumably be in the same
order as the ScanKeywords entries, whereas with the current setup,
who knows where the compiler has put 'em or in what order.

I assume you're talking about the offset approach. Performancewise I
assume that my suggestion of inlining the names into the struct would be
faster. Are there many realistic cases where performance matters enough
to warrant the size increase?

Doubt it, because there'd be an awful lot of wasted space due to the need
to set the struct size large enough for the longest keyword. (Plus it
would likely not come out to be a power-of-2 size, slowing array
indexing.) If you want this to be cache-friendly, I'd think the smaller
the better.

regards, tom lane

#9Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#7)
Re: Large writable variables

On 2018-10-16 10:16:34 +1300, Thomas Munro wrote:

On Tue, Oct 16, 2018 at 9:07 AM Andres Freund <andres@anarazel.de> wrote:

$ nm -t d --size-sort -r -S src/backend/postgres|grep '\b[bB]\b'|head
0000000008251872 0000000000131144 b LagTracker

...

So we have a two variables sized 130kb. Yikes.

...

that's not actually used very often, nor in all processes... Thomas?

Yeah, here's a patch to move it in to the heap.

--
Thomas Munro
http://www.enterprisedb.com

From f3fb64e67c1e86e11dfafc8a712e9750f650f60b Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@enterprisedb.com>
Date: Tue, 16 Oct 2018 10:08:57 +1300
Subject: [PATCH] Move the replication lag tracker into heap memory.

Andres Freund complained about the 128KB of .bss occupied by LagTracker.
It's only needed in the walsender process, so allocate it in heap
memory there.

Cool, let's do that.

Greetings,

Andres Freund

#10Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#9)
Re: Large writable variables

On Tue, Oct 16, 2018 at 10:57 AM Andres Freund <andres@anarazel.de> wrote:

Subject: [PATCH] Move the replication lag tracker into heap memory.

Andres Freund complained about the 128KB of .bss occupied by LagTracker.
It's only needed in the walsender process, so allocate it in heap
memory there.

Cool, let's do that.

Pushed.

--
Thomas Munro
http://www.enterprisedb.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Large writable variables

I wrote:

Andres Freund <andres@anarazel.de> writes:

0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009

I think that tzdefrules_s is not used in common cases (though I could be
wrong about that), so we could win by alloc-on-first-use. The same might
be true for gmtmem, but there's a sticking point: there is no provision
for failure there, so I'm unsure how we avoid crashing on OOM.

So my intuition was exactly backwards on this.

1. tzdefrules_s is filled in the postmaster, and if it's not filled there
it'd be filled by every child process, so there's zero point in converting
it to malloc-on-the-fly style. This is because tzparse() always wants
it filled. That's actually a tad annoying, because it looks to me like
with many timezone settings the data will not get used, but I'm hesitant
to mess with the IANA code's logic enough to improve that. Maybe I'll try
submitting an upstream patch and see what they think of it first.

2. gmtmem, on the other hand, is only used if somebody calls pg_gmtime,
which already has a perfectly good error-report convention, cf gmtime(3).
We have exactly one caller, which was not bothering to test for error :-(
and would dump core on failure. And that caller isn't reached in most
processes, at least not in the regression tests. So this side of it is
easy to improve.

Hence I propose the attached patch for point 2, which I'd want to
backpatch to keep our copies of the IANA code in sync. Point 1
maybe can be addressed in future.

regards, tom lane

Attachments:

allocate-gmtmem-only-when-needed.patchtext/x-diff; charset=us-ascii; name=allocate-gmtmem-only-when-needed.patchDownload+14-11
#12Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#11)
Re: Large writable variables

Hi,

I've a patch making .data variables constant, fwiw. Will post in a
bit. Just so we don't duplicate work too much.

On 2018-10-15 19:41:49 -0400, Tom Lane wrote:

I wrote:

Andres Freund <andres@anarazel.de> writes:

0000000008560224 0000000000023440 b tzdefrules_s
0000000008536704 0000000000023440 b gmtmem.7009

I think that tzdefrules_s is not used in common cases (though I could be
wrong about that), so we could win by alloc-on-first-use. The same might
be true for gmtmem, but there's a sticking point: there is no provision
for failure there, so I'm unsure how we avoid crashing on OOM.

So my intuition was exactly backwards on this.

1. tzdefrules_s is filled in the postmaster, and if it's not filled there
it'd be filled by every child process, so there's zero point in converting
it to malloc-on-the-fly style. This is because tzparse() always wants
it filled. That's actually a tad annoying, because it looks to me like
with many timezone settings the data will not get used, but I'm hesitant
to mess with the IANA code's logic enough to improve that. Maybe I'll try
submitting an upstream patch and see what they think of it first.

Ok, that makes sense.

2. gmtmem, on the other hand, is only used if somebody calls pg_gmtime,
which already has a perfectly good error-report convention, cf gmtime(3).
We have exactly one caller, which was not bothering to test for error :-(
and would dump core on failure. And that caller isn't reached in most
processes, at least not in the regression tests. So this side of it is
easy to improve.

Hence I propose the attached patch for point 2, which I'd want to
backpatch to keep our copies of the IANA code in sync.

That makes sense, too.

Greetings,

Andres Freund

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: Large writable variables

Andres Freund <andres@anarazel.de> writes:

top unitialized allocations:
0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache

Here's a patch to improve that situation.

regards, tom lane

Attachments:

alloc-formatting-cache-only-as-needed.patchtext/x-diff; charset=us-ascii; name=alloc-formatting-cache-only-as-needed.patchDownload+33-28
#14Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#12)
Re: Large writable variables

Hi,

On 2018-10-15 16:45:03 -0700, Andres Freund wrote:

I've a patch making .data variables constant, fwiw. Will post in a
bit. Just so we don't duplicate work too much.

I pushed a few very obvious ones.

An additional fix, for the system attributes in heap.c, is attached. But
that's a bit more invasive, so I wanted to get some input.

I think all of the changes are fairly obvious and the new const
attributes just document what already had to be the case. But to
correctly propagate the const through typedef-the-pointer-away typedefs,
I had to use the underlying type (e.g. use const NameData *n2 instad of
Name n2). To me that's the correct fix, but then I hate these typedefs
with a passion anyway. An alternative fix is to add additional
typedefs like ConstName, but I don't find that particularly
advantageous.

Comments?

Arguably we should use consts more aggresively in signatures anyway, but
I think that's a separate effort.

Greetings,

Andres Freund

Attachments:

0001-Correct-constness-of-system-attributes-in-heap.c-pre.patchtext/x-diff; charset=us-asciiDownload+32-32
#15Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: Large writable variables

Hi,

After the last few changes (including a proposed one), we now have in
the .data segment (i.e. mutable initialized variables):

$ objdump -j .data -t ~/build/postgres/dev-optimize/vpath/src/backend/postgres|awk '{print $4, $5, $6}'|sort -r|lessGreetings,
.data 0000000000004380 ConfigureNamesInt
.data 0000000000003570 ConfigureNamesBool
.data 0000000000002140 ConfigureNamesString
.data 00000000000010e0 ConfigureNamesEnum
.data 0000000000000d20 ConfigureNamesReal

These are by far the largest chunk of the .data segment (like 47152
bytes out of 51168 bytes). It's not entirely trivial to fix. While we
can slap consts onto large parts of the functions taking a struct
config_generic*, there's a few places where *do* modify them.

ISTM, to actually fix, we'd have to split
struct config_generic
{
/* constant fields, must be set correctly in initial value: */
const char *name; /* name of variable - MUST BE FIRST */
GucContext context; /* context required to set the variable */
enum config_group group; /* to help organize variables by function */
const char *short_desc; /* short desc. of this variable's purpose */
const char *long_desc; /* long desc. of this variable's purpose */
int flags; /* flag bits, see guc.h */
/* variable fields, initialized at runtime: */
enum config_type vartype; /* type of variable (set only at startup) */
int status; /* status bits, see below */
GucSource source; /* source of the current actual value */
GucSource reset_source; /* source of the reset_value */
GucContext scontext; /* context that set the current value */
GucContext reset_scontext; /* context that set the reset value */
GucStack *stack; /* stacked prior values */
void *extra; /* "extra" pointer for current actual value */
char *sourcefile; /* file current setting is from (NULL if not
* set in config file) */
int sourceline; /* line in source file */
};

into two different arrays. Namely the already denoted 'constant fields'
and 'variable fields. But it's a bit more complicated than that: The
size doesn't come just from the base config_struct, but also from the
config_{bool,int,real,string,enum} arrays. Where we again have separate
'constant fields' and 'variable fields'.

So we'd have to refactor more heavily than just splitting the above
array into two.

I kind of wonder, if we shouldn't remove the separate ConfigureNamesInt*
int arrays and change things so we have one

struct config_def
{
const char *name; /* name of variable - MUST BE FIRST */
GucContext context; /* context required to set the variable */
enum config_group group; /* to help organize variables by function */
const char *short_desc; /* short desc. of this variable's purpose */
const char *long_desc; /* long desc. of this variable's purpose */
int flags; /* flag bits, see guc.h */
enum config_type vartype; /* type of variable (set only at startup) */

union
{
struct
{
bool *variable;
bool boot_val;
GucBoolCheckHook check_hook;
GucBoolAssignHook assign_hook;
GucShowHook show_hook;
} config_bool;

struct
{
int *variable;
int boot_val;
int min;
int max;
GucIntCheckHook check_hook;
GucIntAssignHook assign_hook;
GucShowHook show_hook;
} config_int

...
} pertype;
}

and then a corresponding struct config_val with a similar union.

Besides reducing modified memory, it'd also have the benefit of making
the grouping within the various arrays much more sensible, because
related fields could be grouped together.

.data 0000000000000420 intRelOpts
.data 00000000000001c0 realRelOpts
.data 0000000000000118 boolRelOpts
.data 00000000000000a8 stringRelOpts

these should be readonly, but the code doesn't make that easy. There's
pending refactorings, I don't know how they effect this.

.data 0000000000000068 SnapshotSelfData
.data 0000000000000068 SnapshotAnyData
.data 0000000000000068 SecondarySnapshotData
.data 0000000000000068 CurrentSnapshotData
.data 0000000000000068 CatalogSnapshotData

The obviously actually are modifyable.

.data 0000000000000028 spi_printtupDR
.data 0000000000000028 printsimpleDR
.data 0000000000000028 donothingDR
.data 0000000000000028 debugtupDR

These we could actually make constant, but CreateDestReceiver() as an
API makes that inconvenient. They also are pretty darn small... There's
a security benefit in making them constant and casting the constness
away - I think that might not be insane.

- Andres

#16Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#15)
Re: Large writable variables

Hi,

On 2018-10-15 21:50:51 -0700, Andres Freund wrote:

.data 0000000000000028 spi_printtupDR
.data 0000000000000028 printsimpleDR
.data 0000000000000028 donothingDR
.data 0000000000000028 debugtupDR

These we could actually make constant, but CreateDestReceiver() as an
API makes that inconvenient. They also are pretty darn small... There's
a security benefit in making them constant and casting the constness
away - I think that might not be insane.

I.e. do something like the attached.

Greetings,

Andres Freund

Attachments:

const-dest.difftext/x-diff; charset=us-asciiDownload+21-11
#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#15)
Re: Large writable variables

Andres Freund <andres@anarazel.de> writes:

[ let's rearrange the GUC structs ]

I find it hard to believe that the API breaks you'll cause for extensions
are a good trade for a few kB reduction in the size of the .data segment.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#13)
Re: Large writable variables

I wrote:

Andres Freund <andres@anarazel.de> writes:

top unitialized allocations:
0000000008435040 0000000000085280 b DCHCache
0000000008391168 0000000000043840 b NUMCache

Here's a patch to improve that situation.

Hm, looking at that more closely, there's a problem with the idea of
allocating the cache slots one-at-a-time. Currently,
sizeof(DCHCacheEntry) and sizeof(NUMCacheEntry) are each just a bit more
than a power of 2, which would cause palloc to waste nearly 50% of the
allocation it makes for them.

We could forget the one-at-a-time idea and just allocate the whole
array on first use, but I feel like that's probably not a good answer.
I'm inclined to instead reduce DCH_CACHE_SIZE and NUM_CACHE_SIZE enough
to make the structs just less than powers of 2 instead of just more ---
AFAICS, both of those numbers were picked out of the air, and so there's
no reason not to pick a different number out of the air.

Also, I noticed that the biggest part of those structs are arrays of
FormatNode, which has been designed with complete lack of thought about
size or padding issues. We can very easily cut it in half on 64-bit
machines.

The attached patch does both of those things. It's independent of
the other one but is important to make that one efficient.

regards, tom lane

Attachments:

formatting-shave-bits.patchtext/x-diff; charset=us-ascii; name=formatting-shave-bits.patchDownload+20-16
#19Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#17)
Re: Large writable variables

Hi,

On 2018-10-16 01:49:22 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

[ let's rearrange the GUC structs ]

I find it hard to believe that the API breaks you'll cause for extensions
are a good trade for a few kB reduction in the size of the .data segment.

I'm doubtful that it's worth it too. But more because it seems like a
fair bit of work. I don't think that many extensions use guc_tables.h -
shouldn't most use guc.h, which wouldn't be affected?

Greetings,

Andres Freund

#20Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#16)
Re: Large writable variables

On 2018-10-15 22:02:00 -0700, Andres Freund wrote:

Hi,

On 2018-10-15 21:50:51 -0700, Andres Freund wrote:

.data 0000000000000028 spi_printtupDR
.data 0000000000000028 printsimpleDR
.data 0000000000000028 donothingDR
.data 0000000000000028 debugtupDR

These we could actually make constant, but CreateDestReceiver() as an
API makes that inconvenient. They also are pretty darn small... There's
a security benefit in making them constant and casting the constness
away - I think that might not be insane.

I.e. do something like the attached.

This just reminded me that a couple times I wanted a cast that casts
away const, but otherwise makes sure the type stays the same. I don't
think there's a way to do that in C, but we can write one that verifies
the cast doesn't do something bad if gcc is used:

#if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
#define unconstify(cst, var) StaticAssertExpr(__builtin_types_compatible_p(__typeof(var), const cst), "wrong cast"), (cst) (var)
#else
#define unconstify(cst, var) ((cst) (var))
#endif

Does anybody besides me see value in adding a cleaned up version of
that?

Greetings,

Andres Freund

#21Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#18)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#21)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#20)
#24Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#25)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#27)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#1)
#31Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
#32Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#30)
#33Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#22)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#33)
#35Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Tom Lane (#34)
#36Andres Freund
andres@anarazel.de
In reply to: Gavin Flower (#35)
#37Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#34)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#11)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#32)
#40Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#39)
#41Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#20)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#40)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#42)
#44Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#41)
#45Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#44)
#46Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#45)
#47Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#46)
#48Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#47)
#49Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#48)
#50Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#47)
#51Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#50)