GUC with units, details

Started by Peter Eisentrautover 19 years ago49 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

It seems everyone likes the units, so here are some details of the
implementation I have prepared.

Memory units are kB, MB, GB. The factor is 1024.

Time units are ms, s, min, h, d.

I intentionally did not support microseconds because that would make the
coding extremely overflow-risky, and the only candidate commit_delay
isn't used much. This can be added once int64 support is required.
For similar reasons, the unit "byte" is not supported.

The full list of candidates then is:

post_auth_delay s
deadlock_timeout ms
vacuum_cost_delay ms
autovacuum_vacuum_cost_delay ms
statement_timeout ms
authentication_timeout s
pre_auth_delay s
checkpoint_timeout s
log_min_duration_statement ms
bgwriter_delay ms
log_rotation_age min
autovacuum_naptime s
tcp_keepalives_idle s
tcp_keepalives_interval s

shared_buffers 8kB
temp_buffers 8kB
work_mem kB
maintenance_work_mem kB
log_rotation_size kB
effective_cache_size kB (pending switch to int)

Units can be specified with or without space after the number. In the
configuration file, writing a space after the number would require
quoting the entire the value, without a space not. With SET of course
you have to quote anyway.

If you don't specify any unit, you get the behavior from before.

Output from SHOW uses the largest unit that fits as long as the number
is an integer. (I guess you could change that later to some more
complex scheme, but I feel that this is better than what we have.) If
the value is zero or negative, no unit is used. (-1 sometimes
means "off".)

The error messages complaining about range violations and similar things
should perhaps also be changed to use units.

I'm a bit afraid of removing all references to the true internal units
of these parameters, because before long someone will see a message "I
don't like value 123" and they won't know what unit it is. We'll have
to deal with those as we go along I guess.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: GUC with units, details

Peter Eisentraut <peter_e@gmx.net> writes:

Output from SHOW uses the largest unit that fits as long as the number
is an integer.

That seems OK for SHOW, which is mainly intended for human consumption,
but what will you do with pg_settings? For programmatic use I think
we want more predictable behavior.

I'm inclined to suggest adding a column "native units" to pg_settings,
which shows what the underlying units are (ie, the existing
interpretations) and then always show the value of a given variable
in its native unit.

regards, tom lane

#3Michael Glaesemann
grzm@seespotcode.net
In reply to: Peter Eisentraut (#1)
Re: GUC with units, details

On Jul 26, 2006, at 6:56 , Peter Eisentraut wrote:

Memory units are kB, MB, GB. The factor is 1024.

Time units are ms, s, min, h, d.

Are units case-sensitive? I've noticed you've been consistent in your
capitalization in these posts, so I'm wondering if you're enforcing
the same case in postgresql.conf.

Michael Glaesemann
grzm seespotcode net

#4Michael Glaesemann
grzm@seespotcode.net
In reply to: Michael Glaesemann (#3)
Re: GUC with units, details

On Jul 26, 2006, at 7:12 , Michael Glaesemann wrote:

On Jul 26, 2006, at 6:56 , Peter Eisentraut wrote:

Memory units are kB, MB, GB. The factor is 1024.

Time units are ms, s, min, h, d.

Are units case-sensitive? I've noticed you've been consistent in
your capitalization in these posts, so I'm wondering if you're
enforcing the same case in postgresql.conf.

Sorry for the noise. Didn't see that this was answered previously.

Michael Glaesemann
grzm seespotcode net

#5Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#1)
Re: GUC with units, details

Peter Eisentraut wrote:

Memory units are kB, MB, GB. The factor is 1024.

Then shouldn't the factor be 1000? If the factor is 1024, then the units
should be KiB, MiB, GiB per IEEE 1541
(http://en.wikipedia.org/wiki/IEEE_1541) and others.

I'm not trying to be pedantic, but the general approach with -hackers
seems to be towards compliance where practical.

Regards,
Paul Bort

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Bort, Paul (#5)
Re: GUC with units, details

Bort, Paul wrote:

I'm not trying to be pedantic, but the general approach with -hackers
seems to be towards compliance where practical.

But in this case it's not practical.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#6)
Re: GUC with units, details

Peter Eisentraut <peter_e@gmx.net> writes:

Bort, Paul wrote:

[ 1000 vs 1024 ]

But in this case it's not practical.

Maybe I'm missing something, but I thought it was fairly common to use
"k" for 1000, "K" for 1024, etc (mnemonic: upper case for the larger
multiplier). So I'd vote for accepting "KB" rather than "kB" ...

regards, tom lane

#8Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#7)
Re: GUC with units, details

On Tue, 2006-07-25 at 19:00 -0400, Tom Lane wrote:

Maybe I'm missing something, but I thought it was fairly common to use
"k" for 1000, "K" for 1024, etc (mnemonic: upper case for the larger
multiplier).

Well, that only works for K vs. k: the SI prefix for mega is M (meaning
10^6), not "m". Similarly for "G".

Why it is "impractical" to use the IEC prefixes?

-Neil

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Neil Conway (#8)
Re: GUC with units, details

Neil Conway wrote:

On Tue, 2006-07-25 at 19:00 -0400, Tom Lane wrote:

Maybe I'm missing something, but I thought it was fairly common to
use "k" for 1000, "K" for 1024, etc (mnemonic: upper case for the
larger multiplier).

Well, that only works for K vs. k: the SI prefix for mega is M
(meaning 10^6), not "m". Similarly for "G".

Indeed. The k vs K idea is an excuse for not wanting to side with
either camp, but it does not scale.

Why it is "impractical" to use the IEC prefixes?

I'd imagine that one of the first things someone will want to try is
something like SET work_mem TO '10MB', which will fail or misbehave
because 10000000 bytes do not divide up into chunks of 1024 bytes. Who
wants to explain to users that they have to write '10MiB'?

Since about forever, PostgreSQL has used kB, MB, GB to describe memory
allocation. If we want to change that, we ought to do it across the
board. But that's a big board.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Peter Eisentraut (#9)
Re: GUC with units, details

On Wed, 2006-07-26 at 08:12 +0200, Peter Eisentraut wrote:

Neil Conway wrote:

On Tue, 2006-07-25 at 19:00 -0400, Tom Lane wrote:

Maybe I'm missing something, but I thought it was fairly common to
use "k" for 1000, "K" for 1024, etc (mnemonic: upper case for the
larger multiplier).

Well, that only works for K vs. k: the SI prefix for mega is M
(meaning 10^6), not "m". Similarly for "G".

Indeed. The k vs K idea is an excuse for not wanting to side with
either camp, but it does not scale.

Why it is "impractical" to use the IEC prefixes?

I'd imagine that one of the first things someone will want to try is
something like SET work_mem TO '10MB', which will fail or misbehave
because 10000000 bytes do not divide up into chunks of 1024 bytes. Who
wants to explain to users that they have to write '10MiB'?

Since about forever, PostgreSQL has used kB, MB, GB to describe memory
allocation. If we want to change that, we ought to do it across the
board. But that's a big board.

Neil is right: K, M, G are the correct SI terms, however, I don't see
any value in using that here. Nobody is suggesting we encourage or even
allow people to write max_fsm_pages = 10M rather than 10000000, so we
don't ever need to say that K = 1000, AFAICS. I think we are safe to
assume that

kB = KB = kb = Kb = 1024 bytes

mB = MB = mb = Mb = 1024 * 1024 bytes

gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes

There's no value in forcing the use of specific case and it will be just
confusing for people.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#11Thomas Hallgren
thhal@mailblocks.com
In reply to: Simon Riggs (#10)
Re: GUC with units, details

Simon Riggs wrote:

don't ever need to say that K = 1000, AFAICS. I think we are safe to
assume that

kB = KB = kb = Kb = 1024 bytes

mB = MB = mb = Mb = 1024 * 1024 bytes

gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes

There's no value in forcing the use of specific case and it will be just
confusing for people.

It's fairly common to use 'b' for 'bits' and 'B' for 'bytes'. My suggestion would be to be
much more restrictive and avoid small caps:

KB = 1024 bytes
MB = 1024 KB
GB = 1024 KB
TB = 1024 GB

Although I don't expect to see bit-rates or fractions ('m' == 'milli') in GUC, it might be
good to use consistent units everywhere.

Regards,
Thomas Hallgren

#12Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#10)
Re: GUC with units, details

Simon Riggs wrote:

There's no value in forcing the use of specific case and it will be
just confusing for people.

The issue was not the case of the units, but people were suggesting that
we should enforce the use of kiB, MiB, and GiB.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
Re: GUC with units, details

Tom Lane wrote:

That seems OK for SHOW, which is mainly intended for human
consumption, but what will you do with pg_settings? For programmatic
use I think we want more predictable behavior.

I'd think that a program would not care. Or do you want a units-free
display that can be parsed as integer?

Do we want to introduce a difference between pg_settings and SHOW ALL?
(Is there one already?)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#13)
Re: GUC with units, details

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane wrote:

That seems OK for SHOW, which is mainly intended for human
consumption, but what will you do with pg_settings? For programmatic
use I think we want more predictable behavior.

I'd think that a program would not care. Or do you want a units-free
display that can be parsed as integer?

Yeah. If the value might be shown as either "99kB" or "9MB" then a
program *must* have a pretty complete understanding of the units system
to make sense of it at all. Furthermore this is not backwards
compatible --- it'll break any existing code that inspects pg_settings
values. I suggest that the values column should continue to display
exactly as it does today (ie, the integer value in the var's native
units) and we just add a column saying what the native units are.

Do we want to introduce a difference between pg_settings and SHOW ALL?

Yup, I think that's the lesser of the evils.

regards, tom lane

#15Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#9)
Re: GUC with units, details

Peter Eisentraut wrote:

I'd imagine that one of the first things someone will want to try is
something like SET work_mem TO '10MB', which will fail or misbehave
because 10000000 bytes do not divide up into chunks of 1024
bytes. Who
wants to explain to users that they have to write '10MiB'?

How about this:

INFO: Your setting was converted to IEC standard binary units. Use KiB,
MiB, and GiB to avoid this warning.

Since about forever, PostgreSQL has used kB, MB, GB to
describe memory
allocation. If we want to change that, we ought to do it across the
board. But that's a big board.

The standard hasn't been around forever; some incarnation of PostgreSQL
certainly pre-dates it. But it was created to reduce confusion between
binary and decimal units.

The Linux kernel changed to the standard years ago. And that's just a
few more lines of code than PostgreSQL. ( http://kerneltrap.org/node/340
and others )

Regards,
Paul Bort

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Bort, Paul (#15)
Re: GUC with units, details

Bort, Paul wrote:

The Linux kernel changed to the standard years ago. And that's just a
few more lines of code than PostgreSQL. (
http://kerneltrap.org/node/340 and others )

For your entertainment, here are the usage numbers from the linux-2.6.17
kernel:

kilobyte (-i) 82
kibibyte (-i) 2
megabyte (-i) 98
mebibyte (-i) 0
gigabyte (-i) 32
gibibyte (-i) 0

KB 1151
kB 407
KiB 181
MB 3830
MiB 298
GB 815
GiB 17

So I remain unconvinced.

Of course, your general point is a good one. If there are actually
systems using this, it might be worth considering. But if not, then
we're just going to confuse people.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#17Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#16)
Re: GUC with units, details

Peter Eisentraut politely corrected:

For your entertainment, here are the usage numbers from the
linux-2.6.17
kernel:

kilobyte (-i) 82
kibibyte (-i) 2
megabyte (-i) 98
mebibyte (-i) 0
gigabyte (-i) 32
gibibyte (-i) 0

KB 1151
kB 407
KiB 181
MB 3830
MiB 298
GB 815
GiB 17

Thanks for the info. I had seen several articles on it, and shot my
mouth off without double-checking. My apologies.

I still think it would be a good idea to use the standard, and that this
is an opportunity to do so.

Regards,
Paul Bort

#18Andreas Pflug
pgadmin@pse-consulting.de
In reply to: Peter Eisentraut (#16)
Re: GUC with units, details

Peter Eisentraut wrote:

Bort, Paul wrote:

The Linux kernel changed to the standard years ago. And that's just a
few more lines of code than PostgreSQL. (
http://kerneltrap.org/node/340 and others )

For your entertainment, here are the usage numbers from the linux-2.6.17
kernel:

kilobyte (-i) 82
kibibyte (-i) 2
megabyte (-i) 98
mebibyte (-i) 0
gigabyte (-i) 32
gibibyte (-i) 0

KB 1151
kB 407
KiB 181
MB 3830
MiB 298
GB 815
GiB 17

So I remain unconvinced.

Of course, your general point is a good one. If there are actually
systems using this, it might be worth considering. But if not, then
we're just going to confuse people.

Is it worth bothering about the small deviation, if 10000 was meant, but
10k gives 10240 buffers? Isn't it quite common that systems round config
values to the next sensible value anyway?

Regards,
Andreas

#19Martijn van Oosterhout
kleptog@svana.org
In reply to: Bort, Paul (#15)
Re: GUC with units, details

On Wed, Jul 26, 2006 at 12:17:00PM -0400, Bort, Paul wrote:

Peter Eisentraut wrote:

I'd imagine that one of the first things someone will want to try is
something like SET work_mem TO '10MB', which will fail or misbehave
because 10000000 bytes do not divide up into chunks of 1024
bytes. Who
wants to explain to users that they have to write '10MiB'?

How about this:

INFO: Your setting was converted to IEC standard binary units. Use KiB,
MiB, and GiB to avoid this warning.

That's silly. If you're going to treat KB as 1024 bytes anyway,
complaining about it is just being pedantic.

The thing is, most memory sizes in postgres need to be some multiple of
a page size. You can't have a shared buffers of exactly 100000 bytes,
while 102400 bytes is possible. When someone has a GB of memory, they
really mean a GiB, but no-one bothers to correct them.

Is there anywhere in postgres where using K=1000 would be significantly
clearer than K=1024?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#20Bort, Paul
pbort@tmwsystems.com
In reply to: Martijn van Oosterhout (#19)
Re: GUC with units, details

Martijn van Oosterhout wrote:

How about this:

INFO: Your setting was converted to IEC standard binary

units. Use KiB,

MiB, and GiB to avoid this warning.

That's silly. If you're going to treat KB as 1024 bytes anyway,
complaining about it is just being pedantic.

But after a version or two with warnings, we have grounds to make it an
error. I'd rather just go with the standard from day 1 and reject
decimal units where they don't make sense, but that seems unlikely.

The thing is, most memory sizes in postgres need to be some
multiple of
a page size. You can't have a shared buffers of exactly 100000 bytes,
while 102400 bytes is possible. When someone has a GB of memory, they
really mean a GiB, but no-one bothers to correct them.

And hard drives are just the opposite: a 250GB drive does not have
268,435,456,000 bytes of unformatted space.

Is there anywhere in postgres where using K=1000 would be
significantly
clearer than K=1024?

If the unit for a setting is pages, then a value of '1K' could cause
some confusion as to whether that's 1,000 or 1,024 pages.

Show quoted text

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

From each according to his ability. To each according to

his ability to litigate.

#21Michael Glaesemann
grzm@seespotcode.net
In reply to: Martijn van Oosterhout (#19)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Glaesemann (#21)
#23Michael Glaesemann
grzm@seespotcode.net
In reply to: Tom Lane (#22)
#24Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#22)
#25Peter Eisentraut
peter_e@gmx.net
In reply to: Bort, Paul (#17)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#24)
#27Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#25)
#28Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bort, Paul (#27)
#29Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Bort, Paul (#27)
#30Csaba Nagy
nagy@ecircle-ag.com
In reply to: Jim Nasby (#29)
#31Florian Pflug
fgp@phlo.org
In reply to: Tom Lane (#26)
#32Peter Eisentraut
peter_e@gmx.net
In reply to: Jim Nasby (#29)
#33Peter Eisentraut
peter_e@gmx.net
In reply to: Florian Pflug (#31)
#34Florian Pflug
fgp@phlo.org
In reply to: Peter Eisentraut (#33)
#35Csaba Nagy
nagy@ecircle-ag.com
In reply to: Peter Eisentraut (#33)
#36Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#32)
#37Peter Eisentraut
peter_e@gmx.net
In reply to: Bort, Paul (#36)
#38Bort, Paul
pbort@tmwsystems.com
In reply to: Peter Eisentraut (#37)
#39Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Peter Eisentraut (#32)
#40Martijn van Oosterhout
kleptog@svana.org
In reply to: Peter Eisentraut (#32)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#29)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#32)
#43Bruce Momjian
bruce@momjian.us
In reply to: Martijn van Oosterhout (#40)
#44Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#41)
#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#44)
#46Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#45)
#47Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#22)
#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#46)
#49Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Peter Eisentraut (#44)