pg_controldata gobbledygook

Started by Peter Eisentrautover 12 years ago13 messages
#1Peter Eisentraut
peter_e@gmx.net

I'm not sure who is supposed to be able to read this sort of stuff:

Latest checkpoint's NextXID: 0/7575
Latest checkpoint's NextOID: 49152
Latest checkpoint's NextMultiXactId: 7
Latest checkpoint's NextMultiOffset: 13
Latest checkpoint's oldestXID: 1265
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1

Note that these symbols don't even correspond to the actual symbols used
in the source code in some cases.

The comments in the pg_control.h header file use much more pleasant
terms, which when put to use would lead to output similar to this:

Latest checkpoint's next free transaction ID: 0/7575
Latest checkpoint's next free OID: 49152
Latest checkpoint's next free MultiXactId: 7
Latest checkpoint's next free MultiXact offset: 13
Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
Latest checkpoint's oldest transaction ID still running: 0
Latest checkpoint's cluster-wide minimum datminmxid: 1
Latest checkpoint's database with cluster-wide minimum datminmxid: 1

One could even rearrange the layout a little bit like this:

Control data as of latest checkpoint:
next free transaction ID: 0/7575
next free OID: 49152
etc.

Comments?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: pg_controldata gobbledygook

Peter Eisentraut <peter_e@gmx.net> writes:

The comments in the pg_control.h header file use much more pleasant
terms, which when put to use would lead to output similar to this:

Latest checkpoint's next free transaction ID: 0/7575
Latest checkpoint's next free OID: 49152
Latest checkpoint's next free MultiXactId: 7
Latest checkpoint's next free MultiXact offset: 13
Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
Latest checkpoint's oldest transaction ID still running: 0
Latest checkpoint's cluster-wide minimum datminmxid: 1
Latest checkpoint's database with cluster-wide minimum datminmxid: 1

One could even rearrange the layout a little bit like this:

Control data as of latest checkpoint:
next free transaction ID: 0/7575
next free OID: 49152
etc.

Comments?

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Geoghegan
pg@heroku.com
In reply to: Peter Eisentraut (#1)
Re: pg_controldata gobbledygook

On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

Comments?

+1 from me.

I don't think that these particular changes would break WAL-E,
Heroku's continuous archiving tool, which has a class called
PgControlDataParser. However, it's possible to imagine someone being
affected in a similar way. So I'd be sure to document it clearly, and
to perhaps preserve the old label names to avoid breaking scripts.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: pg_controldata gobbledygook

Tom Lane wrote:

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

We could provide two alternative outputs, one for human consumption with
the proposed format and something else that uses, say, shell assignment
syntax. (I did propose this years ago and I might have an unfinished
patch still lingering about somewhere.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: Peter Geoghegan (#3)
Re: pg_controldata gobbledygook

On Fri, Apr 26, 2013 at 12:22 AM, Peter Geoghegan <pg@heroku.com> wrote:

On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

Comments?

+1 from me.

I don't think that these particular changes would break WAL-E,
Heroku's continuous archiving tool, which has a class called
PgControlDataParser. However, it's possible to imagine someone being
affected in a similar way. So I'd be sure to document it clearly, and
to perhaps preserve the old label names to avoid breaking scripts.

Why don't we add options to pg_controldata outputs the info in other
several formats like json, yaml, xml or another one?

Best regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: pg_controldata gobbledygook

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

We could provide two alternative outputs, one for human consumption with
the proposed format and something else that uses, say, shell assignment
syntax. (I did propose this years ago and I might have an unfinished
patch still lingering about somewhere.)

And a script would use that how? "pg_controldata --machine-friendly"
would fail outright on older versions. I think it's okay to ask script
writers to write
pg_controldata | grep -e 'old label|new label'
but not okay to ask them to deal with anything as complicated as trying
a switch to see if it works or not.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Daniel Farina
daniel@heroku.com
In reply to: Tom Lane (#6)
Re: pg_controldata gobbledygook

On Thu, Apr 25, 2013 at 9:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

We could provide two alternative outputs, one for human consumption with
the proposed format and something else that uses, say, shell assignment
syntax. (I did propose this years ago and I might have an unfinished
patch still lingering about somewhere.)

And a script would use that how? "pg_controldata --machine-friendly"
would fail outright on older versions. I think it's okay to ask script
writers to write
pg_controldata | grep -e 'old label|new label'
but not okay to ask them to deal with anything as complicated as trying
a switch to see if it works or not.

From what I'm reading, it seems like the main benefit of the changes
is to make things easier for humans to skim over. Automated programs
that care about precise meanings of each field are awkwardly but
otherwise well-served by the precise output as rendered right now.

What about doing something similar but different from the
--machine-readable proposal, such as adding an option for the
*human*-readable variant that is guaranteed to mercilessly change as
human-readers/-hackers sees fit on whim? It's a bit of a kludge that
this is not the default, but would prevent having to serve two quite
different masters with the same output.

Although I'm not seriously proposing explicitly "-h" (as seen in some
GNU programs in rendering byte sizes and the like...yet could be
confused for 'help'), something like that may serve as prior art.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Daniel Farina (#7)
Re: pg_controldata gobbledygook

On 26/04/13 18:53, Daniel Farina wrote:

On Thu, Apr 25, 2013 at 9:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

We could provide two alternative outputs, one for human consumption with
the proposed format and something else that uses, say, shell assignment
syntax. (I did propose this years ago and I might have an unfinished
patch still lingering about somewhere.)

And a script would use that how? "pg_controldata --machine-friendly"
would fail outright on older versions. I think it's okay to ask script
writers to write
pg_controldata | grep -e 'old label|new label'
but not okay to ask them to deal with anything as complicated as trying
a switch to see if it works or not.

From what I'm reading, it seems like the main benefit of the changes
is to make things easier for humans to skim over. Automated programs
that care about precise meanings of each field are awkwardly but
otherwise well-served by the precise output as rendered right now.

What about doing something similar but different from the
--machine-readable proposal, such as adding an option for the
*human*-readable variant that is guaranteed to mercilessly change as
human-readers/-hackers sees fit on whim? It's a bit of a kludge that
this is not the default, but would prevent having to serve two quite
different masters with the same output.

Although I'm not seriously proposing explicitly "-h" (as seen in some
GNU programs in rendering byte sizes and the like...yet could be
confused for 'help'), something like that may serve as prior art.

I think the current way should remain the default, as Daniel suggests
- but a '--human-readable' (or suitable abbreviation) flag could be added.

Such as in the command to list directory details, using the 'ls' command
in Linux...

(Below, *Y* = 1024 * 1024 * 1024 * 1024 * 1024 * 1024 * 1024 * 1024
bytes = 2^80 bytes.)

*man ls**
**[...]**
** -h, --human-readable**
** with -l, print sizes in human readable format (e.g., 1K
234M 2G)**
**[...]**
** SIZE may be (or may be an integer optionally followed by) one
of fol-**
** lowing: KB 1000, K 1024, MB 1000*1000, M 1024*1024, and so on
for G, T,**
** P, E, Z, Y.**
**[...]*

Cheers,
Gavin

#9Andres Freund
andres@2ndquadrant.com
In reply to: Peter Eisentraut (#1)
Re: pg_controldata gobbledygook

On 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote:

I'm not sure who is supposed to be able to read this sort of stuff:

Latest checkpoint's NextXID: 0/7575
Latest checkpoint's NextOID: 49152
Latest checkpoint's NextMultiXactId: 7
Latest checkpoint's NextMultiOffset: 13
Latest checkpoint's oldestXID: 1265
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1

Note that these symbols don't even correspond to the actual symbols used
in the source code in some cases.

The comments in the pg_control.h header file use much more pleasant
terms, which when put to use would lead to output similar to this:

Latest checkpoint's next free transaction ID: 0/7575
Latest checkpoint's next free OID: 49152
Latest checkpoint's next free MultiXactId: 7
Latest checkpoint's next free MultiXact offset: 13
Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
Latest checkpoint's oldest transaction ID still running: 0
Latest checkpoint's cluster-wide minimum datminmxid: 1
Latest checkpoint's database with cluster-wide minimum datminmxid: 1

One could even rearrange the layout a little bit like this:

Control data as of latest checkpoint:
next free transaction ID: 0/7575
next free OID: 49152

I have to admit I don't see the point. None of those values is particularly
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.
The clarity win here doesn't seem to be worth the price of potentially
breaking some tools.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Bernd Helmle
mailings@oopsware.de
In reply to: Tom Lane (#2)
Re: pg_controldata gobbledygook

--On 25. April 2013 23:19:14 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

I had exactly this kind of discussion just a few days ago with a customer,
who wants to use the output in their scripts and was a little worried about
the compatibility between major versions.

I don't think we do guarantuee any output format compatibility between
corresponding symbols in major versions explicitly, but given that
pg_controldata seems to have a broad use case here, we should maybe
document it somewhere wether to discourage or encourage people to rely on
it?

--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#9)
Re: pg_controldata gobbledygook

On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres@2ndquadrant.com> wrote:

I have to admit I don't see the point. None of those values is particularly
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.
The clarity win here doesn't seem to be worth the price of potentially
breaking some tools.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Jeff Janes
jeff.janes@gmail.com
In reply to: Andres Freund (#9)
Re: pg_controldata gobbledygook

On Fri, Apr 26, 2013 at 2:08 AM, Andres Freund <andres@2ndquadrant.com>wrote:

On 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote:

I'm not sure who is supposed to be able to read this sort of stuff:

Latest checkpoint's NextXID: 0/7575
Latest checkpoint's NextOID: 49152
Latest checkpoint's NextMultiXactId: 7
Latest checkpoint's NextMultiOffset: 13
Latest checkpoint's oldestXID: 1265
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1

Note that these symbols don't even correspond to the actual symbols used
in the source code in some cases.

The comments in the pg_control.h header file use much more pleasant
terms, which when put to use would lead to output similar to this:

Latest checkpoint's next free transaction ID: 0/7575
Latest checkpoint's next free OID: 49152
Latest checkpoint's next free MultiXactId: 7
Latest checkpoint's next free MultiXact offset: 13
Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
Latest checkpoint's oldest transaction ID still running: 0
Latest checkpoint's cluster-wide minimum datminmxid: 1
Latest checkpoint's database with cluster-wide minimum datminmxid: 1

One could even rearrange the layout a little bit like this:

Control data as of latest checkpoint:
next free transaction ID: 0/7575
next free OID: 49152

I have to admit I don't see the point. None of those values is particularly
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.

I agree. For the ones I didn't know the meaning of, I still don't know the
meaning of them based on the long form, either. While a tutorial on what
these things mean might be useful, embedding the tutorial into the output
of pg_controldata probably isn't the right place.

Cheers,

Jeff

#13Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#11)
Re: pg_controldata gobbledygook

On Fri, Apr 26, 2013 at 08:51:23AM -0400, Robert Haas wrote:

On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres@2ndquadrant.com> wrote:

I have to admit I don't see the point. None of those values is particularly
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.
The clarity win here doesn't seem to be worth the price of potentially
breaking some tools.

+1.

FYI, pg_upgrade would certainly have to be updated to handle this
change.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers