Wanted: jsonb on-disk representation documentation

Started by Heikki Linnakangasabout 12 years ago36 messageshackers

heikki.linnakangas@enterprisedb.com

about 12 years ago

I'm reading the new jsonb code, trying to understand the on-disk
representation. And I cannot make heads or tails of it.

My first entry point was jsonb.h. Jsonb struct is the on-disk
representation, so I looked at the comments above that. No help, the
comments are useless for getting an overview picture.

Help, anyone?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#1)

Re: Wanted: jsonb on-disk representation documentation

On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

I'm reading the new jsonb code, trying to understand the on-disk
representation. And I cannot make heads or tails of it.

My first entry point was jsonb.h. Jsonb struct is the on-disk
representation, so I looked at the comments above that. No help, the
comments are useless for getting an overview picture.

Help, anyone?

Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patch wouldn't have been merged without that and other adjustments.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Bruce Momjian

bruce@momjian.us

about 12 years ago

In reply to: Andres Freund (#2)

Re: Wanted: jsonb on-disk representation documentation

On Tue, May 6, 2014 at 09:48:04PM +0200, Andres Freund wrote:

On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

I'm reading the new jsonb code, trying to understand the on-disk
representation. And I cannot make heads or tails of it.

My first entry point was jsonb.h. Jsonb struct is the on-disk
representation, so I looked at the comments above that. No help, the
comments are useless for getting an overview picture.

Help, anyone?

Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patch wouldn't have been merged without that and other adjustments.

I also would like to know what the index-everything hash ops does? Does
it index the keys, values, or both?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

about 12 years ago

In reply to: Bruce Momjian (#3)

Re: Wanted: jsonb on-disk representation documentation

On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian <bruce@momjian.us> wrote:

I also would like to know what the index-everything hash ops does? Does
it index the keys, values, or both?

It indexes both, but it isn't possible to test existence (of a key)
with the hash GIN opclass.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

about 12 years ago

In reply to: Andres Freund (#2)

Re: Wanted: jsonb on-disk representation documentation

On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote:

Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patch wouldn't have been merged without that and other adjustments.

I'm almost certain that the only feedback of yours that I didn't
incorporate was that I didn't change the name of JsonbValue, a
decision I stand by, and also that I didn't add ascii art to
illustrate the on-disk format. I can write a patch that adds the
latter soon.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Oleg Bartunov

oleg@sai.msu.su

about 12 years ago

In reply to: Peter Geoghegan (#4)

Re: Wanted: jsonb on-disk representation documentation

FYI,
http://obartunov.livejournal.com/178495.html

This is hash based gin opclass for hstore with all operators support.
It's pity we had no time to do the same for jsonb, but we may include
it and couple of other opclasses to contrib/jsonx.

Oleg

On Wed, May 7, 2014 at 12:18 AM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian <bruce@momjian.us> wrote:

I also would like to know what the index-everything hash ops does? Does
it index the keys, values, or both?

It indexes both, but it isn't possible to test existence (of a key)
with the hash GIN opclass.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Peter Geoghegan (#5)

Re: Wanted: jsonb on-disk representation documentation

On 2014-05-06 13:30:26 -0700, Peter Geoghegan wrote:

On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote:

Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patch wouldn't have been merged without that and other adjustments.

I'm almost certain that the only feedback of yours that I didn't
incorporate was that I didn't change the name of JsonbValue, a
decision I stand by, and also that I didn't add ascii art to
illustrate the on-disk format. I can write a patch that adds the
latter soon.

That might or might not be true. I don't really remember. Documentation
about the on-disk format is the one thing I am sure about that's not
done.

The reviews I did were really cursory reviews, nothing thorough. There's
large parts of the code (e.g. jsonb_gin.c) I didn't even look at. And
others I don't really understand. I also didn't have time to look at the
later versions. The code did improve, don't get me wrong. Otherwise I'd
have been very vocal about this when committed.
But it's still pretty hard to read/understand code. Which imo is
problematic for a feature touted being absolutely critical for postgres'
success. If other's want a taste, take a peek at
findJsonbValueFromSuperHeader()'s code.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

about 12 years ago

In reply to: Andres Freund (#7)

Re: Wanted: jsonb on-disk representation documentation

On Tue, May 6, 2014 at 3:37 PM, Andres Freund <andres@anarazel.de> wrote:

That might or might not be true. I don't really remember. Documentation
about the on-disk format is the one thing I am sure about that's not
done.

I think it would be best to do that with reference to a concrete
example. As I said, I'll work on a patch.

The reviews I did were really cursory reviews, nothing thorough. There's
large parts of the code (e.g. jsonb_gin.c) I didn't even look at. And
others I don't really understand. I also didn't have time to look at the
later versions. The code did improve, don't get me wrong. Otherwise I'd
have been very vocal about this when committed.
But it's still pretty hard to read/understand code. Which imo is
problematic for a feature touted being absolutely critical for postgres'
success. If other's want a taste, take a peek at
findJsonbValueFromSuperHeader()'s code.

I don't really know what to say to that. Lots of code in Postgres is
complicated, especially if you look at one particular function without
some wider context. Is your objection that the complexity is
incidental rather than essential? If so, how?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Peter Geoghegan (#8)

Re: Wanted: jsonb on-disk representation documentation

On 2014-05-06 15:45:39 -0700, Peter Geoghegan wrote:

I don't really know what to say to that. Lots of code in Postgres is
complicated, especially if you look at one particular function without
some wider context.
Is your objection that the complexity is incidental rather than
essential?

Yes.

If so, how?

If you think the following is a solution of essential complexity in
*new* code for navigating one level down a relatively simple *new*
datastructure - then we have a disconnect that's larger than I am
willing to argue about.
I can live with the argument that this code is what we have; but calling
this only having the "essential complexity" is absurd.

JsonbValue *
findJsonbValueFromSuperHeader(JsonbSuperHeader sheader, uint32 flags,
uint32 *lowbound, JsonbValue *key)
{
uint32 superheader = *(uint32 *) sheader;
JEntry *array = (JEntry *) (sheader + sizeof(uint32));
int count = (superheader & JB_CMASK);
JsonbValue *result = palloc(sizeof(JsonbValue));

Assert((flags & ~(JB_FARRAY | JB_FOBJECT)) == 0);

if (flags & JB_FARRAY & superheader)
{
char *data = (char *) (array + (superheader & JB_CMASK));
int i;

for (i = 0; i < count; i++)
{
JEntry *e = array + i;

if (JBE_ISNULL(*e) && key->type == jbvNull)
{
result->type = jbvNull;
result->estSize = sizeof(JEntry);
}
else if (JBE_ISSTRING(*e) && key->type == jbvString)
{
result->type = jbvString;
result->val.string.val = data + JBE_OFF(*e);
result->val.string.len = JBE_LEN(*e);
result->estSize = sizeof(JEntry) + result->val.string.len;
}
else if (JBE_ISNUMERIC(*e) && key->type == jbvNumeric)
{
result->type = jbvNumeric;
result->val.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e)));

result->estSize = 2 * sizeof(JEntry) +
VARSIZE_ANY(result->val.numeric);
}
else if (JBE_ISBOOL(*e) && key->type == jbvBool)
{
result->type = jbvBool;
result->val.boolean = JBE_ISBOOL_TRUE(*e) != 0;
result->estSize = sizeof(JEntry);
}
else
continue;

if (compareJsonbScalarValue(key, result) == 0)
return result;
}
}
else if (flags & JB_FOBJECT & superheader)
{
/* Since this is an object, account for *Pairs* of Jentrys */
char *data = (char *) (array + (superheader & JB_CMASK) * 2);
uint32 stopLow = lowbound ? *lowbound : 0,
stopMiddle;

/* Object key past by caller must be a string */
Assert(key->type == jbvString);
...

I am not calling for a revert. I am just saying that it's imo below
project standards.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Peter Geoghegan

pg@bowt.ie

about 12 years ago

In reply to: Andres Freund (#9)

Re: Wanted: jsonb on-disk representation documentation

On Tue, May 6, 2014 at 5:13 PM, Andres Freund <andres@2ndquadrant.com> wrote:

If you think the following is a solution of essential complexity in
*new* code for navigating one level down a relatively simple *new*
datastructure - then we have a disconnect that's larger than I am
willing to argue about

You omitted the 40 lines of comments above the function.

I can live with the argument that this code is what we have; but calling
this only having the "essential complexity" is absurd.

I did not say that it only had essential complexity; just that your
criticism was vague. What's wrong with this particular code,
precisely? What complexity is incidental? Why?

I think what you're missing here is that
findJsonbValueFromSuperHeader() is useful for testing "existence" -
that's mostly what it does (serve as a worker function for the 3
existence-type operators). It's also used once to do a binary search
for a key when testing containment, ahead of testing a corresponding
value in a pair (a pair within a rhs that we're testing for
containment within an lhs "this" value). Finally, it's also used once
with arrays when testing containment.

Why do I just match the key within findJsonbValueFromSuperHeader() for
objects, and not the key/value pair you ask? Because that's not what
existence is, and it's easier and clearer to have containment of
nested objects and arrays handling by the higher level containment
function (once we find the value from the pair, to pass back to it).
There is a number of things in tension here. A further factor is the
desire to avoid redundant code. Now, I guess you could make the case
that the handling of the JB_ARRAY and JB_OBJECT cases could be broken
out, but that isn't obviously true, since that creates redundancy for
the majority of callers that only care about existence.

If you're suggesting that the JB_ARRAY and JB_OBJECT cases within that
function are redundant, well, they're not; I'm iterating element-wise
for the former and pairwise for the latter. I'm also returning the
value for the former, and the element (which in a certain sense is
equivalent -- the equivalent of an object/pair "value") for the
latter. Note the user-visible definition of existence if you don't
know what I mean.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Peter Geoghegan (#5)

Re: Wanted: jsonb on-disk representation documentation

On 05/06/2014 11:30 PM, Peter Geoghegan wrote:

On Tue, May 6, 2014 at 12:48 PM, Andres Freund <andres@anarazel.de> wrote:

Enthusiatically seconded. I've asked for that about three times without much success. If it had been my decision the patch wouldn't have been merged without that and other adjustments.

I'm almost certain that the only feedback of yours that I didn't
incorporate was that I didn't change the name of JsonbValue, a
decision I stand by, and also that I didn't add ascii art to
illustrate the on-disk format. I can write a patch that adds the
latter soon.

That would be great.

I found the serialization routine, convertJsonb() to be a bit of a mess.
It's maintaining a custom stack of levels, which can be handy if you
need to avoid recursion, but it's also relying on the native stack. And
I didn't understand the point of splitting it into the "walk" and "put"
functions; the division of labor between the two was far from clear
IMHO. I started refactoring that, and ended up with the attached.

One detail that I found scary is that the estSize field in JsonbValue is
not just any rough estimate. It's used ín the allocation of the output
buffer for convertJsonb(), so it has to be large enough or you hit an
assertion or buffer overflow. I believe it was correct as it was, but
that kind of programming is always scary. I refactored the
convertJsonb() function to use a StringInfo buffer instead, and removed
estSize altogether.

This is still work-in-progress, but I thought I'd post this now to let
people know I'm working on it. For example, the StringInfo isn't
actually very well suited for this purpose, it might be better to have a
custom buffer that's enlarged when needed.

For my own sanity, I started writing some docs on the on-disk format.
See the comments in jsonb.h for my understanding of it. I moved around
the structs a bit in jsonb.h, to make the format clearer, but the actual
on-disk format is unchanged.

- Heikki

#12

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Heikki Linnakangas (#11)

Re: Wanted: jsonb on-disk representation documentation

Continuing the review, I don't like the "superheader" terminology. To
me, "super" implies that there's some other kind of header involved, and
the superheader somehow includes or the parent of that. But it actually
seems to refer to the header field in the beginning of an array or
object value. In essence, "superheader" is used as the common term to
refer to an object that can be an array or an object. I propose that we
change that to "container".

Noticed something funny while looking at the convertJsonb function:

postgres=# select substr(((('["' ||repeat('x', 268435455) || '", "' ||
repeat('y', 2) || '"]')::jsonb)::text), 268435455);
substr

------------------------------------------------------------------------------
-----------------------------------------------------------
xxx",
0.000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000]
(1 row)

Somehow the second string element in the array, "yy", gets turned into a
numeric. The reason is that although we check that the length of a
single string doesn't exceed the maximum of 2^28 that can be stored in
the space reserved for the length in a Jentry, there are no length
checks for the end offset stored there in an array. So if the total size
of elements in an array exceed 2^28, funny things like above happen.

Attached is a WIP patch that fixes the above, renames "superheader" to
"container", and includes the refactorings and cleanup that I posted
earlier today.

- Heikki

#13

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Heikki Linnakangas (#12)

Re: Wanted: jsonb on-disk representation documentation

So, apart from cleaning up the code, we really need to take a close look
at the on-disk format now. The code can be cleaned up later, too, but
we're going to be stuck with the on-disk format forever, so it's
critical to get that right.

First, a few observations:

* JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct,
you know from the context which element in the array it is.

* JENTRY_ISNEST is set but never used.

* JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which
seems confusing.

I'm going to proceed refactoring those things, which will change the
on-disk format. It's late in the release cycle - these things really
should've been cleaned up earlier - but it's important to get the
on-disk format right. Shout if you have any objections.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Michael Paquier

michael@paquier.xyz

about 12 years ago

In reply to: Heikki Linnakangas (#13)

Re: Wanted: jsonb on-disk representation documentation

On Wed, May 7, 2014 at 8:20 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

So, apart from cleaning up the code, we really need to take a close look at
the on-disk format now. The code can be cleaned up later, too, but we're
going to be stuck with the on-disk format forever, so it's critical to get
that right.

First, a few observations:

* JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct, you
know from the context which element in the array it is.

* JENTRY_ISNEST is set but never used.

* JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which
seems confusing.

I'm going to proceed refactoring those things, which will change the on-disk
format. It's late in the release cycle - these things really should've been
cleaned up earlier - but it's important to get the on-disk format right.
Shout if you have any objections.

+1. It is saner to do that now than never.
-- 
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#13)

Re: Wanted: jsonb on-disk representation documentation

Hi,

On 2014-05-07 14:20:19 +0300, Heikki Linnakangas wrote:

So, apart from cleaning up the code, we really need to take a close look at
the on-disk format now. The code can be cleaned up later, too, but we're
going to be stuck with the on-disk format forever, so it's critical to get
that right.

First, a few observations:

Agreed.

I'd like to add that:
* Imo we need space in jsonb ondisk values to indicate a format
version. We won't fully get this right.
* The jentry representation should be changed so it's possible to get the type
of a entry without checking individual types. That'll make code like
findJsonbValueFromSuperHeader() (well, whatever you've renamed it to)
much easier to read. Preferrably so it an have the same values (after
shifting/masking) ask the JBE variants. And it needs space for futher
types (or representations thereof).
* I wonder if the hash/object pair representation is wise and if it
shouldn't be keys combined with offsets first, and then the
values. That will make access much faster. And that's what jsonb
essentially is about.
* I think both arrays and hashes should grow individual structs. With
space for additional flags.

* I have doubts of the wisdom of allowing to embed jbvBinary values in
JsonbValues. Although that can be changed later since it's not on disk.

I'm going to proceed refactoring those things, which will change the on-disk
format. It's late in the release cycle - these things really should've been
cleaned up earlier - but it's important to get the on-disk format right.
Shout if you have any objections.

I don't think it's likely that beta1 will be binary compatible with the
final version at this point.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Andres Freund (#15)

Re: Wanted: jsonb on-disk representation documentation

On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> wrote:

I'm going to proceed refactoring those things, which will change the on-disk
format. It's late in the release cycle - these things really should've been
cleaned up earlier - but it's important to get the on-disk format right.
Shout if you have any objections.

+1.

I don't think it's likely that beta1 will be binary compatible with the
final version at this point.

I rather think we're not ready for beta1 at this point (but I expect
to lose that argument).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Magnus Hagander

magnus@hagander.net

about 12 years ago

In reply to: Heikki Linnakangas (#13)

Re: Wanted: jsonb on-disk representation documentation

On Wed, May 7, 2014 at 1:20 PM, Heikki Linnakangas
<hlinnakangas@vmware.com>wrote:

So, apart from cleaning up the code, we really need to take a close look
at the on-disk format now. The code can be cleaned up later, too, but we're
going to be stuck with the on-disk format forever, so it's critical to get
that right.

First, a few observations:

* JENTRY_ISFIRST is redundant. Whenever you deal with the Jentry struct,
you know from the context which element in the array it is.

* JENTRY_ISNEST is set but never used.

* JENTRY_ISBOOL is defined as (JENTRY_ISNUMERIC | JENTRY_ISNEST), which
seems confusing.

I'm going to proceed refactoring those things, which will change the
on-disk format. It's late in the release cycle - these things really
should've been cleaned up earlier - but it's important to get the on-disk
format right. Shout if you have any objections.

+1. It's now or never. If the on-disk format needs changing, not doing it
now is going to leave us with a "jsonc" in the future... Better bite the
bullet now.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#18

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Robert Haas (#16)

Re: Wanted: jsonb on-disk representation documentation

On 2014-05-07 08:50:33 -0400, Robert Haas wrote:

On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de> wrote:

I don't think it's likely that beta1 will be binary compatible with the
final version at this point.

I rather think we're not ready for beta1 at this point (but I expect
to lose that argument).

Well, I guess it depends on what we define 'beta1' to be. Imo evaluating
problematic pieces of new code, locating unfinished pieces is part of
that. I don't see much point in forbidding incompatible changes in beta1
personally. That robs th the development cycle of the only period where
users can actually test the new version in a halfway sane manner and
report back with things that apparently broken.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Magnus Hagander

magnus@hagander.net

about 12 years ago

In reply to: Andres Freund (#18)

Re: Wanted: jsonb on-disk representation documentation

On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:

On 2014-05-07 08:50:33 -0400, Robert Haas wrote:

On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de>

wrote:

I don't think it's likely that beta1 will be binary compatible with the
final version at this point.

I rather think we're not ready for beta1 at this point (but I expect
to lose that argument).

Well, I guess it depends on what we define 'beta1' to be. Imo evaluating
problematic pieces of new code, locating unfinished pieces is part of
that. I don't see much point in forbidding incompatible changes in beta1
personally. That robs th the development cycle of the only period where
users can actually test the new version in a halfway sane manner and
report back with things that apparently broken.

We need to be very careful to tell people about it though. Preferrably if
we *know* a dump/reload will be needed to go beta1->beta2, we should
actually document that in the releasenotes of beta1 already. So people can
make proper plans..

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#20

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Magnus Hagander (#19)

Re: Wanted: jsonb on-disk representation documentation

On 2014-05-07 15:00:01 +0200, Magnus Hagander wrote:

On Wed, May 7, 2014 at 2:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:

On 2014-05-07 08:50:33 -0400, Robert Haas wrote:

On Wed, May 7, 2014 at 7:40 AM, Andres Freund <andres@anarazel.de>

wrote:

I don't think it's likely that beta1 will be binary compatible with the
final version at this point.

I rather think we're not ready for beta1 at this point (but I expect
to lose that argument).

Well, I guess it depends on what we define 'beta1' to be. Imo evaluating
problematic pieces of new code, locating unfinished pieces is part of
that. I don't see much point in forbidding incompatible changes in beta1
personally. That robs th the development cycle of the only period where
users can actually test the new version in a halfway sane manner and
report back with things that apparently broken.

We need to be very careful to tell people about it though. Preferrably if
we *know* a dump/reload will be needed to go beta1->beta2, we should
actually document that in the releasenotes of beta1 already. So people can
make proper plans..

Yes, I think it actually makes sense to add that to *all* beta release
notes. Even in beta2, although slightly weakened.
That's not a new thing btw. E.g. 9.3 has had a catversion bump between
beta1/2:
git diff 09bd2acbe5ac866ce9..817a89423f429a6a8b -- src/include/catalog/catversion.h

The more interesting note probably is that there quite possibly won't be
pg_upgrade'ability...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Magnus Hagander

magnus@hagander.net

about 12 years ago

In reply to: Andres Freund (#20)

#22