JSON Function Bike Shedding

Started by David E. Wheeleralmost 13 years ago47 messages

david@justatheory.com

almost 13 years ago

Hello Hackers,

If you dislike bike-shedding (and who does?), delete this email and the ensuing thread right now. You have been warned!

I have been playing with Andrew’s JSON enhancements and really enjoying them. I am already using them in code I’m developing for production deployment in a month or two. Kudos!

However, I am not so keen on the function names. They all start with json_! This mostly feels redundant to me, since the types of the parameters are part of the function signature.

Therefore, I would like to propose different names:

Existing Name Proposed Name
-------------------------- ----------------------------------------
json_array_length() array_length() or length() or size()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_json()
json_get_path_as_text() get_text()
json_object_keys() get_keys()
json_populate_record() record() or row()
json_populate_recordset() records() or rows()
json_unnest() get_values()
json_agg() collect_json()

Note that I have given json_get() and json_get_path() the same names, as it seems to me that the former is the same as the latter, with only one parameter. Same for json_get_as_text() and json_get_path_as_text().

One nice thing about get_values() as opposed to json_unnest(), is that it could be used to fetch the values from a JSON object as well as an array. (BTW, I think unnest is not a good name at all, since unlike the SQL unnest() function, it doesn't actually unnest (flatten) the entire array).

As for the operators, as previously discussed, I'm happy with either -> or ~> (and ->> or ~>>, of course). But I'm wondering if the same operator couldn't be used when an array is on the RHS. I mean, having #> to that it doesn't have to be cast is nice, too, but I think it'd be nice if an array would work with -> and ->>, too.

AS for #> and #>>, what about @> and @>> instead? Or am I just too much the Perl hacker for thinking that @ is a nice mnemonic for "array"?

And finally, a couple of feature requests, which can be taken with a shaker of salt -- or as ideas for 9.4 -- and are mostly stolen from hstore:

* An exists() function (and ? operator) similar to hstore
* A defined() function
* A delete() function
* A slice() function
* A concatenation function and operator
* union, intercept, and except operators and/or functions
* Perhaps some set-returning functions (select_keys(), select_values())

Even if nothing changes before release, I'm happy with the functionality Andrew has added. As I said, this is pure bike shedding, but I believe naming things is important, so it's a discussion worth having.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Josh Berkus

josh@agliodbs.com

almost 13 years ago

In reply to: David E. Wheeler (#1)

Re: JSON Function Bike Shedding

David,

However, I am not so keen on the function names. They all start with
json_! This mostly feels redundant to me, since the types of the
parameters are part of the function signature.

I have no opinion about starting the function names with json_ or not.
If we decide not, I agree that in general your proposed names are
reasonable. Except:

json_populate_record()
record() or row()

json_populate_recordset() records() or
rows()

Given that row() is already a type-agnostic function, and RECORD is a
stored procedure return meta-type, I think the above names would be a
mistake. I'd suggest instead:

json_to_record() and json_to_recordset()
or:

to_record(json) and to_recordset(json)

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Josh Berkus (#2)

Re: JSON Function Bike Shedding

On Feb 12, 2013, at 2:01 PM, Josh Berkus <josh@agliodbs.com> wrote:

Given that row() is already a type-agnostic function, and RECORD is a
stored procedure return meta-type, I think the above names would be a
mistake. I'd suggest instead:

json_to_record() and json_to_recordset()
or:

to_record(json) and to_recordset(json)

I like these last two a lot, actually. Thanks!

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 13 years ago

In reply to: Josh Berkus (#2)

Re: JSON Function Bike Shedding

Josh Berkus <josh@agliodbs.com> writes:

David,

However, I am not so keen on the function names. They all start with
json_! This mostly feels redundant to me, since the types of the
parameters are part of the function signature.

I have no opinion about starting the function names with json_ or not.

+1 for removing that where possible. We generally have avoided such
names at SQL level. (The C-level function names need such prefixes to
be unique, but the SQL names don't.)

In the cases where one or more arguments are anyelement, however, we may
need to be more specific to avoid ambiguity problems in future. I agree
with Josh's objections to record(), row() etc. to_record() and
to_recordset() might be OK.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Tom Lane (#4)

Re: JSON Function Bike Shedding

On Tue, Feb 12, 2013 at 6:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

David,

However, I am not so keen on the function names. They all start with
json_! This mostly feels redundant to me, since the types of the
parameters are part of the function signature.

I have no opinion about starting the function names with json_ or not.

+1 for removing that where possible. We generally have avoided such
names at SQL level. (The C-level function names need such prefixes to
be unique, but the SQL names don't.)

In the cases where one or more arguments are anyelement, however, we may
need to be more specific to avoid ambiguity problems in future. I agree
with Josh's objections to record(), row() etc. to_record() and
to_recordset() might be OK.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David E. Wheeler

david@kineticode.com

almost 13 years ago

In reply to: Merlin Moncure (#5)

Re: JSON Function Bike Shedding

On Feb 12, 2013, at 8:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

+1 for removing that where possible. We generally have avoided such
names at SQL level. (The C-level function names need such prefixes to
be unique, but the SQL names don't.)

In the cases where one or more arguments are anyelement, however, we may
need to be more specific to avoid ambiguity problems in future. I agree
with Josh's objections to record(), row() etc. to_record() and
to_recordset() might be OK.

Agreed on all counts. (Wow!)

!

Not sure this would make a useful operator. Maybe for exists()? :-O

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: David E. Wheeler (#1)

Re: JSON Function Bike Shedding

On Tue, Feb 12, 2013 at 1:18 PM, David E. Wheeler <david@justatheory.com> wrote:

couple other suggestions:

Existing Name Proposed Name
-------------------------- ----------------------------------------
json_array_length() array_length() or length() or size()

very much prefer without 'array_' prefix as this leads to semantic
confusion with our (generally badly named) array manipulation API.
So, length() -- also I see no reason why this can't be made to run if
the outermost container is an object but that's an acceptable
implementation detail.

json_each() each_json()

why not each(). Assumption to return json is reasonable and doesn't
need decoration IMO.

json_each_as_text() each_text()

like this

json_get() get_json()

prefer get()

json_get_as_text() get_text()

like this

json_get_path() get_json()

get() please

json_get_path_as_text() get_text()

like this

json_object_keys() get_keys()

like this

json_populate_record() record() or row()

to_record()

json_populate_recordset() records() or rows()

to_recordset()

json_unnest() get_values()

greatly prefer unwrap()

json_agg() collect_json()

perfer to leave as json_agg() -- we have string_agg, array_agg, etc.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: David E. Wheeler (#1)

Re: JSON Function Bike Shedding

On 02/12/2013 02:18 PM, David E. Wheeler wrote:

Hello Hackers,

If you dislike bike-shedding (and who does?), delete this email and the ensuing thread right now. You have been warned!

I have been playing with Andrew’s JSON enhancements and really enjoying them. I am already using them in code I’m developing for production deployment in a month or two. Kudos!

However, I am not so keen on the function names. They all start with json_! This mostly feels redundant to me, since the types of the parameters are part of the function signature.

I don't have any problem getting rid of the json_ prefixes, except for
json_agg which I think should keep it (c.f. string_agg, array_agg).

Therefore, I would like to propose different names:

Existing Name Proposed Name
-------------------------- ----------------------------------------
json_array_length() array_length() or length() or size()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_json()
json_get_path_as_text() get_text()
json_object_keys() get_keys()
json_populate_record() record() or row()
json_populate_recordset() records() or rows()
json_unnest() get_values()
json_agg() collect_json()

Note that I have given json_get() and json_get_path() the same names, as it seems to me that the former is the same as the latter, with only one parameter. Same for json_get_as_text() and json_get_path_as_text().

I will take some of this under advisement. Note that
json_populate_record's name was taken from hstore's populate_record, so
if we're trying to use similar names then it should possibly be just
populate_record. Or if that's still a bit long I would accept to_record.

One nice thing about get_values() as opposed to json_unnest(), is that it could be used to fetch the values from a JSON object as well as an array. (BTW, I think unnest is not a good name at all, since unlike the SQL unnest() function, it doesn't actually unnest (flatten) the entire array).

I think Merlin's suggestion if unwrap might be good. Or simply
"elements()" might work.

As for the operators, as previously discussed, I'm happy with either -> or ~> (and ->> or ~>>, of course). But I'm wondering if the same operator couldn't be used when an array is on the RHS. I mean, having #> to that it doesn't have to be cast is nice, too, but I think it'd be nice if an array would work with -> and ->>, too.

The point of using different operator names is that if there's an array
literal postgres will convert it to an array. If the operator names are
the same it will treat it as a text key instead. Being able to type

my_json #>> '{f1,0,f2,3,f3}'

is nice. Of course, we could duplicate the operators, but I generally
prefer not to do that.

AS for #> and #>>, what about @> and @>> instead? Or am I just too much the Perl hacker for thinking that @ is a nice mnemonic for "array"?

Probably. I deliberately avoided @> because it's used elsewhere to mean
"contains" and using it for something quite different here might be
confusing.

And finally, a couple of feature requests, which can be taken with a shaker of salt -- or as ideas for 9.4 -- and are mostly stolen from hstore:

* An exists() function (and ? operator) similar to hstore
* A defined() function
* A delete() function
* A slice() function
* A concatenation function and operator
* union, intercept, and except operators and/or functions
* Perhaps some set-returning functions (select_keys(), select_values())

I think this is beyond bikeshedding. Apparently you have missed the
existence of json_object_keys().

The new API makes many or all of these things possible to do with
relative ease as extensions (See my possibly upcoming talk on the subject.)

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 13 years ago

In reply to: Andrew Dunstan (#8)

Re: JSON Function Bike Shedding

Andrew Dunstan <andrew@dunslane.net> writes:

I will take some of this under advisement. Note that
json_populate_record's name was taken from hstore's populate_record, so
if we're trying to use similar names then it should possibly be just
populate_record. Or if that's still a bit long I would accept to_record.

+1 for following precedent whenever there is some --- so let's go with
populate_record.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Andrew Dunstan (#8)

Re: JSON Function Bike Shedding

On Feb 13, 2013, at 8:36 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I don't have any problem getting rid of the json_ prefixes, except for json_agg which I think should keep it (c.f. string_agg, array_agg).

I think that's an unfortunately naming forced on us by the SQL standard, and it doesn't mean we have to use it anyway.

I will take some of this under advisement. Note that json_populate_record's name was taken from hstore's populate_record, so if we're trying to use similar names then it should possibly be just populate_record. Or if that's still a bit long I would accept to_record.

to_record()++

I think Merlin's suggestion if unwrap might be good. Or simply "elements()" might work.

Perhaps unwrap() returns a set and elements() returns an array?

AS for #> and #>>, what about @> and @>> instead? Or am I just too much the Perl hacker for thinking that @ is a nice mnemonic for "array"?

Probably. I deliberately avoided @> because it's used elsewhere to mean "contains" and using it for something quite different here might be confusing.

I can see that, especially if you end up adding exists(): @> could be its operator.

I think this is beyond bikeshedding. Apparently you have missed the existence of json_object_keys().

Oh, I forgot it returned a set rather than an array. So I suggest:

values() - Returns an array
keys() - Returns an array

And:

unwrap() - Returns a set
skeys() - Returns a set

Er, okay, so skeys() sucks alongside the others here. If we were to steal from hstore, these would be:

svals() - Returns a set
skeys() - Returns a set
avals() - Returns an array
akeys() - Returns an array

I don’t love those, but if we want to follow precedent…

The new API makes many or all of these things possible to do with relative ease as extensions (See my possibly upcoming talk on the subject.)

I’ll be there, yo!

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Pavel Stehule

pavel.stehule@gmail.com

almost 13 years ago

In reply to: David E. Wheeler (#10)

Re: JSON Function Bike Shedding

2013/2/13 David E. Wheeler <david@justatheory.com>:

On Feb 13, 2013, at 8:36 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I don't have any problem getting rid of the json_ prefixes, except for json_agg which I think should keep it (c.f. string_agg, array_agg).

I think that's an unfortunately naming forced on us by the SQL standard, and it doesn't mean we have to use it anyway.

I will take some of this under advisement. Note that json_populate_record's name was taken from hstore's populate_record, so if we're trying to use similar names then it should possibly be just populate_record. Or if that's still a bit long I would accept to_record.

to_record()++

I think Merlin's suggestion if unwrap might be good. Or simply "elements()" might work.

Perhaps unwrap() returns a set and elements() returns an array?

AS for #> and #>>, what about @> and @>> instead? Or am I just too much the Perl hacker for thinking that @ is a nice mnemonic for "array"?

Probably. I deliberately avoided @> because it's used elsewhere to mean "contains" and using it for something quite different here might be confusing.

I can see that, especially if you end up adding exists(): @> could be its operator.

I think this is beyond bikeshedding. Apparently you have missed the existence of json_object_keys().

Oh, I forgot it returned a set rather than an array. So I suggest:

values() - Returns an array
keys() - Returns an array

"values" is keyword and "keys" is relative high risk too

Regards

Pavel

And:

unwrap() - Returns a set
skeys() - Returns a set

Er, okay, so skeys() sucks alongside the others here. If we were to steal from hstore, these would be:

svals() - Returns a set
skeys() - Returns a set
avals() - Returns an array
akeys() - Returns an array

I don’t love those, but if we want to follow precedent…

The new API makes many or all of these things possible to do with relative ease as extensions (See my possibly upcoming talk on the subject.)

I’ll be there, yo!

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: David E. Wheeler (#10)

Re: JSON Function Bike Shedding

On 02/13/2013 12:07 PM, David E. Wheeler wrote:

On Feb 13, 2013, at 8:36 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I don't have any problem getting rid of the json_ prefixes, except for json_agg which I think should keep it (c.f. string_agg, array_agg).

I think that's an unfortunately naming forced on us by the SQL standard, and it doesn't mean we have to use it anyway.

Regardless of that, I'd prefer to be consistent.

I think Merlin's suggestion if unwrap might be good. Or simply "elements()" might work.

Perhaps unwrap() returns a set and elements() returns an array?

Now you're adding functionality. Let's just keep this to the question of
names.

I think this is beyond bikeshedding. Apparently you have missed the existence of json_object_keys().

Oh, I forgot it returned a set rather than an array. So I suggest:

values() - Returns an array
keys() - Returns an array

And:

unwrap() - Returns a set
skeys() - Returns a set

Er, okay, so skeys() sucks alongside the others here. If we were to steal from hstore, these would be:

svals() - Returns a set
skeys() - Returns a set
avals() - Returns an array
akeys() - Returns an array

I don’t love those, but if we want to follow precedent…

Ditto. I think we're a bit late to be adding functionality.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Andrew Dunstan (#12)

Re: JSON Function Bike Shedding

On Feb 13, 2013, at 9:31 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I don’t love those, but if we want to follow precedent…

Ditto. I think we're a bit late to be adding functionality.

Well, how about having just keys() and vals() return arrays? Then one can just wrap them in unnest() to get sets.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Alvaro Herrera

alvherre@2ndquadrant.com

almost 13 years ago

In reply to: Andrew Dunstan (#12)

Re: JSON Function Bike Shedding

Andrew Dunstan wrote:

On 02/13/2013 12:07 PM, David E. Wheeler wrote:

On Feb 13, 2013, at 8:36 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I think Merlin's suggestion if unwrap might be good. Or simply "elements()" might work.

Perhaps unwrap() returns a set and elements() returns an array?

Now you're adding functionality. Let's just keep this to the
question of names.

I agree with that, but it seems a good idea to leave names available for
future functionality, where reasonable.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Alvaro Herrera (#14)

Re: JSON Function Bike Shedding

On Wed, Feb 13, 2013 at 11:40 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Andrew Dunstan wrote:

On 02/13/2013 12:07 PM, David E. Wheeler wrote:

On Feb 13, 2013, at 8:36 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I think Merlin's suggestion if unwrap might be good. Or simply "elements()" might work.

Perhaps unwrap() returns a set and elements() returns an array?

Now you're adding functionality. Let's just keep this to the
question of names.

I agree with that, but it seems a good idea to leave names available for
future functionality, where reasonable.

Given all that, is there consensus on the names so that Andrew knows
what changes to make?

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Robert Haas

robertmhaas@gmail.com

almost 13 years ago

In reply to: David E. Wheeler (#1)

Re: JSON Function Bike Shedding

On Tue, Feb 12, 2013 at 2:18 PM, David E. Wheeler <david@justatheory.com> wrote:

Hello Hackers,

If you dislike bike-shedding (and who does?), delete this email and the ensuing thread right now. You have been warned!

I have been playing with Andrew’s JSON enhancements and really enjoying them. I am already using them in code I’m developing for production deployment in a month or two. Kudos!

However, I am not so keen on the function names. They all start with json_! This mostly feels redundant to me, since the types of the parameters are part of the function signature.

Therefore, I would like to propose different names:

Existing Name Proposed Name
-------------------------- ----------------------------------------
json_array_length() array_length() or length() or size()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_json()
json_get_path_as_text() get_text()
json_object_keys() get_keys()
json_populate_record() record() or row()
json_populate_recordset() records() or rows()
json_unnest() get_values()
json_agg() collect_json()

Note that I have given json_get() and json_get_path() the same names, as it seems to me that the former is the same as the latter, with only one parameter. Same for json_get_as_text() and json_get_path_as_text().

I realize I'm in the minority here, but -1 from me on all of this.
Should we also rename xml_is_well_formed() to just is_well_formed()?
string_agg() to agg()? Eventually we will have more data types, and
some of them will have functions that could also be called rows() or
get_values(), but it's unlikely that they'll have exactly the same
behavior, which will start to make things confusing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Robert Haas (#16)

Re: JSON Function Bike Shedding

On Feb 15, 2013, at 9:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I realize I'm in the minority here, but -1 from me on all of this.
Should we also rename xml_is_well_formed() to just is_well_formed()?

That would be nice, but I think that ship done sunk.

string_agg() to agg()?

Would love a different name, but IIRC that followed array_agg(), which was dictated by the SQL standard, in its infinite wisdom. See also =>.

Eventually we will have more data types, and
some of them will have functions that could also be called rows() or
get_values(), but it's unlikely that they'll have exactly the same
behavior, which will start to make things confusing.

Well, they will have to take account of *this* precedent and act accordingly. Much easier for them to look back at what has been done here than for us to look forward to something that today exists as no more than a twinkle in your eye.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: Andrew Dunstan (#8)

Re: JSON Function Bike Shedding

On 02/13/2013 11:36 AM, Andrew Dunstan wrote:

Therefore, I would like to propose different names:

Existing Name Proposed Name
-------------------------- ----------------------------------------
json_array_length() array_length() or length() or size()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_json()
json_get_path_as_text() get_text()
json_object_keys() get_keys()
json_populate_record() record() or row()
json_populate_recordset() records() or rows()
json_unnest() get_values()
json_agg() collect_json()

Note that I have given json_get() and json_get_path() the same names,
as it seems to me that the former is the same as the latter, with
only one parameter. Same for json_get_as_text() and
json_get_path_as_text().

I will take some of this under advisement. Note that
json_populate_record's name was taken from hstore's populate_record,
so if we're trying to use similar names then it should possibly be
just populate_record. Or if that's still a bit long I would accept
to_record.

I have had a look at doing something like this with the json_get
functions. The trouble is that the best way to do it is to have json_get
take "variadic any", but then string literals come in as unknown rather
than as text, which makes things fairly ugly. If we force people to cast
path elements to text then I think the cure is worse than the disease. I
think the best we can do here is possibly to provide json_get and
json_get_text taking either a single int or variadic text[], and
json_get_path and json_get_path_text taking non-variadic text[].

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Andrew Dunstan (#18)

Re: JSON Function Bike Shedding

On Feb 16, 2013, at 8:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I have had a look at doing something like this with the json_get functions. The trouble is that the best way to do it is to have json_get take "variadic any", but then string literals come in as unknown rather than as text, which makes things fairly ugly. If we force people to cast path elements to text then I think the cure is worse than the disease. I think the best we can do here is possibly to provide json_get and json_get_text taking either a single int or variadic text[], and json_get_path and json_get_path_text taking non-variadic text[].

Why not also one taking a single text?

get(text)
get(int)
get(variadic text[])

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Andres Freund

andres@2ndquadrant.com

almost 13 years ago

In reply to: David E. Wheeler (#19)

Re: JSON Function Bike Shedding

On 2013-02-16 11:55:26 -0800, David E. Wheeler wrote:

On Feb 16, 2013, at 8:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I have had a look at doing something like this with the json_get functions. The trouble is that the best way to do it is to have json_get take "variadic any", but then string literals come in as unknown rather than as text, which makes things fairly ugly. If we force people to cast path elements to text then I think the cure is worse than the disease. I think the best we can do here is possibly to provide json_get and json_get_text taking either a single int or variadic text[], and json_get_path and json_get_path_text taking non-variadic text[].

Why not also one taking a single text?

get(text)
get(int)
get(variadic text[])

Those aren't differentiable by their argument types. Why should json be
able to claim that namespace and not other datatypes?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: Andres Freund (#20)

Re: JSON Function Bike Shedding

On 02/16/2013 03:05 PM, Andres Freund wrote:

On 2013-02-16 11:55:26 -0800, David E. Wheeler wrote:

On Feb 16, 2013, at 8:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

I have had a look at doing something like this with the json_get functions. The trouble is that the best way to do it is to have json_get take "variadic any", but then string literals come in as unknown rather than as text, which makes things fairly ugly. If we force people to cast path elements to text then I think the cure is worse than the disease. I think the best we can do here is possibly to provide json_get and json_get_text taking either a single int or variadic text[], and json_get_path and json_get_path_text taking non-variadic text[].

Why not also one taking a single text?

get(text)
get(int)
get(variadic text[])

Those aren't differentiable by their argument types. Why should json be
able to claim that namespace and not other datatypes?

Well, of course the calls would be

get(json, ...)

although I'm still waiting to see if anyone else agrees with Robert
about the naming of the functions.

To answer David's point, there is no point in having both

get(json,text)
get(json, variadic text[])

since the second can encompass the first, and having both would make
calls ambiguous.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Andrew Dunstan (#21)

Re: JSON Function Bike Shedding

On Feb 16, 2013, at 12:47 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

To answer David's point, there is no point in having both

get(json,text)
get(json, variadic text[])

since the second can encompass the first, and having both would make calls ambiguous.

Oh. Well then how about

get(json, int)
get(json, text)
get(json, text[])

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: David E. Wheeler (#22)

Re: JSON Function Bike Shedding

On 02/16/2013 07:50 PM, David E. Wheeler wrote:

On Feb 16, 2013, at 12:47 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

To answer David's point, there is no point in having both

get(json,text)
get(json, variadic text[])

since the second can encompass the first, and having both would make calls ambiguous.

Oh. Well then how about

get(json, int)
get(json, text)
get(json, text[])

?

No, then we don't have a variadic version. You are going to have to
accept that we can't make one function name cover all of this.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Andrew Dunstan (#23)

Re: JSON Function Bike Shedding

On Feb 17, 2013, at 6:33 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

No, then we don't have a variadic version. You are going to have to accept that we can't make one function name cover all of this.

Well, for me, I would rather specify an array than call a function with a different name. But it’s six of one, half-dozen of another, really, as long as it all works.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: David E. Wheeler (#24)

Re: JSON Function Bike Shedding

On 02/17/2013 01:19 PM, David E. Wheeler wrote:

On Feb 17, 2013, at 6:33 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

No, then we don't have a variadic version. You are going to have to accept that we can't make one function name cover all of this.

Well, for me, I would rather specify an array than call a function with a different name. But it’s six of one, half-dozen of another, really, as long as it all works.

I am going to go the way that involves the least amount of explicit
casting or array construction. So get_path() stays, but becomes
non-variadic. get() can take an int or variadic text[], so you can do:

get(myjson,0)
get(myjson,'f1')
get(myjson,'f1','2','f3')
get_path(myjson,'{f1,2,f3}')

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Robert Haas (#16)

Re: JSON Function Bike Shedding

On Fri, Feb 15, 2013 at 11:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Note that I have given json_get() and json_get_path() the same names, as it seems to me that the former is the same as the latter, with only one parameter. Same for json_get_as_text() and json_get_path_as_text().

I realize I'm in the minority here, but -1 from me on all of this.
Should we also rename xml_is_well_formed() to just is_well_formed()?
string_agg() to agg()? Eventually we will have more data types, and
some of them will have functions that could also be called rows() or
get_values(), but it's unlikely that they'll have exactly the same
behavior, which will start to make things confusing.

It's a little late, but I'd like to rebut this point:

string_agg() to agg()?

This not germane to the discussion. string_agg means you are
aggregating *to* a string, not from one, which is a completely
different thing. This also applies to to_char, to_date, etc. If you
wanted to do just 'agg()', you'd have to supply output type somehow --
the only way to do that in postgres is to use hstore null::foo trick
(which is not an improvement obviously).

xml_is_well_formed() to just is_well_formed()?

Again, this is not the same thing. It does not work on the xml type,
but text, so you'd have to supply a hint to specific behaviors if you
wanted to abstract type out of the function. Because the returned
type is unambiguously boolean though, you can get away with:

validate(format text, data text);

select validate('json', <json string>);
select validate('xml', <xml string>);
etc.

if you wanted to. And yes, I absolutely think this is superior to
cluttering the public namespace with xml specific verbage, and could
be extended to other formats. Look at the other way: we currently
have encode(format text, stuff bytea). Would we be better off with
hex_encode(bytea), escape_encode(bytea)... .etc?

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Robert Haas

robertmhaas@gmail.com

almost 13 years ago

In reply to: Merlin Moncure (#26)

Re: JSON Function Bike Shedding

On Mon, Feb 18, 2013 at 10:42 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

if you wanted to. And yes, I absolutely think this is superior to
cluttering the public namespace with xml specific verbage, and could
be extended to other formats. Look at the other way: we currently
have encode(format text, stuff bytea). Would we be better off with
hex_encode(bytea), escape_encode(bytea)... .etc?

Probably not, but that's not what I proposed either.

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

I've come to value greppability of source code pretty highly. I think
that some of the points you raise are valid, but in my (minority)
opinion overloading creates more problems than it solves. You're not
going to convince me that get() is *ever* a good name for a function -
you might as well call it thing() or foo() for all the useful
information that name conveys.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

Petr Jelinek

pjmodos@pjmodos.net

almost 13 years ago

In reply to: Robert Haas (#27)

Re: JSON Function Bike Shedding

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Robert Haas
Sent: 19 February 2013 15:05
To: Merlin Moncure
Cc: David E. Wheeler; PostgreSQL-development Hackers

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

I've come to value greppability of source code pretty highly. I think

that

some of the points you raise are valid, but in my (minority) opinion
overloading creates more problems than it solves. You're not going to
convince me that get() is *ever* a good name for a function - you might as
well call it thing() or foo() for all the useful information that name

conveys.

Let me join the minority here, +1

Regards
Petr Jelinek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Pavel Stehule

pavel.stehule@gmail.com

almost 13 years ago

In reply to: Petr Jelinek (#28)

Re: JSON Function Bike Shedding

2013/2/19 Petr Jelinek <pjmodos@pjmodos.net>:

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Robert Haas
Sent: 19 February 2013 15:05
To: Merlin Moncure
Cc: David E. Wheeler; PostgreSQL-development Hackers

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

I've come to value greppability of source code pretty highly. I think

that

some of the points you raise are valid, but in my (minority) opinion
overloading creates more problems than it solves. You're not going to
convince me that get() is *ever* a good name for a function - you might as
well call it thing() or foo() for all the useful information that name

conveys.

Let me join the minority here, +1

me too +1

Pavel

Regards
Petr Jelinek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Robert Haas (#27)

Re: JSON Function Bike Shedding

On Tue, Feb 19, 2013 at 8:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

I've come to value greppability of source code pretty highly.

Hm, good point. Counter argument: use better editing tools.
Counter-counter argument: postgresql's fast moving boutique syntax is
not well understood by any editor I'm aware of.

The editor I use is Source insight, a $$$ windows tool, and I use it
because it's basically a source code indexer with a full java and C
parser. It can do SQL also, but it's limited to what you can do with
regex for non fully parsmed languages so if I have the code:

select get(j, '...') from foo;

It doesn't know that j is json and as such I can't look for all
instances of "get() as pertains to json generally or the json field j"

Interesting aside: another language that is essentially immune to good
tooling, javascript, is exploding in use -- even on the server side.

Anyways, as to overloading in general, well, SQL is heavily
overloaded. We don't have int_max, float_max, etc. and it would be
usability reduction if we did. But that's not even the point; the
driving philosophy of SQL is that your data structures (and types) are
to be strongly decoupled from the manipulation you do -- this keeps
the language very general. That philosophy, while not perfect, should
be adhered to when possible.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Petr Jelinek (#28)

Re: JSON Function Bike Shedding

On Feb 19, 2013, at 6:11 AM, Petr Jelinek <pjmodos@pjmodos.net> wrote:

some of the points you raise are valid, but in my (minority) opinion
overloading creates more problems than it solves. You're not going to
convince me that get() is *ever* a good name for a function - you might as
well call it thing() or foo() for all the useful information that name

conveys.

Let me join the minority here, +1

Well, that's why I called them get_json() and get_text(). Basically, I don't mind that the function name says something about the return type.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Josh Berkus

josh@agliodbs.com

almost 13 years ago

In reply to: Petr Jelinek (#28)

Re: JSON Function Bike Shedding

I've come to value greppability of source code pretty highly. I think

that

some of the points you raise are valid, but in my (minority) opinion
overloading creates more problems than it solves. You're not going to
convince me that get() is *ever* a good name for a function - you might as
well call it thing() or foo() for all the useful information that name

conveys.

What about extract()? That's consistent with the function we already
use for timestamps and intervals, and is more clear than get().

On the other hand, to_string() seems like a GREAT name for an overloaded
function. You know that it takes some other type as an argument,
possibly several other types, and will always output a string.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Pavel Stehule

pavel.stehule@gmail.com

almost 13 years ago

In reply to: Josh Berkus (#32)

Re: JSON Function Bike Shedding

2013/2/19 Josh Berkus <josh@agliodbs.com>:

I've come to value greppability of source code pretty highly. I think

that

some of the points you raise are valid, but in my (minority) opinion
overloading creates more problems than it solves. You're not going to
convince me that get() is *ever* a good name for a function - you might as
well call it thing() or foo() for all the useful information that name

conveys.

What about extract()? That's consistent with the function we already
use for timestamps and intervals, and is more clear than get().

"extract" is not usual function, it is supported by parser, and in
this time nobody knows datatypes, so result can be some ugly error
messages

Regards

Pavel

On the other hand, to_string() seems like a GREAT name for an overloaded
function. You know that it takes some other type as an argument,
possibly several other types, and will always output a string.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Robert Haas

robertmhaas@gmail.com

almost 13 years ago

In reply to: Merlin Moncure (#30)

Re: JSON Function Bike Shedding

On Tue, Feb 19, 2013 at 10:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Anyways, as to overloading in general, well, SQL is heavily
overloaded. We don't have int_max, float_max, etc. and it would be
usability reduction if we did.

That's true, but max(int) and max(float) are doing pretty much the
same logical operation - they are taking the maximum of a group of
numbers. Overloading in cases where the semantics vary - e.g. + for
both integer addition and string concatenation - is something else
altogether, and I have not generally observed it to be a very good
idea. Sometimes it works in cases where it's part of the core
language design, but we don't have the luxury of knowing what other
data types we'll want to add in the future, and I'm vary wary of
allowing JSON to engage in uncontrolled namespace pollution.

But that's not even the point; the
driving philosophy of SQL is that your data structures (and types) are
to be strongly decoupled from the manipulation you do -- this keeps
the language very general. That philosophy, while not perfect, should
be adhered to when possible.

Perhaps, but that goal seems unlikely to be met in this case. The
JSON functions and operators are being named by one group of people
with one set of sensibilities, and the hstore functions and operators
were named by a different group of people with a different set of
sensibilities (and therefore don't match), and the next type that
comes along will be named according to yet another group of people
with another set of sensibilities. So we're unlikely to end up with a
coherent set of primitives that operate on underlying data of a
variety of types; we are instead likely to end up with an incoherent
jumble.

Although we now have a JSON type in core, we should not pretend that
it's in the same league as text or int4. If those data types claim
common function names like max and abs and common operator names like
+ and ||, it can be justified on the grounds that the appeal of those
data types is pretty near universal. JSON is a very popular format
right now and I completely support adding more support for it, but I
cheerfully submit that if you think it falls into the same category as
"text" or "int4", you've gotten way too caught up in the hype. It's
completely appropriate to apply stricter criteria for namespace
pollution to JSON than to a basic data type whose semantics are
dictated by the SQL standard, the behavior of other database products,
and fourth-grade math class.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Robert Haas (#34)

Re: JSON Function Bike Shedding

On Wed, Feb 20, 2013 at 9:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Feb 19, 2013 at 10:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Anyways, as to overloading in general, well, SQL is heavily
overloaded. We don't have int_max, float_max, etc. and it would be
usability reduction if we did.

That's true, but max(int) and max(float) are doing pretty much the
same logical operation - they are taking the maximum of a group of
numbers. Overloading in cases where the semantics vary - e.g. + for
both integer addition and string concatenation - is something else
altogether, and I have not generally observed it to be a very good
idea. Sometimes it works in cases where it's part of the core
language design, but we don't have the luxury of knowing what other
data types we'll want to add in the future, and I'm vary wary of
allowing JSON to engage in uncontrolled namespace pollution.

Sure: but that's another straw man: abuse of + operator is case of
combining arbitrarily different behaviors (concatenation and
arithmetic aggregation) into uniform syntax. This is bad, but a
different thing. The right way to do it is to globally define the
behavior and map it to types if and only if it makes sense. Again,
you want clean separation of 'what you're doing' vs 'what you're doing
it over'.

But that's not even the point; the
driving philosophy of SQL is that your data structures (and types) are
to be strongly decoupled from the manipulation you do -- this keeps
the language very general. That philosophy, while not perfect, should
be adhered to when possible.

Perhaps, but that goal seems unlikely to be met in this case. The
JSON functions and operators are being named by one group of people
with one set of sensibilities, and the hstore functions and operators
were named by a different group of people with a different set of
sensibilities (and therefore don't match), and the next type that
comes along will be named according to yet another group of people
with another set of sensibilities. So we're unlikely to end up with a
coherent set of primitives that operate on underlying data of a
variety of types; we are instead likely to end up with an incoherent
jumble.

json and hstore have overlap in the sense that you can use them to
define a tuple that is independent from database type system and
therefore free from it's restrictions (this is why 9.0+ hstore was a
complete game changer for trigger development). Also a json object is
for all intents and purposes an hstore++ -- json is more general and
if json it gets the ability to be manipulated would probably displace
hstore for most usages.

So I'm not buying that: if the truly overlapping behaviors were
syntactically equivalent then you would be able to swap out the
implementation changing only the type without refactoring all your
code. C++ STL works this way and that principle, at least, is good
despite all the C++ baggage headaches.

Although we now have a JSON type in core, we should not pretend that
it's in the same league as text or int4. If those data types claim
common function names like max and abs and common operator names like
+ and ||, it can be justified on the grounds that the appeal of those
data types is pretty near universal. JSON is a very popular format
right now and I completely support adding more support for it, but I
cheerfully submit that if you think it falls into the same category as
"text" or "int4", you've gotten way too caught up in the hype. It's
completely appropriate to apply stricter criteria for namespace
pollution to JSON than to a basic data type whose semantics are
dictated by the SQL standard, the behavior of other database products,
and fourth-grade math class.

I'm not buying into the hype at all. I've been arguing (without much
success) for years that throwing arcane type specific functions into
the public namespace is incoherent, not the other way around.
array_upper()? How about length() or count()?

Well, we need to to decide what to do here -- I'll call the vote about
even, and there plausible arguments to do it either way -- so how do
we resolve this?

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Robert Haas

robertmhaas@gmail.com

almost 13 years ago

In reply to: Merlin Moncure (#35)

Re: JSON Function Bike Shedding

On Thu, Feb 21, 2013 at 10:51 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Sure: but that's another straw man: abuse of + operator is case of
combining arbitrarily different behaviors (concatenation and
arithmetic aggregation) into uniform syntax. This is bad, but a
different thing. The right way to do it is to globally define the
behavior and map it to types if and only if it makes sense. Again,
you want clean separation of 'what you're doing' vs 'what you're doing
it over'.

I'll buy that. So what's the globally defined behavior of a ~>
operator or a function called get() or even vals()? The problem is
that I don't know how we can be sure any definition we choose now
based on one example will be forward-compatible with things we want to
do later, perhaps involving completely unrelated data types with very
different semantics. It's not like there are an infinite number of
short, high-quality operator/function names.

I'm not buying into the hype at all. I've been arguing (without much
success) for years that throwing arcane type specific functions into
the public namespace is incoherent, not the other way around.
array_upper()? How about length() or count()?

Not sure I follow. array_upper() is annoying because its semantics
are kinda confusing and idiosyncratic, but that's more the fault of
the type itself than the accessor function. length() and count() are
admittedly very common English words, but it's hard to imagine what
we'd want to use those names for that would be more common/important
than what they're used for already. It's not at all hard to imagine
that with some of the other names that have been proposed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Robert Haas (#36)

Re: JSON Function Bike Shedding

On Thu, Feb 21, 2013 at 11:02 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Feb 21, 2013 at 10:51 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

Sure: but that's another straw man: abuse of + operator is case of
combining arbitrarily different behaviors (concatenation and
arithmetic aggregation) into uniform syntax. This is bad, but a
different thing. The right way to do it is to globally define the
behavior and map it to types if and only if it makes sense. Again,
you want clean separation of 'what you're doing' vs 'what you're doing
it over'.

I'll buy that. So what's the globally defined behavior of a ~>
operator or a function called get() or even vals()? The problem is
that I don't know how we can be sure any definition we choose now
based on one example will be forward-compatible with things we want to
do later, perhaps involving completely unrelated data types with very
different semantics. It's not like there are an infinite number of
short, high-quality operator/function names.

Well, for case the of operator, it means whatever we reserve to mean.
Very much agree on limitations of symbolic representation of behaviors
(especially since some of the best ones were reserved by SQL or other
acctors), so I think there is growing consensus that such things
should get moved to functions. But functions are a lot less terse
than operators so functions describing clearly defined behaviors are
appreciated.

So, get() means what *define it to mean*, but the definition should be
consistent. If it's shorthand for "get from some multiple key/value
container" then fine. If get() is just not specific enough -- let's
at least try and go for something behavior specific (such as getMember
or some such) before punting and resolving type specific function
names.

In fact, a an awful lot of $propsal's behaviors are in fact direct
proxies for hstore behaviors, and a superficial think is suggesting
that around 90% of hstore API would make sense in JSON terms (even
though Andrew didn't implement all those behaviors and we're not going
to ask him to). That to me is suggesting that tuple manipulation is a
pretty general problem (hstore AKA tuple) and json only brings a
couple of things to the table that isn't already covered there.
Isn't it nice that you can document functions like avals/svals ONCE
and not have to rewrite your triggers when you swap out hstore for
json to get a couple extra behavior bits?

I'm not buying into the hype at all. I've been arguing (without much
success) for years that throwing arcane type specific functions into
the public namespace is incoherent, not the other way around.
array_upper()? How about length() or count()?

Not sure I follow. array_upper() is annoying because its semantics
are kinda confusing and idiosyncratic, but that's more the fault of
the type itself than the accessor function. length() and count() are
admittedly very common English words, but it's hard to imagine what
we'd want to use those names for that would be more common/important
than what they're used for already. It's not at all hard to imagine
that with some of the other names that have been proposed.

yeah.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Robert Haas

robertmhaas@gmail.com

almost 13 years ago

In reply to: Merlin Moncure (#37)

Re: JSON Function Bike Shedding

On Thu, Feb 21, 2013 at 1:16 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

Well, for case the of operator, it means whatever we reserve to mean.
Very much agree on limitations of symbolic representation of behaviors
(especially since some of the best ones were reserved by SQL or other
acctors), so I think there is growing consensus that such things
should get moved to functions. But functions are a lot less terse
than operators so functions describing clearly defined behaviors are
appreciated.

So, get() means what *define it to mean*, but the definition should be
consistent. If it's shorthand for "get from some multiple key/value
container" then fine. If get() is just not specific enough -- let's
at least try and go for something behavior specific (such as getMember
or some such) before punting and resolving type specific function
names.

In fact, a an awful lot of $propsal's behaviors are in fact direct
proxies for hstore behaviors, and a superficial think is suggesting
that around 90% of hstore API would make sense in JSON terms (even
though Andrew didn't implement all those behaviors and we're not going
to ask him to). That to me is suggesting that tuple manipulation is a
pretty general problem (hstore AKA tuple) and json only brings a
couple of things to the table that isn't already covered there.
Isn't it nice that you can document functions like avals/svals ONCE
and not have to rewrite your triggers when you swap out hstore for
json to get a couple extra behavior bits?

Naming the JSON stuff the same way we've already named the hstore
stuff is a somewhat promising idea, but it's hard for me to believe
we'd truly resist the urge to tinker. avals and svals are completely
opaque to me; without reading the manual I have no idea what those
things mean. If they had longer, more descriptive names it would be
more tempting. Still, if the behaviors line up closely enough for
government work and we want to match the names up as well, I think
that'd be tolerable.

What I think is NOT tolerable is choosing a set of short but arbitrary
names which are different from anything that we have now and
pretending that we'll want to use those again for the next data type
that comes along. That's just wishful thinking. Programmers who
believe that their decisions will act as precedent for all future code
are almost inevitably disappointed. Precedent grows organically out
of what happens; it's very hard to create it ex nihilo, especially
since we have no clear idea what future data types we'll likely want
to add. Sure, if we add something that's just like JSON but with a
few extra features, we'll be able to reuse the names no problem. But
that's unlikely, because we typically resist the urge to add things
that are too much like what we already have. The main reason we're
adding JSON when we already have hstore is because JSON has become
something of a standard. We probably WILL add more "container" types
in the future, but I'd guess that they are likely to be as different
from JSON as JSON is from XML, or from arrays. I'm not convinced we
can define a set of semantics that are going to sweep that broadly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

David E. Wheeler

david@justatheory.com

almost 13 years ago

In reply to: Robert Haas (#38)

Re: JSON Function Bike Shedding

On Feb 22, 2013, at 9:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I think is NOT tolerable is choosing a set of short but arbitrary
names which are different from anything that we have now and
pretending that we'll want to use those again for the next data type
that comes along. That's just wishful thinking. Programmers who
believe that their decisions will act as precedent for all future code
are almost inevitably disappointed. Precedent grows organically out
of what happens; it's very hard to create it ex nihilo, especially
since we have no clear idea what future data types we'll likely want
to add. Sure, if we add something that's just like JSON but with a
few extra features, we'll be able to reuse the names no problem. But
that's unlikely, because we typically resist the urge to add things
that are too much like what we already have. The main reason we're
adding JSON when we already have hstore is because JSON has become
something of a standard. We probably WILL add more "container" types
in the future, but I'd guess that they are likely to be as different
from JSON as JSON is from XML, or from arrays. I'm not convinced we
can define a set of semantics that are going to sweep that broadly.

Maybe. I would argue, however, that a key/value-oriented data type will always call those things "keys" and "values". So keys() and vals() (or get_keys() and get_vals()) seems pretty reasonable to me.

Anyway, back to practicalities, Andrew last posted:

I am going to go the way that involves the least amount of explicit casting or array construction. So get_path() stays, but becomes non-variadic. get() can take an int or variadic text[], so you can do:

get(myjson,0)
get(myjson,'f1')
get(myjson,'f1','2','f3')
get_path(myjson,'{f1,2,f3}')

I would change these to mention the return types:

get_json(myjson,0)
get_json(myjson,'f1')
get_json(myjson,'f1','2','f3')
get_path_json(myjson,'{f1,2,f3}')

And then the complementary text-returning versions:

get_text(myjson,0)
get_text(myjson,'f1')
get_text(myjson,'f1','2','f3')
get_path_text(myjson,'{f1,2,f3}')

I do think that something like length() has pretty good semantics across data types, though. So to update the proposed names, taking in the discussion, I now propose:

Existing Name Proposed Name
-------------------------- -------------------
json_array_length() length()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_path_json()
json_get_path_as_text() get_path_text()
json_object_keys() get_keys()
json_populate_record() to_record()
json_populate_recordset() to_records()
json_unnest() get_values()
json_agg() json_agg()

I still prefer to_record() and to_records() to populate_record(). It just feels more like a cast to me. I dislike json_agg(), but assume we're stuck with it.

But at this point, I’m happy to leave Andrew to it. The functionality is awesome.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: David E. Wheeler (#39)

Re: JSON Function Bike Shedding

On Fri, Feb 22, 2013 at 11:50 AM, David E. Wheeler
<david@justatheory.com> wrote:

On Feb 22, 2013, at 9:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I think is NOT tolerable is choosing a set of short but arbitrary
names which are different from anything that we have now and
pretending that we'll want to use those again for the next data type
that comes along. That's just wishful thinking. Programmers who
believe that their decisions will act as precedent for all future code
are almost inevitably disappointed. Precedent grows organically out
of what happens; it's very hard to create it ex nihilo, especially
since we have no clear idea what future data types we'll likely want
to add. Sure, if we add something that's just like JSON but with a
few extra features, we'll be able to reuse the names no problem. But
that's unlikely, because we typically resist the urge to add things
that are too much like what we already have. The main reason we're
adding JSON when we already have hstore is because JSON has become
something of a standard. We probably WILL add more "container" types
in the future, but I'd guess that they are likely to be as different
from JSON as JSON is from XML, or from arrays. I'm not convinced we
can define a set of semantics that are going to sweep that broadly.

Maybe. I would argue, however, that a key/value-oriented data type will always call those things "keys" and "values". So keys() and vals() (or get_keys() and get_vals()) seems pretty reasonable to me.

Anyway, back to practicalities, Andrew last posted:

I am going to go the way that involves the least amount of explicit casting or array construction. So get_path() stays, but becomes non-variadic. get() can take an int or variadic text[], so you can do:

get(myjson,0)
get(myjson,'f1')
get(myjson,'f1','2','f3')
get_path(myjson,'{f1,2,f3}')

I would change these to mention the return types:

get_json(myjson,0)
get_json(myjson,'f1')
get_json(myjson,'f1','2','f3')
get_path_json(myjson,'{f1,2,f3}')

And then the complementary text-returning versions:

get_text(myjson,0)
get_text(myjson,'f1')
get_text(myjson,'f1','2','f3')
get_path_text(myjson,'{f1,2,f3}')

I do think that something like length() has pretty good semantics across data types, though. So to update the proposed names, taking in the discussion, I now propose:

Existing Name Proposed Name
-------------------------- -------------------
json_array_length() length()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_path_json()
json_get_path_as_text() get_path_text()
json_object_keys() get_keys()
json_populate_record() to_record()
json_populate_recordset() to_records()
json_unnest() get_values()
json_agg() json_agg()

I still prefer to_record() and to_records() to populate_record(). It just feels more like a cast to me. I dislike json_agg(), but assume we're stuck with it.

But at this point, I’m happy to leave Andrew to it. The functionality is awesome.

Agreed: +1 to your thoughts here. But also +1 to the originals and +1
to Robert's point of view also. This feature is of huge strategic
importance to the project and we need to lock this down and commit it.
There is a huge difference between "i slightly prefer some different
names" and "the feature has issues".

So, i think the various positions are clear: this is one argument i'd
be happy to lose (or win).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: Merlin Moncure (#40)

Re: JSON Function Bike Shedding

On 03/01/2013 11:09 AM, Merlin Moncure wrote:

On Fri, Feb 22, 2013 at 11:50 AM, David E. Wheeler
<david@justatheory.com> wrote:

On Feb 22, 2013, at 9:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I think is NOT tolerable is choosing a set of short but arbitrary
names which are different from anything that we have now and
pretending that we'll want to use those again for the next data type
that comes along. That's just wishful thinking. Programmers who
believe that their decisions will act as precedent for all future code
are almost inevitably disappointed. Precedent grows organically out
of what happens; it's very hard to create it ex nihilo, especially
since we have no clear idea what future data types we'll likely want
to add. Sure, if we add something that's just like JSON but with a
few extra features, we'll be able to reuse the names no problem. But
that's unlikely, because we typically resist the urge to add things
that are too much like what we already have. The main reason we're
adding JSON when we already have hstore is because JSON has become
something of a standard. We probably WILL add more "container" types
in the future, but I'd guess that they are likely to be as different
from JSON as JSON is from XML, or from arrays. I'm not convinced we
can define a set of semantics that are going to sweep that broadly.

Maybe. I would argue, however, that a key/value-oriented data type will always call those things "keys" and "values". So keys() and vals() (or get_keys() and get_vals()) seems pretty reasonable to me.

Anyway, back to practicalities, Andrew last posted:

I am going to go the way that involves the least amount of explicit casting or array construction. So get_path() stays, but becomes non-variadic. get() can take an int or variadic text[], so you can do:

get(myjson,0)
get(myjson,'f1')
get(myjson,'f1','2','f3')
get_path(myjson,'{f1,2,f3}')

I would change these to mention the return types:

get_json(myjson,0)
get_json(myjson,'f1')
get_json(myjson,'f1','2','f3')
get_path_json(myjson,'{f1,2,f3}')

And then the complementary text-returning versions:

get_text(myjson,0)
get_text(myjson,'f1')
get_text(myjson,'f1','2','f3')
get_path_text(myjson,'{f1,2,f3}')

I do think that something like length() has pretty good semantics across data types, though. So to update the proposed names, taking in the discussion, I now propose:

Existing Name Proposed Name
-------------------------- -------------------
json_array_length() length()
json_each() each_json()
json_each_as_text() each_text()
json_get() get_json()
json_get_as_text() get_text()
json_get_path() get_path_json()
json_get_path_as_text() get_path_text()
json_object_keys() get_keys()
json_populate_record() to_record()
json_populate_recordset() to_records()
json_unnest() get_values()
json_agg() json_agg()

I still prefer to_record() and to_records() to populate_record(). It just feels more like a cast to me. I dislike json_agg(), but assume we're stuck with it.

But at this point, I’m happy to leave Andrew to it. The functionality is awesome.

Agreed: +1 to your thoughts here. But also +1 to the originals and +1
to Robert's point of view also. This feature is of huge strategic
importance to the project and we need to lock this down and commit it.
There is a huge difference between "i slightly prefer some different
names" and "the feature has issues".

So, i think the various positions are clear: this is one argument i'd
be happy to lose (or win).

I've been sitting here for a while mulling none too happily over the
debate on the names for the proposed JSON extraction functions. I
haven't really been happy with any of the suggestions, much, not least
my own original function names which were really only intended as
placeholders. Last night in the still watches I decided I just couldn't
go with a function name as almost totally content-free as get(), or even
get_text(). And I don't think prepending "json_'" to the name helps much
either.

Just concentrating to start with on those get() functions, in the simple
case we really don't need them at all. hstore has the "->" operator
without documenting the underlying function ("fetchval"). So maybe we
should just do that. We could have documented, simply:

myjson -> 'fname'
myjson -> 1
myjson ->> 'fname'
myjson ->> 1
myjson #> '{fname,1}'
myjson #>> '{fname,1}'

and leave the underlying functions undocumented.

One wrinkle in this picture is the variadic forms of extraction which
don't lend themselves nicely to use with an operator. We could decide to
do away with those altogether, or come up with a better name. I'm loath
to use "json_path" since it's a name used for something similar but
different elsewhere. I do think it's valuable to have the variadic form,
though, and I'd be sad to see it go.

Regarding the remaining functions,

* I'd be inclined to stick with json_array_length() and
json_object_keys() - I think they describe pretty well what they do.
hstore's skeys() does more or less the same as json_object_keys(),
so we could use that if we want to be consistent. I don't think it's
a terribly good name though.
* json_unnest() should certainly be renamed. Alternatives that come to
mind are json_unfold() or json_elements() or json_array_elements().
* json_each(), json_each_as_text(), json_populate_record() and
json_populate_recordset() - to be consistent with hstore we could
remove the "json_". We probably should remove the "_as_ from
json_each_as_text().

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Tom Lane

tgl@sss.pgh.pa.us

almost 13 years ago

In reply to: Andrew Dunstan (#41)

Re: JSON Function Bike Shedding

Andrew Dunstan <andrew@dunslane.net> writes:

I've been sitting here for a while mulling none too happily over the
debate on the names for the proposed JSON extraction functions. I
haven't really been happy with any of the suggestions, much, not least
my own original function names which were really only intended as
placeholders. Last night in the still watches I decided I just couldn't
go with a function name as almost totally content-free as get(), or even
get_text(). And I don't think prepending "json_'" to the name helps much
either.

Agreed.

Just concentrating to start with on those get() functions, in the simple
case we really don't need them at all. hstore has the "->" operator
without documenting the underlying function ("fetchval"). So maybe we
should just do that.

Well, not documenting the underlying function does not relieve you from
having to name it in a reasonably sane fashion. It still wouldn't do
to call it "get()".

* I'd be inclined to stick with json_array_length() and
json_object_keys() - I think they describe pretty well what they do.
hstore's skeys() does more or less the same as json_object_keys(),
so we could use that if we want to be consistent. I don't think it's
a terribly good name though.
* json_unnest() should certainly be renamed. Alternatives that come to
mind are json_unfold() or json_elements() or json_array_elements().
* json_each(), json_each_as_text(), json_populate_record() and
json_populate_recordset() - to be consistent with hstore we could
remove the "json_". We probably should remove the "_as_ from
json_each_as_text().

I don't particularly have a dog in this fight, but do we really want
some of these to have a json_ prefix and others not?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Josh Berkus

josh@agliodbs.com

almost 13 years ago

In reply to: Andrew Dunstan (#41)

Re: JSON Function Bike Shedding

On 03/18/2013 12:29 PM, Andrew Dunstan wrote:

One wrinkle in this picture is the variadic forms of extraction which
don't lend themselves nicely to use with an operator. We could decide to
do away with those altogether, or come up with a better name. I'm loath
to use "json_path" since it's a name used for something similar but
different elsewhere. I do think it's valuable to have the variadic form,
though, and I'd be sad to see it go.

Given that the variadic form is meant to be the foundation of future
tree-based indexing of JSON values, I really don't want to do without
it. Plus, I'd be forced to reimplement it in my own code.

But the name does need work. json_tree? Hmmm, no good ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Tom Lane (#42)

Re: JSON Function Bike Shedding

On Mon, Mar 18, 2013 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I've been sitting here for a while mulling none too happily over the
debate on the names for the proposed JSON extraction functions. I
haven't really been happy with any of the suggestions, much, not least
my own original function names which were really only intended as
placeholders. Last night in the still watches I decided I just couldn't
go with a function name as almost totally content-free as get(), or even
get_text(). And I don't think prepending "json_'" to the name helps much
either.

Agreed.

Just concentrating to start with on those get() functions, in the simple
case we really don't need them at all. hstore has the "->" operator
without documenting the underlying function ("fetchval"). So maybe we
should just do that.

Well, not documenting the underlying function does not relieve you from
having to name it in a reasonably sane fashion. It still wouldn't do
to call it "get()".

How about 'fetch'. Or is that just skirting the content free aspect?
Agree that 'path' is out (as unnest is out due to unfortunate semantic
confusion). At the end of the day, 'get()' is simply referencing an
array (either associative or not). Most languages do this with an
operator, but I think fetch is pretty solid term.

* I'd be inclined to stick with json_array_length() and
json_object_keys() - I think they describe pretty well what they do.
hstore's skeys() does more or less the same as json_object_keys(),
so we could use that if we want to be consistent. I don't think it's
a terribly good name though.
* json_unnest() should certainly be renamed. Alternatives that come to
mind are json_unfold() or json_elements() or json_array_elements().
* json_each(), json_each_as_text(), json_populate_record() and
json_populate_recordset() - to be consistent with hstore we could
remove the "json_". We probably should remove the "_as_ from
json_each_as_text().

I don't particularly have a dog in this fight, but do we really want
some of these to have a json_ prefix and others not?

That's already baked in, because 9.2 json functions have prefix. I'm
still partial to json_unwrap for unnest, but out of Andrew's
suggestions I like json_elements the best. Like removing _as_.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: Merlin Moncure (#44)

Re: JSON Function Bike Shedding

On 03/22/2013 09:29 AM, Merlin Moncure wrote:

On Mon, Mar 18, 2013 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I've been sitting here for a while mulling none too happily over the
debate on the names for the proposed JSON extraction functions. I
haven't really been happy with any of the suggestions, much, not least
my own original function names which were really only intended as
placeholders. Last night in the still watches I decided I just couldn't
go with a function name as almost totally content-free as get(), or even
get_text(). And I don't think prepending "json_'" to the name helps much
either.

Agreed.

Just concentrating to start with on those get() functions, in the simple
case we really don't need them at all. hstore has the "->" operator
without documenting the underlying function ("fetchval"). So maybe we
should just do that.

Well, not documenting the underlying function does not relieve you from
having to name it in a reasonably sane fashion. It still wouldn't do
to call it "get()".

How about 'fetch'. Or is that just skirting the content free aspect?
Agree that 'path' is out (as unnest is out due to unfortunate semantic
confusion). At the end of the day, 'get()' is simply referencing an
array (either associative or not). Most languages do this with an
operator, but I think fetch is pretty solid term.

* I'd be inclined to stick with json_array_length() and
json_object_keys() - I think they describe pretty well what they do.
hstore's skeys() does more or less the same as json_object_keys(),
so we could use that if we want to be consistent. I don't think it's
a terribly good name though.
* json_unnest() should certainly be renamed. Alternatives that come to
mind are json_unfold() or json_elements() or json_array_elements().
* json_each(), json_each_as_text(), json_populate_record() and
json_populate_recordset() - to be consistent with hstore we could
remove the "json_". We probably should remove the "_as_ from
json_each_as_text().

I don't particularly have a dog in this fight, but do we really want
some of these to have a json_ prefix and others not?

That's already baked in, because 9.2 json functions have prefix.

I have finally decided my position on this. I think we have lots of good
precedents for using type names in function names: array functions, xml
functions and enum functions, for example. I think these are the
precedents to follow, rather than hstore. Some people will be unhappy
that this means more typing, but SQL is somewhat verbose anyway, and
whatever we do will make someone unhappy :-)

I'm
still partial to json_unwrap for unnest, but out of Andrew's
suggestions I like json_elements the best. Like removing _as_.

OK. I can live with that.

New version forthcoming soon.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#46

Merlin Moncure

mmoncure@gmail.com

almost 13 years ago

In reply to: Andrew Dunstan (#45)

Re: JSON Function Bike Shedding

On Fri, Mar 22, 2013 at 8:58 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 03/22/2013 09:29 AM, Merlin Moncure wrote:

On Mon, Mar 18, 2013 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I've been sitting here for a while mulling none too happily over the
debate on the names for the proposed JSON extraction functions. I
haven't really been happy with any of the suggestions, much, not least
my own original function names which were really only intended as
placeholders. Last night in the still watches I decided I just couldn't
go with a function name as almost totally content-free as get(), or even
get_text(). And I don't think prepending "json_'" to the name helps much
either.

Agreed.

Just concentrating to start with on those get() functions, in the simple
case we really don't need them at all. hstore has the "->" operator
without documenting the underlying function ("fetchval"). So maybe we
should just do that.

Well, not documenting the underlying function does not relieve you from
having to name it in a reasonably sane fashion. It still wouldn't do
to call it "get()".

How about 'fetch'. Or is that just skirting the content free aspect?
Agree that 'path' is out (as unnest is out due to unfortunate semantic
confusion). At the end of the day, 'get()' is simply referencing an
array (either associative or not). Most languages do this with an
operator, but I think fetch is pretty solid term.

* I'd be inclined to stick with json_array_length() and
json_object_keys() - I think they describe pretty well what they
do.
hstore's skeys() does more or less the same as json_object_keys(),
so we could use that if we want to be consistent. I don't think
it's
a terribly good name though.
* json_unnest() should certainly be renamed. Alternatives that come
to
mind are json_unfold() or json_elements() or json_array_elements().
* json_each(), json_each_as_text(), json_populate_record() and
json_populate_recordset() - to be consistent with hstore we could
remove the "json_". We probably should remove the "_as_ from
json_each_as_text().

I don't particularly have a dog in this fight, but do we really want
some of these to have a json_ prefix and others not?

That's already baked in, because 9.2 json functions have prefix.

I have finally decided my position on this. I think we have lots of good
precedents for using type names in function names: array functions, xml
functions and enum functions, for example. I think these are the precedents
to follow, rather than hstore. Some people will be unhappy that this means
more typing, but SQL is somewhat verbose anyway, and whatever we do will
make someone unhappy :-)

I'm
still partial to json_unwrap for unnest, but out of Andrew's
suggestions I like json_elements the best. Like removing _as_.

OK. I can live with that.

New version forthcoming soon.

Thanks for that! I'm super duper busy lately, but i'd still like to
bone up the docs a little bit, so if I can find the time I'd like to
squeeze some in before we lock in the beta if that's all right.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47

Andrew Dunstan

andrew@dunslane.net

almost 13 years ago

In reply to: Andrew Dunstan (#45)

1 attachment(s)

Re: JSON Function Bike Shedding

On 03/22/2013 09:58 AM, Andrew Dunstan wrote:

New version forthcoming soon.

And here it is. I think this is just about baked now.

cheers

andrew

Attachments:

jsonapi9xx.patchtext/x-patch; name=jsonapi9xx.patchDownload

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 490d710..11270d7 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9842,17 +9842,75 @@ table2-mapping
     <secondary>Functions and operators</secondary>
   </indexterm>
 
+   <para>
+   <xref linkend="functions-json-op-table"> shows the operators that are
+   available for use with JSON (see <xref linkend="datatype-json">) data.
+  </para>
+
+  <table id="functions-json-op-table">
+     <title>JSON Operators</title>
+     <tgroup cols="4">
+      <thead>
+       <row>
+        <entry>Operator</entry>
+        <entry>Right Operand Type</entry>
+        <entry>Description</entry>
+        <entry>Example</entry>
+       </row>
+      </thead>
+      <tbody>
+       <row>
+        <entry><literal>-&gt;</literal></entry>
+        <entry>int</entry>
+        <entry>Get JSON array element</entry>
+        <entry><literal>'[1,2,3]'::json-&gt;2</literal></entry>
+       </row>
+       <row>
+        <entry><literal>-&gt;</literal></entry>
+        <entry>text</entry>
+        <entry>Get JSON object field</entry>
+        <entry><literal>'{"a":1,"b":2}'::json-&gt;'b'</literal></entry>
+       </row>
+        <row>
+        <entry><literal>-&gt;&gt;</literal></entry>
+        <entry>int</entry>
+        <entry>Get JSON array element as text</entry>
+        <entry><literal>'[1,2,3]'::json-&gt;&gt;2</literal></entry>
+       </row>
+       <row>
+        <entry><literal>-&gt;&gt;</literal></entry>
+        <entry>text</entry>
+        <entry>Get JSON object field as text</entry>
+        <entry><literal>'{"a":1,"b":2}'::json-&gt;&gt;'b'</literal></entry>
+       </row>
+       <row>
+        <entry><literal>#&gt;</literal></entry>
+        <entry>array of text</entry>
+        <entry>Get JSON object at specified path</entry>
+        <entry><literal>'{"a":[1,2,3],"b":[4,5,6]}'::json#&gt;'{a,2}'</literal></entry>
+       </row>
+       <row>
+        <entry><literal>#&gt;&gt;</literal></entry>
+        <entry>array of text</entry>
+        <entry>Get JSON object at specified path as text</entry>
+        <entry><literal>'{"a":[1,2,3],"b":[4,5,6]}'::json#&gt;&gt;'{a,2}'</literal></entry>
+       </row>
+      </tbody>
+     </tgroup>
+   </table>
+
   <para>
    <xref linkend="functions-json-table"> shows the functions that are available
-   for creating JSON (see <xref linkend="datatype-json">) data.
+   for creating and manipulating JSON (see <xref linkend="datatype-json">) data.
   </para>
 
   <table id="functions-json-table">
     <title>JSON Support Functions</title>
-    <tgroup cols="4">
+    <tgroup cols="5">
      <thead>
       <row>
        <entry>Function</entry>
+       <entry>Return Type</entry>
        <entry>Description</entry>
        <entry>Example</entry>
        <entry>Example Result</entry>
@@ -9866,6 +9924,7 @@ table2-mapping
          </indexterm>
          <literal>array_to_json(anyarray [, pretty_bool])</literal>
        </entry>
+       <entry>json</entry>
        <entry>
          Returns the array as JSON. A PostgreSQL multidimensional array
          becomes a JSON array of arrays. Line feeds will be added between
@@ -9881,6 +9940,7 @@ table2-mapping
          </indexterm>
          <literal>row_to_json(record [, pretty_bool])</literal>
        </entry>
+       <entry>json</entry>
        <entry>
          Returns the row as JSON. Line feeds will be added between level
          1 elements if <parameter>pretty_bool</parameter> is true.
@@ -9895,6 +9955,7 @@ table2-mapping
          </indexterm>
          <literal>to_json(anyelement)</literal>
        </entry>
+       <entry>json</entry>
        <entry>
          Returns the value as JSON. If the data type is not builtin, and there
          is a cast from the type to json, the cast function will be used to
@@ -9905,6 +9966,182 @@ table2-mapping
        <entry><literal>to_json('Fred said "Hi."'</literal></entry>
        <entry><literal>"Fred said \"Hi.\""</literal></entry>
       </row>
+      <row>
+	   <entry>
+         <indexterm>
+          <primary>json_array_length</primary>
+         </indexterm>
+         <literal>json_array_length(json)</literal>
+       </entry>
+       <entry>int</entry>
+       <entry>
+         Returns the number of elements in the outermost json array.
+       </entry>
+       <entry><literal>json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]')</literal></entry>
+       <entry><literal>5</literal></entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_each</primary>
+         </indexterm>
+         <literal>json_each(json)</literal>
+       </entry>
+       <entry>SETOF key text, value json</entry>
+       <entry>
+         Expands the outermost json object into a set of key/value pairs.
+       </entry>
+       <entry><literal>select * from json_each_as_text('{"a":"foo", "b":"bar"}')</literal></entry>
+       <entry>
+<programlisting>
+ key | value
+-----+-------
+ a   | "foo"
+ b   | "bar"
+ </programlisting>
+       </entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_each_text</primary>
+         </indexterm>
+         <literal>json_each_text(from_json json)</literal>
+       </entry>
+       <entry>SETOF key text, value text</entry>
+       <entry>
+         Expands the outermost json object into a set of key/value pairs. The
+         returned value will be of type text.
+       </entry>
+       <entry><literal>select * from json_each_as_text('{"a":"foo", "b":"bar"}')</literal></entry>
+       <entry>
+<programlisting>
+ key | value
+-----+-------
+ a   | foo
+ b   | bar
+ </programlisting>
+       </entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_extract_path</primary>
+         </indexterm>
+         <literal>json_extract_path(from_json json, VARIADIC path_elems text[])</literal>
+       </entry>
+       <entry>json</entry>
+       <entry>
+         Returns json object pointed to by <parameter>path_elems</parameter>.
+       </entry>
+       <entry><literal>json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4')</literal></entry>
+       <entry><literal>{"f5":99,"f6":"foo"}</literal></entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_extract_path_text</primary>
+         </indexterm>
+         <literal>json_extract_path_text(from_json json, VARIADIC path_elems text[])</literal>
+       </entry>
+       <entry>text</entry>
+       <entry>
+         Returns json object pointed to by <parameter>path_elems</parameter>.
+       </entry>
+       <entry><literal>json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4', 'f6')</literal></entry>
+       <entry><literal>foo</literal></entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_object_keys</primary>
+         </indexterm>
+         <literal>json_object_keys(json)</literal>
+       </entry>
+       <entry>SETOF text</entry>
+       <entry>
+          Returns set of keys in the json object.  Only the "outer" object will be displayed.
+       </entry>
+       <entry><literal>json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}')</literal></entry>
+       <entry>
+<programlisting>
+ json_object_keys
+------------------
+ f1
+ f2
+</programlisting>
+       </entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_populate_record</primary>
+         </indexterm>
+         <literal>json_populate_record(base anyelement, from_json json, [, use_json_as_text bool=false]</literal>
+       </entry>
+       <entry>anyelement</entry>
+       <entry>
+         Expands the object in from_json to a row whose columns match
+         the record type defined by base. Conversion will be best
+         effort; columns in base with no corresponding key in from_json
+         will be left null.  A column may only be specified once.
+       </entry>
+       <entry><literal>json_populate_record(null::x, '{"a":1,"b":2}')</literal></entry>
+       <entry>
+<programlisting>
+ a | b
+---+---
+ 1 | 2
+</programlisting>
+       </entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_populate_recordset</primary>
+         </indexterm>
+         <literal>json_populate_recordset(base anyelement, from_json json, [, use_json_as_text bool=false]</literal>
+       </entry>
+       <entry>SETOF anyelement</entry>
+       <entry>
+         Expands the outermost set of objects in from_json to a set
+         whose columns match the record type defined by base.
+         Conversion will be best effort; columns in base with no
+         corresponding key in from_json will be left null.  A column
+         may only be specified once.
+       </entry>
+       <entry><literal>json_populate_recordset(null::x, '[{"a":1,"b":2},{"a":3,"b":4}]')</literal></entry>
+       <entry>
+<programlisting>
+ a | b
+---+---
+ 1 | 2
+ 3 | 4
+ </programlisting>
+       </entry>
+      </row>
+      <row>
+       <entry>
+         <indexterm>
+          <primary>json_array_elements</primary>
+         </indexterm>
+         <literal>json_array_elements(json)</literal>
+       </entry>
+       <entry>SETOF json</entry>
+       <entry>
+         Expands a json array to a set of json elements.
+       </entry>
+       <entry><literal>json_array_elements('[1,true, [2,false]]')</literal></entry>
+       <entry>
+<programlisting>
+   value
+-----------
+ 1
+ true
+ [2,false]
+</programlisting>
+       </entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -9922,7 +10159,6 @@ table2-mapping
     function <function>json_agg</function> which aggregates record
     values as json efficiently.
   </para>
-
  </sect1>
 
  <sect1 id="functions-sequence">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f727acd..57adbf6 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -787,3 +787,11 @@ COMMENT ON FUNCTION ts_debug(text) IS
 CREATE OR REPLACE FUNCTION
   pg_start_backup(label text, fast boolean DEFAULT false)
   RETURNS text STRICT VOLATILE LANGUAGE internal AS 'pg_start_backup';
+
+CREATE OR REPLACE FUNCTION 
+  json_populate_record(base anyelement, from_json json, use_json_as_text boolean DEFAULT false)
+  RETURNS anyelement LANGUAGE internal STABLE AS 'json_populate_record';
+
+CREATE OR REPLACE FUNCTION 
+  json_populate_recordset(base anyelement, from_json json, use_json_as_text boolean DEFAULT false)
+  RETURNS SETOF anyelement LANGUAGE internal STABLE ROWS 100  AS 'json_populate_recordset';
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index a929f4a..41a8982 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -19,8 +19,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
 	array_userfuncs.o arrayutils.o bool.o \
 	cash.o char.o date.o datetime.o datum.o domains.o \
 	enum.o float.o format_type.o \
-	geo_ops.o geo_selfuncs.o int.o int8.o json.o like.o lockfuncs.o \
-	misc.o nabstime.o name.o numeric.o numutils.o \
+	geo_ops.o geo_selfuncs.o int.o int8.o json.o jsonfuncs.o like.o \
+	lockfuncs.o misc.o nabstime.o name.o numeric.o numutils.o \
 	oid.o oracle_compat.o pseudotypes.o rangetypes.o rangetypes_gist.o \
 	rowtypes.o regexp.o regproc.o ruleutils.o selfuncs.o \
 	tid.o timestamp.o varbit.o varchar.o varlena.o version.o xid.o \
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 82be9a1..90404f3 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -26,59 +26,37 @@
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/json.h"
-#include "utils/syscache.h"
+#include "utils/jsonapi.h"
 #include "utils/typcache.h"
+#include "utils/syscache.h"
 
-typedef enum					/* types of JSON values */
-{
-	JSON_VALUE_INVALID,			/* non-value tokens are reported as this */
-	JSON_VALUE_STRING,
-	JSON_VALUE_NUMBER,
-	JSON_VALUE_OBJECT,
-	JSON_VALUE_ARRAY,
-	JSON_VALUE_TRUE,
-	JSON_VALUE_FALSE,
-	JSON_VALUE_NULL
-} JsonValueType;
-
-typedef struct					/* state of JSON lexer */
-{
-	char	   *input;			/* whole string being parsed */
-	char	   *token_start;	/* start of current token within input */
-	char	   *token_terminator;		/* end of previous or current token */
-	JsonValueType token_type;	/* type of current token, once it's known */
-} JsonLexContext;
-
-typedef enum					/* states of JSON parser */
+/*
+ * The context of the parser is maintained by the recursive descent
+ * mechanism, but is passed explicitly to the error reporting routine
+ * for better diagnostics.
+ */
+typedef enum					/* contexts of JSON parser */
 {
 	JSON_PARSE_VALUE,			/* expecting a value */
+	JSON_PARSE_STRING,			/* expecting a string (for a field name) */
 	JSON_PARSE_ARRAY_START,		/* saw '[', expecting value or ']' */
 	JSON_PARSE_ARRAY_NEXT,		/* saw array element, expecting ',' or ']' */
 	JSON_PARSE_OBJECT_START,	/* saw '{', expecting label or '}' */
 	JSON_PARSE_OBJECT_LABEL,	/* saw object label, expecting ':' */
 	JSON_PARSE_OBJECT_NEXT,		/* saw object value, expecting ',' or '}' */
-	JSON_PARSE_OBJECT_COMMA		/* saw object ',', expecting next label */
-} JsonParseState;
-
-typedef struct JsonParseStack	/* the parser state has to be stackable */
-{
-	JsonParseState state;
-	/* currently only need the state enum, but maybe someday more stuff */
-} JsonParseStack;
-
-typedef enum					/* required operations on state stack */
-{
-	JSON_STACKOP_NONE,			/* no-op */
-	JSON_STACKOP_PUSH,			/* push new JSON_PARSE_VALUE stack item */
-	JSON_STACKOP_PUSH_WITH_PUSHBACK,	/* push, then rescan current token */
-	JSON_STACKOP_POP			/* pop, or expect end of input if no stack */
-} JsonStackOp;
-
-static void json_validate_cstring(char *input);
-static void json_lex(JsonLexContext *lex);
-static void json_lex_string(JsonLexContext *lex);
-static void json_lex_number(JsonLexContext *lex, char *s);
-static void report_parse_error(JsonParseStack *stack, JsonLexContext *lex);
+	JSON_PARSE_OBJECT_COMMA,	/* saw object ',', expecting next label */
+	JSON_PARSE_END				/* saw the end of a document, expect nothing */
+}	JsonParseContext;
+
+static inline void json_lex(JsonLexContext *lex);
+static inline void json_lex_string(JsonLexContext *lex);
+static inline void json_lex_number(JsonLexContext *lex, char *s);
+static inline void parse_scalar(JsonLexContext *lex, JsonSemAction sem);
+static void parse_object_field(JsonLexContext *lex, JsonSemAction sem);
+static void parse_object(JsonLexContext *lex, JsonSemAction sem);
+static void parse_array_element(JsonLexContext *lex, JsonSemAction sem);
+static void parse_array(JsonLexContext *lex, JsonSemAction sem);
+static void report_parse_error(JsonParseContext ctx, JsonLexContext *lex);
 static void report_invalid_token(JsonLexContext *lex);
 static int	report_json_context(JsonLexContext *lex);
 static char *extract_mb_char(char *s);
@@ -91,6 +69,77 @@ static void array_dim_to_json(StringInfo result, int dim, int ndims, int *dims,
 static void array_to_json_internal(Datum array, StringInfo result,
 					   bool use_line_feeds);
 
+/* the null action object used for pure validation */
+static jsonSemAction nullSemAction =
+{
+	NULL, NULL, NULL, NULL, NULL,
+	NULL, NULL, NULL, NULL, NULL
+};
+static JsonSemAction NullSemAction = &nullSemAction;
+
+/* Recursive Descent parser support routines */
+
+/*
+ * lex_peek
+ *
+ * what is the current look_ahead token?
+*/
+static inline JsonTokenType
+lex_peek(JsonLexContext *lex)
+{
+	return lex->token_type;
+}
+
+/*
+ * lex_accept
+ *
+ * accept the look_ahead token and move the lexer to the next token if the
+ * look_ahead token matches the token parameter. In that case, and if required,
+ * also hand back the de-escaped lexeme.
+ *
+ * returns true if the token matched, false otherwise.
+ */
+static inline bool
+lex_accept(JsonLexContext *lex, JsonTokenType token, char **lexeme)
+{
+	if (lex->token_type == token)
+	{
+		if (lexeme != NULL)
+		{
+			if (lex->token_type == JSON_TOKEN_STRING)
+			{
+				if (lex->strval != NULL)
+					*lexeme = pstrdup(lex->strval->data);
+			}
+			else
+			{
+				int			len = (lex->token_terminator - lex->token_start);
+				char	   *tokstr = palloc(len + 1);
+
+				memcpy(tokstr, lex->token_start, len);
+				tokstr[len] = '\0';
+				*lexeme = tokstr;
+			}
+		}
+		json_lex(lex);
+		return true;
+	}
+	return false;
+}
+
+/*
+ * lex_accept
+ *
+ * move the lexer to the next token if the current look_ahead token matches
+ * the parameter token. Otherwise, report an error.
+ */
+static inline void
+lex_expect(JsonParseContext ctx, JsonLexContext *lex, JsonTokenType token)
+{
+	if (!lex_accept(lex, token, NULL))
+		report_parse_error(ctx, lex);;
+}
+
 /*
  * All the defined	type categories are upper case , so use lower case here
  * so we avoid any possible clash.
@@ -109,19 +158,22 @@ static void array_to_json_internal(Datum array, StringInfo result,
 	 (c) == '_' || \
 	 IS_HIGHBIT_SET(c))
 
-
 /*
  * Input.
  */
 Datum
 json_in(PG_FUNCTION_ARGS)
 {
-	char	   *text = PG_GETARG_CSTRING(0);
+	char	   *json = PG_GETARG_CSTRING(0);
+	text	   *result = cstring_to_text(json);
+	JsonLexContext *lex;
 
-	json_validate_cstring(text);
+	/* validate it */
+	lex = makeJsonLexContext(result, false);
+	pg_parse_json(lex, NullSemAction);
 
 	/* Internal representation is the same as text, for now */
-	PG_RETURN_TEXT_P(cstring_to_text(text));
+	PG_RETURN_TEXT_P(result);
 }
 
 /*
@@ -160,293 +212,457 @@ json_recv(PG_FUNCTION_ARGS)
 	text	   *result;
 	char	   *str;
 	int			nbytes;
+	JsonLexContext *lex;
 
 	str = pq_getmsgtext(buf, buf->len - buf->cursor, &nbytes);
 
-	/*
-	 * We need a null-terminated string to pass to json_validate_cstring().
-	 * Rather than make a separate copy, make the temporary result one byte
-	 * bigger than it needs to be.
-	 */
-	result = palloc(nbytes + 1 + VARHDRSZ);
+	result = palloc(nbytes + VARHDRSZ);
 	SET_VARSIZE(result, nbytes + VARHDRSZ);
 	memcpy(VARDATA(result), str, nbytes);
-	str = VARDATA(result);
-	str[nbytes] = '\0';
 
 	/* Validate it. */
-	json_validate_cstring(str);
+	lex = makeJsonLexContext(result, false);
+	pg_parse_json(lex, NullSemAction);
 
 	PG_RETURN_TEXT_P(result);
 }
 
 /*
- * Check whether supplied input is valid JSON.
+ * makeJsonLexContext
+ *
+ * lex constructor, with or without StringInfo object
+ * for de-escaped lexemes.
+ *
+ * Without is better as it makes the processing faster, so only make one
+ * if really required.
+ */
+JsonLexContext *
+makeJsonLexContext(text *json, bool need_escapes)
+{
+	JsonLexContext *lex = palloc0(sizeof(JsonLexContext));
+
+	lex->input = lex->token_terminator = lex->line_start = VARDATA(json);
+	lex->line_number = 1;
+	lex->input_length = VARSIZE(json) - VARHDRSZ;
+	if (need_escapes)
+		lex->strval = makeStringInfo();
+	return lex;
+}
+
+/*
+ * pg_parse_json
+ *
+ * Publicly visible entry point for the JSON parser.
+ *
+ * lex is a lexing context, set up for the json to be processed by calling
+ * makeJsonLexContext(). sem is a strucure of function pointers to semantic
+ * action routines to be called at appropriate spots during parsing, and a
+ * pointer to a state object to be passed to those routines.
+ */
+void
+pg_parse_json(JsonLexContext *lex, JsonSemAction sem)
+{
+	JsonTokenType tok;
+
+	/* get the initial token */
+	json_lex(lex);
+
+	tok = lex_peek(lex);
+
+	/* parse by recursive descent */
+	switch (tok)
+	{
+		case JSON_TOKEN_OBJECT_START:
+			parse_object(lex, sem);
+			break;
+		case JSON_TOKEN_ARRAY_START:
+			parse_array(lex, sem);
+			break;
+		default:
+			parse_scalar(lex, sem);		/* json can be a bare scalar */
+	}
+
+	lex_expect(JSON_PARSE_END, lex, JSON_TOKEN_END);
+
+}
+
+/*
+ *	Recursive Descent parse routines. There is one for each structural
+ *	element in a json document:
+ *	  - scalar (string, number, true, false, null)
+ *	  - array  ( [ ] )
+ *	  - array element
+ *	  - object ( { } )
+ *	  - object field
  */
+static inline void
+parse_scalar(JsonLexContext *lex, JsonSemAction sem)
+{
+	char	   *val = NULL;
+	json_scalar_action sfunc = sem->scalar;
+	char	  **valaddr;
+	JsonTokenType tok = lex_peek(lex);
+
+	valaddr = sfunc == NULL ? NULL : &val;
+
+	/* a scalar must be a string, a number, true, false, or null */
+	switch (tok)
+	{
+		case JSON_TOKEN_TRUE:
+			lex_accept(lex, JSON_TOKEN_TRUE, valaddr);
+			break;
+		case JSON_TOKEN_FALSE:
+			lex_accept(lex, JSON_TOKEN_FALSE, valaddr);
+			break;
+		case JSON_TOKEN_NULL:
+			lex_accept(lex, JSON_TOKEN_NULL, valaddr);
+			break;
+		case JSON_TOKEN_NUMBER:
+			lex_accept(lex, JSON_TOKEN_NUMBER, valaddr);
+			break;
+		case JSON_TOKEN_STRING:
+			lex_accept(lex, JSON_TOKEN_STRING, valaddr);
+			break;
+		default:
+			report_parse_error(JSON_PARSE_VALUE, lex);
+	}
+
+	if (sfunc != NULL)
+		(*sfunc) (sem->semstate, val, tok);
+}
+
 static void
-json_validate_cstring(char *input)
+parse_object_field(JsonLexContext *lex, JsonSemAction sem)
 {
-	JsonLexContext lex;
-	JsonParseStack *stack,
-			   *stacktop;
-	int			stacksize;
-
-	/* Set up lexing context. */
-	lex.input = input;
-	lex.token_terminator = lex.input;
-
-	/* Set up parse stack. */
-	stacksize = 32;
-	stacktop = (JsonParseStack *) palloc(sizeof(JsonParseStack) * stacksize);
-	stack = stacktop;
-	stack->state = JSON_PARSE_VALUE;
-
-	/* Main parsing loop. */
-	for (;;)
+	/*
+	 * an object field is "fieldname" : value where value can be a scalar,
+	 * object or array
+	 */
+
+	char	   *fname = NULL;	/* keep compiler quiet */
+	json_ofield_action ostart = sem->object_field_start;
+	json_ofield_action oend = sem->object_field_end;
+	bool		isnull;
+	char	  **fnameaddr = NULL;
+	JsonTokenType tok;
+
+	if (ostart != NULL || oend != NULL)
+		fnameaddr = &fname;
+
+	if (!lex_accept(lex, JSON_TOKEN_STRING, fnameaddr))
+		report_parse_error(JSON_PARSE_STRING, lex);
+
+	lex_expect(JSON_PARSE_OBJECT_LABEL, lex, JSON_TOKEN_COLON);
+
+	tok = lex_peek(lex);
+	isnull = tok == JSON_TOKEN_NULL;
+
+	if (ostart != NULL)
+		(*ostart) (sem->semstate, fname, isnull);
+
+	switch (tok)
 	{
-		JsonStackOp op;
+		case JSON_TOKEN_OBJECT_START:
+			parse_object(lex, sem);
+			break;
+		case JSON_TOKEN_ARRAY_START:
+			parse_array(lex, sem);
+			break;
+		default:
+			parse_scalar(lex, sem);
+	}
 
-		/* Fetch next token. */
-		json_lex(&lex);
+	if (oend != NULL)
+		(*oend) (sem->semstate, fname, isnull);
 
-		/* Check for unexpected end of input. */
-		if (lex.token_start == NULL)
-			report_parse_error(stack, &lex);
+	if (fname != NULL)
+		pfree(fname);
+}
 
-redo:
-		/* Figure out what to do with this token. */
-		op = JSON_STACKOP_NONE;
-		switch (stack->state)
-		{
-			case JSON_PARSE_VALUE:
-				if (lex.token_type != JSON_VALUE_INVALID)
-					op = JSON_STACKOP_POP;
-				else if (lex.token_start[0] == '[')
-					stack->state = JSON_PARSE_ARRAY_START;
-				else if (lex.token_start[0] == '{')
-					stack->state = JSON_PARSE_OBJECT_START;
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_ARRAY_START:
-				if (lex.token_type != JSON_VALUE_INVALID)
-					stack->state = JSON_PARSE_ARRAY_NEXT;
-				else if (lex.token_start[0] == ']')
-					op = JSON_STACKOP_POP;
-				else if (lex.token_start[0] == '[' ||
-						 lex.token_start[0] == '{')
-				{
-					stack->state = JSON_PARSE_ARRAY_NEXT;
-					op = JSON_STACKOP_PUSH_WITH_PUSHBACK;
-				}
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_ARRAY_NEXT:
-				if (lex.token_type != JSON_VALUE_INVALID)
-					report_parse_error(stack, &lex);
-				else if (lex.token_start[0] == ']')
-					op = JSON_STACKOP_POP;
-				else if (lex.token_start[0] == ',')
-					op = JSON_STACKOP_PUSH;
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_OBJECT_START:
-				if (lex.token_type == JSON_VALUE_STRING)
-					stack->state = JSON_PARSE_OBJECT_LABEL;
-				else if (lex.token_type == JSON_VALUE_INVALID &&
-						 lex.token_start[0] == '}')
-					op = JSON_STACKOP_POP;
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_OBJECT_LABEL:
-				if (lex.token_type == JSON_VALUE_INVALID &&
-					lex.token_start[0] == ':')
-				{
-					stack->state = JSON_PARSE_OBJECT_NEXT;
-					op = JSON_STACKOP_PUSH;
-				}
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_OBJECT_NEXT:
-				if (lex.token_type != JSON_VALUE_INVALID)
-					report_parse_error(stack, &lex);
-				else if (lex.token_start[0] == '}')
-					op = JSON_STACKOP_POP;
-				else if (lex.token_start[0] == ',')
-					stack->state = JSON_PARSE_OBJECT_COMMA;
-				else
-					report_parse_error(stack, &lex);
-				break;
-			case JSON_PARSE_OBJECT_COMMA:
-				if (lex.token_type == JSON_VALUE_STRING)
-					stack->state = JSON_PARSE_OBJECT_LABEL;
-				else
-					report_parse_error(stack, &lex);
-				break;
-			default:
-				elog(ERROR, "unexpected json parse state: %d",
-					 (int) stack->state);
-		}
+static void
+parse_object(JsonLexContext *lex, JsonSemAction sem)
+{
+	/*
+	 * an object is a possibly empty sequence of object fields, separated by
+	 * commas and surrounde by curly braces.
+	 */
+	json_struct_action ostart = sem->object_start;
+	json_struct_action oend = sem->object_end;
+	JsonTokenType tok;
 
-		/* Push or pop the state stack, if needed. */
-		switch (op)
-		{
-			case JSON_STACKOP_PUSH:
-			case JSON_STACKOP_PUSH_WITH_PUSHBACK:
-				stack++;
-				if (stack >= &stacktop[stacksize])
-				{
-					/* Need to enlarge the stack. */
-					int			stackoffset = stack - stacktop;
-
-					stacksize += 32;
-					stacktop = (JsonParseStack *)
-						repalloc(stacktop,
-								 sizeof(JsonParseStack) * stacksize);
-					stack = stacktop + stackoffset;
-				}
-				stack->state = JSON_PARSE_VALUE;
-				if (op == JSON_STACKOP_PUSH_WITH_PUSHBACK)
-					goto redo;
-				break;
-			case JSON_STACKOP_POP:
-				if (stack == stacktop)
-				{
-					/* Expect end of input. */
-					json_lex(&lex);
-					if (lex.token_start != NULL)
-						report_parse_error(NULL, &lex);
-					return;
-				}
-				stack--;
-				break;
-			case JSON_STACKOP_NONE:
-				/* nothing to do */
-				break;
-		}
+	if (ostart != NULL)
+		(*ostart) (sem->semstate);
+
+	/*
+	 * Data inside an object at at a higher nesting level than the object
+	 * itself. Note that we increment this after we call the semantic routine
+	 * for the object start and restore it before we call the routine for the
+	 * object end.
+	 */
+	lex->lex_level++;
+
+	/* we know this will succeeed, just clearing the token */
+	lex_expect(JSON_PARSE_OBJECT_START, lex, JSON_TOKEN_OBJECT_START);
+
+	tok = lex_peek(lex);
+	switch (tok)
+	{
+		case JSON_TOKEN_STRING:
+			parse_object_field(lex, sem);
+			while (lex_accept(lex, JSON_TOKEN_COMMA, NULL))
+				parse_object_field(lex, sem);
+			break;
+		case JSON_TOKEN_OBJECT_END:
+			break;
+		default:
+			/* case of an invalid initial token inside the object */
+			report_parse_error(JSON_PARSE_OBJECT_START, lex);
+	}
+
+	lex_expect(JSON_PARSE_OBJECT_NEXT, lex, JSON_TOKEN_OBJECT_END);
+
+	lex->lex_level--;
+
+	if (oend != NULL)
+		(*oend) (sem->semstate);
+}
+
+static void
+parse_array_element(JsonLexContext *lex, JsonSemAction sem)
+{
+	json_aelem_action astart = sem->array_element_start;
+	json_aelem_action aend = sem->array_element_end;
+	JsonTokenType tok = lex_peek(lex);
+
+	bool		isnull;
+
+	isnull = tok == JSON_TOKEN_NULL;
+
+	if (astart != NULL)
+		(*astart) (sem->semstate, isnull);
+
+	/* an array element is any object, array or scalar */
+	switch (tok)
+	{
+		case JSON_TOKEN_OBJECT_START:
+			parse_object(lex, sem);
+			break;
+		case JSON_TOKEN_ARRAY_START:
+			parse_array(lex, sem);
+			break;
+		default:
+			parse_scalar(lex, sem);
+	}
+
+	if (aend != NULL)
+		(*aend) (sem->semstate, isnull);
+}
+
+static void
+parse_array(JsonLexContext *lex, JsonSemAction sem)
+{
+	/*
+	 * an array is a possibly empty sequence of array elements, separated by
+	 * commas and surrounded by square brackets.
+	 */
+	json_struct_action astart = sem->array_start;
+	json_struct_action aend = sem->array_end;
+
+	if (astart != NULL)
+		(*astart) (sem->semstate);
+
+	/*
+	 * Data inside an array at at a higher nesting level than the array
+	 * itself. Note that we increment this after we call the semantic routine
+	 * for the array start and restore it before we call the routine for the
+	 * array end.
+	 */
+	lex->lex_level++;
+
+	lex_expect(JSON_PARSE_ARRAY_START, lex, JSON_TOKEN_ARRAY_START);
+	if (lex_peek(lex) != JSON_TOKEN_ARRAY_END)
+	{
+
+		parse_array_element(lex, sem);
+
+		while (lex_accept(lex, JSON_TOKEN_COMMA, NULL))
+			parse_array_element(lex, sem);
 	}
+
+	lex_expect(JSON_PARSE_ARRAY_NEXT, lex, JSON_TOKEN_ARRAY_END);
+
+	lex->lex_level--;
+
+	if (aend != NULL)
+		(*aend) (sem->semstate);
 }
 
 /*
  * Lex one token from the input stream.
  */
-static void
+static inline void
 json_lex(JsonLexContext *lex)
 {
 	char	   *s;
+	int			len;
 
 	/* Skip leading whitespace. */
 	s = lex->token_terminator;
-	while (*s == ' ' || *s == '\t' || *s == '\n' || *s == '\r')
-		s++;
+	len = s - lex->input;
+	while (len < lex->input_length &&
+		   (*s == ' ' || *s == '\t' || *s == '\n' || *s == '\r'))
+	{
+		if (*s == '\n')
+			++lex->line_number;
+		++s;
+		++len;
+	}
 	lex->token_start = s;
 
 	/* Determine token type. */
-	if (strchr("{}[],:", s[0]) != NULL)
+	if (len >= lex->input_length)
 	{
-		/* strchr() is willing to match a zero byte, so test for that. */
-		if (s[0] == '\0')
-		{
-			/* End of string. */
-			lex->token_start = NULL;
-			lex->token_terminator = s;
-		}
-		else
-		{
-			/* Single-character token, some kind of punctuation mark. */
-			lex->token_terminator = s + 1;
-		}
-		lex->token_type = JSON_VALUE_INVALID;
-	}
-	else if (*s == '"')
-	{
-		/* String. */
-		json_lex_string(lex);
-		lex->token_type = JSON_VALUE_STRING;
-	}
-	else if (*s == '-')
-	{
-		/* Negative number. */
-		json_lex_number(lex, s + 1);
-		lex->token_type = JSON_VALUE_NUMBER;
-	}
-	else if (*s >= '0' && *s <= '9')
-	{
-		/* Positive number. */
-		json_lex_number(lex, s);
-		lex->token_type = JSON_VALUE_NUMBER;
+		lex->token_start = NULL;
+		lex->prev_token_terminator = lex->token_terminator;
+		lex->token_terminator = s;
+		lex->token_type = JSON_TOKEN_END;
 	}
 	else
-	{
-		char	   *p;
-
-		/*
-		 * We're not dealing with a string, number, legal punctuation mark, or
-		 * end of string.  The only legal tokens we might find here are true,
-		 * false, and null, but for error reporting purposes we scan until we
-		 * see a non-alphanumeric character.  That way, we can report the
-		 * whole word as an unexpected token, rather than just some
-		 * unintuitive prefix thereof.
-		 */
-		for (p = s; JSON_ALPHANUMERIC_CHAR(*p); p++)
-			 /* skip */ ;
-
-		if (p == s)
+		switch (*s)
 		{
-			/*
-			 * We got some sort of unexpected punctuation or an otherwise
-			 * unexpected character, so just complain about that one
-			 * character.  (It can't be multibyte because the above loop will
-			 * advance over any multibyte characters.)
-			 */
-			lex->token_terminator = s + 1;
-			report_invalid_token(lex);
-		}
+				/* Single-character token, some kind of punctuation mark. */
+			case '{':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_OBJECT_START;
+				break;
+			case '}':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_OBJECT_END;
+				break;
+			case '[':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_ARRAY_START;
+				break;
+			case ']':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_ARRAY_END;
+				break;
+			case ',':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_COMMA;
+				break;
+			case ':':
+				lex->prev_token_terminator = lex->token_terminator;
+				lex->token_terminator = s + 1;
+				lex->token_type = JSON_TOKEN_COLON;
+				break;
+			case '"':
+				/* string */
+				json_lex_string(lex);
+				lex->token_type = JSON_TOKEN_STRING;
+				break;
+			case '-':
+				/* Negative number. */
+				json_lex_number(lex, s + 1);
+				lex->token_type = JSON_TOKEN_NUMBER;
+				break;
+			case '0':
+			case '1':
+			case '2':
+			case '3':
+			case '4':
+			case '5':
+			case '6':
+			case '7':
+			case '8':
+			case '9':
+				/* Positive number. */
+				json_lex_number(lex, s);
+				lex->token_type = JSON_TOKEN_NUMBER;
+				break;
+			default:
+				{
+					char	   *p;
+
+					/*
+					 * We're not dealing with a string, number, legal
+					 * punctuation mark, or end of string.	The only legal
+					 * tokens we might find here are true, false, and null,
+					 * but for error reporting purposes we scan until we see a
+					 * non-alphanumeric character.	That way, we can report
+					 * the whole word as an unexpected token, rather than just
+					 * some unintuitive prefix thereof.
+					 */
+					for (p = s; JSON_ALPHANUMERIC_CHAR(*p) && p - s < lex->input_length - len; p++)
+						 /* skip */ ;
+
+					/*
+					 * We got some sort of unexpected punctuation or an
+					 * otherwise unexpected character, so just complain about
+					 * that one character.
+					 */
+					if (p == s)
+					{
+						lex->prev_token_terminator = lex->token_terminator;
+						lex->token_terminator = s + 1;
+						report_invalid_token(lex);
+					}
 
-		/*
-		 * We've got a real alphanumeric token here.  If it happens to be
-		 * true, false, or null, all is well.  If not, error out.
-		 */
-		lex->token_terminator = p;
-		if (p - s == 4)
-		{
-			if (memcmp(s, "true", 4) == 0)
-				lex->token_type = JSON_VALUE_TRUE;
-			else if (memcmp(s, "null", 4) == 0)
-				lex->token_type = JSON_VALUE_NULL;
-			else
-				report_invalid_token(lex);
-		}
-		else if (p - s == 5 && memcmp(s, "false", 5) == 0)
-			lex->token_type = JSON_VALUE_FALSE;
-		else
-			report_invalid_token(lex);
-	}
+					/*
+					 * We've got a real alphanumeric token here.  If it
+					 * happens to be true, false, or null, all is well.  If
+					 * not, error out.
+					 */
+					lex->prev_token_terminator = lex->token_terminator;
+					lex->token_terminator = p;
+					if (p - s == 4)
+					{
+						if (memcmp(s, "true", 4) == 0)
+							lex->token_type = JSON_TOKEN_TRUE;
+						else if (memcmp(s, "null", 4) == 0)
+							lex->token_type = JSON_TOKEN_NULL;
+						else
+							report_invalid_token(lex);
+					}
+					else if (p - s == 5 && memcmp(s, "false", 5) == 0)
+						lex->token_type = JSON_TOKEN_FALSE;
+					else
+						report_invalid_token(lex);
+
+				}
+		}						/* end of switch */
 }
 
 /*
  * The next token in the input stream is known to be a string; lex it.
  */
-static void
+static inline void
 json_lex_string(JsonLexContext *lex)
 {
 	char	   *s;
+	int			len;
+
+	if (lex->strval != NULL)
+		resetStringInfo(lex->strval);
 
-	for (s = lex->token_start + 1; *s != '"'; s++)
+	len = lex->token_start - lex->input;
+	len++;
+	for (s = lex->token_start + 1; *s != '"'; s++, len++)
 	{
-		/* Per RFC4627, these characters MUST be escaped. */
-		if ((unsigned char) *s < 32)
+		/* Premature end of the string. */
+		if (len >= lex->input_length)
 		{
-			/* A NUL byte marks the (premature) end of the string. */
-			if (*s == '\0')
-			{
-				lex->token_terminator = s;
-				report_invalid_token(lex);
-			}
+			lex->token_terminator = s;
+			report_invalid_token(lex);
+		}
+		else if ((unsigned char) *s < 32)
+		{
+			/* Per RFC4627, these characters MUST be escaped. */
 			/* Since *s isn't printable, exclude it from the context string */
 			lex->token_terminator = s;
 			ereport(ERROR,
@@ -460,7 +676,8 @@ json_lex_string(JsonLexContext *lex)
 		{
 			/* OK, we have an escape character. */
 			s++;
-			if (*s == '\0')
+			len++;
+			if (len >= lex->input_length)
 			{
 				lex->token_terminator = s;
 				report_invalid_token(lex);
@@ -473,7 +690,8 @@ json_lex_string(JsonLexContext *lex)
 				for (i = 1; i <= 4; i++)
 				{
 					s++;
-					if (*s == '\0')
+					len++;
+					if (len >= lex->input_length)
 					{
 						lex->token_terminator = s;
 						report_invalid_token(lex);
@@ -494,10 +712,66 @@ json_lex_string(JsonLexContext *lex)
 								 report_json_context(lex)));
 					}
 				}
+				if (lex->strval != NULL)
+				{
+					char		utf8str[5];
+					int			utf8len;
+					char	   *converted;
+
+					unicode_to_utf8(ch, (unsigned char *) utf8str);
+					utf8len = pg_utf_mblen((unsigned char *) utf8str);
+					utf8str[utf8len] = '\0';
+					converted = pg_any_to_server(utf8str, 1, PG_UTF8);
+					appendStringInfoString(lex->strval, converted);
+					if (converted != utf8str)
+						pfree(converted);
+
+				}
+			}
+			else if (lex->strval != NULL)
+			{
+				switch (*s)
+				{
+					case '"':
+					case '\\':
+					case '/':
+						appendStringInfoChar(lex->strval, *s);
+						break;
+					case 'b':
+						appendStringInfoChar(lex->strval, '\b');
+						break;
+					case 'f':
+						appendStringInfoChar(lex->strval, '\f');
+						break;
+					case 'n':
+						appendStringInfoChar(lex->strval, '\n');
+						break;
+					case 'r':
+						appendStringInfoChar(lex->strval, '\r');
+						break;
+					case 't':
+						appendStringInfoChar(lex->strval, '\t');
+						break;
+					default:
+						/* Not a valid string escape, so error out. */
+						lex->token_terminator = s + pg_mblen(s);
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+								 errmsg("invalid input syntax for type json"),
+							errdetail("Escape sequence \"\\%s\" is invalid.",
+									  extract_mb_char(s)),
+								 report_json_context(lex)));
+				}
 			}
 			else if (strchr("\"\\/bfnrt", *s) == NULL)
 			{
-				/* Not a valid string escape, so error out. */
+				/*
+				 * Simpler processing if we're not bothered about de-escaping
+				 *
+				 * It's very tempting to remove the strchr() call here and
+				 * replace it with a switch statement, but testing so far has
+				 * shown it's not a performance win.
+				 */
 				lex->token_terminator = s + pg_mblen(s);
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
@@ -506,10 +780,17 @@ json_lex_string(JsonLexContext *lex)
 								   extract_mb_char(s)),
 						 report_json_context(lex)));
 			}
+
+		}
+		else if (lex->strval != NULL)
+		{
+			appendStringInfoChar(lex->strval, *s);
 		}
+
 	}
 
 	/* Hooray, we found the end of the string! */
+	lex->prev_token_terminator = lex->token_terminator;
 	lex->token_terminator = s + 1;
 }
 
@@ -539,57 +820,70 @@ json_lex_string(JsonLexContext *lex)
  *
  *-------------------------------------------------------------------------
  */
-static void
+static inline void
 json_lex_number(JsonLexContext *lex, char *s)
 {
 	bool		error = false;
 	char	   *p;
+	int			len;
 
+	len = s - lex->input;
 	/* Part (1): leading sign indicator. */
 	/* Caller already did this for us; so do nothing. */
 
 	/* Part (2): parse main digit string. */
 	if (*s == '0')
+	{
 		s++;
+		len++;
+	}
 	else if (*s >= '1' && *s <= '9')
 	{
 		do
 		{
 			s++;
-		} while (*s >= '0' && *s <= '9');
+			len++;
+		} while (*s >= '0' && *s <= '9' && len < lex->input_length);
 	}
 	else
 		error = true;
 
 	/* Part (3): parse optional decimal portion. */
-	if (*s == '.')
+	if (len < lex->input_length && *s == '.')
 	{
 		s++;
-		if (*s < '0' || *s > '9')
+		len++;
+		if (len == lex->input_length || *s < '0' || *s > '9')
 			error = true;
 		else
 		{
 			do
 			{
 				s++;
-			} while (*s >= '0' && *s <= '9');
+				len++;
+			} while (*s >= '0' && *s <= '9' && len < lex->input_length);
 		}
 	}
 
 	/* Part (4): parse optional exponent. */
-	if (*s == 'e' || *s == 'E')
+	if (len < lex->input_length && (*s == 'e' || *s == 'E'))
 	{
 		s++;
-		if (*s == '+' || *s == '-')
+		len++;
+		if (len < lex->input_length && (*s == '+' || *s == '-'))
+		{
 			s++;
-		if (*s < '0' || *s > '9')
+			len++;
+		}
+		if (len == lex->input_length || *s < '0' || *s > '9')
 			error = true;
 		else
 		{
 			do
 			{
 				s++;
-			} while (*s >= '0' && *s <= '9');
+				len++;
+			} while (len < lex->input_length && *s >= '0' && *s <= '9');
 		}
 	}
 
@@ -598,8 +892,9 @@ json_lex_number(JsonLexContext *lex, char *s)
 	 * here should be considered part of the token for error-reporting
 	 * purposes.
 	 */
-	for (p = s; JSON_ALPHANUMERIC_CHAR(*p); p++)
+	for (p = s; JSON_ALPHANUMERIC_CHAR(*p) && len < lex->input_length; p++, len++)
 		error = true;
+	lex->prev_token_terminator = lex->token_terminator;
 	lex->token_terminator = p;
 	if (error)
 		report_invalid_token(lex);
@@ -611,13 +906,13 @@ json_lex_number(JsonLexContext *lex, char *s)
  * lex->token_start and lex->token_terminator must identify the current token.
  */
 static void
-report_parse_error(JsonParseStack *stack, JsonLexContext *lex)
+report_parse_error(JsonParseContext ctx, JsonLexContext *lex)
 {
 	char	   *token;
 	int			toklen;
 
 	/* Handle case where the input ended prematurely. */
-	if (lex->token_start == NULL)
+	if (lex->token_start == NULL || lex->token_type == JSON_TOKEN_END)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type json"),
@@ -631,7 +926,7 @@ report_parse_error(JsonParseStack *stack, JsonLexContext *lex)
 	token[toklen] = '\0';
 
 	/* Complain, with the appropriate detail message. */
-	if (stack == NULL)
+	if (ctx == JSON_PARSE_END)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 				 errmsg("invalid input syntax for type json"),
@@ -640,7 +935,7 @@ report_parse_error(JsonParseStack *stack, JsonLexContext *lex)
 				 report_json_context(lex)));
 	else
 	{
-		switch (stack->state)
+		switch (ctx)
 		{
 			case JSON_PARSE_VALUE:
 				ereport(ERROR,
@@ -650,6 +945,14 @@ report_parse_error(JsonParseStack *stack, JsonLexContext *lex)
 								   token),
 						 report_json_context(lex)));
 				break;
+			case JSON_PARSE_STRING:
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+						 errmsg("invalid input syntax for type json"),
+						 errdetail("Expected string, but found \"%s\".",
+								   token),
+						 report_json_context(lex)));
+				break;
 			case JSON_PARSE_ARRAY_START:
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
@@ -699,8 +1002,7 @@ report_parse_error(JsonParseStack *stack, JsonLexContext *lex)
 						 report_json_context(lex)));
 				break;
 			default:
-				elog(ERROR, "unexpected json parse state: %d",
-					 (int) stack->state);
+				elog(ERROR, "unexpected json parse state: %d", ctx);
 		}
 	}
 }
@@ -795,7 +1097,7 @@ report_json_context(JsonLexContext *lex)
 	 * suffixing "..." if not ending at end of line.
 	 */
 	prefix = (context_start > line_start) ? "..." : "";
-	suffix = (*context_end != '\0' && *context_end != '\n' && *context_end != '\r') ? "..." : "";
+	suffix = (lex->token_type != JSON_TOKEN_END && context_end - lex->input < lex->input_length && *context_end != '\n' && *context_end != '\r') ? "..." : "";
 
 	return errcontext("JSON data, line %d: %s%s%s",
 					  line_number, prefix, ctxt, suffix);
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
new file mode 100644
index 0000000..63df1ac
--- /dev/null
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -0,0 +1,1919 @@
+/*-------------------------------------------------------------------------
+ *
+ * jsonfuncs.c
+ *		Functions to process JSON data type.
+ *
+ * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/adt/jsonfuncs.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include <limits.h>
+
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "access/htup_details.h"
+#include "catalog/pg_type.h"
+#include "lib/stringinfo.h"
+#include "mb/pg_wchar.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/hsearch.h"
+#include "utils/json.h"
+#include "utils/jsonapi.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/typcache.h"
+
+/* semantic action functions for json_object_keys */
+static void okeys_object_field_start(void *state, char *fname, bool isnull);
+static void okeys_array_start(void *state);
+static void okeys_scalar(void *state, char *token, JsonTokenType tokentype);
+
+/* semantic action functions for json_get* functions */
+static void get_object_start(void *state);
+static void get_object_field_start(void *state, char *fname, bool isnull);
+static void get_object_field_end(void *state, char *fname, bool isnull);
+static void get_array_start(void *state);
+static void get_array_element_start(void *state, bool isnull);
+static void get_array_element_end(void *state, bool isnull);
+static void get_scalar(void *state, char *token, JsonTokenType tokentype);
+
+/* common worker function for json getter functions */
+static inline Datum get_path_all(PG_FUNCTION_ARGS, bool as_text);
+static inline text *get_worker(text *json, char *field, int elem_index,
+		   char **tpath, int *ipath, int npath,
+		   bool normalize_results);
+
+/* semantic action functions for json_array_length */
+static void alen_object_start(void *state);
+static void alen_scalar(void *state, char *token, JsonTokenType tokentype);
+static void alen_array_element_start(void *state, bool isnull);
+
+/* common worker for json_each* functions */
+static inline Datum each_worker(PG_FUNCTION_ARGS, bool as_text);
+
+/* semantic action functions for json_each */
+static void each_object_field_start(void *state, char *fname, bool isnull);
+static void each_object_field_end(void *state, char *fname, bool isnull);
+static void each_array_start(void *state);
+static void each_scalar(void *state, char *token, JsonTokenType tokentype);
+
+/* semantic action functions for json_array_elements */
+static void elements_object_start(void *state);
+static void elements_array_element_start(void *state, bool isnull);
+static void elements_array_element_end(void *state, bool isnull);
+static void elements_scalar(void *state, char *token, JsonTokenType tokentype);
+
+/* turn a json object into a hash table */
+static HTAB *get_json_object_as_hash(text *json, char *funcname, bool use_json_as_text);
+
+/* semantic action functions for get_json_object_as_hash */
+static void hash_object_field_start(void *state, char *fname, bool isnull);
+static void hash_object_field_end(void *state, char *fname, bool isnull);
+static void hash_array_start(void *state);
+static void hash_scalar(void *state, char *token, JsonTokenType tokentype);
+
+/* semantic action functions for populate_recordset */
+static void populate_recordset_object_field_start(void *state, char *fname, bool isnull);
+static void populate_recordset_object_field_end(void *state, char *fname, bool isnull);
+static void populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype);
+static void populate_recordset_object_start(void *state);
+static void populate_recordset_object_end(void *state);
+static void populate_recordset_array_start(void *state);
+static void populate_recordset_array_element_start(void *state, bool isnull);
+
+/* search type classification for json_get* functions */
+typedef enum
+{
+	JSON_SEARCH_OBJECT = 1,
+	JSON_SEARCH_ARRAY,
+	JSON_SEARCH_PATH
+}	JsonSearch;
+
+/* state for json_object_keys */
+typedef struct okeysState
+{
+	JsonLexContext *lex;
+	char	  **result;
+	int			result_size;
+	int			result_count;
+	int			sent_count;
+}	okeysState, *OkeysState;
+
+/* state for json_get* functions */
+typedef struct getState
+{
+	JsonLexContext *lex;
+	JsonSearch	search_type;
+	int			search_index;
+	int			array_index;
+	char	   *search_term;
+	char	   *result_start;
+	text	   *tresult;
+	bool		result_is_null;
+	bool		normalize_results;
+	bool		next_scalar;
+	char	  **path;
+	int			npath;
+	char	  **current_path;
+	bool	   *pathok;
+	int		   *array_level_index;
+	int		   *path_level_index;
+}	getState, *GetState;
+
+/* state for json_array_length */
+typedef struct alenState
+{
+	JsonLexContext *lex;
+	int			count;
+}	alenState, *AlenState;
+
+/* state for json_each */
+typedef struct eachState
+{
+	JsonLexContext *lex;
+	Tuplestorestate *tuple_store;
+	TupleDesc	ret_tdesc;
+	MemoryContext tmp_cxt;
+	char	   *result_start;
+	bool		normalize_results;
+	bool		next_scalar;
+	char	   *normalized_scalar;
+}	eachState, *EachState;
+
+/* state for json_array_elements */
+typedef struct elementsState
+{
+	JsonLexContext *lex;
+	Tuplestorestate *tuple_store;
+	TupleDesc	ret_tdesc;
+	MemoryContext tmp_cxt;
+	char	   *result_start;
+}	elementsState, *ElementsState;
+
+/* state for get_json_object_as_hash */
+typedef struct jhashState
+{
+	JsonLexContext *lex;
+	HTAB	   *hash;
+	char	   *saved_scalar;
+	char	   *save_json_start;
+	bool		use_json_as_text;
+	char	   *function_name;
+}	jhashState, *JHashState;
+
+/* used to build the hashtable */
+typedef struct jsonHashEntry
+{
+	char		fname[NAMEDATALEN];
+	char	   *val;
+	char	   *json;
+	bool		isnull;
+}	jsonHashEntry, *JsonHashEntry;
+
+/* these two are stolen from hstore / record_out, used in populate_record* */
+typedef struct ColumnIOData
+{
+	Oid			column_type;
+	Oid			typiofunc;
+	Oid			typioparam;
+	FmgrInfo	proc;
+} ColumnIOData;
+
+typedef struct RecordIOData
+{
+	Oid			record_type;
+	int32		record_typmod;
+	int			ncolumns;
+	ColumnIOData columns[1];	/* VARIABLE LENGTH ARRAY */
+} RecordIOData;
+
+/* state for populate_recordset */
+typedef struct populateRecordsetState
+{
+	JsonLexContext *lex;
+	HTAB	   *json_hash;
+	char	   *saved_scalar;
+	char	   *save_json_start;
+	bool		use_json_as_text;
+	Tuplestorestate *tuple_store;
+	TupleDesc	ret_tdesc;
+	HeapTupleHeader rec;
+	RecordIOData *my_extra;
+	MemoryContext fn_mcxt;		/* used to stash IO funcs */
+}	populateRecordsetState, *PopulateRecordsetState;
+
+/*
+ * SQL function json_object-keys
+ *
+ * Returns the set of keys for the object argument.
+ *
+ * This SRF operates in value-per-call mode. It processes the
+ * object during the first call, and the keys are simply stashed
+ * in an array, whise size is expanded as necessary. This is probably
+ * safe enough for a list of keys of a single object, since they are
+ * limited in size to NAMEDATALEN and the number of keys is unlikely to
+ * be so huge that it has major memory implications.
+ */
+
+
+Datum
+json_object_keys(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	OkeysState	state;
+	int			i;
+
+	if (SRF_IS_FIRSTCALL())
+	{
+		text	   *json = PG_GETARG_TEXT_P(0);
+		JsonLexContext *lex = makeJsonLexContext(json, true);
+		JsonSemAction sem;
+
+		MemoryContext oldcontext;
+
+		funcctx = SRF_FIRSTCALL_INIT();
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		state = palloc(sizeof(okeysState));
+		sem = palloc0(sizeof(jsonSemAction));
+
+		state->lex = lex;
+		state->result_size = 256;
+		state->result_count = 0;
+		state->sent_count = 0;
+		state->result = palloc(256 * sizeof(char *));
+
+		sem->semstate = (void *) state;
+		sem->array_start = okeys_array_start;
+		sem->scalar = okeys_scalar;
+		sem->object_field_start = okeys_object_field_start;
+		/* remainder are all NULL, courtesy of palloc0 above */
+
+		pg_parse_json(lex, sem);
+		/* keys are now in state->result */
+
+		pfree(lex->strval->data);
+		pfree(lex->strval);
+		pfree(lex);
+		pfree(sem);
+
+		MemoryContextSwitchTo(oldcontext);
+		funcctx->user_fctx = (void *) state;
+
+	}
+
+	funcctx = SRF_PERCALL_SETUP();
+	state = (OkeysState) funcctx->user_fctx;
+
+	if (state->sent_count < state->result_count)
+	{
+		char	   *nxt = state->result[state->sent_count++];
+
+		SRF_RETURN_NEXT(funcctx, CStringGetTextDatum(nxt));
+	}
+
+	/* cleanup to reduce or eliminate memory leaks */
+	for (i = 0; i < state->result_count; i++)
+		pfree(state->result[i]);
+	pfree(state->result);
+	pfree(state);
+
+	SRF_RETURN_DONE(funcctx);
+}
+
+static void
+okeys_object_field_start(void *state, char *fname, bool isnull)
+{
+	OkeysState	_state = (OkeysState) state;
+
+	/* only collecting keys for the top level object */
+	if (_state->lex->lex_level != 1)
+		return;
+
+	/* enlarge result array if necessary */
+	if (_state->result_count >= _state->result_size)
+	{
+		_state->result_size *= 2;
+		_state->result =
+			repalloc(_state->result, sizeof(char *) * _state->result_size);
+	}
+
+	/* save a copy of the field name */
+	_state->result[_state->result_count++] = pstrdup(fname);
+}
+
+static void
+okeys_array_start(void *state)
+{
+	OkeysState	_state = (OkeysState) state;
+
+	/* top level must be a json object */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_object_keys on an array")));
+}
+
+static void
+okeys_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	OkeysState	_state = (OkeysState) state;
+
+	/* top level must be a json object */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_object_keys on a scalar")));
+}
+
+/*
+ * json getter functions
+ * these implement the -> ->> #> and #>> operators
+ * and the json_extract_path*(json, text, ...) functions
+ */
+
+
+Datum
+json_object_field(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	text	   *result;
+	text	   *fname = PG_GETARG_TEXT_P(1);
+	char	   *fnamestr = text_to_cstring(fname);
+
+	result = get_worker(json, fnamestr, -1, NULL, NULL, -1, false);
+
+	if (result != NULL)
+		PG_RETURN_TEXT_P(result);
+	else
+		PG_RETURN_NULL();
+}
+
+Datum
+json_object_field_text(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	text	   *result;
+	text	   *fname = PG_GETARG_TEXT_P(1);
+	char	   *fnamestr = text_to_cstring(fname);
+
+	result = get_worker(json, fnamestr, -1, NULL, NULL, -1, true);
+
+	if (result != NULL)
+		PG_RETURN_TEXT_P(result);
+	else
+		PG_RETURN_NULL();
+}
+
+Datum
+json_array_element(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	text	   *result;
+	int			element = PG_GETARG_INT32(1);
+
+	result = get_worker(json, NULL, element, NULL, NULL, -1, false);
+
+	if (result != NULL)
+		PG_RETURN_TEXT_P(result);
+	else
+		PG_RETURN_NULL();
+}
+
+Datum
+json_array_element_text(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	text	   *result;
+	int			element = PG_GETARG_INT32(1);
+
+	result = get_worker(json, NULL, element, NULL, NULL, -1, true);
+
+	if (result != NULL)
+		PG_RETURN_TEXT_P(result);
+	else
+		PG_RETURN_NULL();
+}
+
+Datum
+json_extract_path(PG_FUNCTION_ARGS)
+{
+	return get_path_all(fcinfo, false);
+}
+
+Datum
+json_extract_path_text(PG_FUNCTION_ARGS)
+{
+	return get_path_all(fcinfo, true);
+}
+
+/*
+ * common routine for extract_path functions
+ */
+static inline Datum
+get_path_all(PG_FUNCTION_ARGS, bool as_text)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1);
+	text	   *result;
+	Datum	   *pathtext;
+	bool	   *pathnulls;
+	int			npath;
+	char	  **tpath;
+	int		   *ipath;
+	int			i;
+	long		ind;
+	char	   *endptr;
+
+	if (array_contains_nulls(path))
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call function with null path elements")));
+
+
+	deconstruct_array(path, TEXTOID, -1, false, 'i',
+					  &pathtext, &pathnulls, &npath);
+
+	tpath = palloc(npath * sizeof(char *));
+	ipath = palloc(npath * sizeof(int));
+
+
+	for (i = 0; i < npath; i++)
+	{
+		tpath[i] = TextDatumGetCString(pathtext[i]);
+		if (*tpath[i] == '\0')
+			ereport(
+					ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				   errmsg("cannot call function with empty path elements")));
+
+		/*
+		 * we have no idea at this stage what structure the document is so
+		 * just convert anything in the path that we can to an integer and set
+		 * all the other integers to -1 which will never match.
+		 */
+		ind = strtol(tpath[i], &endptr, 10);
+		if (*endptr == '\0' && ind <= INT_MAX && ind >= 0)
+			ipath[i] = (int) ind;
+		else
+			ipath[i] = -1;
+	}
+
+
+	result = get_worker(json, NULL, -1, tpath, ipath, npath, as_text);
+
+	if (result != NULL)
+		PG_RETURN_TEXT_P(result);
+	else
+		PG_RETURN_NULL();
+}
+
+/*
+ * get_worker
+ *
+ * common worker for all the json getter functions
+ */
+static inline text *
+get_worker(text *json,
+		   char *field,
+		   int elem_index,
+		   char **tpath,
+		   int *ipath,
+		   int npath,
+		   bool normalize_results)
+{
+	GetState	state;
+	JsonLexContext *lex = makeJsonLexContext(json, true);
+	JsonSemAction sem;
+
+	/* only allowed to use one of these */
+	Assert(elem_index < 0 || (tpath == NULL && ipath == NULL && field == NULL));
+	Assert(tpath == NULL || field == NULL);
+
+	state = palloc0(sizeof(getState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+	state->lex = lex;
+	/* is it "_as_text" variant? */
+	state->normalize_results = normalize_results;
+	if (field != NULL)
+	{
+		/* single text argument */
+		state->search_type = JSON_SEARCH_OBJECT;
+		state->search_term = field;
+	}
+	else if (tpath != NULL)
+	{
+		/* path array argument */
+		state->search_type = JSON_SEARCH_PATH;
+		state->path = tpath;
+		state->npath = npath;
+		state->current_path = palloc(sizeof(char *) * npath);
+		state->pathok = palloc0(sizeof(bool) * npath);
+		state->pathok[0] = true;
+		state->array_level_index = palloc(sizeof(int) * npath);
+		state->path_level_index = ipath;
+
+	}
+	else
+	{
+		/* single integer argument */
+		state->search_type = JSON_SEARCH_ARRAY;
+		state->search_index = elem_index;
+		state->array_index = -1;
+	}
+
+	sem->semstate = (void *) state;
+
+	/*
+	 * Not all	variants need all the semantic routines. only set the ones
+	 * that are actually needed for maximum efficiency.
+	 */
+	sem->object_start = get_object_start;
+	sem->array_start = get_array_start;
+	sem->scalar = get_scalar;
+	if (field != NULL || tpath != NULL)
+	{
+		sem->object_field_start = get_object_field_start;
+		sem->object_field_end = get_object_field_end;
+	}
+	if (field == NULL)
+	{
+		sem->array_element_start = get_array_element_start;
+		sem->array_element_end = get_array_element_end;
+	}
+
+	pg_parse_json(lex, sem);
+
+	return state->tresult;
+}
+
+static void
+get_object_start(void *state)
+{
+	GetState	_state = (GetState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0 && _state->search_type == JSON_SEARCH_ARRAY)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot extract array element from a non-array")));
+}
+
+static void
+get_object_field_start(void *state, char *fname, bool isnull)
+{
+	GetState	_state = (GetState) state;
+	bool		get_next = false;
+	int			lex_level = _state->lex->lex_level;
+
+	if (lex_level == 1 && _state->search_type == JSON_SEARCH_OBJECT &&
+		strcmp(fname, _state->search_term) == 0)
+	{
+
+		_state->tresult = NULL;
+		_state->result_start = NULL;
+		get_next = true;
+	}
+	else if (_state->search_type == JSON_SEARCH_PATH &&
+			 lex_level <= _state->npath &&
+			 _state->pathok[_state->lex->lex_level - 1] &&
+			 strcmp(fname, _state->path[lex_level - 1]) == 0)
+	{
+		/* path search, path so far is ok,	and we have a match */
+
+		/* this object overrides any previous matching object */
+
+		_state->tresult = NULL;
+		_state->result_start = NULL;
+
+		/* if not at end of path just mark path ok */
+		if (lex_level < _state->npath)
+			_state->pathok[lex_level] = true;
+
+		/* end of path, so we want this value */
+		if (lex_level == _state->npath)
+			get_next = true;
+	}
+
+	if (get_next)
+	{
+		if (_state->normalize_results &&
+			_state->lex->token_type == JSON_TOKEN_STRING)
+		{
+			/* for as_text variants, tell get_scalar to set it for us */
+			_state->next_scalar = true;
+		}
+		else
+		{
+			/* for non-as_text variants, just note the json starting point */
+			_state->result_start = _state->lex->token_start;
+		}
+	}
+}
+
+static void
+get_object_field_end(void *state, char *fname, bool isnull)
+{
+	GetState	_state = (GetState) state;
+	bool		get_last = false;
+	int			lex_level = _state->lex->lex_level;
+
+
+	/* same tests as in get_object_field_start, mutatis mutandis */
+	if (lex_level == 1 && _state->search_type == JSON_SEARCH_OBJECT &&
+		strcmp(fname, _state->search_term) == 0)
+	{
+		get_last = true;
+	}
+	else if (_state->search_type == JSON_SEARCH_PATH &&
+			 lex_level <= _state->npath &&
+			 _state->pathok[lex_level - 1] &&
+			 strcmp(fname, _state->path[lex_level - 1]) == 0)
+	{
+		/* done with this field so reset pathok */
+		if (lex_level < _state->npath)
+			_state->pathok[lex_level] = false;
+
+		if (lex_level == _state->npath)
+			get_last = true;
+	}
+
+	/* for as_test variants our work is already done */
+	if (get_last && _state->result_start != NULL)
+	{
+		/*
+		 * make a text object from the string from the prevously noted json
+		 * start up to the end of the previous token (the lexer is by now
+		 * ahead of us on whatevere came after what we're interested in).
+		 */
+		int			len = _state->lex->prev_token_terminator - _state->result_start;
+
+		if (isnull && _state->normalize_results)
+			_state->tresult = (text *) NULL;
+		else
+			_state->tresult = cstring_to_text_with_len(_state->result_start, len);
+	}
+
+	/*
+	 * don't need to reset _state->result_start b/c we're only returning one
+	 * datum, the conditions should not occur more than once, and this lets us
+	 * check cheaply that they don't (see object_field_start() )
+	 */
+}
+
+static void
+get_array_start(void *state)
+{
+	GetState	_state = (GetState) state;
+	int			lex_level = _state->lex->lex_level;
+
+	/* json structure check */
+	if (lex_level == 0 && _state->search_type == JSON_SEARCH_OBJECT)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot extract field from a non-object")));
+	/* initialize array count for this nesting level */
+	if (_state->search_type == JSON_SEARCH_PATH &&
+		lex_level <= _state->npath)
+		_state->array_level_index[lex_level] = -1;
+}
+
+static void
+get_array_element_start(void *state, bool isnull)
+{
+	GetState	_state = (GetState) state;
+	bool		get_next = false;
+	int			lex_level = _state->lex->lex_level;
+
+	if (lex_level == 1 && _state->search_type == JSON_SEARCH_ARRAY)
+	{
+		/* single integer search */
+		_state->array_index++;
+		if (_state->array_index == _state->search_index)
+			get_next = true;
+	}
+	else if (_state->search_type == JSON_SEARCH_PATH &&
+			 lex_level <= _state->npath &&
+			 _state->pathok[lex_level - 1])
+	{
+		/*
+		 * path search, path so far is ok
+		 *
+		 * increment the array counter. no point doing this if we already know
+		 * the path is bad.
+		 *
+		 * then check if we have a match.
+		 */
+
+		if (++_state->array_level_index[lex_level - 1] ==
+			_state->path_level_index[lex_level - 1])
+		{
+			if (lex_level == _state->npath)
+			{
+				/* match and at end of path, so get value */
+				get_next = true;
+			}
+			else
+			{
+				/* not at end of path just mark path ok */
+				_state->pathok[lex_level] = true;
+			}
+		}
+
+	}
+
+	/* same logic as for objects */
+	if (get_next)
+	{
+		if (_state->normalize_results &&
+			_state->lex->token_type == JSON_TOKEN_STRING)
+		{
+			_state->next_scalar = true;
+		}
+		else
+		{
+			_state->result_start = _state->lex->token_start;
+		}
+	}
+}
+
+static void
+get_array_element_end(void *state, bool isnull)
+{
+	GetState	_state = (GetState) state;
+	bool		get_last = false;
+	int			lex_level = _state->lex->lex_level;
+
+	/* same logic as in get_object_end, modified for arrays */
+
+	if (lex_level == 1 && _state->search_type == JSON_SEARCH_ARRAY &&
+		_state->array_index == _state->search_index)
+	{
+		get_last = true;
+	}
+	else if (_state->search_type == JSON_SEARCH_PATH &&
+			 lex_level <= _state->npath &&
+			 _state->pathok[lex_level - 1] &&
+			 _state->array_level_index[lex_level - 1] ==
+			 _state->path_level_index[lex_level - 1])
+	{
+		/* done with this element so reset pathok */
+		if (lex_level < _state->npath)
+			_state->pathok[lex_level] = false;
+
+		if (lex_level == _state->npath)
+			get_last = true;
+	}
+	if (get_last && _state->result_start != NULL)
+	{
+		int			len = _state->lex->prev_token_terminator - _state->result_start;
+
+		if (isnull && _state->normalize_results)
+			_state->tresult = (text *) NULL;
+		else
+			_state->tresult = cstring_to_text_with_len(_state->result_start, len);
+	}
+}
+
+static void
+get_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	GetState	_state = (GetState) state;
+
+	if (_state->lex->lex_level == 0 && _state->search_type != JSON_SEARCH_PATH)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot extract element from a scalar")));
+	if (_state->next_scalar)
+	{
+		/* a de-escaped text value is wanted, so supply it */
+		_state->tresult = cstring_to_text(token);
+		/* make sure the next call to get_scalar doesn't overwrite it */
+		_state->next_scalar = false;
+	}
+
+}
+
+/*
+ * SQL function json_array_length(json) -> int
+ */
+Datum
+json_array_length(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+
+	AlenState	state;
+	JsonLexContext *lex = makeJsonLexContext(json, false);
+	JsonSemAction sem;
+
+	state = palloc0(sizeof(alenState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+	/* palloc0 does this for us */
+#if 0
+	state->count = 0;
+#endif
+	state->lex = lex;
+
+	sem->semstate = (void *) state;
+	sem->object_start = alen_object_start;
+	sem->scalar = alen_scalar;
+	sem->array_element_start = alen_array_element_start;
+
+	pg_parse_json(lex, sem);
+
+	PG_RETURN_INT32(state->count);
+}
+
+/*
+ * These next two check ensure that the json is an array (since it can't be
+ * a scalar or an object).
+ */
+
+static void
+alen_object_start(void *state)
+{
+	AlenState	_state = (AlenState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot get array length of a non-array")));
+}
+
+static void
+alen_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	AlenState	_state = (AlenState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot get array length of a scalar")));
+}
+
+static void
+alen_array_element_start(void *state, bool isnull)
+{
+	AlenState	_state = (AlenState) state;
+
+	/* just count up all the level 1 elements */
+	if (_state->lex->lex_level == 1)
+		_state->count++;
+}
+
+/*
+ * SQL function json_each and json_each_text
+ *
+ * decompose a json object into key value pairs.
+ *
+ * Unlike json_object_keys() these SRFs operate in materialize mode,
+ * stashing results into a Tuplestore object as they go.
+ * The construction of tuples is done using a temporary memory context
+ * that is cleared out after each tuple is built.
+ */
+Datum
+json_each(PG_FUNCTION_ARGS)
+{
+	return each_worker(fcinfo, false);
+}
+
+Datum
+json_each_text(PG_FUNCTION_ARGS)
+{
+	return each_worker(fcinfo, true);
+}
+
+static inline Datum
+each_worker(PG_FUNCTION_ARGS, bool as_text)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+	JsonLexContext *lex = makeJsonLexContext(json, true);
+	JsonSemAction sem;
+	ReturnSetInfo *rsi;
+	MemoryContext old_cxt;
+	TupleDesc	tupdesc;
+	EachState	state;
+
+	state = palloc0(sizeof(eachState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+	rsi = (ReturnSetInfo *) fcinfo->resultinfo;
+
+	if (!rsi || !IsA(rsi, ReturnSetInfo) ||
+		(rsi->allowedModes & SFRM_Materialize) == 0 ||
+		rsi->expectedDesc == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that "
+						"cannot accept a set")));
+
+
+	rsi->returnMode = SFRM_Materialize;
+
+	(void) get_call_result_type(fcinfo, NULL, &tupdesc);
+
+	/* make these in a sufficiently long-lived memory context */
+	old_cxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);
+
+	state->ret_tdesc = CreateTupleDescCopy(tupdesc);
+	BlessTupleDesc(state->ret_tdesc);
+	state->tuple_store =
+		tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
+							  false, work_mem);
+
+	MemoryContextSwitchTo(old_cxt);
+
+	sem->semstate = (void *) state;
+	sem->array_start = each_array_start;
+	sem->scalar = each_scalar;
+	sem->object_field_start = each_object_field_start;
+	sem->object_field_end = each_object_field_end;
+
+	state->normalize_results = as_text;
+	state->next_scalar = false;
+
+	state->lex = lex;
+	state->tmp_cxt = AllocSetContextCreate(CurrentMemoryContext,
+										   "json_each temporary cxt",
+										   ALLOCSET_DEFAULT_MINSIZE,
+										   ALLOCSET_DEFAULT_INITSIZE,
+										   ALLOCSET_DEFAULT_MAXSIZE);
+
+	pg_parse_json(lex, sem);
+
+	rsi->setResult = state->tuple_store;
+	rsi->setDesc = state->ret_tdesc;
+
+	PG_RETURN_NULL();
+}
+
+
+static void
+each_object_field_start(void *state, char *fname, bool isnull)
+{
+	EachState	_state = (EachState) state;
+
+	/* save a pointer to where the value starts */
+	if (_state->lex->lex_level == 1)
+	{
+		/*
+		 * next_scalar will be reset in the object_field_end handler, and
+		 * since we know the value is a scalar there is no danger of it being
+		 * on while recursing down the tree.
+		 */
+		if (_state->normalize_results && _state->lex->token_type == JSON_TOKEN_STRING)
+			_state->next_scalar = true;
+		else
+			_state->result_start = _state->lex->token_start;
+	}
+}
+
+static void
+each_object_field_end(void *state, char *fname, bool isnull)
+{
+	EachState	_state = (EachState) state;
+	MemoryContext old_cxt;
+	int			len;
+	text	   *val;
+	HeapTuple	tuple;
+	Datum		values[2];
+	bool		nulls[2] = {false, false};
+
+	/* skip over nested objects */
+	if (_state->lex->lex_level != 1)
+		return;
+
+	/* use the tmp context so we can clean up after each tuple is done */
+	old_cxt = MemoryContextSwitchTo(_state->tmp_cxt);
+
+	values[0] = CStringGetTextDatum(fname);
+
+	if (isnull && _state->normalize_results)
+	{
+		nulls[1] = true;
+		values[1] = (Datum) NULL;
+	}
+	else if (_state->next_scalar)
+	{
+		values[1] = CStringGetTextDatum(_state->normalized_scalar);
+		_state->next_scalar = false;
+	}
+	else
+	{
+		len = _state->lex->prev_token_terminator - _state->result_start;
+		val = cstring_to_text_with_len(_state->result_start, len);
+		values[1] = PointerGetDatum(val);
+	}
+
+
+	tuple = heap_form_tuple(_state->ret_tdesc, values, nulls);
+
+	tuplestore_puttuple(_state->tuple_store, tuple);
+
+	/* clean up and switch back */
+	MemoryContextSwitchTo(old_cxt);
+	MemoryContextReset(_state->tmp_cxt);
+}
+
+static void
+each_array_start(void *state)
+{
+	EachState	_state = (EachState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot deconstruct an array as an object")));
+}
+
+static void
+each_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	EachState	_state = (EachState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot deconstruct a scalar")));
+
+	/* supply de-escaped value if required */
+	if (_state->next_scalar)
+		_state->normalized_scalar = token;
+}
+
+/*
+ * SQL function json_array_elements
+ *
+ * get the elements from a json array
+ *
+ * a lot of this processing is similar to the json_each* functions
+ */
+Datum
+json_array_elements(PG_FUNCTION_ARGS)
+{
+	text	   *json = PG_GETARG_TEXT_P(0);
+
+	/* elements doesn't need any escaped strings, so use false here */
+	JsonLexContext *lex = makeJsonLexContext(json, false);
+	JsonSemAction sem;
+	ReturnSetInfo *rsi;
+	MemoryContext old_cxt;
+	TupleDesc	tupdesc;
+	ElementsState state;
+
+	state = palloc0(sizeof(elementsState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+	rsi = (ReturnSetInfo *) fcinfo->resultinfo;
+
+	if (!rsi || !IsA(rsi, ReturnSetInfo) ||
+		(rsi->allowedModes & SFRM_Materialize) == 0 ||
+		rsi->expectedDesc == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that "
+						"cannot accept a set")));
+
+
+	rsi->returnMode = SFRM_Materialize;
+
+	/* it's a simple type, so don't use get_call_result_type() */
+	tupdesc = rsi->expectedDesc;
+
+	/* make these in a sufficiently long-lived memory context */
+	old_cxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);
+
+	state->ret_tdesc = CreateTupleDescCopy(tupdesc);
+	BlessTupleDesc(state->ret_tdesc);
+	state->tuple_store =
+		tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
+							  false, work_mem);
+
+	MemoryContextSwitchTo(old_cxt);
+
+	sem->semstate = (void *) state;
+	sem->object_start = elements_object_start;
+	sem->scalar = elements_scalar;
+	sem->array_element_start = elements_array_element_start;
+	sem->array_element_end = elements_array_element_end;
+
+	state->lex = lex;
+	state->tmp_cxt = AllocSetContextCreate(CurrentMemoryContext,
+										 "json_array_elements temporary cxt",
+										   ALLOCSET_DEFAULT_MINSIZE,
+										   ALLOCSET_DEFAULT_INITSIZE,
+										   ALLOCSET_DEFAULT_MAXSIZE);
+
+	pg_parse_json(lex, sem);
+
+	rsi->setResult = state->tuple_store;
+	rsi->setDesc = state->ret_tdesc;
+
+	PG_RETURN_NULL();
+}
+
+static void
+elements_array_element_start(void *state, bool isnull)
+{
+	ElementsState _state = (ElementsState) state;
+
+	/* save a pointer to where the value starts */
+	if (_state->lex->lex_level == 1)
+		_state->result_start = _state->lex->token_start;
+}
+
+static void
+elements_array_element_end(void *state, bool isnull)
+{
+	ElementsState _state = (ElementsState) state;
+	MemoryContext old_cxt;
+	int			len;
+	text	   *val;
+	HeapTuple	tuple;
+	Datum		values[1];
+	static bool nulls[1] = {false};
+
+	/* skip over nested objects */
+	if (_state->lex->lex_level != 1)
+		return;
+
+	/* use the tmp context so we can clean up after each tuple is done */
+	old_cxt = MemoryContextSwitchTo(_state->tmp_cxt);
+
+	len = _state->lex->prev_token_terminator - _state->result_start;
+	val = cstring_to_text_with_len(_state->result_start, len);
+
+	values[0] = PointerGetDatum(val);
+
+	tuple = heap_form_tuple(_state->ret_tdesc, values, nulls);
+
+	tuplestore_puttuple(_state->tuple_store, tuple);
+
+	/* clean up and switch back */
+	MemoryContextSwitchTo(old_cxt);
+	MemoryContextReset(_state->tmp_cxt);
+}
+
+static void
+elements_object_start(void *state)
+{
+	ElementsState _state = (ElementsState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_array_elements on a non-array")));
+}
+
+static void
+elements_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	ElementsState _state = (ElementsState) state;
+
+	/* json structure check */
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_array_elements on a scalar")));
+
+	/*
+	 * json_array_elements always returns json, so there's no need to think
+	 * about de-escaped values here.
+	 */
+}
+
+/*
+ * SQL function json_populate_record
+ *
+ * set fields in a record from the argument json
+ *
+ * Code adapted shamelessly from hstore's populate_record
+ * which is in turn partly adapted from record_out.
+ *
+ * The json is decomposed into a hash table, in which each
+ * field in the record is then looked up by name.
+ */
+Datum
+json_populate_record(PG_FUNCTION_ARGS)
+{
+	Oid			argtype = get_fn_expr_argtype(fcinfo->flinfo, 0);
+	text	   *json = PG_GETARG_TEXT_P(1);
+	bool		use_json_as_text = PG_GETARG_BOOL(2);
+	HTAB	   *json_hash;
+	HeapTupleHeader rec;
+	Oid			tupType;
+	int32		tupTypmod;
+	TupleDesc	tupdesc;
+	HeapTupleData tuple;
+	HeapTuple	rettuple;
+	RecordIOData *my_extra;
+	int			ncolumns;
+	int			i;
+	Datum	   *values;
+	bool	   *nulls;
+	char		fname[NAMEDATALEN];
+	JsonHashEntry hashentry;
+
+
+	if (!type_is_rowtype(argtype))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATATYPE_MISMATCH),
+				 errmsg("first argument must be a rowtype")));
+
+	if (PG_ARGISNULL(0))
+	{
+		if (PG_ARGISNULL(1))
+			PG_RETURN_NULL();
+
+		rec = NULL;
+
+		/*
+		 * have no tuple to look at, so the only source of type info is the
+		 * argtype. The lookup_rowtype_tupdesc call below will error out if we
+		 * don't have a known composite type oid here.
+		 */
+		tupType = argtype;
+		tupTypmod = -1;
+	}
+	else
+	{
+		rec = PG_GETARG_HEAPTUPLEHEADER(0);
+
+		if (PG_ARGISNULL(1))
+			PG_RETURN_POINTER(rec);
+
+		/* Extract type info from the tuple itself */
+		tupType = HeapTupleHeaderGetTypeId(rec);
+		tupTypmod = HeapTupleHeaderGetTypMod(rec);
+	}
+
+	json_hash = get_json_object_as_hash(json, "json_populate_record", use_json_as_text);
+
+	/*
+	 * if the input json is empty, we can only skip the rest if we were passed
+	 * in a non-null record, since otherwise there may be issues with domain
+	 * nulls.
+	 */
+	if (hash_get_num_entries(json_hash) == 0 && rec)
+		PG_RETURN_POINTER(rec);
+
+
+	tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
+	ncolumns = tupdesc->natts;
+
+	if (rec)
+	{
+		/* Build a temporary HeapTuple control structure */
+		tuple.t_len = HeapTupleHeaderGetDatumLength(rec);
+		ItemPointerSetInvalid(&(tuple.t_self));
+		tuple.t_tableOid = InvalidOid;
+		tuple.t_data = rec;
+	}
+
+	/*
+	 * We arrange to look up the needed I/O info just once per series of
+	 * calls, assuming the record type doesn't change underneath us.
+	 */
+	my_extra = (RecordIOData *) fcinfo->flinfo->fn_extra;
+	if (my_extra == NULL ||
+		my_extra->ncolumns != ncolumns)
+	{
+		fcinfo->flinfo->fn_extra =
+			MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+							   sizeof(RecordIOData) - sizeof(ColumnIOData)
+							   + ncolumns * sizeof(ColumnIOData));
+		my_extra = (RecordIOData *) fcinfo->flinfo->fn_extra;
+		my_extra->record_type = InvalidOid;
+		my_extra->record_typmod = 0;
+	}
+
+	if (my_extra->record_type != tupType ||
+		my_extra->record_typmod != tupTypmod)
+	{
+		MemSet(my_extra, 0,
+			   sizeof(RecordIOData) - sizeof(ColumnIOData)
+			   + ncolumns * sizeof(ColumnIOData));
+		my_extra->record_type = tupType;
+		my_extra->record_typmod = tupTypmod;
+		my_extra->ncolumns = ncolumns;
+	}
+
+	values = (Datum *) palloc(ncolumns * sizeof(Datum));
+	nulls = (bool *) palloc(ncolumns * sizeof(bool));
+
+	if (rec)
+	{
+		/* Break down the tuple into fields */
+		heap_deform_tuple(&tuple, tupdesc, values, nulls);
+	}
+	else
+	{
+		for (i = 0; i < ncolumns; ++i)
+		{
+			values[i] = (Datum) 0;
+			nulls[i] = true;
+		}
+	}
+
+	for (i = 0; i < ncolumns; ++i)
+	{
+		ColumnIOData *column_info = &my_extra->columns[i];
+		Oid			column_type = tupdesc->attrs[i]->atttypid;
+		char	   *value;
+
+		/* Ignore dropped columns in datatype */
+		if (tupdesc->attrs[i]->attisdropped)
+		{
+			nulls[i] = true;
+			continue;
+		}
+
+		memset(fname, 0, NAMEDATALEN);
+		strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
+		hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
+
+		/*
+		 * we can't just skip here if the key wasn't found since we might have
+		 * a domain to deal with. If we were passed in a non-null record
+		 * datum, we assume that the existing values are valid (if they're
+		 * not, then it's not our fault), but if we were passed in a null,
+		 * then every field which we don't populate needs to be run through
+		 * the input function just in case it's a domain type.
+		 */
+		if (hashentry == NULL && rec)
+			continue;
+
+		/*
+		 * Prepare to convert the column value from text
+		 */
+		if (column_info->column_type != column_type)
+		{
+			getTypeInputInfo(column_type,
+							 &column_info->typiofunc,
+							 &column_info->typioparam);
+			fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
+						  fcinfo->flinfo->fn_mcxt);
+			column_info->column_type = column_type;
+		}
+		if (hashentry == NULL || hashentry->isnull)
+		{
+			/*
+			 * need InputFunctionCall to happen even for nulls, so that domain
+			 * checks are done
+			 */
+			values[i] = InputFunctionCall(&column_info->proc, NULL,
+										  column_info->typioparam,
+										  tupdesc->attrs[i]->atttypmod);
+			nulls[i] = true;
+		}
+		else
+		{
+			value = hashentry->val;
+
+			values[i] = InputFunctionCall(&column_info->proc, value,
+										  column_info->typioparam,
+										  tupdesc->attrs[i]->atttypmod);
+			nulls[i] = false;
+		}
+	}
+
+	rettuple = heap_form_tuple(tupdesc, values, nulls);
+
+	ReleaseTupleDesc(tupdesc);
+
+	PG_RETURN_DATUM(HeapTupleGetDatum(rettuple));
+}
+
+/*
+ * get_json_object_as_hash
+ *
+ * decompose a json object into a hash table.
+ *
+ * Currently doesn't allow anything but a flat object. Should this
+ * change?
+ *
+ * funcname argument allows caller to pass in its name for use in
+ * error messages.
+ */
+static HTAB *
+get_json_object_as_hash(text *json, char *funcname, bool use_json_as_text)
+{
+	HASHCTL		ctl;
+	HTAB	   *tab;
+	JHashState	state;
+	JsonLexContext *lex = makeJsonLexContext(json, true);
+	JsonSemAction sem;
+
+	memset(&ctl, 0, sizeof(ctl));
+	ctl.keysize = NAMEDATALEN;
+	ctl.entrysize = sizeof(jsonHashEntry);
+	ctl.hcxt = CurrentMemoryContext;
+	tab = hash_create("json object hashtable",
+					  100,
+					  &ctl,
+					  HASH_ELEM | HASH_CONTEXT);
+
+	state = palloc0(sizeof(jhashState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+	state->function_name = funcname;
+	state->hash = tab;
+	state->lex = lex;
+	state->use_json_as_text = use_json_as_text;
+
+	sem->semstate = (void *) state;
+	sem->array_start = hash_array_start;
+	sem->scalar = hash_scalar;
+	sem->object_field_start = hash_object_field_start;
+	sem->object_field_end = hash_object_field_end;
+
+	pg_parse_json(lex, sem);
+
+	return tab;
+}
+
+static void
+hash_object_field_start(void *state, char *fname, bool isnull)
+{
+	JHashState	_state = (JHashState) state;
+
+	if (_state->lex->lex_level > 1)
+		return;
+
+	if (_state->lex->token_type == JSON_TOKEN_ARRAY_START ||
+		_state->lex->token_type == JSON_TOKEN_OBJECT_START)
+	{
+		if (!_state->use_json_as_text)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("cannot call %s on a nested object",
+							_state->function_name)));
+		_state->save_json_start = _state->lex->token_start;
+	}
+	else
+	{
+		/* must be a scalar */
+		_state->save_json_start = NULL;
+	}
+}
+
+static void
+hash_object_field_end(void *state, char *fname, bool isnull)
+{
+	JHashState	_state = (JHashState) state;
+	JsonHashEntry hashentry;
+	bool		found;
+	char		name[NAMEDATALEN];
+
+	/*
+	 * ignore field names >= NAMEDATALEN - they can't match a record field
+	 * ignore nested fields.
+	 */
+	if (_state->lex->lex_level > 2 || strlen(fname) >= NAMEDATALEN)
+		return;
+
+	memset(name, 0, NAMEDATALEN);
+	strncpy(name, fname, NAMEDATALEN);
+
+	hashentry = hash_search(_state->hash, name, HASH_ENTER, &found);
+
+	/*
+	 * found being true indicates a duplicate. We don't do anything about
+	 * that, a later field with the same name overrides the earlier field.
+	 */
+
+	hashentry->isnull = isnull;
+	if (_state->save_json_start != NULL)
+	{
+		int			len = _state->lex->prev_token_terminator - _state->save_json_start;
+		char	   *val = palloc((len + 1) * sizeof(char));
+
+		memcpy(val, _state->save_json_start, len);
+		val[len] = '\0';
+		hashentry->val = val;
+	}
+	else
+	{
+		/* must have had a scalar instead */
+		hashentry->val = _state->saved_scalar;
+	}
+}
+
+static void
+hash_array_start(void *state)
+{
+	JHashState	_state = (JHashState) state;
+
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			   errmsg("cannot call %s on an array", _state->function_name)));
+}
+
+static void
+hash_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	JHashState	_state = (JHashState) state;
+
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			   errmsg("cannot call %s on a scalar", _state->function_name)));
+
+	if (_state->lex->lex_level == 1)
+		_state->saved_scalar = token;
+}
+
+
+/*
+ * SQL function json_populate_recordset
+ *
+ * set fields in a set of records from the argument json,
+ * which must be an array of objects.
+ *
+ * similar to json_populate_record, but the tuple-building code
+ * is pushed down into the semantic action handlers so it's done
+ * per object in the array.
+ */
+Datum
+json_populate_recordset(PG_FUNCTION_ARGS)
+{
+	Oid			argtype = get_fn_expr_argtype(fcinfo->flinfo, 0);
+	text	   *json = PG_GETARG_TEXT_P(1);
+	bool		use_json_as_text = PG_GETARG_BOOL(2);
+	ReturnSetInfo *rsi;
+	MemoryContext old_cxt;
+	Oid			tupType;
+	int32		tupTypmod;
+	HeapTupleHeader rec;
+	TupleDesc	tupdesc;
+	RecordIOData *my_extra;
+	int			ncolumns;
+	JsonLexContext *lex;
+	JsonSemAction sem;
+	PopulateRecordsetState state;
+
+	if (!type_is_rowtype(argtype))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATATYPE_MISMATCH),
+				 errmsg("first argument must be a rowtype")));
+
+	rsi = (ReturnSetInfo *) fcinfo->resultinfo;
+
+	if (!rsi || !IsA(rsi, ReturnSetInfo) ||
+		(rsi->allowedModes & SFRM_Materialize) == 0 ||
+		rsi->expectedDesc == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that "
+						"cannot accept a set")));
+
+
+	rsi->returnMode = SFRM_Materialize;
+
+	/*
+	 * get the tupdesc from the result set info - it must be a record type
+	 * because we already checked that arg1 is a record type.
+	 */
+	(void) get_call_result_type(fcinfo, NULL, &tupdesc);
+
+	state = palloc0(sizeof(populateRecordsetState));
+	sem = palloc0(sizeof(jsonSemAction));
+
+
+	/* make these in a sufficiently long-lived memory context */
+	old_cxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);
+
+	state->ret_tdesc = CreateTupleDescCopy(tupdesc);
+	BlessTupleDesc(state->ret_tdesc);
+	state->tuple_store =
+		tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
+							  false, work_mem);
+
+	MemoryContextSwitchTo(old_cxt);
+
+	/* if the json is null send back an empty set */
+	if (PG_ARGISNULL(1))
+		PG_RETURN_NULL();
+
+	if (PG_ARGISNULL(0))
+		rec = NULL;
+	else
+		rec = PG_GETARG_HEAPTUPLEHEADER(0);
+
+	tupType = tupdesc->tdtypeid;
+	tupTypmod = tupdesc->tdtypmod;
+	ncolumns = tupdesc->natts;
+
+	lex = makeJsonLexContext(json, true);
+
+	/*
+	 * We arrange to look up the needed I/O info just once per series of
+	 * calls, assuming the record type doesn't change underneath us.
+	 */
+	my_extra = (RecordIOData *) fcinfo->flinfo->fn_extra;
+	if (my_extra == NULL ||
+		my_extra->ncolumns != ncolumns)
+	{
+		fcinfo->flinfo->fn_extra =
+			MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+							   sizeof(RecordIOData) - sizeof(ColumnIOData)
+							   + ncolumns * sizeof(ColumnIOData));
+		my_extra = (RecordIOData *) fcinfo->flinfo->fn_extra;
+		my_extra->record_type = InvalidOid;
+		my_extra->record_typmod = 0;
+	}
+
+	if (my_extra->record_type != tupType ||
+		my_extra->record_typmod != tupTypmod)
+	{
+		MemSet(my_extra, 0,
+			   sizeof(RecordIOData) - sizeof(ColumnIOData)
+			   + ncolumns * sizeof(ColumnIOData));
+		my_extra->record_type = tupType;
+		my_extra->record_typmod = tupTypmod;
+		my_extra->ncolumns = ncolumns;
+	}
+
+	sem->semstate = (void *) state;
+	sem->array_start = populate_recordset_array_start;
+	sem->array_element_start = populate_recordset_array_element_start;
+	sem->scalar = populate_recordset_scalar;
+	sem->object_field_start = populate_recordset_object_field_start;
+	sem->object_field_end = populate_recordset_object_field_end;
+	sem->object_start = populate_recordset_object_start;
+	sem->object_end = populate_recordset_object_end;
+
+	state->lex = lex;
+
+	state->my_extra = my_extra;
+	state->rec = rec;
+	state->use_json_as_text = use_json_as_text;
+	state->fn_mcxt = fcinfo->flinfo->fn_mcxt;
+
+	pg_parse_json(lex, sem);
+
+	rsi->setResult = state->tuple_store;
+	rsi->setDesc = state->ret_tdesc;
+
+	PG_RETURN_NULL();
+
+}
+
+static void
+populate_recordset_object_start(void *state)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+	int			lex_level = _state->lex->lex_level;
+	HASHCTL		ctl;
+
+	if (lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_populate_recordset on an object")));
+	else if (lex_level > 1 && !_state->use_json_as_text)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("cannot call json_populate_recordset with nested objects")));
+
+	/* set up a new hash for this entry */
+	memset(&ctl, 0, sizeof(ctl));
+	ctl.keysize = NAMEDATALEN;
+	ctl.entrysize = sizeof(jsonHashEntry);
+	ctl.hcxt = CurrentMemoryContext;
+	_state->json_hash = hash_create("json object hashtable",
+									100,
+									&ctl,
+									HASH_ELEM | HASH_CONTEXT);
+}
+
+static void
+populate_recordset_object_end(void *state)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+	HTAB	   *json_hash = _state->json_hash;
+	Datum	   *values;
+	bool	   *nulls;
+	char		fname[NAMEDATALEN];
+	int			i;
+	RecordIOData *my_extra = _state->my_extra;
+	int			ncolumns = my_extra->ncolumns;
+	TupleDesc	tupdesc = _state->ret_tdesc;
+	JsonHashEntry hashentry;
+	HeapTupleHeader rec = _state->rec;
+	HeapTuple	rettuple;
+
+	if (_state->lex->lex_level > 1)
+		return;
+
+	values = (Datum *) palloc(ncolumns * sizeof(Datum));
+	nulls = (bool *) palloc(ncolumns * sizeof(bool));
+
+	if (_state->rec)
+	{
+		HeapTupleData tuple;
+
+		/* Build a temporary HeapTuple control structure */
+		tuple.t_len = HeapTupleHeaderGetDatumLength(_state->rec);
+		ItemPointerSetInvalid(&(tuple.t_self));
+		tuple.t_tableOid = InvalidOid;
+		tuple.t_data = _state->rec;
+
+		/* Break down the tuple into fields */
+		heap_deform_tuple(&tuple, tupdesc, values, nulls);
+	}
+	else
+	{
+		for (i = 0; i < ncolumns; ++i)
+		{
+			values[i] = (Datum) 0;
+			nulls[i] = true;
+		}
+	}
+
+	for (i = 0; i < ncolumns; ++i)
+	{
+		ColumnIOData *column_info = &my_extra->columns[i];
+		Oid			column_type = tupdesc->attrs[i]->atttypid;
+		char	   *value;
+
+		/* Ignore dropped columns in datatype */
+		if (tupdesc->attrs[i]->attisdropped)
+		{
+			nulls[i] = true;
+			continue;
+		}
+
+		memset(fname, 0, NAMEDATALEN);
+		strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
+		hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
+
+		/*
+		 * we can't just skip here if the key wasn't found since we might have
+		 * a domain to deal with. If we were passed in a non-null record
+		 * datum, we assume that the existing values are valid (if they're
+		 * not, then it's not our fault), but if we were passed in a null,
+		 * then every field which we don't populate needs to be run through
+		 * the input function just in case it's a domain type.
+		 */
+		if (hashentry == NULL && rec)
+			continue;
+
+		/*
+		 * Prepare to convert the column value from text
+		 */
+		if (column_info->column_type != column_type)
+		{
+			getTypeInputInfo(column_type,
+							 &column_info->typiofunc,
+							 &column_info->typioparam);
+			fmgr_info_cxt(column_info->typiofunc, &column_info->proc,
+						  _state->fn_mcxt);
+			column_info->column_type = column_type;
+		}
+		if (hashentry == NULL || hashentry->isnull)
+		{
+			/*
+			 * need InputFunctionCall to happen even for nulls, so that domain
+			 * checks are done
+			 */
+			values[i] = InputFunctionCall(&column_info->proc, NULL,
+										  column_info->typioparam,
+										  tupdesc->attrs[i]->atttypmod);
+			nulls[i] = true;
+		}
+		else
+		{
+			value = hashentry->val;
+
+			values[i] = InputFunctionCall(&column_info->proc, value,
+										  column_info->typioparam,
+										  tupdesc->attrs[i]->atttypmod);
+			nulls[i] = false;
+		}
+	}
+
+	rettuple = heap_form_tuple(tupdesc, values, nulls);
+
+	tuplestore_puttuple(_state->tuple_store, rettuple);
+
+	hash_destroy(json_hash);
+}
+
+static void
+populate_recordset_array_element_start(void *state, bool isnull)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+
+	if (_state->lex->lex_level == 1 &&
+		_state->lex->token_type != JSON_TOKEN_OBJECT_START)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("must call populate_recordset on an array of objects")));
+}
+
+static void
+populate_recordset_array_start(void *state)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+
+	if (_state->lex->lex_level != 0 && !_state->use_json_as_text)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		  errmsg("cannot call json_populate_recordset with nested arrays")));
+}
+
+static void
+populate_recordset_scalar(void *state, char *token, JsonTokenType tokentype)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+
+	if (_state->lex->lex_level == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("cannot call json_populate_recordset on a scalar")));
+
+	if (_state->lex->lex_level == 2)
+		_state->saved_scalar = token;
+}
+
+static void
+populate_recordset_object_field_start(void *state, char *fname, bool isnull)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+
+	if (_state->lex->lex_level > 2)
+		return;
+
+	if (_state->lex->token_type == JSON_TOKEN_ARRAY_START ||
+		_state->lex->token_type == JSON_TOKEN_OBJECT_START)
+	{
+		if (!_state->use_json_as_text)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("cannot call json_populate_recordset on a nested object")));
+		_state->save_json_start = _state->lex->token_start;
+	}
+	else
+	{
+		_state->save_json_start = NULL;
+	}
+}
+
+static void
+populate_recordset_object_field_end(void *state, char *fname, bool isnull)
+{
+	PopulateRecordsetState _state = (PopulateRecordsetState) state;
+	JsonHashEntry hashentry;
+	bool		found;
+	char		name[NAMEDATALEN];
+
+	/*
+	 * ignore field names >= NAMEDATALEN - they can't match a record field
+	 * ignore nested fields.
+	 */
+	if (_state->lex->lex_level > 2 || strlen(fname) >= NAMEDATALEN)
+		return;
+
+	memset(name, 0, NAMEDATALEN);
+	strncpy(name, fname, NAMEDATALEN);
+
+	hashentry = hash_search(_state->json_hash, name, HASH_ENTER, &found);
+
+	/*
+	 * found being true indicates a duplicate. We don't do anything about
+	 * that, a later field with the same name overrides the earlier field.
+	 */
+
+	hashentry->isnull = isnull;
+	if (_state->save_json_start != NULL)
+	{
+		int			len = _state->lex->prev_token_terminator - _state->save_json_start;
+		char	   *val = palloc((len + 1) * sizeof(char));
+
+		memcpy(val, _state->save_json_start, len);
+		val[len] = '\0';
+		hashentry->val = val;
+	}
+	else
+	{
+		/* must have had a scalar instead */
+		hashentry->val = _state->saved_scalar;
+	}
+}
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d79d76e..7a124e6 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1724,6 +1724,19 @@ DATA(insert OID = 3899 (  "-"	   PGNSP PGUID b f f 3831 3831 3831 0 0 range_minu
 DESCR("range difference");
 DATA(insert OID = 3900 (  "*"	   PGNSP PGUID b f f 3831 3831 3831 3900 0 range_intersect - - ));
 DESCR("range intersection");
+DATA(insert OID = 5100 (  "->"	   PGNSP PGUID b f f 114 25 114 0 0 json_object_field - - ));
+DESCR("get json object field");
+DATA(insert OID = 5101 (  "->>"    PGNSP PGUID b f f 114 25 25 0 0 json_object_field_text - - ));
+DESCR("get json object field as text");
+DATA(insert OID = 5102 (  "->"	   PGNSP PGUID b f f 114 23 114 0 0 json_array_element - - ));
+DESCR("get json array element");
+DATA(insert OID = 5103 (  "->>"    PGNSP PGUID b f f 114 23 25 0 0 json_array_element_text - - ));
+DESCR("get json array element as text");
+DATA(insert OID = 5104 (  "#>"     PGNSP PGUID b f f 114 1009 114 0 0 json_extract_path_op - - ));
+DESCR("get value from json with path elements");
+DATA(insert OID = 5105 (  "#>>"    PGNSP PGUID b f f 114 1009 25 0 0 json_extract_path_text_op - - ));
+DESCR("get value from json as text with path elements");
+
 
 
 /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 4aee002..da60415 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4118,6 +4118,37 @@ DESCR("aggregate input into json");
 DATA(insert OID = 3176 (  to_json	   PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 114 "2283" _null_ _null_ _null_ _null_ to_json _null_ _null_ _null_ ));
 DESCR("map input to json");
 
+DATA(insert OID = 5001 (  json_object_field         PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 114 "114 25" _null_ _null_ "{from_json, field_name}" _null_ json_object_field _null_ _null_ _null_ ));
+DESCR("get json object field");
+DATA(insert OID = 5002 (  json_object_field_text    PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 25  "114 25" _null_ _null_ "{from_json, field_name}" _null_ json_object_field_text _null_ _null_ _null_ ));
+DESCR("get json object field as text");
+DATA(insert OID = 5003 (  json_array_element        PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 114 "114 23" _null_ _null_ "{from_json, element_index}" _null_ json_array_element _null_ _null_ _null_ ));
+DESCR("get json array element");
+DATA(insert OID = 5004 (  json_array_element_text   PGNSP PGUID 12 1 0 0 0 f f f f t f s 2 0 25  "114 23" _null_ _null_ "{from_json, element_index}" _null_ json_array_element_text _null_ _null_ _null_ ));
+DESCR("get json array element as text");
+DATA(insert OID = 5005 (  json_extract_path	        PGNSP PGUID 12 1 0 25 0 f f f f t f s 2 0 114 "114 1009" "{114,1009}" "{i,v}" "{from_json,path_elems}" _null_ json_extract_path _null_ _null_ _null_ ));
+DESCR("get value from json with path elements");
+DATA(insert OID = 5006 (  json_extract_path_op      PGNSP PGUID 12 1 0 0 0  f f f f t f s 2 0 114 "114 1009" _null_ _null_ "{from_json,path_elems}" _null_ json_extract_path _null_ _null_ _null_ ));
+DESCR("get value from json with path elements");
+DATA(insert OID = 5007 (  json_extract_path_text	PGNSP PGUID 12 1 0 25 0 f f f f t f s 2 0 25 "114 1009" "{114,1009}" "{i,v}" "{from_json,path_elems}" _null_ json_extract_path_text _null_ _null_ _null_ ));
+DESCR("get value from json as text with path elements");
+DATA(insert OID = 5008 (  json_extract_path_text_op PGNSP PGUID 12 1 0 0 0  f f f f t f s 2 0 25 "114 1009" _null_ _null_ "{from_json,path_elems}" _null_ json_extract_path_text _null_ _null_ _null_ ));
+DESCR("get value from json as text with path elements");
+DATA(insert OID = 5009 (  json_array_elements       PGNSP PGUID 12 1 100 0 0 f f f f t t s 1 0 114 "114" "{114,114}" "{i,o}" "{from_json,value}" _null_ json_array_elements _null_ _null_ _null_ ));
+DESCR("key value pairs of a json object");
+DATA(insert OID = 5010 (  json_array_length         PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 23 "114" _null_ _null_ _null_ _null_ json_array_length _null_ _null_ _null_ ));
+DESCR("length of json array");
+DATA(insert OID = 5011 (  json_object_keys          PGNSP PGUID 12 1 100 0 0 f f f f t t s 1 0 25 "114" _null_ _null_ _null_ _null_ json_object_keys _null_ _null_ _null_ ));
+DESCR("get json object keys");
+DATA(insert OID = 5012 (  json_each                PGNSP PGUID 12 1 100 0 0 f f f f t t s 1 0 2249 "114" "{114,25,114}" "{i,o,o}" "{from_json,key,value}" _null_ json_each _null_ _null_ _null_ ));
+DESCR("key value pairs of a json object");
+DATA(insert OID = 5013 (  json_each_text           PGNSP PGUID 12 1 100 0 0 f f f f t t s 1 0 2249 "114" "{114,25,25}" "{i,o,o}" "{from_json,key,value}" _null_ json_each_text _null_ _null_ _null_ ));
+DESCR("key value pairs of a json object");
+DATA(insert OID = 5014 (  json_populate_record     PGNSP PGUID 12 1 0 0 0 f f f f f f s 3 0 2283 "2283 114 16" _null_ _null_ _null_ _null_ json_populate_record _null_ _null_ _null_ ));
+DESCR("get record fields from a json object");
+DATA(insert OID = 5015 (  json_populate_recordset  PGNSP PGUID 12 1 100 0 0 f f f f f t s 3 0 2283 "2283 114 16" _null_ _null_ _null_ _null_ json_populate_recordset _null_ _null_ _null_ ));
+DESCR("get set of records with fields from a json array of objects");
+
 /* uuid */
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2950 "2275" _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
 DESCR("I/O");
diff --git a/src/include/utils/json.h b/src/include/utils/json.h
index caaa769..4845f5f 100644
--- a/src/include/utils/json.h
+++ b/src/include/utils/json.h
@@ -17,6 +17,7 @@
 #include "fmgr.h"
 #include "lib/stringinfo.h"
 
+/* functions in json.c */
 extern Datum json_in(PG_FUNCTION_ARGS);
 extern Datum json_out(PG_FUNCTION_ARGS);
 extern Datum json_recv(PG_FUNCTION_ARGS);
@@ -32,4 +33,19 @@ extern Datum json_agg_finalfn(PG_FUNCTION_ARGS);
 
 extern void escape_json(StringInfo buf, const char *str);
 
+/* functions in jsonfuncs.c */
+extern Datum json_object_field(PG_FUNCTION_ARGS);
+extern Datum json_object_field_text(PG_FUNCTION_ARGS);
+extern Datum json_array_element(PG_FUNCTION_ARGS);
+extern Datum json_array_element_text(PG_FUNCTION_ARGS);
+extern Datum json_extract_path(PG_FUNCTION_ARGS);
+extern Datum json_extract_path_text(PG_FUNCTION_ARGS);
+extern Datum json_object_keys(PG_FUNCTION_ARGS);
+extern Datum json_array_length(PG_FUNCTION_ARGS);
+extern Datum json_each(PG_FUNCTION_ARGS);
+extern Datum json_each_text(PG_FUNCTION_ARGS);
+extern Datum json_array_elements(PG_FUNCTION_ARGS);
+extern Datum json_populate_record(PG_FUNCTION_ARGS);
+extern Datum json_populate_recordset(PG_FUNCTION_ARGS);
+
 #endif   /* JSON_H */
diff --git a/src/include/utils/jsonapi.h b/src/include/utils/jsonapi.h
new file mode 100644
index 0000000..f96e953
--- /dev/null
+++ b/src/include/utils/jsonapi.h
@@ -0,0 +1,110 @@
+/*-------------------------------------------------------------------------
+ *
+ * jsonapi.h
+ *	  Declarations for JSON API support.
+ *
+ * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/jsonapi.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef JSONAPI_H
+#define JSONAPI_H
+
+#include "lib/stringinfo.h"
+
+typedef enum
+{
+	JSON_TOKEN_INVALID,
+	JSON_TOKEN_STRING,
+	JSON_TOKEN_NUMBER,
+	JSON_TOKEN_OBJECT_START,
+	JSON_TOKEN_OBJECT_END,
+	JSON_TOKEN_ARRAY_START,
+	JSON_TOKEN_ARRAY_END,
+	JSON_TOKEN_COMMA,
+	JSON_TOKEN_COLON,
+	JSON_TOKEN_TRUE,
+	JSON_TOKEN_FALSE,
+	JSON_TOKEN_NULL,
+	JSON_TOKEN_END,
+}	JsonTokenType;
+
+
+/*
+ * All the fields in this structure should be treated as read-only.
+ *
+ * If strval is not null, then it should contain the de-escaped value
+ * of the lexeme if it's a string. Otherwise most of these field names
+ * should be self-explanatory.
+ *
+ * line_number and line_start are principally for use by the parser's
+ * error reporting routines.
+ * token_terminator and prev_token_terminator point to the character
+ * AFTER the end of the token, i.e. where there would be a nul byte
+ * if we were using nul-terminated strings.
+ */
+typedef struct JsonLexContext
+{
+	char	   *input;
+	int			input_length;
+	char	   *token_start;
+	char	   *token_terminator;
+	char	   *prev_token_terminator;
+	JsonTokenType token_type;
+	int			lex_level;
+	int			line_number;
+	char	   *line_start;
+	StringInfo	strval;
+} JsonLexContext;
+
+typedef void (*json_struct_action) (void *state);
+typedef void (*json_ofield_action) (void *state, char *fname, bool isnull);
+typedef void (*json_aelem_action) (void *state, bool isnull);
+typedef void (*json_scalar_action) (void *state, char *token, JsonTokenType tokentype);
+
+
+/*
+ * Semantic Action structure for use in parsing json.
+ * Any of these actions can be NULL, in which case nothing is done at that
+ * point, Likewise, semstate can be NULL. Using an all-NULL structure amounts
+ * to doing a pure parse with no side-effects, and is therefore exactly
+ * what the json input routines do.
+ */
+typedef struct jsonSemAction
+{
+	void	   *semstate;
+	json_struct_action object_start;
+	json_struct_action object_end;
+	json_struct_action array_start;
+	json_struct_action array_end;
+	json_ofield_action object_field_start;
+	json_ofield_action object_field_end;
+	json_aelem_action array_element_start;
+	json_aelem_action array_element_end;
+	json_scalar_action scalar;
+}	jsonSemAction, *JsonSemAction;
+
+/*
+ * parse_json will parse the string in the lex calling the
+ * action functions in sem at the appropriate points. It is
+ * up to them to keep what state they need	in semstate. If they
+ * need access to the state of the lexer, then its pointer
+ * should be passed to them as a member of whatever semstate
+ * points to. If the action pointers are NULL the parser
+ * does nothing and just continues.
+ */
+extern void pg_parse_json(JsonLexContext *lex, JsonSemAction sem);
+
+/*
+ * constructor for JsonLexContext, with or without strval element.
+ * If supplied, the strval element will contain a de-escaped version of
+ * the lexeme. However, doing this imposes a performance penalty, so
+ * it should be avoided if the de-escaped lexeme is not required.
+ */
+extern JsonLexContext *makeJsonLexContext(text *json, bool need_escapes);
+
+#endif   /* JSONAPI_H */
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index 7ae18f1..01f0679 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -457,3 +457,466 @@ FROM (SELECT '{"a":1,"b": [2,3,4,"d","e","f"],"c":{"p":1,"q":2}}'::json AS "json
  {"jsonfield":{"a":1,"b": [2,3,4,"d","e","f"],"c":{"p":1,"q":2}}}
 (1 row)
 
+-- json extraction functions
+CREATE TEMP TABLE test_json (
+       json_type text,
+       test_json json
+);
+INSERT INTO test_json VALUES
+('scalar','"a scalar"'),
+('array','["zero", "one","two",null,"four","five"]'),
+('object','{"field1":"val1","field2":"val2","field3":null}');
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'scalar';
+ERROR:  cannot extract element from a scalar
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'array';
+ERROR:  cannot extract field from a non-object
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'object';
+ ?column? 
+----------
+ 
+(1 row)
+
+SELECT test_json->'field2'
+FROM test_json
+WHERE json_type = 'object';
+ ?column? 
+----------
+ "val2"
+(1 row)
+
+SELECT test_json->>'field2' 
+FROM test_json
+WHERE json_type = 'object';
+ ?column? 
+----------
+ val2
+(1 row)
+
+SELECT test_json -> 2 
+FROM test_json
+WHERE json_type = 'scalar';
+ERROR:  cannot extract element from a scalar
+SELECT test_json -> 2 
+FROM test_json
+WHERE json_type = 'array';
+ ?column? 
+----------
+ "two"
+(1 row)
+
+SELECT test_json -> 2
+FROM test_json
+WHERE json_type = 'object';
+ERROR:  cannot extract array element from a non-array
+SELECT test_json->>2
+FROM test_json
+WHERE json_type = 'array';
+ ?column? 
+----------
+ two
+(1 row)
+
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'scalar';
+ERROR:  cannot call json_object_keys on a scalar
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'array';
+ERROR:  cannot call json_object_keys on an array
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'object';
+ json_object_keys 
+------------------
+ field1
+ field2
+ field3
+(3 rows)
+
+-- nulls
+select (test_json->'field3') is null as expect_false
+from test_json
+where json_type = 'object';
+ expect_false 
+--------------
+ f
+(1 row)
+
+select (test_json->>'field3') is null as expect_true
+from test_json
+where json_type = 'object';
+ expect_true 
+-------------
+ t
+(1 row)
+
+select (test_json->3) is null as expect_false
+from test_json
+where json_type = 'array';
+ expect_false 
+--------------
+ f
+(1 row)
+
+select (test_json->>3) is null as expect_true
+from test_json
+where json_type = 'array';
+ expect_true 
+-------------
+ t
+(1 row)
+
+-- array length
+SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+ json_array_length 
+-------------------
+                 5
+(1 row)
+
+SELECT json_array_length('[]');
+ json_array_length 
+-------------------
+                 0
+(1 row)
+
+SELECT json_array_length('{"f1":1,"f2":[5,6]}');
+ERROR:  cannot get array length of a non-array
+SELECT json_array_length('4');
+ERROR:  cannot get array length of a scalar
+-- each
+select json_each('{"f1":[1,2,3],"f2":{"f3":1},"f4":null}');
+     json_each     
+-------------------
+ (f1,"[1,2,3]")
+ (f2,"{""f3"":1}")
+ (f4,null)
+(3 rows)
+
+select * from json_each('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":99,"f6":"stringy"}') q;
+ key |   value   
+-----+-----------
+ f1  | [1,2,3]
+ f2  | {"f3":1}
+ f4  | null
+ f5  | 99
+ f6  | "stringy"
+(5 rows)
+
+select json_each_text('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":"null"}');
+  json_each_text   
+-------------------
+ (f1,"[1,2,3]")
+ (f2,"{""f3"":1}")
+ (f4,)
+ (f5,null)
+(4 rows)
+
+select * from json_each_text('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":99,"f6":"stringy"}') q;
+ key |  value   
+-----+----------
+ f1  | [1,2,3]
+ f2  | {"f3":1}
+ f4  | 
+ f5  | 99
+ f6  | stringy
+(5 rows)
+
+-- extract_path, extract_path_as_text
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f4','f6');
+ json_extract_path 
+-------------------
+ "stringy"
+(1 row)
+
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f2');
+ json_extract_path 
+-------------------
+ {"f3":1}
+(1 row)
+
+select json_extract_path('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',0::text);
+ json_extract_path 
+-------------------
+ "f3"
+(1 row)
+
+select json_extract_path('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',1::text);
+ json_extract_path 
+-------------------
+ 1
+(1 row)
+
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f4','f6');
+ json_extract_path_text 
+------------------------
+ stringy
+(1 row)
+
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f2');
+ json_extract_path_text 
+------------------------
+ {"f3":1}
+(1 row)
+
+select json_extract_path_text('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',0::text);
+ json_extract_path_text 
+------------------------
+ f3
+(1 row)
+
+select json_extract_path_text('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',1::text);
+ json_extract_path_text 
+------------------------
+ 1
+(1 row)
+
+-- extract_path nulls
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":null,"f6":"stringy"}}','f4','f5') is null as expect_false;
+ expect_false 
+--------------
+ f
+(1 row)
+
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":null,"f6":"stringy"}}','f4','f5') is null as expect_true;
+ expect_true 
+-------------
+ t
+(1 row)
+
+select json_extract_path('{"f2":{"f3":1},"f4":[0,1,2,null]}','f4','3') is null as expect_false;
+ expect_false 
+--------------
+ f
+(1 row)
+
+select json_extract_path_text('{"f2":{"f3":1},"f4":[0,1,2,null]}','f4','3') is null as expect_true;
+ expect_true 
+-------------
+ t
+(1 row)
+
+-- extract_path operators
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f4','f6'];
+ ?column?  
+-----------
+ "stringy"
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2'];
+ ?column? 
+----------
+ {"f3":1}
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2','0'];
+ ?column? 
+----------
+ "f3"
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2','1'];
+ ?column? 
+----------
+ 1
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f4','f6'];
+ ?column? 
+----------
+ stringy
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2'];
+ ?column? 
+----------
+ {"f3":1}
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2','0'];
+ ?column? 
+----------
+ f3
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2','1'];
+ ?column? 
+----------
+ 1
+(1 row)
+
+-- same using array literals
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f4,f6}';
+ ?column?  
+-----------
+ "stringy"
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2}';
+ ?column? 
+----------
+ {"f3":1}
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2,0}';
+ ?column? 
+----------
+ "f3"
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2,1}';
+ ?column? 
+----------
+ 1
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f4,f6}';
+ ?column? 
+----------
+ stringy
+(1 row)
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2}';
+ ?column? 
+----------
+ {"f3":1}
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2,0}';
+ ?column? 
+----------
+ f3
+(1 row)
+
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2,1}';
+ ?column? 
+----------
+ 1
+(1 row)
+
+-- array_elements
+select json_array_elements('[1,true,[1,[2,3]],null,{"f1":1,"f2":[7,8,9]},false]');
+  json_array_elements  
+-----------------------
+ 1
+ true
+ [1,[2,3]]
+ null
+ {"f1":1,"f2":[7,8,9]}
+ false
+(6 rows)
+
+select * from json_array_elements('[1,true,[1,[2,3]],null,{"f1":1,"f2":[7,8,9]},false]') q;
+         value         
+-----------------------
+ 1
+ true
+ [1,[2,3]]
+ null
+ {"f1":1,"f2":[7,8,9]}
+ false
+(6 rows)
+
+-- populate_record
+create type jpop as (a text, b int, c timestamp);
+select * from json_populate_record(null::jpop,'{"a":"blurfl","x":43.2}') q;
+   a    | b | c 
+--------+---+---
+ blurfl |   | 
+(1 row)
+
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":"blurfl","x":43.2}') q;
+   a    | b |            c             
+--------+---+--------------------------
+ blurfl | 3 | Mon Dec 31 15:30:56 2012
+(1 row)
+
+select * from json_populate_record(null::jpop,'{"a":"blurfl","x":43.2}', true) q;
+   a    | b | c 
+--------+---+---
+ blurfl |   | 
+(1 row)
+
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":"blurfl","x":43.2}', true) q;
+   a    | b |            c             
+--------+---+--------------------------
+ blurfl | 3 | Mon Dec 31 15:30:56 2012
+(1 row)
+
+select * from json_populate_record(null::jpop,'{"a":[100,200,false],"x":43.2}', true) q;
+        a        | b | c 
+-----------------+---+---
+ [100,200,false] |   | 
+(1 row)
+
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":[100,200,false],"x":43.2}', true) q;
+        a        | b |            c             
+-----------------+---+--------------------------
+ [100,200,false] | 3 | Mon Dec 31 15:30:56 2012
+(1 row)
+
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"c":[100,200,false],"x":43.2}', true) q;
+ERROR:  invalid input syntax for type timestamp: "[100,200,false]"
+-- populate_recordset
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',false) q;
+   a    | b |            c             
+--------+---+--------------------------
+ blurfl |   | 
+        | 3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',false) q;
+   a    | b  |            c             
+--------+----+--------------------------
+ blurfl | 99 | 
+ def    |  3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+   a    | b |            c             
+--------+---+--------------------------
+ blurfl |   | 
+        | 3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+   a    | b  |            c             
+--------+----+--------------------------
+ blurfl | 99 | 
+ def    |  3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+       a       | b  |            c             
+---------------+----+--------------------------
+ [100,200,300] | 99 | 
+ {"z":true}    |  3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"c":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+ERROR:  invalid input syntax for type timestamp: "[100,200,300]"
+-- using the default use_json_as_text argument
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]') q;
+   a    | b |            c             
+--------+---+--------------------------
+ blurfl |   | 
+        | 3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]') q;
+   a    | b  |            c             
+--------+----+--------------------------
+ blurfl | 99 | 
+ def    |  3 | Fri Jan 20 10:42:53 2012
+(2 rows)
+
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]') q;
+ERROR:  cannot call json_populate_recordset on a nested object
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"c":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]') q;
+ERROR:  cannot call json_populate_recordset on a nested object
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index 5583d65..04b22fe 100644
--- a/src/test/regress/sql/json.sql
+++ b/src/test/regress/sql/json.sql
@@ -125,3 +125,174 @@ FROM (SELECT '-Infinity'::float8 AS "float8field") q;
 -- json input
 SELECT row_to_json(q)
 FROM (SELECT '{"a":1,"b": [2,3,4,"d","e","f"],"c":{"p":1,"q":2}}'::json AS "jsonfield") q;
+
+
+-- json extraction functions
+
+CREATE TEMP TABLE test_json (
+       json_type text,
+       test_json json
+);
+
+INSERT INTO test_json VALUES
+('scalar','"a scalar"'),
+('array','["zero", "one","two",null,"four","five"]'),
+('object','{"field1":"val1","field2":"val2","field3":null}');
+
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'scalar';
+
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'array';
+
+SELECT test_json -> 'x' 
+FROM test_json
+WHERE json_type = 'object';
+
+SELECT test_json->'field2'
+FROM test_json
+WHERE json_type = 'object';
+
+SELECT test_json->>'field2' 
+FROM test_json
+WHERE json_type = 'object';
+
+SELECT test_json -> 2 
+FROM test_json
+WHERE json_type = 'scalar';
+
+SELECT test_json -> 2 
+FROM test_json
+WHERE json_type = 'array';
+
+SELECT test_json -> 2
+FROM test_json
+WHERE json_type = 'object';
+
+SELECT test_json->>2
+FROM test_json
+WHERE json_type = 'array';
+
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'scalar';
+
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'array';
+
+SELECT json_object_keys(test_json)
+FROM test_json
+WHERE json_type = 'object';
+
+-- nulls
+
+select (test_json->'field3') is null as expect_false
+from test_json
+where json_type = 'object';
+
+select (test_json->>'field3') is null as expect_true
+from test_json
+where json_type = 'object';
+
+select (test_json->3) is null as expect_false
+from test_json
+where json_type = 'array';
+
+select (test_json->>3) is null as expect_true
+from test_json
+where json_type = 'array';
+
+
+-- array length
+
+SELECT json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]');
+
+SELECT json_array_length('[]');
+
+SELECT json_array_length('{"f1":1,"f2":[5,6]}');
+
+SELECT json_array_length('4');
+
+-- each
+
+select json_each('{"f1":[1,2,3],"f2":{"f3":1},"f4":null}');
+select * from json_each('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":99,"f6":"stringy"}') q;
+
+select json_each_text('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":"null"}');
+select * from json_each_text('{"f1":[1,2,3],"f2":{"f3":1},"f4":null,"f5":99,"f6":"stringy"}') q;
+
+-- extract_path, extract_path_as_text
+
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f4','f6');
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f2');
+select json_extract_path('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',0::text);
+select json_extract_path('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',1::text);
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f4','f6');
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}','f2');
+select json_extract_path_text('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',0::text);
+select json_extract_path_text('{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}','f2',1::text);
+
+-- extract_path nulls
+
+select json_extract_path('{"f2":{"f3":1},"f4":{"f5":null,"f6":"stringy"}}','f4','f5') is null as expect_false;
+select json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":null,"f6":"stringy"}}','f4','f5') is null as expect_true;
+select json_extract_path('{"f2":{"f3":1},"f4":[0,1,2,null]}','f4','3') is null as expect_false;
+select json_extract_path_text('{"f2":{"f3":1},"f4":[0,1,2,null]}','f4','3') is null as expect_true;
+
+-- extract_path operators
+
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f4','f6'];
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2'];
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2','0'];
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>array['f2','1'];
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f4','f6'];
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2'];
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2','0'];
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>array['f2','1'];
+
+-- same using array literals
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f4,f6}';
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2}';
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2,0}';
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>'{f2,1}';
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f4,f6}';
+select '{"f2":{"f3":1},"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2}';
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2,0}';
+select '{"f2":["f3",1],"f4":{"f5":99,"f6":"stringy"}}'::json#>>'{f2,1}';
+
+-- array_elements
+
+select json_array_elements('[1,true,[1,[2,3]],null,{"f1":1,"f2":[7,8,9]},false]');
+select * from json_array_elements('[1,true,[1,[2,3]],null,{"f1":1,"f2":[7,8,9]},false]') q;
+
+-- populate_record
+create type jpop as (a text, b int, c timestamp);
+
+select * from json_populate_record(null::jpop,'{"a":"blurfl","x":43.2}') q;
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":"blurfl","x":43.2}') q;
+
+select * from json_populate_record(null::jpop,'{"a":"blurfl","x":43.2}', true) q;
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":"blurfl","x":43.2}', true) q;
+
+select * from json_populate_record(null::jpop,'{"a":[100,200,false],"x":43.2}', true) q;
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"a":[100,200,false],"x":43.2}', true) q;
+select * from json_populate_record(row('x',3,'2012-12-31 15:30:56')::jpop,'{"c":[100,200,false],"x":43.2}', true) q;
+
+-- populate_recordset
+
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',false) q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',false) q;
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"c":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]',true) q;
+
+-- using the default use_json_as_text argument
+
+select * from json_populate_recordset(null::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]') q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":"blurfl","x":43.2},{"b":3,"c":"2012-01-20 10:42:53"}]') q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"a":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]') q;
+select * from json_populate_recordset(row('def',99,null)::jpop,'[{"c":[100,200,300],"x":43.2},{"a":{"z":true},"b":3,"c":"2012-01-20 10:42:53"}]') q;