Basic JSON support

Started by Itagaki Takahiroover 15 years ago18 messages
#1Itagaki Takahiro
itagaki.takahiro@gmail.com
1 attachment(s)

Here is a patch for basic JSON support. It adds only those features:
* Add "json" data type, that is binary-compatible with text.
* Syntax checking on text to JSON conversion.
* json_pretty() -- print JSON tree with indentation.

We have "JSON datatype (WIP) 01" item:
https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.

The most different point of my patch is that JSON parser is
reimplemented with flex and bison to make maintenance easier.

Note that the JSON parser accept top-level scalar values
intensionally, because of requirement when we add functions
to extract values from JSON array/object. For example,
CREATE FUNCTION json_to_array(json) RETURNS json[]
might return naked scalar values in the result array.

--
Itagaki Takahiro

Attachments:

basic_json-20100915.patchapplication/octet-stream; name=basic_json-20100915.patchDownload
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 70e21b9..483abb6 100644
*** a/src/backend/utils/adt/Makefile
--- b/src/backend/utils/adt/Makefile
*************** endif
*** 18,24 ****
  OBJS = acl.o arrayfuncs.o array_userfuncs.o arrayutils.o bool.o \
  	cash.o char.o date.o datetime.o datum.o domains.o \
  	enum.o float.o format_type.o \
! 	geo_ops.o geo_selfuncs.o int.o int8.o like.o lockfuncs.o \
  	misc.o nabstime.o name.o numeric.o numutils.o \
  	oid.o oracle_compat.o pseudotypes.o rowtypes.o \
  	regexp.o regproc.o ruleutils.o selfuncs.o \
--- 18,25 ----
  OBJS = acl.o arrayfuncs.o array_userfuncs.o arrayutils.o bool.o \
  	cash.o char.o date.o datetime.o datum.o domains.o \
  	enum.o float.o format_type.o \
! 	geo_ops.o geo_selfuncs.o json.o json_parser.o json_scanner.o \
! 	int.o int8.o like.o lockfuncs.o \
  	misc.o nabstime.o name.o numeric.o numutils.o \
  	oid.o oracle_compat.o pseudotypes.o rowtypes.o \
  	regexp.o regproc.o ruleutils.o selfuncs.o \
*************** OBJS = acl.o arrayfuncs.o array_userfunc
*** 33,36 ****
--- 34,62 ----
  
  like.o: like.c like_match.c
  
+ json_parser.o: json_scanner.c
+ 
+ # Latest flex causes warnings in this file.
+ #ifeq ($(GCC),yes)
+ #json_parser.o: CFLAGS += -Wno-error
+ #endif
+ 
+ json_parser.h: json_parser.c ;
+ 
+ json_parser.c: json_parser.y
+ ifdef BISON
+ 	$(BISON) -d $(BISONFLAGS) -o $@ $<
+ else
+ 	@$(missing) bison $< $@
+ endif
+ 
+ json_scanner.c: json_scanner.l
+ ifdef FLEX
+ 	$(FLEX) $(FLEXFLAGS) -o'$@' $<
+ else
+ 	@$(missing) flex $< $@
+ endif
+ 
+ json_parser.o keywords.o parser.o: json_parser.h
+ 
  include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index ...13e5391 .
*** a/src/backend/utils/adt/json.c
--- b/src/backend/utils/adt/json.c
***************
*** 0 ****
--- 1,307 ----
+ /*-------------------------------------------------------------------------
+  *
+  * json.c
+  *	  JSON data type support.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ 
+ #include "libpq/pqformat.h"
+ #include "utils/builtins.h"
+ #include "utils/json.h"
+ 
+ #include "json_parser.h"
+ #include "json_scanner.h"
+ 
+ static Json *json_parse(const char *str, int len, bool validateOnly);
+ static void json_free(Json *node);
+ static void append_json(StringInfo buf, const Json *node, int indent);
+ static void append_indent(StringInfo buf, int indent);
+ 
+ extern int	json_yyparse(yyscan_t scanner);
+ extern void	json_yyerror(yyscan_t scanner, const char *error);
+ extern void *json_yyalloc(size_t bytes, void *yyscanner);
+ extern void *json_yyrealloc(void *ptr, size_t bytes, void *yyscanner);
+ extern void json_yyfree(void *ptr, void *yyscanner);
+ 
+ /* indent with 4 spaces */
+ #define JSON_INDENT_STRING	"    "
+ 
+ 
+ Datum
+ json_in(PG_FUNCTION_ARGS)
+ {
+ 	char	   *s = PG_GETARG_CSTRING(0);
+ 	jsontype   *json;
+ 
+ 	json = (jsontype *) cstring_to_text(s);
+ 
+ 	/*
+ 	 * Parse the data to check if it is a JSON data.  Assume that
+ 	 * ERROR occurred if parsing failed.
+ 	 */
+ 	json_parse(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json), true);
+ 
+ 	PG_RETURN_JSON_P(json);
+ }
+ 
+ Datum
+ json_out(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	return textout(fcinfo);
+ }
+ 
+ Datum
+ json_recv(PG_FUNCTION_ARGS)
+ {
+ 	StringInfo	buf = (StringInfo) PG_GETARG_POINTER(0);
+ 	jsontype   *result;
+ 	char	   *str;
+ 	int			nbytes;
+ 
+ 	str = pq_getmsgtext(buf, buf->len - buf->cursor, &nbytes);
+ 	json_parse(str, nbytes, true);
+ 
+ 	result = (jsontype *) cstring_to_text_with_len(str, nbytes);
+ 	pfree(str);
+ 	PG_RETURN_JSON_P(result);
+ }
+ 
+ Datum
+ json_send(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	return textsend(fcinfo);
+ }
+ 
+ Datum
+ text_to_json(PG_FUNCTION_ARGS)
+ {
+ 	text	   *t = PG_GETARG_TEXT_PP(0);
+ 
+ 	json_parse(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t), true);
+ 
+ 	PG_RETURN_JSON_P((jsontype *) t);
+ }
+ 
+ Datum
+ json_to_text(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+ }
+ 
+ Datum
+ json_pretty(PG_FUNCTION_ARGS)
+ {
+ 	jsontype	   *json = PG_GETARG_JSON_PP(0);
+ 	Json		   *node;
+ 	StringInfoData	buf;
+ 
+ 	node = json_parse(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json), false);
+ 	initStringInfo(&buf);
+ 	append_json(&buf, node, 0);
+ 	json_free(node);
+ 
+ 	return CStringGetTextDatum(buf.data);
+ }
+ 
+ /*
+  * json_parse -- parse and validate the JSON text.
+  *
+  * Only validates the input when validateOnly is true, or returns JSON
+  * tree on false.
+  */
+ static Json *
+ json_parse(const char *str, int len, bool validateOnly)
+ {
+ 	yyscan_t		scanner;
+ 	char		   *scanbuf;
+ 	JsonParser		parser;
+ 
+ 	scanbuf = (char *) palloc(len + 2);
+ 	memcpy(scanbuf, str, len);
+ 	scanbuf[len] = scanbuf[len + 1] = '\0';
+ 
+ 	/* initialize parser and scanner */
+ 	memset(&parser, 0, sizeof(parser));
+ 	parser.validateOnly = validateOnly;
+ 	if (!validateOnly)
+ 		initStringInfo(&parser.buf);
+ 
+ 	json_yylex_init(&scanner);
+ 	json_yyset_extra(&parser, scanner);
+ 	json_yy_scan_buffer(scanbuf, len + 2, scanner);
+ 
+ 	/* parse the JSON text */
+ 	json_yyparse(scanner);
+ 
+ 	/* cleanup */
+ 	json_yylex_destroy(scanner);
+ 	pfree(scanbuf);
+ 	if (parser.buf.data)
+ 		pfree(parser.buf.data);
+ 
+ 	return parser.json;
+ }
+ 
+ /*
+  * json_free -- free JSON node tree recursively
+  */
+ static void
+ json_free(Json *node)
+ {
+ 	ListCell   *cell;
+ 
+ 	switch (node->type)
+ 	{
+ 	case JSON_NULL:
+ 	case JSON_BOOL:
+ 		break;
+ 	case JSON_NUMBER:
+ 		pfree(node->value.number);
+ 		break;
+ 	case JSON_STRING:
+ 		pfree(node->value.string);
+ 		break;
+ 	case JSON_ARRAY:
+ 		foreach (cell, node->value.array)
+ 		{
+ 			Json *item = (Json *) lfirst(cell);
+ 
+ 			json_free(item);
+ 		}
+ 		list_free(node->value.array);
+ 		break;
+ 	case JSON_OBJECT:
+ 		foreach (cell, node->value.object)
+ 		{
+ 			JsonAttr *attr = (JsonAttr *) lfirst(cell);
+ 
+ 			pfree(attr->key);
+ 			json_free(attr->value);
+ 			pfree(attr);
+ 		}
+ 		list_free(node->value.object);
+ 		break;
+ 	default:
+ 		elog(ERROR, "unexpected json type: %d", node->type);
+ 	}
+ 
+ 	pfree(node);
+ }
+ 
+ /*
+  * append_json -- append JSON node tree to string with indentation
+  */
+ static void
+ append_json(StringInfo buf, const Json *node, int indent)
+ {
+ 	ListCell   *cell;
+ 	bool		sep;
+ 
+ 	switch (node->type)
+ 	{
+ 	case JSON_NULL:
+ 		appendStringInfoString(buf, "null");
+ 		break;
+ 	case JSON_BOOL:
+ 		appendStringInfoString(buf, node->value.boolean ? "true" : "false");
+ 		break;
+ 	case JSON_NUMBER:
+ 		appendStringInfoString(buf, node->value.number);
+ 		break;
+ 	case JSON_STRING:
+ 		appendStringInfoString(buf, node->value.string);
+ 		break;
+ 	case JSON_ARRAY:
+ 		appendStringInfoString(buf, "[\n");
+ 		sep = false;
+ 		foreach (cell, node->value.array)
+ 		{
+ 			const Json *value = (const Json *) lfirst(cell);
+ 
+ 			if (sep)
+ 				appendStringInfoString(buf, ",\n");
+ 			else
+ 				sep = true;
+ 			append_indent(buf, indent + 1);
+ 			append_json(buf, value, indent + 1);
+ 		}
+ 		if (sep)
+ 			appendStringInfoChar(buf, '\n');
+ 		append_indent(buf, indent);
+ 		appendStringInfoChar(buf, ']');
+ 		break;
+ 	case JSON_OBJECT:
+ 		appendStringInfoString(buf, "{\n");
+ 		sep = false;
+ 		foreach (cell, node->value.object)
+ 		{
+ 			const JsonAttr *attr = (const JsonAttr *) lfirst(cell);
+ 			const char *key = attr->key;
+ 			const Json *value = attr->value;
+ 
+ 			if (sep)
+ 				appendStringInfoString(buf, ",\n");
+ 			else
+ 				sep = true;
+ 			append_indent(buf, indent + 1);
+ 			appendStringInfo(buf, "%s: ", key);
+ 			append_json(buf, value, indent + 1);
+ 		}
+ 		if (sep)
+ 			appendStringInfoChar(buf, '\n');
+ 		append_indent(buf, indent);
+ 		appendStringInfoChar(buf, '}');
+ 		break;
+ 	default:
+ 		elog(ERROR, "unexpected json type: %d", node->type);
+ 	}
+ }
+ 
+ static void
+ append_indent(StringInfo buf, int indent)
+ {
+ 	int		i;
+ 	for (i = 0; i < indent; i++)
+ 		appendStringInfoString(buf, JSON_INDENT_STRING);
+ }
+ 
+ void
+ json_yyerror(yyscan_t scanner, const char *error)
+ {
+ 	ereport(ERROR,
+ 			(errcode(ERRCODE_SYNTAX_ERROR),
+ 			 errmsg("syntax error in json: %s", error)));
+ }
+ 
+ void *
+ json_yyalloc(size_t bytes, void *yyscanner)
+ {
+ 	return palloc(bytes);
+ }
+ 
+ void *
+ json_yyrealloc(void *ptr, size_t bytes, void *yyscanner)
+ {
+ 	if (ptr != NULL)
+ 		return repalloc(ptr, bytes);
+ 	else
+ 		return palloc(bytes);
+ }
+ 
+ void
+ json_yyfree(void *ptr, void *yyscanner)
+ {
+ 	if (ptr != NULL)
+ 		pfree(ptr);
+ }
diff --git a/src/backend/utils/adt/json_parser.y b/src/backend/utils/adt/json_parser.y
index ...fb769e2 .
*** a/src/backend/utils/adt/json_parser.y
--- b/src/backend/utils/adt/json_parser.y
***************
*** 0 ****
--- 1,221 ----
+ %{
+ /*-------------------------------------------------------------------------
+  *
+  * json_parser.y
+  *	  Parser for JSON data types.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "nodes/pg_list.h"
+ #include "utils/json.h"
+ 
+ static Json *makeJsonNode(JsonType type);
+ static Json *makeJsonNull(JsonParser *parser);
+ static Json *makeJsonBool(JsonParser *parser, bool value);
+ static Json *makeJsonNumber(JsonParser *parser, char *value);
+ static Json *makeJsonString(JsonParser *parser, char *value);
+ static Json *makeJsonArray(JsonParser *parser, List *items);
+ static Json *makeJsonObject(JsonParser *parser, List *attrs);
+ static JsonAttr *makeJsonAttr(JsonParser *parser, char *key, Json *value);
+ static List *makeList(JsonParser *parser, void *elem);
+ static List *lconsList(JsonParser *parser, void *elem, List *list);
+ 
+ #include "json_parser.h"
+ #include "json_scanner.h"
+ 
+ #define getParser()		((JsonParser *) json_yyget_extra(scanner))
+ 
+ extern void json_yyerror(yyscan_t, const char *);
+ %}
+ 
+ %defines
+ %expect 0
+ %name-prefix="json_yy"
+ %pure-parser
+ %parse-param	{yyscan_t scanner}
+ %lex-param		{yyscan_t scanner}
+ 
+ %union {
+ 	bool		boolean;
+ 	char	   *number;
+ 	char	   *string;
+ 	Json	   *node;
+ 	JsonAttr   *attr;
+ 	List	   *list;
+ }
+ 
+ %token <string>			T_JSON_STRING
+ %token <number>			T_JSON_NUMBER
+ %token <boolean>		T_JSON_TRUE T_JSON_FALSE
+ %token T_JSON_NULL
+ %token T_JSON_INVALID
+ 
+ %type <number>			number
+ %type <string>			string
+ %type <node>			json value array object 
+ %type <list>			items attrs
+ %type <attr>			attr
+ 
+ %%
+ json:
+ 	  value				{ getParser()->json = $1; }
+ ;
+ 
+ array:
+ 	  '[' ']'			{ $$ = makeJsonArray(getParser(), NIL); }
+ 	| '[' items ']'		{ $$ = makeJsonArray(getParser(), $2);; }
+ ;
+ 
+ items:
+ 	  value				{ $$ = makeList(getParser(), $1); }
+ 	| value ',' items	{ $$ = lconsList(getParser(), $1, $3); }
+ ;
+ 
+ object:
+ 	  '{' '}'			{ $$ = makeJsonObject(getParser(), NIL); }
+ 	| '{' attrs '}'		{ $$ = makeJsonObject(getParser(), $2); }
+ ;
+ 
+ attrs:
+ 	  attr				{ $$ = makeList(getParser(), $1); }
+ 	| attr ',' attrs	{ $$ = lconsList(getParser(), $1, $3); }
+ ;
+ 
+ attr:
+ 	  string ':' value	{ $$ = makeJsonAttr(getParser(), $1, $3); }
+ ;
+ 
+ value:
+ 	  T_JSON_NULL		{ $$ = makeJsonNull(getParser()); }
+ 	| T_JSON_TRUE		{ $$ = makeJsonBool(getParser(), true); }
+ 	| T_JSON_FALSE		{ $$ = makeJsonBool(getParser(), false); }
+ 	| number			{ $$ = makeJsonNumber(getParser(), $1); }
+ 	| string			{ $$ = makeJsonString(getParser(), $1); }
+ 	| array				{ $$ = $1; }
+ 	| object			{ $$ = $1; }
+ ;
+ 
+ number:		T_JSON_NUMBER		{ $$ = $1; };
+ string:		T_JSON_STRING		{ $$ = $1; };
+ 
+ %%
+ static Json *
+ makeJsonNode(JsonType type)
+ {
+ 	Json *node = (Json *) palloc(sizeof(Json));
+ 	node->type = type;
+ 	return node;
+ }
+ 
+ static Json *
+ makeJsonNull(JsonParser *parser)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 		return makeJsonNode(JSON_NULL);
+ }
+ 
+ static Json *
+ makeJsonBool(JsonParser *parser, bool value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_BOOL);
+ 		node->value.boolean = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonNumber(JsonParser *parser, char *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_NUMBER);
+ 		node->value.number = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonString(JsonParser *parser, char *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_STRING);
+ 		node->value.string = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonArray(JsonParser *parser, List *items)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_ARRAY);
+ 		node->value.array = items;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonObject(JsonParser *parser, List *attrs)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_OBJECT);
+ 		node->value.object = attrs;
+ 		return node;
+ 	}
+ }
+ 
+ static JsonAttr *
+ makeJsonAttr(JsonParser *parser, char *key, Json *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		JsonAttr *attr = (JsonAttr *) palloc(sizeof(JsonAttr));
+ 		attr->key = key;
+ 		attr->value = value;
+ 		return attr;
+ 	}
+ }
+ 
+ static List *
+ makeList(JsonParser *parser, void *elem)
+ {
+ 	if (parser->validateOnly)
+ 		return NIL;
+ 	else
+ 		return list_make1(elem);
+ }
+ 
+ static List *
+ lconsList(JsonParser *parser, void *elem, List *list)
+ {
+ 	if (parser->validateOnly)
+ 		return NIL;
+ 	else
+ 		return lcons(elem, list);
+ }
diff --git a/src/backend/utils/adt/json_scanner.l b/src/backend/utils/adt/json_scanner.l
index ...f4ed64f .
*** a/src/backend/utils/adt/json_scanner.l
--- b/src/backend/utils/adt/json_scanner.l
***************
*** 0 ****
--- 1,88 ----
+ %{
+ /*-------------------------------------------------------------------------
+  *
+  * json_snanner.l
+  *	  Lexical scanner for JSON data types.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "nodes/pg_list.h"
+ #include "utils/json.h"
+ #include "json_parser.h"
+ 
+ #define getParser()			((JsonParser *) json_yyget_extra(yyscanner))
+ %}
+ 
+ %option reentrant
+ %option bison-bridge
+ %option 8bit
+ %option never-interactive
+ %option nodefault
+ %option noinput
+ %option nounput
+ %option noyywrap
+ %option noyyalloc
+ %option noyyrealloc
+ %option noyyfree
+ %option warn
+ %option prefix="json_yy"
+ %option outfile="json_scanner.c" header-file="json_scanner.h"
+ 
+ %x str
+ 
+ %%
+ 	/* Whitespace */
+ [ \t\r\n]
+ 
+ 	/* Symbol */
+ [\[\]\{\}:,] return yytext[0];
+ 
+ 	/* Null */
+ null		{ return T_JSON_NULL; }
+ 
+ 	/* Boolean */
+ true		{ return T_JSON_TRUE; }
+ false		{ return T_JSON_FALSE; }
+ 
+ 	/* Number */
+ 0|-?([1-9][0-9]*(\.[0-9]+)?|0\.[0-9]+)([Ee][+-]?[0-9]+)?	{
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 			yylval->number = pstrdup(yytext);
+ 		return T_JSON_NUMBER;
+ 	}
+ 
+ 	/* String */
+ \"	{
+ 		JsonParser *parser = getParser();
+ 		BEGIN str;
+ 		if (!parser->validateOnly)
+ 			appendStringInfoString(&parser->buf, yytext);
+ 	}
+ <str>([^\"\\[:cntrl:]]+|\\u[0-9A-Fa-f]{4}|\\[\"\\/bfnrt]) {
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 			appendStringInfoString(&parser->buf, yytext);
+ 	}
+ <str>\" {
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 		{
+ 			appendStringInfoString(&parser->buf, yytext);
+ 			yylval->string = pstrdup(parser->buf.data);
+ 			resetStringInfo(&parser->buf);
+ 		}
+ 		BEGIN(INITIAL);
+ 		return T_JSON_STRING;
+ 	}
+ <str>.|\n	{ return T_JSON_INVALID; }	/* unterminated string */
+ 
+ 	/* Invalid */
+ . return T_JSON_INVALID;
diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h
index f53a39a..569dcaa 100644
*** a/src/include/catalog/pg_cast.h
--- b/src/include/catalog/pg_cast.h
*************** DATA(insert (  869	 25  730 a f ));
*** 320,325 ****
--- 320,327 ----
  DATA(insert (	16	 25 2971 a f ));
  DATA(insert (  142	 25    0 a b ));
  DATA(insert (	25	142 2896 e f ));
+ DATA(insert (  321	 25    0 a b ));
+ DATA(insert (	25	321 3830 e f ));
  
  /*
   * Cross-category casts to and from VARCHAR
*************** DATA(insert (  869 1043  730 a f ));
*** 331,336 ****
--- 333,340 ----
  DATA(insert (	16 1043 2971 a f ));
  DATA(insert (  142 1043    0 a b ));
  DATA(insert ( 1043	142 2896 e f ));
+ DATA(insert (  321 1043    0 a b ));
+ DATA(insert ( 1043	321 3830 e f ));
  
  /*
   * Cross-category casts to and from BPCHAR
*************** DATA(insert (  869 1042  730 a f ));
*** 342,347 ****
--- 346,353 ----
  DATA(insert (	16 1042 2971 a f ));
  DATA(insert (  142 1042    0 a b ));
  DATA(insert ( 1042	142 2896 e f ));
+ DATA(insert (  321 1042    0 a b ));
+ DATA(insert ( 1042	321 3830 e f ));
  
  /*
   * Length-coercion functions
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 427f5ea..b3d3da5 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DESCR("determine if a string is well for
*** 4457,4462 ****
--- 4457,4478 ----
  DATA(insert OID = 3053 (  xml_is_well_formed_content	 PGNSP PGUID 12 1 0 0 f f f t f i 1 0 16 "25" _null_ _null_ _null_ _null_ xml_is_well_formed_content _null_ _null_ _null_ ));
  DESCR("determine if a string is well formed XML content");
  
+ /* JSON support */
+ DATA(insert OID = 3826 (  json_in			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 321 "2275" _null_ _null_ _null_ _null_ json_in _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3827 (  json_out			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2275 "321" _null_ _null_ _null_ _null_ json_out _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3828 (  json_recv			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 321 "2281" _null_ _null_ _null_ _null_ json_recv _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3829 (  json_send			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 17 "321" _null_ _null_ _null_ _null_ json_send _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3830 (  json				PGNSP PGUID 12 1 0 0 f f f t f s 1 0 321 "25" _null_ _null_ _null_ _null_ text_to_json _null_ _null_ _null_ ));
+ DESCR("perform a non-validating parse of a character string to produce a JSON value");
+ DATA(insert OID = 3831 (  text				PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "321" _null_ _null_ _null_ _null_ json_to_text _null_ _null_ _null_ ));
+ DESCR("serialize a JSON value to a character string");
+ DATA(insert OID = 3832 (  json_pretty		PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "321" _null_ _null_ _null_ _null_ json_pretty _null_ _null_ _null_ ));
+ DESCR("convert a JSON to a human readable text");
+ 
  /* uuid */
  DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 "2275" _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
  DESCR("I/O");
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index 05fc974..6b110bf 100644
*** a/src/include/catalog/pg_type.h
--- b/src/include/catalog/pg_type.h
*************** DESCR("storage manager");
*** 354,359 ****
--- 354,364 ----
  
  /* OIDS 300 - 399 */
  
+ DATA(insert OID = 321 ( json	   PGNSP PGUID -1 f b U f t \054 0 0 322 json_in json_out json_recv json_send - - - i x f 0 -1 0 _null_ _null_ ));
+ DESCR("JSON content");
+ #define JSONOID 321
+ DATA(insert OID = 322 ( _json	   PGNSP PGUID -1 f b A f t \054 0 321 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 _null_ _null_ ));
+ 
  /* OIDS 400 - 499 */
  
  /* OIDS 500 - 599 */
diff --git a/src/include/utils/json.h b/src/include/utils/json.h
index ...952d1bc .
*** a/src/include/utils/json.h
--- b/src/include/utils/json.h
***************
*** 0 ****
--- 1,75 ----
+ /*-------------------------------------------------------------------------
+  *
+  * json.h
+  *	  Declarations for JSON data type support.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * $PostgreSQL$
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #ifndef JSON_H
+ #define JSON_H
+ 
+ #include "fmgr.h"
+ #include "nodes/execnodes.h"
+ #include "nodes/primnodes.h"
+ 
+ typedef struct varlena jsontype;
+ 
+ #define DatumGetJsonP(X)		((jsontype *) PG_DETOAST_DATUM(X))
+ #define DatumGetJsonPP(X)		((jsontype *) PG_DETOAST_DATUM_PACKED(X))
+ #define JsonPGetDatum(X)		PointerGetDatum(X)
+ 
+ #define PG_GETARG_JSON_P(n)		DatumGetJsonP(PG_GETARG_DATUM(n))
+ #define PG_GETARG_JSON_PP(n)	DatumGetJsonPP(PG_GETARG_DATUM(n))
+ #define PG_RETURN_JSON_P(x)		PG_RETURN_POINTER(x)
+ 
+ extern Datum json_in(PG_FUNCTION_ARGS);
+ extern Datum json_out(PG_FUNCTION_ARGS);
+ extern Datum json_recv(PG_FUNCTION_ARGS);
+ extern Datum json_send(PG_FUNCTION_ARGS);
+ extern Datum text_to_json(PG_FUNCTION_ARGS);
+ extern Datum json_to_text(PG_FUNCTION_ARGS);
+ extern Datum json_pretty(PG_FUNCTION_ARGS);
+ 
+ typedef enum JsonType
+ {
+ 	JSON_NULL,
+ 	JSON_BOOL,
+ 	JSON_NUMBER,
+ 	JSON_STRING,
+ 	JSON_ARRAY,
+ 	JSON_OBJECT
+ } JsonType;
+ 
+ typedef struct Json
+ {
+ 	JsonType	type;
+ 	union
+ 	{
+ 		bool	boolean;
+ 		char   *number;
+ 		char   *string;
+ 		List   *array;		/* a list of Json */
+ 		List   *object;		/* a list of JsonAttr */
+ 	} value;
+ } Json;
+ 
+ typedef struct JsonAttr
+ {
+ 	char	   *key;
+ 	Json	   *value;
+ } JsonAttr;
+ 
+ typedef struct JsonParser
+ {
+ 	Json		   *json;
+ 	StringInfoData	buf;
+ 	bool			validateOnly;
+ } JsonParser;
+ 
+ #endif   /* JSON_H */
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index ...1237714 .
*** a/src/test/regress/expected/json.out
--- b/src/test/regress/expected/json.out
***************
*** 0 ****
--- 1,39 ----
+ CREATE TABLE jsontest (
+     id int,
+     data json
+ );
+ INSERT INTO jsontest VALUES (1, 'null');
+ INSERT INTO jsontest VALUES (2, '{"booleans":[true,false],"null":null}');
+ INSERT INTO jsontest VALUES (3, E'[\n123.456,\n"string",\n{"name":"object"}\n]');
+ INSERT INTO jsontest VALUES (101, 'wrong');
+ ERROR:  syntax error in json: syntax error
+ LINE 1: INSERT INTO jsontest VALUES (101, 'wrong');
+                                           ^
+ INSERT INTO jsontest VALUES (101, '[wrong');
+ ERROR:  syntax error in json: syntax error
+ LINE 1: INSERT INTO jsontest VALUES (101, '[wrong');
+                                           ^
+ INSERT INTO jsontest VALUES (101, '{"wrong":123}}');
+ ERROR:  syntax error in json: syntax error
+ LINE 1: INSERT INTO jsontest VALUES (101, '{"wrong":123}}');
+                                           ^
+ SELECT *, json_pretty(data) FROM jsontest;
+  id |                 data                  |       json_pretty        
+ ----+---------------------------------------+--------------------------
+   1 | null                                  | null
+   2 | {"booleans":[true,false],"null":null} | {                       +
+     |                                       |     "booleans": [       +
+     |                                       |         true,           +
+     |                                       |         false           +
+     |                                       |     ],                  +
+     |                                       |     "null": null        +
+     |                                       | }
+   3 | [                                    +| [                       +
+     | 123.456,                             +|     123.456,            +
+     | "string",                            +|     "string",           +
+     | {"name":"object"}                    +|     {                   +
+     | ]                                     |         "name": "object"+
+     |                                       |     }                   +
+     |                                       | ]
+ (3 rows)
+ 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 4703d49..ca6ff28 100644
*** a/src/test/regress/expected/opr_sanity.out
--- b/src/test/regress/expected/opr_sanity.out
*************** WHERE c.castfunc = p.oid AND
*** 402,407 ****
--- 402,409 ----
  -- texttoxml(), which does an XML syntax check.
  -- As of 9.1, this finds the cast from pg_node_tree to text, which we
  -- intentionally do not provide a reverse pathway for.
+ -- Also, this finds the casts from json to text, varchar, and bpchar,
+ -- because of the same reason as xml.
  SELECT castsource::regtype, casttarget::regtype, castfunc, castcontext
  FROM pg_cast c
  WHERE c.castmethod = 'b' AND
*************** WHERE c.castmethod = 'b' AND
*** 416,424 ****
   pg_node_tree      | text              |        0 | i
   cidr              | inet              |        0 | i
   xml               | text              |        0 | a
   xml               | character varying |        0 | a
   xml               | character         |        0 | a
! (7 rows)
  
  -- **************** pg_operator ****************
  -- Look for illegal values in pg_operator fields.
--- 418,429 ----
   pg_node_tree      | text              |        0 | i
   cidr              | inet              |        0 | i
   xml               | text              |        0 | a
+  json              | text              |        0 | a
   xml               | character varying |        0 | a
+  json              | character varying |        0 | a
   xml               | character         |        0 | a
!  json              | character         |        0 | a
! (10 rows)
  
  -- **************** pg_operator ****************
  -- Look for illegal values in pg_operator fields.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 191d1fe..e056bd2 100644
*** a/src/test/regress/parallel_schedule
--- b/src/test/regress/parallel_schedule
*************** test: select_views portals_p2 foreign_ke
*** 91,97 ****
  # NB: temp.sql does a reconnect which transiently uses 2 connections,
  # so keep this parallel group to at most 19 tests
  # ----------
! test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
  
  # run stats by itself because its delay may be insufficient under heavy load
  test: stats
--- 91,97 ----
  # NB: temp.sql does a reconnect which transiently uses 2 connections,
  # so keep this parallel group to at most 19 tests
  # ----------
! test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml json
  
  # run stats by itself because its delay may be insufficient under heavy load
  test: stats
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 80a9881..3965bef 100644
*** a/src/test/regress/serial_schedule
--- b/src/test/regress/serial_schedule
*************** test: returning
*** 123,126 ****
--- 123,127 ----
  test: largeobject
  test: with
  test: xml
+ text: json
  test: stats
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index ...76a1d06 .
*** a/src/test/regress/sql/json.sql
--- b/src/test/regress/sql/json.sql
***************
*** 0 ****
--- 1,13 ----
+ CREATE TABLE jsontest (
+     id int,
+     data json
+ );
+ 
+ INSERT INTO jsontest VALUES (1, 'null');
+ INSERT INTO jsontest VALUES (2, '{"booleans":[true,false],"null":null}');
+ INSERT INTO jsontest VALUES (3, E'[\n123.456,\n"string",\n{"name":"object"}\n]');
+ INSERT INTO jsontest VALUES (101, 'wrong');
+ INSERT INTO jsontest VALUES (101, '[wrong');
+ INSERT INTO jsontest VALUES (101, '{"wrong":123}}');
+ 
+ SELECT *, json_pretty(data) FROM jsontest;
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 0d084a1..fb71bb2 100644
*** a/src/test/regress/sql/opr_sanity.sql
--- b/src/test/regress/sql/opr_sanity.sql
*************** WHERE c.castfunc = p.oid AND
*** 320,325 ****
--- 320,327 ----
  
  -- As of 9.1, this finds the cast from pg_node_tree to text, which we
  -- intentionally do not provide a reverse pathway for.
+ -- Also, this finds the casts from json to text, varchar, and bpchar,
+ -- because of the same reason as xml.
  
  SELECT castsource::regtype, casttarget::regtype, castfunc, castcontext
  FROM pg_cast c
#2David E. Wheeler
david@kineticode.com
In reply to: Itagaki Takahiro (#1)
Re: Basic JSON support

On Sep 14, 2010, at 7:32 PM, Itagaki Takahiro wrote:

Here is a patch for basic JSON support. It adds only those features:
* Add "json" data type, that is binary-compatible with text.
* Syntax checking on text to JSON conversion.
* json_pretty() -- print JSON tree with indentation.

We have "JSON datatype (WIP) 01" item:
https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.

So should this be added to the commitfest, or replace this one?

Best,

David

#3Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: David E. Wheeler (#2)
Re: Basic JSON support

On Thu, Sep 16, 2010 at 1:45 AM, David E. Wheeler <david@kineticode.com> wrote:

We have "JSON datatype (WIP) 01" item:
 https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.

So should this be added to the commitfest, or replace this one?

I added my patch, but the previous one will be returned with feedback
if the author doesn't have a new version adjusted to existing comments.

--
Itagaki Takahiro

#4Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#3)
Re: Basic JSON support

On Wed, Sep 15, 2010 at 7:23 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Thu, Sep 16, 2010 at 1:45 AM, David E. Wheeler <david@kineticode.com> wrote:

We have "JSON datatype (WIP) 01" item:
 https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.

So should this be added to the commitfest, or replace this one?

I added my patch, but the previous one will be returned with feedback
if the author doesn't have a new version adjusted to existing comments.

Did you extract this from his work, or is this completely independent?
I'm a bit disinclined to say we should just toss overboard all the
work that's already been done. Why did you write a new patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#5Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#4)
Re: Basic JSON support

On Thu, Sep 16, 2010 at 9:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Did you extract this from his work, or is this completely independent?
 I'm a bit disinclined to say we should just toss overboard all the
work that's already been done.  Why did you write a new patch?

I read his patch and am inspired by the work, but the code is almost
rewritten. As my previous comment,
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01773.php
I think it's not a good idea to develop JSON datatype based on the
complex node links. The point of my patch is to simplify it.

I'd like to encourage use of the simplified structure to implement
his other works, including JSONPath.

--
Itagaki Takahiro

#6Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#5)
Re: Basic JSON support

On Wed, Sep 15, 2010 at 9:53 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

On Thu, Sep 16, 2010 at 9:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Did you extract this from his work, or is this completely independent?
 I'm a bit disinclined to say we should just toss overboard all the
work that's already been done.  Why did you write a new patch?

I read his patch and am inspired by the work, but the code is almost
rewritten. As my previous comment,
 http://archives.postgresql.org/pgsql-hackers/2010-08/msg01773.php
I think it's not a good idea to develop JSON datatype based on the
complex node links. The point of my patch is to simplify it.

I'd like to encourage use of the simplified structure to implement
his other works, including JSONPath.

I think that if we make a habit of rewriting the contributions of
first-time contributors in toto, we will have fewer second-time
contributors. I think it would have been a good idea to discuss this
on the list before you went and did it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#7Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#6)
Re: Basic JSON support

I think that if we make a habit of rewriting the contributions of
first-time contributors in toto, we will have fewer second-time
contributors. I think it would have been a good idea to discuss this
on the list before you went and did it.

To be fair to Itagaki-san, he DID ask about the status of the SOC
project, and didn't get much of an answer.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#8Joseph Adams
joeyadams3.14159@gmail.com
In reply to: Itagaki Takahiro (#1)
1 attachment(s)
Re: Basic JSON support

On Tue, Sep 14, 2010 at 10:32 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

Here is a patch for basic JSON support. It adds only those features:
 * Add "json" data type, that is binary-compatible with text.
 * Syntax checking on text to JSON conversion.
 * json_pretty() -- print JSON tree with indentation.

We have "JSON datatype (WIP) 01" item:
 https://commitfest.postgresql.org/action/patch_view?id=351
but it seems too complex for me to apply all of the feature
at once, especially JSON-Path support. So, I'd like to submit
only basic and minimal JSON datatype support at first.

The most different point of my patch is that JSON parser is
reimplemented with flex and bison to make maintenance easier.

Note that the JSON parser accept top-level scalar values
intensionally, because of requirement when we add functions
to extract values from JSON array/object. For example,
 CREATE FUNCTION json_to_array(json) RETURNS json[]
might return naked scalar values in the result array.

--
Itagaki Takahiro

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

I have written a patch that amends the basic_json-20100915.patch . It
does the following:

* Fixes bugs in the lexer (namely, ' -0 '::JSON was not accepted, and
' """ '::JSON was accepted).
* Adds a json_validate() function.
* Adds test strings from original JSON patch along with some new ones.
* Tweaks the error message for invalid JSON input.
* Fixes the Makefile so json.c won't fail to build because
json_parser.h and json_scanner.h haven't been built yet.

In the lexer, the following regular expression for parsing a number
was incorrect:

0|-?([1-9][0-9]*(\.[0-9]+)?|0\.[0-9]+)([Ee][+-]?[0-9]+)?

This regex doesn't accept '-0', but that's a valid JSON number
according to the standard.

Also, the exclusive state <str> didn't have an <<EOF>> rule, causing
the lexer to treat all characters from an unclosed quote to EOF as
nothing at all.

A less important issue with the lexer is that it doesn't allow the
control character \x7F to appear in strings unescaped. The JSON RFC
doesn't mention \x7F as a control character, and many implementations
of JSON and JavaScript don't treat it as one either.

I went ahead and added json_validate() now because it's useful for
testing (my test strings use it).

I made the error message say "ERROR: invalid input syntax for JSON"
rather than the redundant "ERROR: syntax error in json: syntax
error". However, the specific parsing error (we haven't defined any
yet) is simply ignored by json_validate and company.

As for the Makefile, I think the reason you weren't getting build
problems but I was is because I normally build with make -j2 (run 2
processes at once), which tends to test makefile rule correctness.

Here's one thing I'm worried about: the bison/flex code in your patch
looks rather similar to the code in
http://www.jsonlint.com/bin/jsonval.tgz , which is licensed under the
GPL. In particular, the incorrect number regex I discussed above can
also be found in jsonval verbatim. However, because there are a lot
of differences in both the bison and flex code now, I'm not sure
they're close enough to be "copied", but I am not a lawyer. It might
be a good idea to contact Ben Spencer and ask him for permission to
license our modified version of the code under PostgreSQL's more
relaxed license, just to be on the safe side.

Finally, could you post a patch with your latest work (with or without
my contribution) so we can tinker with it more? Thanks!

Joey Adams

Attachments:

json_fixed_lexer-20100919.difftext/x-patch; charset=US-ASCII; name=json_fixed_lexer-20100919.diffDownload
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 483abb6..2c035b1 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,8 @@ else
 	@$(missing) bison $< $@
 endif
 
+json_scanner.h: json_scanner.c ;
+
 json_scanner.c: json_scanner.l
 ifdef FLEX
 	$(FLEX) $(FLEXFLAGS) -o'$@' $<
@@ -59,4 +61,6 @@ endif
 
 json_parser.o keywords.o parser.o: json_parser.h
 
+json.c: json_parser.h json_scanner.h
+
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 13e5391..9a3979e 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -20,7 +20,9 @@
 #include "json_parser.h"
 #include "json_scanner.h"
 
-static Json *json_parse(const char *str, int len, bool validateOnly);
+static Json *json_parse(const char *str, int len);
+static bool json_validate(const char *str, int len);
+static Json *json_parse_or_validate(const char *str, int len, bool *validate);
 static void json_free(Json *node);
 static void append_json(StringInfo buf, const Json *node, int indent);
 static void append_indent(StringInfo buf, int indent);
@@ -43,11 +45,12 @@ json_in(PG_FUNCTION_ARGS)
 
 	json = (jsontype *) cstring_to_text(s);
 
-	/*
-	 * Parse the data to check if it is a JSON data.  Assume that
-	 * ERROR occurred if parsing failed.
-	 */
-	json_parse(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json), true);
+	if (!json_validate(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json)))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for JSON")));
+	}
 
 	PG_RETURN_JSON_P(json);
 }
@@ -68,7 +71,12 @@ json_recv(PG_FUNCTION_ARGS)
 	int			nbytes;
 
 	str = pq_getmsgtext(buf, buf->len - buf->cursor, &nbytes);
-	json_parse(str, nbytes, true);
+	if (!json_validate(str, nbytes))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+				 errmsg("invalid input syntax for JSON")));
+	}
 
 	result = (jsontype *) cstring_to_text_with_len(str, nbytes);
 	pfree(str);
@@ -82,12 +90,31 @@ json_send(PG_FUNCTION_ARGS)
 	return textsend(fcinfo);
 }
 
+/* json_validate(text).  Renamed to avoid clashing
+ * with the C function json_validate. */
+PG_FUNCTION_INFO_V1(json_validate_f);
+Datum		json_validate_f(PG_FUNCTION_ARGS);
+Datum
+json_validate_f(PG_FUNCTION_ARGS)
+{
+	jsontype   *t = PG_GETARG_JSON_P(0);
+	bool		ret;
+
+	ret = json_validate(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
+	PG_RETURN_BOOL(ret);
+}
+
 Datum
 text_to_json(PG_FUNCTION_ARGS)
 {
 	text	   *t = PG_GETARG_TEXT_PP(0);
 
-	json_parse(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t), true);
+	if (!json_validate(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t)))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for JSON")));
+	}
 
 	PG_RETURN_JSON_P((jsontype *) t);
 }
@@ -106,7 +133,7 @@ json_pretty(PG_FUNCTION_ARGS)
 	Json		   *node;
 	StringInfoData	buf;
 
-	node = json_parse(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json), false);
+	node = json_parse(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json));
 	initStringInfo(&buf);
 	append_json(&buf, node, 0);
 	json_free(node);
@@ -115,17 +142,42 @@ json_pretty(PG_FUNCTION_ARGS)
 }
 
 /*
- * json_parse -- parse and validate the JSON text.
+ * json_parse -- parse a JSON text.
+ *
+ * Returns NULL if the input is not valid JSON.
+ */
+static Json *
+json_parse(const char *str, int len)
+{
+	return json_parse_or_validate(str, len, NULL);
+}
+
+/*
+ * json_validate -- validate a JSON text.
+ */
+static bool
+json_validate(const char *str, int len)
+{
+	bool valid;
+	json_parse_or_validate(str, len, &valid);
+	return valid;
+}
+
+/*
+ * Helper for json_parse and json_validate.
  *
- * Only validates the input when validateOnly is true, or returns JSON
- * tree on false.
+ * If validate is NULL, this function returns a parse tree
+ *      (or NULL if the JSON was invalid).
+ * If validate is not NULL, this function passes a boolean through
+ *      *validate to indicate whether the JSON was valid.
  */
 static Json *
-json_parse(const char *str, int len, bool validateOnly)
+json_parse_or_validate(const char *str, int len, bool *validate)
 {
 	yyscan_t		scanner;
 	char		   *scanbuf;
 	JsonParser		parser;
+	int				rc;
 
 	scanbuf = (char *) palloc(len + 2);
 	memcpy(scanbuf, str, len);
@@ -133,8 +185,8 @@ json_parse(const char *str, int len, bool validateOnly)
 
 	/* initialize parser and scanner */
 	memset(&parser, 0, sizeof(parser));
-	parser.validateOnly = validateOnly;
-	if (!validateOnly)
+	parser.validateOnly = !!validate;
+	if (!validate)
 		initStringInfo(&parser.buf);
 
 	json_yylex_init(&scanner);
@@ -142,7 +194,7 @@ json_parse(const char *str, int len, bool validateOnly)
 	json_yy_scan_buffer(scanbuf, len + 2, scanner);
 
 	/* parse the JSON text */
-	json_yyparse(scanner);
+	rc = json_yyparse(scanner);
 
 	/* cleanup */
 	json_yylex_destroy(scanner);
@@ -150,6 +202,9 @@ json_parse(const char *str, int len, bool validateOnly)
 	if (parser.buf.data)
 		pfree(parser.buf.data);
 
+	if (validate)
+		*validate = (rc == 0);
+
 	return parser.json;
 }
 
@@ -279,9 +334,7 @@ append_indent(StringInfo buf, int indent)
 void
 json_yyerror(yyscan_t scanner, const char *error)
 {
-	ereport(ERROR,
-			(errcode(ERRCODE_SYNTAX_ERROR),
-			 errmsg("syntax error in json: %s", error)));
+	
 }
 
 void *
diff --git a/src/backend/utils/adt/json_scanner.l b/src/backend/utils/adt/json_scanner.l
index f4ed64f..acc0c6d 100644
--- a/src/backend/utils/adt/json_scanner.l
+++ b/src/backend/utils/adt/json_scanner.l
@@ -52,7 +52,7 @@ true		{ return T_JSON_TRUE; }
 false		{ return T_JSON_FALSE; }
 
 	/* Number */
-0|-?([1-9][0-9]*(\.[0-9]+)?|0\.[0-9]+)([Ee][+-]?[0-9]+)?	{
+-?(0|[1-9][0-9]*)(\.[0-9]+)?([Ee][+-]?[0-9]+)?	{
 		JsonParser *parser = getParser();
 		if (!parser->validateOnly)
 			yylval->number = pstrdup(yytext);
@@ -66,7 +66,7 @@ false		{ return T_JSON_FALSE; }
 		if (!parser->validateOnly)
 			appendStringInfoString(&parser->buf, yytext);
 	}
-<str>([^\"\\[:cntrl:]]+|\\u[0-9A-Fa-f]{4}|\\[\"\\/bfnrt]) {
+<str>([^\"\\\x00-\x1F]+|\\u[0-9A-Fa-f]{4}|\\[\"\\/bfnrt]) {
 		JsonParser *parser = getParser();
 		if (!parser->validateOnly)
 			appendStringInfoString(&parser->buf, yytext);
@@ -82,7 +82,9 @@ false		{ return T_JSON_FALSE; }
 		BEGIN(INITIAL);
 		return T_JSON_STRING;
 	}
-<str>.|\n	{ return T_JSON_INVALID; }	/* unterminated string */
+		/* unterminated string */
+<str>.|\n		{ return T_JSON_INVALID; }
+<str><<EOF>>	{ return T_JSON_INVALID; }
 
 	/* Invalid */
 . return T_JSON_INVALID;
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b3d3da5..ae49f53 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4472,6 +4472,7 @@ DATA(insert OID = 3831 (  text				PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "321"
 DESCR("serialize a JSON value to a character string");
 DATA(insert OID = 3832 (  json_pretty		PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "321" _null_ _null_ _null_ _null_ json_pretty _null_ _null_ _null_ ));
 DESCR("convert a JSON to a human readable text");
+DATA(insert OID = 3833 (  json_validate		PGNSP PGUID 12 1 0 0 f f f t f i 1 0 16	 "25"		  _null_ _null_ _null_					_null_ json_validate_f _null_ _null_ _null_ ));
 
 /* uuid */
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 "2275" _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index 1237714..6bf7a9f 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -1,3 +1,365 @@
+-- Get our json_test_strings table.
+SET client_min_messages = warning;
+\set ECHO none
+RESET client_min_messages;
+-- Trivial cast-to-JSON tests
+SELECT '[]'::JSON;
+ json 
+------
+ []
+(1 row)
+
+SELECT '['::JSON;
+ERROR:  invalid input syntax for JSON
+LINE 1: SELECT '['::JSON;
+               ^
+SELECT '[1,2,3]'::JSON;
+  json   
+---------
+ [1,2,3]
+(1 row)
+
+SELECT '[1,2,3]'::JSON::TEXT;
+  text   
+---------
+ [1,2,3]
+(1 row)
+
+SELECT '[1,2,3  ]'::JSON;
+   json    
+-----------
+ [1,2,3  ]
+(1 row)
+
+SELECT '[1,2,3  ,4]'::JSON;
+    json     
+-------------
+ [1,2,3  ,4]
+(1 row)
+
+SELECT '[1,2,3  ,4.0]'::JSON;
+     json      
+---------------
+ [1,2,3  ,4.0]
+(1 row)
+
+SELECT '[1,2,3  ,4]'::JSON;
+    json     
+-------------
+ [1,2,3  ,4]
+(1 row)
+
+SELECT 'true'::JSON;
+ json 
+------
+ true
+(1 row)
+
+SELECT 'true'::TEXT::JSON;
+ json 
+------
+ true
+(1 row)
+
+SELECT 'false'::JSON;
+ json  
+-------
+ false
+(1 row)
+
+SELECT 'null'::JSON;
+ json 
+------
+ null
+(1 row)
+
+SELECT '1.1'::JSON;
+ json 
+------
+ 1.1
+(1 row)
+
+SELECT '"string"'::JSON;
+   json   
+----------
+ "string"
+(1 row)
+
+SELECT '{"key1":"value1", "key2":"value2"}'::JSON;
+                json                
+------------------------------------
+ {"key1":"value1", "key2":"value2"}
+(1 row)
+
+SELECT '{"key1":"value1", "key2":"value2"}'::JSON;
+                json                
+------------------------------------
+ {"key1":"value1", "key2":"value2"}
+(1 row)
+
+SELECT 15::JSON;
+ERROR:  cannot cast type integer to json
+LINE 1: SELECT 15::JSON;
+                 ^
+-- Make sure all the right strings validate
+SELECT string, json_validate(string) FROM json_test_strings;
+                          string                          | json_validate 
+----------------------------------------------------------+---------------
+                                                          | f
+                                                          | f
+ "                                                        | f
+ [,]                                                      | f
+ [)                                                       | f
+ []]                                                      | f
+ [}                                                       | f
+ {,}                                                      | f
+ {]                                                       | f
+ ["1":2]                                                  | f
+ [1,2,]                                                   | f
+ [1:2}                                                    | f
+ {"1":2,}                                                 | f
+ {1:2}                                                    | f
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"], [7 ]}]}          | f
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"], [7]}]}           | f
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]      | f
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]]     | f
+ {"1":2, "3":4                                            | f
+ "1\u2"                                                   | f
+ [,2]                                                     | f
+ "3                                                       | f
+ "3" "4"                                                  | f
+ [3[4]                                                    | f
+ [3[4]]                                                   | f
+ [3, [4, [5], 6] 7, 8 9]                                  | f
+ [3, [4, [5], 6] 7, 8, 9]                                 | f
+ [3, [4, [5], 6], 7, 8 9]                                 | f
+ {"hello":true, "bye":false, null}                        | f
+ {"hello":true, "bye":false, null:null}                   | f
+ "hi                                                      | f
+ "hi"""                                                   | f
+ {"hi": "bye"]                                            | f
+ "\uD800\uD800"                                           | t
+ "\uD800\uDBFF"                                           | t
+ "\UD834\UDD1E"                                           | f
+ "\uDB00"                                                 | t
+ "\uDB00\uDBFF"                                           | t
+ "\uFFFE"                                                 | t
+ "\uFFFF"                                                 | t
+ .                                                        | f
+ ""                                                       | t
+ []                                                       | t
+ {}                                                       | t
+ +.                                                       | f
+ 0.5                                                      | t
+ 0.e1                                                     | f
+ {"1":{}}                                                 | t
+ {"1":2}                                                  | t
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"]}]}                | t
+ {"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]}     | t
+ 1234                                                     | t
+ -1234                                                    | t
+ {"1":2, "3":4}                                           | t
+ +1234                                                    | f
+ ++1234                                                   | f
+ 123.456e14234                                            | t
+ 123.456e-14234                                           | t
+ 123.456e+14234                                           | t
+ 123.e-14234                                              | f
+ "1\u2000"                                                | t
+ "1\u20001"                                               | t
+ 2                                                        | t
+ .246e-14234                                              | f
+ .2e-14234                                                | f
+ 3                                                        | t
+ .3                                                       | f
+ "3"                                                      | t
+ [3]                                                      | t
+ +3.                                                      | f
+ 3.2e+1                                                   | t
+ [3, [4]]                                                 | t
+ [3, [4, [5]]]                                            | t
+ [3, [4, [5], 6]]                                         | t
+ [3, [4, [5], 6], 7]                                      | t
+ [3, [4, [5], 6], 7, 8]                                   | t
+ [3, [4, [5], 6], 7, 8, 9]                                | t
+ +3.5                                                     | f
+ .3e                                                      | f
+ .3e1                                                     | f
+ .3e-1                                                    | f
+ .3e+1                                                    | f
+ 3.e1                                                     | f
+ 3.e+1                                                    | f
+ 3e+1                                                     | t
+ .5                                                       | f
+ +.5                                                      | f
+ .5e+1                                                    | f
+ [ 7]                                                     | t
+ [7 ]                                                     | t
+ [7]                                                      | t
+ .e-14234                                                 | f
+ "hello"                                                  | t
+ ["hello"]                                                | t
+ ["hello", "bye"]                                         | t
+ ["hello", "bye\n"]                                       | t
+ ["hello", "bye\n\r\t"]                                   | t
+ ["hello", "bye\n\r\t\b"]                                 | t
+ ["hello", "bye\n\r\t\b",true]                            | t
+ ["hello", "bye\n\r\t\b",true , false]                    | t
+ ["hello", "bye\n\r\t\b",true , false,    null]           | t
+ ["hello", "bye\n\r\t\v"]                                 | f
+ {"hello":true}                                           | t
+ {"hello":true, "bye":false}                              | t
+ {"hello":true, "bye":false, "foo":["one","two","three"]} | t
+ "hi"                                                     | t
+ ["hi"]                                                   | t
+ ["hi", "bye"]                                            | t
+ {"hi": "bye"}                                            | t
+ ["hi", "bye", 3]                                         | t
+ ["hi", "bye[", 3]                                        | t
+ "\u0007"                                                 | t
+ "\u0008"                                                 | t
+ "\u0009"                                                 | t
+ "\u0010"                                                 | t
+ "\u0020"                                                 | t
+ "\u10000"                                                | t
+ "\u1234"                                                 | t
+ "\u99999"                                                | t
+ "\ud800\udc00"                                           | t
+ "\uD800\uDC00"                                           | t
+ "\uD834\uDD1E"                                           | t
+ "\uDBFF\uDFFF"                                           | t
+ "\uFFFD"                                                 | t
+ "\uFFFF"                                                 | t
+ hello                                                    | f
+ [32, 1]                                                  | t
+ [32,                                                     | f
+ "\uD800\uDC00"                                           | t
+ "\n"                                                     | t
+ "hello"                                                  | t
+ "hello\u0009world"                                       | t
+ "hello"                                                  | t
+ "hello\n"                                                | t
+ "hello"                                                  | t
+ 3                                                        | t
+ 3.                                                       | f
+ .3                                                       | f
+ 0.3                                                      | t
+ 0.3e                                                     | f
+ 0.3e+                                                    | f
+ 0.3e+5                                                   | t
+ 0.3e-5                                                   | t
+ 0.3e5                                                    | t
+ "hello"                                                  | t
+ +3                                                       | f
+ -3                                                       | t
+ -3.                                                      | f
+ -3.1                                                     | t
+ -0                                                       | t
+ -0.                                                      | f
+ -0.0                                                     | t
+ .5                                                       | f
+ 5.                                                       | f
+ 5.e1                                                     | f
+ 0.5                                                      | t
+ .3e1                                                     | f
+ .3e+1                                                    | f
+ .3e-1                                                    | f
+ .3e-1 .5                                                 | f
+ .3e-1.5                                                  | f
+ .3e+1.5                                                  | f
+ .3e+.                                                    | f
+ .3e+.5                                                   | f
+ .3e+1.5                                                  | f
+ 9.3e+1.5                                                 | f
+ 9.e+1.5                                                  | f
+ 9.e+                                                     | f
+ 9.e+1                                                    | f
+ "\""                                                     | t
+ "\"3.5"                                                  | t
+ "\"."                                                    | t
+ "\".".                                                   | f
+ "\"....."                                                | t
+ "\"\"\"\"""                                              | f
+ "newline                                                +| f
+ "                                                        | 
+ ["\"\"\"\"", .5]                                         | f
+ [.5]                                                     | f
+ ["\"\"\"\"", 0.5]                                        | t
+ ["\"\"\"\"", .5]                                         | f
+ ["\"\"\"\"",.5]                                          | f
+ ["\"",.5]                                                | f
+ ["\".5",.5]                                              | f
+ ["\".5",".5\"".5]                                        | f
+ ["\".5",".5\"", .5]                                      | f
+ ["\".5",".5\"",.5]                                       | f
+ ["\".5",".5\"",0.5]                                      | t
+ {"key":/*comment*/"value"}                               | f
+ {"key":/*comment"value"}                                 | f
+ {"key":"value"}/*                                        | f
+ {"key":"value"}/**/                                      | f
+ {"key":"value"}/***/                                     | f
+ {"key":"value"}/**//                                     | f
+ {"key":"value"}/**///                                    | f
+ {"key":"value"}/**///----                                | f
+ {"key":"value"}#                                         | f
+ {"key":"value"}#{                                        | f
+ {"key":"value"}#{}                                       | f
+ {"key":"value"}#,                                        | f
+ {"key":"value"/**/, "k2":"v2"}                           | f
+ "\u0027"                                                 | t
+ "hello\'"                                                | f
+ 'hello\''                                                | f
+ 'hello'                                                  | f
+ 'hell\'o'                                                | f
+ '\'hello'                                                | f
+ '\'hello\''                                              | f
+ \'hello\'                                                | f
+ 'hello\'                                                 | f
+ ['hello\']                                               | f
+ ['hello\'']                                              | f
+ ['hello"']                                               | f
+ ['hello\"']                                              | f
+ ['hello"o']                                              | f
+ ['"']                                                    | f
+ '"'                                                      | f
+ '"hello"'                                                | f
+ '"hello'                                                 | f
+ '"hi"'                                                   | f
+ "\x01"                                                   | f
+ "\x02"                                                   | f
+ "\x03"                                                   | f
+ "\x04"                                                   | f
+ "\x05"                                                   | f
+ "\x06"                                                   | f
+ "\x07"                                                   | f
+ "\x08"                                                   | f
+ "       "                                                | f
+ "                                                       +| f
+ "                                                        | 
+ "\x0B"                                                   | f
+ "\x0C"                                                   | f
+ "\r"                                                     | f
+ "\x0E"                                                   | f
+ "\x0F"                                                   | f
+ "\x10"                                                   | f
+ "\x11"                                                   | f
+ "\x12"                                                   | f
+ "\x13"                                                   | f
+ "\x14"                                                   | f
+ "\x15"                                                   | f
+ "\x16"                                                   | f
+ "\x17"                                                   | f
+ "\x18"                                                   | f
+ "\x19"                                                   | f
+ "\x1A"                                                   | f
+ "\x1B"                                                   | f
+ "\x1C"                                                   | f
+ "\x1D"                                                   | f
+ "\x1E"                                                   | f
+ "\x1F"                                                   | f
+ "\x7F"                                                   | t
+(251 rows)
+
 CREATE TABLE jsontest (
     id int,
     data json
@@ -6,15 +368,15 @@ INSERT INTO jsontest VALUES (1, 'null');
 INSERT INTO jsontest VALUES (2, '{"booleans":[true,false],"null":null}');
 INSERT INTO jsontest VALUES (3, E'[\n123.456,\n"string",\n{"name":"object"}\n]');
 INSERT INTO jsontest VALUES (101, 'wrong');
-ERROR:  syntax error in json: syntax error
+ERROR:  invalid input syntax for JSON
 LINE 1: INSERT INTO jsontest VALUES (101, 'wrong');
                                           ^
 INSERT INTO jsontest VALUES (101, '[wrong');
-ERROR:  syntax error in json: syntax error
+ERROR:  invalid input syntax for JSON
 LINE 1: INSERT INTO jsontest VALUES (101, '[wrong');
                                           ^
 INSERT INTO jsontest VALUES (101, '{"wrong":123}}');
-ERROR:  syntax error in json: syntax error
+ERROR:  invalid input syntax for JSON
 LINE 1: INSERT INTO jsontest VALUES (101, '{"wrong":123}}');
                                           ^
 SELECT *, json_pretty(data) FROM jsontest;
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index 76a1d06..841052d 100644
--- a/src/test/regress/sql/json.sql
+++ b/src/test/regress/sql/json.sql
@@ -1,3 +1,32 @@
+-- Get our json_test_strings table.
+SET client_min_messages = warning;
+\set ECHO none
+\i sql/json_test_strings.sql
+\set ECHO all
+RESET client_min_messages;
+
+-- Trivial cast-to-JSON tests
+SELECT '[]'::JSON;
+SELECT '['::JSON;
+SELECT '[1,2,3]'::JSON;
+SELECT '[1,2,3]'::JSON::TEXT;
+SELECT '[1,2,3  ]'::JSON;
+SELECT '[1,2,3  ,4]'::JSON;
+SELECT '[1,2,3  ,4.0]'::JSON;
+SELECT '[1,2,3  ,4]'::JSON;
+SELECT 'true'::JSON;
+SELECT 'true'::TEXT::JSON;
+SELECT 'false'::JSON;
+SELECT 'null'::JSON;
+SELECT '1.1'::JSON;
+SELECT '"string"'::JSON;
+SELECT '{"key1":"value1", "key2":"value2"}'::JSON;
+SELECT '{"key1":"value1", "key2":"value2"}'::JSON;
+SELECT 15::JSON;
+
+-- Make sure all the right strings validate
+SELECT string, json_validate(string) FROM json_test_strings;
+
 CREATE TABLE jsontest (
     id int,
     data json
diff --git a/src/test/regress/sql/json_test_strings.sql b/src/test/regress/sql/json_test_strings.sql
new file mode 100644
index 0000000..909a0da
--- /dev/null
+++ b/src/test/regress/sql/json_test_strings.sql
@@ -0,0 +1,264 @@
+/*
+ * This is not a test.  It is the set of
+ * test strings used by the json.sql test.
+ */
+
+CREATE TABLE json_test_strings (string TEXT);
+INSERT INTO json_test_strings VALUES ($$$$);
+INSERT INTO json_test_strings VALUES ($$  $$);
+INSERT INTO json_test_strings VALUES ($$"$$);
+INSERT INTO json_test_strings VALUES ($$[,]$$);
+INSERT INTO json_test_strings VALUES ($$[)$$);
+INSERT INTO json_test_strings VALUES ($$[]]$$);
+INSERT INTO json_test_strings VALUES ($$[}$$);
+INSERT INTO json_test_strings VALUES ($${,}$$);
+INSERT INTO json_test_strings VALUES ($${]$$);
+INSERT INTO json_test_strings VALUES ($$["1":2]$$);
+INSERT INTO json_test_strings VALUES ($$[1,2,]$$);
+INSERT INTO json_test_strings VALUES ($$[1:2}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2,}$$);
+INSERT INTO json_test_strings VALUES ($${1:2}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"], [7 ]}]}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"], [7]}]}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]]$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "3":4$$);
+INSERT INTO json_test_strings VALUES ($$"1\u2"$$);
+INSERT INTO json_test_strings VALUES ($$[,2]$$);
+INSERT INTO json_test_strings VALUES ($$"3$$);
+INSERT INTO json_test_strings VALUES ($$"3" "4"$$);
+INSERT INTO json_test_strings VALUES ($$[3[4]$$);
+INSERT INTO json_test_strings VALUES ($$[3[4]]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6] 7, 8 9]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6] 7, 8, 9]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6], 7, 8 9]$$);
+INSERT INTO json_test_strings VALUES ($${"hello":true, "bye":false, null}$$);
+INSERT INTO json_test_strings VALUES ($${"hello":true, "bye":false, null:null}$$);
+INSERT INTO json_test_strings VALUES ($$"hi$$);
+INSERT INTO json_test_strings VALUES ($$"hi"""$$);
+INSERT INTO json_test_strings VALUES ($${"hi": "bye"]$$);
+INSERT INTO json_test_strings VALUES ($$"\uD800\uD800"$$);
+INSERT INTO json_test_strings VALUES ($$"\uD800\uDBFF"$$);
+INSERT INTO json_test_strings VALUES ($$"\UD834\UDD1E"$$);
+INSERT INTO json_test_strings VALUES ($$"\uDB00"$$);
+INSERT INTO json_test_strings VALUES ($$"\uDB00\uDBFF"$$);
+INSERT INTO json_test_strings VALUES ($$"\uFFFE"$$);
+INSERT INTO json_test_strings VALUES ($$"\uFFFF"$$);
+INSERT INTO json_test_strings VALUES ($$.$$);
+INSERT INTO json_test_strings VALUES ($$""$$);
+INSERT INTO json_test_strings VALUES ($$[]$$);
+INSERT INTO json_test_strings VALUES ($${}$$);
+INSERT INTO json_test_strings VALUES ($$+.$$);
+INSERT INTO json_test_strings VALUES ($$0.5$$);
+INSERT INTO json_test_strings VALUES ($$0.e1$$);
+INSERT INTO json_test_strings VALUES ($${"1":{}}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"]}]}$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "2.5" : [3, 4, {}, {"5": ["6"], "7" :[8 ]}]}$$);
+INSERT INTO json_test_strings VALUES ($$1234$$);
+INSERT INTO json_test_strings VALUES ($$-1234$$);
+INSERT INTO json_test_strings VALUES ($${"1":2, "3":4}$$);
+INSERT INTO json_test_strings VALUES ($$+1234$$);
+INSERT INTO json_test_strings VALUES ($$++1234$$);
+INSERT INTO json_test_strings VALUES ($$123.456e14234$$);
+INSERT INTO json_test_strings VALUES ($$123.456e-14234$$);
+INSERT INTO json_test_strings VALUES ($$123.456e+14234$$);
+INSERT INTO json_test_strings VALUES ($$123.e-14234$$);
+INSERT INTO json_test_strings VALUES ($$"1\u2000"$$);
+INSERT INTO json_test_strings VALUES ($$"1\u20001"$$);
+INSERT INTO json_test_strings VALUES ($$2$$);
+INSERT INTO json_test_strings VALUES ($$.246e-14234$$);
+INSERT INTO json_test_strings VALUES ($$.2e-14234$$);
+INSERT INTO json_test_strings VALUES ($$3$$);
+INSERT INTO json_test_strings VALUES ($$.3$$);
+INSERT INTO json_test_strings VALUES ($$"3"$$);
+INSERT INTO json_test_strings VALUES ($$[3]$$);
+INSERT INTO json_test_strings VALUES ($$+3.$$);
+INSERT INTO json_test_strings VALUES ($$3.2e+1$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4]]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5]]]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6]]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6], 7]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6], 7, 8]$$);
+INSERT INTO json_test_strings VALUES ($$[3, [4, [5], 6], 7, 8, 9]$$);
+INSERT INTO json_test_strings VALUES ($$+3.5$$);
+INSERT INTO json_test_strings VALUES ($$.3e$$);
+INSERT INTO json_test_strings VALUES ($$.3e1$$);
+INSERT INTO json_test_strings VALUES ($$.3e-1$$);
+INSERT INTO json_test_strings VALUES ($$.3e+1$$);
+INSERT INTO json_test_strings VALUES ($$3.e1$$);
+INSERT INTO json_test_strings VALUES ($$3.e+1$$);
+INSERT INTO json_test_strings VALUES ($$3e+1$$);
+INSERT INTO json_test_strings VALUES ($$.5$$);
+INSERT INTO json_test_strings VALUES ($$+.5$$);
+INSERT INTO json_test_strings VALUES ($$.5e+1$$);
+INSERT INTO json_test_strings VALUES ($$[ 7]$$);
+INSERT INTO json_test_strings VALUES ($$[7 ]$$);
+INSERT INTO json_test_strings VALUES ($$[7]$$);
+INSERT INTO json_test_strings VALUES ($$.e-14234$$);
+INSERT INTO json_test_strings VALUES ($$"hello"$$);
+INSERT INTO json_test_strings VALUES ($$["hello"]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye"]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n"]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t"]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t\b"]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t\b",true]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t\b",true , false]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t\b",true , false,    null]$$);
+INSERT INTO json_test_strings VALUES ($$["hello", "bye\n\r\t\v"]$$);
+INSERT INTO json_test_strings VALUES ($${"hello":true}$$);
+INSERT INTO json_test_strings VALUES ($${"hello":true, "bye":false}$$);
+INSERT INTO json_test_strings VALUES ($${"hello":true, "bye":false, "foo":["one","two","three"]}$$);
+INSERT INTO json_test_strings VALUES ($$"hi"$$);
+INSERT INTO json_test_strings VALUES ($$["hi"]$$);
+INSERT INTO json_test_strings VALUES ($$["hi", "bye"]$$);
+INSERT INTO json_test_strings VALUES ($${"hi": "bye"}$$);
+INSERT INTO json_test_strings VALUES ($$["hi", "bye", 3]$$);
+INSERT INTO json_test_strings VALUES ($$["hi", "bye[", 3]$$);
+INSERT INTO json_test_strings VALUES ($$"\u0007"$$);
+INSERT INTO json_test_strings VALUES ($$"\u0008"$$);
+INSERT INTO json_test_strings VALUES ($$"\u0009"$$);
+INSERT INTO json_test_strings VALUES ($$"\u0010"$$);
+INSERT INTO json_test_strings VALUES ($$"\u0020"$$);
+INSERT INTO json_test_strings VALUES ($$"\u10000"$$);
+INSERT INTO json_test_strings VALUES ($$"\u1234"$$);
+INSERT INTO json_test_strings VALUES ($$"\u99999"$$);
+INSERT INTO json_test_strings VALUES ($$"\ud800\udc00"$$);
+INSERT INTO json_test_strings VALUES ($$"\uD800\uDC00"$$);
+INSERT INTO json_test_strings VALUES ($$"\uD834\uDD1E"$$);
+INSERT INTO json_test_strings VALUES ($$"\uDBFF\uDFFF"$$);
+INSERT INTO json_test_strings VALUES ($$"\uFFFD"$$);
+INSERT INTO json_test_strings VALUES ($$"\uFFFF"$$);
+INSERT INTO json_test_strings VALUES ($$hello$$);
+INSERT INTO json_test_strings VALUES ($$[32, 1]$$);
+INSERT INTO json_test_strings VALUES ($$[32, $$);
+INSERT INTO json_test_strings VALUES ($$"\uD800\uDC00"$$);
+INSERT INTO json_test_strings VALUES ($$"\n"$$);
+INSERT INTO json_test_strings VALUES ($$"hello"$$);
+INSERT INTO json_test_strings VALUES ($$"hello\u0009world"$$);
+INSERT INTO json_test_strings VALUES ($$"hello"$$);
+INSERT INTO json_test_strings VALUES ($$"hello\n"$$);
+INSERT INTO json_test_strings VALUES ($$"hello"$$);
+INSERT INTO json_test_strings VALUES ($$3$$);
+INSERT INTO json_test_strings VALUES ($$3.$$);
+INSERT INTO json_test_strings VALUES ($$.3$$);
+INSERT INTO json_test_strings VALUES ($$0.3$$);
+INSERT INTO json_test_strings VALUES ($$0.3e$$);
+INSERT INTO json_test_strings VALUES ($$0.3e+$$);
+INSERT INTO json_test_strings VALUES ($$0.3e+5$$);
+INSERT INTO json_test_strings VALUES ($$0.3e-5$$);
+INSERT INTO json_test_strings VALUES ($$0.3e5$$);
+INSERT INTO json_test_strings VALUES ($$"hello"$$);
+INSERT INTO json_test_strings VALUES ($$+3$$);
+INSERT INTO json_test_strings VALUES ($$-3$$);
+INSERT INTO json_test_strings VALUES ($$-3.$$);
+INSERT INTO json_test_strings VALUES ($$-3.1$$);
+INSERT INTO json_test_strings VALUES ($$-0$$);
+INSERT INTO json_test_strings VALUES ($$-0.$$);
+INSERT INTO json_test_strings VALUES ($$-0.0$$);
+INSERT INTO json_test_strings VALUES ($$.5$$);
+INSERT INTO json_test_strings VALUES ($$5.$$);
+INSERT INTO json_test_strings VALUES ($$5.e1$$);
+INSERT INTO json_test_strings VALUES ($$0.5$$);
+INSERT INTO json_test_strings VALUES ($$.3e1$$);
+INSERT INTO json_test_strings VALUES ($$.3e+1$$);
+INSERT INTO json_test_strings VALUES ($$.3e-1$$);
+INSERT INTO json_test_strings VALUES ($$.3e-1 .5$$);
+INSERT INTO json_test_strings VALUES ($$.3e-1.5$$);
+INSERT INTO json_test_strings VALUES ($$.3e+1.5$$);
+INSERT INTO json_test_strings VALUES ($$.3e+.$$);
+INSERT INTO json_test_strings VALUES ($$.3e+.5$$);
+INSERT INTO json_test_strings VALUES ($$.3e+1.5$$);
+INSERT INTO json_test_strings VALUES ($$9.3e+1.5$$);
+INSERT INTO json_test_strings VALUES ($$9.e+1.5$$);
+INSERT INTO json_test_strings VALUES ($$9.e+$$);
+INSERT INTO json_test_strings VALUES ($$9.e+1$$);
+INSERT INTO json_test_strings VALUES ($$"\""$$);
+INSERT INTO json_test_strings VALUES ($$"\"3.5"$$);
+INSERT INTO json_test_strings VALUES ($$"\"."$$);
+INSERT INTO json_test_strings VALUES ($$"\".".$$);
+INSERT INTO json_test_strings VALUES ($$"\"....."$$);
+INSERT INTO json_test_strings VALUES ($$"\"\"\"\"""$$);
+INSERT INTO json_test_strings VALUES (E'"newline\n"');
+INSERT INTO json_test_strings VALUES ($$["\"\"\"\"", .5]$$);
+INSERT INTO json_test_strings VALUES ($$[.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\"\"\"\"", 0.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\"\"\"\"", .5]$$);
+INSERT INTO json_test_strings VALUES ($$["\"\"\"\"",.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\"",.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\".5",.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\".5",".5\"".5]$$);
+INSERT INTO json_test_strings VALUES ($$["\".5",".5\"", .5]$$);
+INSERT INTO json_test_strings VALUES ($$["\".5",".5\"",.5]$$);
+INSERT INTO json_test_strings VALUES ($$["\".5",".5\"",0.5]$$);
+INSERT INTO json_test_strings VALUES ($${"key":/*comment*/"value"}$$);
+INSERT INTO json_test_strings VALUES ($${"key":/*comment"value"}$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/*$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/**/$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/***/$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/**//$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/**///$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}/**///----$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}#$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}#{$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}#{}$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"}#,$$);
+INSERT INTO json_test_strings VALUES ($${"key":"value"/**/, "k2":"v2"}$$);
+INSERT INTO json_test_strings VALUES ($$"\u0027"$$);
+INSERT INTO json_test_strings VALUES ($$"hello\'"$$);
+INSERT INTO json_test_strings VALUES ($$'hello\''$$);
+INSERT INTO json_test_strings VALUES ($$'hello'$$);
+INSERT INTO json_test_strings VALUES ($$'hell\'o'$$);
+INSERT INTO json_test_strings VALUES ($$'\'hello'$$);
+INSERT INTO json_test_strings VALUES ($$'\'hello\''$$);
+INSERT INTO json_test_strings VALUES ($$\'hello\'$$);
+INSERT INTO json_test_strings VALUES ($$'hello\'$$);
+INSERT INTO json_test_strings VALUES ($$['hello\']$$);
+INSERT INTO json_test_strings VALUES ($$['hello\'']$$);
+INSERT INTO json_test_strings VALUES ($$['hello"']$$);
+INSERT INTO json_test_strings VALUES ($$['hello\"']$$);
+INSERT INTO json_test_strings VALUES ($$['hello"o']$$);
+INSERT INTO json_test_strings VALUES ($$['"']$$);
+INSERT INTO json_test_strings VALUES ($$'"'$$);
+INSERT INTO json_test_strings VALUES ($$'"hello"'$$);
+INSERT INTO json_test_strings VALUES ($$'"hello'$$);
+INSERT INTO json_test_strings VALUES ($$'"hi"'$$);
+
+-- Control characters not allowed in JSON strings.
+-- Can't test \x00 because it's not even a valid TEXT.
+INSERT INTO json_test_strings VALUES (E'"\x01"');
+INSERT INTO json_test_strings VALUES (E'"\x02"');
+INSERT INTO json_test_strings VALUES (E'"\x03"');
+INSERT INTO json_test_strings VALUES (E'"\x04"');
+INSERT INTO json_test_strings VALUES (E'"\x05"');
+INSERT INTO json_test_strings VALUES (E'"\x06"');
+INSERT INTO json_test_strings VALUES (E'"\x07"');
+INSERT INTO json_test_strings VALUES (E'"\x08"');
+INSERT INTO json_test_strings VALUES (E'"\x09"');
+INSERT INTO json_test_strings VALUES (E'"\x0A"');
+INSERT INTO json_test_strings VALUES (E'"\x0B"');
+INSERT INTO json_test_strings VALUES (E'"\x0C"');
+INSERT INTO json_test_strings VALUES (E'"\x0D"');
+INSERT INTO json_test_strings VALUES (E'"\x0E"');
+INSERT INTO json_test_strings VALUES (E'"\x0F"');
+INSERT INTO json_test_strings VALUES (E'"\x10"');
+INSERT INTO json_test_strings VALUES (E'"\x11"');
+INSERT INTO json_test_strings VALUES (E'"\x12"');
+INSERT INTO json_test_strings VALUES (E'"\x13"');
+INSERT INTO json_test_strings VALUES (E'"\x14"');
+INSERT INTO json_test_strings VALUES (E'"\x15"');
+INSERT INTO json_test_strings VALUES (E'"\x16"');
+INSERT INTO json_test_strings VALUES (E'"\x17"');
+INSERT INTO json_test_strings VALUES (E'"\x18"');
+INSERT INTO json_test_strings VALUES (E'"\x19"');
+INSERT INTO json_test_strings VALUES (E'"\x1A"');
+INSERT INTO json_test_strings VALUES (E'"\x1B"');
+INSERT INTO json_test_strings VALUES (E'"\x1C"');
+INSERT INTO json_test_strings VALUES (E'"\x1D"');
+INSERT INTO json_test_strings VALUES (E'"\x1E"');
+INSERT INTO json_test_strings VALUES (E'"\x1F"');
+
+-- Although \x7F is a control character, the JSON RFC only says
+-- "..., and the control characters (U+0000 through U+001F)".
+-- However, whether to trust the RFC here or not is a tough call.
+INSERT INTO json_test_strings VALUES (E'"\x7F"');
#9Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Joseph Adams (#8)
1 attachment(s)
Re: Basic JSON support

On Mon, Sep 20, 2010 at 1:38 PM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

I have written a patch that amends the basic_json-20100915.patch .

Thanks. I merged your patch and added json_to_array(), as a demonstration
of json_stringify(). As the current code, json_stringify(json) just returns
the input text as-is, but json_stringify(json, NULL) trims all of unnecessary
whitespaces. We could do it in json_in() and json_parse() and always store
values in compressed representation instead. We leave room for discussion.

I also merge json_test_strings.sql into the main test file.
I slimed some tests -- many test cases seem to be duplicated for me.

I went ahead and added json_validate() now because it's useful for
testing (my test strings use it).

Good idea, but how about calling it json_is_well_formed()? We have
similar name of functions for xml type. I renamed it in the patch.

Here's one thing I'm worried about: the bison/flex code in your patch
looks rather similar to the code in
http://www.jsonlint.com/bin/jsonval.tgz , which is licensed under the
GPL.  In particular, the incorrect number regex I discussed above can
also be found in jsonval verbatim.  However, because there are a lot
of differences in both the bison and flex code now,  I'm not sure
they're close enough to be "copied", but I am not a lawyer.  It might
be a good idea to contact Ben Spencer and ask him for permission to
license our modified version of the code under PostgreSQL's more
relaxed license, just to be on the safe side.

Sorry for my insincere manner. Surely I read his code.
Do you know his contact address? I cannot find it...

--
Itagaki Takahiro

Attachments:

basic_json-20100921.patchapplication/octet-stream; name=basic_json-20100921.patchDownload
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index be272b5..796fcac 100644
*** a/src/backend/utils/adt/Makefile
--- b/src/backend/utils/adt/Makefile
*************** endif
*** 18,24 ****
  OBJS = acl.o arrayfuncs.o array_userfuncs.o arrayutils.o bool.o \
  	cash.o char.o date.o datetime.o datum.o domains.o \
  	enum.o float.o format_type.o \
! 	geo_ops.o geo_selfuncs.o int.o int8.o like.o lockfuncs.o \
  	misc.o nabstime.o name.o numeric.o numutils.o \
  	oid.o oracle_compat.o pseudotypes.o rowtypes.o \
  	regexp.o regproc.o ruleutils.o selfuncs.o \
--- 18,25 ----
  OBJS = acl.o arrayfuncs.o array_userfuncs.o arrayutils.o bool.o \
  	cash.o char.o date.o datetime.o datum.o domains.o \
  	enum.o float.o format_type.o \
! 	geo_ops.o geo_selfuncs.o json.o json_parser.o json_scanner.o \
! 	int.o int8.o like.o lockfuncs.o \
  	misc.o nabstime.o name.o numeric.o numutils.o \
  	oid.o oracle_compat.o pseudotypes.o rowtypes.o \
  	regexp.o regproc.o ruleutils.o selfuncs.o \
*************** OBJS = acl.o arrayfuncs.o array_userfunc
*** 33,36 ****
--- 34,66 ----
  
  like.o: like.c like_match.c
  
+ json_parser.o: json_scanner.c
+ 
+ # Latest flex causes warnings in this file.
+ #ifeq ($(GCC),yes)
+ #json_parser.o: CFLAGS += -Wno-error
+ #endif
+ 
+ json_parser.h: json_parser.c ;
+ 
+ json_parser.c: json_parser.y
+ ifdef BISON
+ 	$(BISON) -d $(BISONFLAGS) -o $@ $<
+ else
+ 	@$(missing) bison $< $@
+ endif
+ 
+ json_scanner.h: json_scanner.c ;
+ 
+ json_scanner.c: json_scanner.l
+ ifdef FLEX
+ 	$(FLEX) $(FLEXFLAGS) -o'$@' $<
+ else
+ 	@$(missing) flex $< $@
+ endif
+ 
+ json_parser.o keywords.o parser.o: json_parser.h
+ 
+ json.c: json_parser.h json_scanner.h
+ 
  include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index ...864bc87 .
*** a/src/backend/utils/adt/json.c
--- b/src/backend/utils/adt/json.c
***************
*** 0 ****
--- 1,460 ----
+ /*-------------------------------------------------------------------------
+  *
+  * json.c
+  *	  JSON data type support.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *	  src/backend/utils/adt/json.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ 
+ #include "catalog/pg_type.h"
+ #include "libpq/pqformat.h"
+ #include "utils/array.h"
+ #include "utils/builtins.h"
+ #include "utils/json.h"
+ 
+ #include "json_parser.h"
+ #include "json_scanner.h"
+ 
+ static Json *json_to_node(const char *str, int len);
+ static bool json_validate(const char *str, int len);
+ static Json *json_parse_internal(const char *str, int len, bool *validate);
+ static void append_json(StringInfo buf, const Json *node, const text *space,
+ 						int indent);
+ static void append_indent(StringInfo buf, const text *space, int indent);
+ #ifdef NOT_USED
+ static void json_free(Json *node);
+ #endif
+ 
+ extern int	json_yyparse(yyscan_t scanner);
+ extern void	json_yyerror(yyscan_t scanner, const char *error);
+ extern void *json_yyalloc(size_t bytes, void *yyscanner);
+ extern void *json_yyrealloc(void *ptr, size_t bytes, void *yyscanner);
+ extern void json_yyfree(void *ptr, void *yyscanner);
+ 
+ 
+ Datum
+ json_in(PG_FUNCTION_ARGS)
+ {
+ 	char	   *s = PG_GETARG_CSTRING(0);
+ 	jsontype   *json;
+ 
+ 	json = (jsontype *) cstring_to_text(s);
+ 
+ 	if (!json_validate(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json)))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ 				 errmsg("invalid input syntax for JSON")));
+ 
+ 	PG_RETURN_JSON_P(json);
+ }
+ 
+ Datum
+ json_out(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	return textout(fcinfo);
+ }
+ 
+ Datum
+ json_recv(PG_FUNCTION_ARGS)
+ {
+ 	StringInfo	buf = (StringInfo) PG_GETARG_POINTER(0);
+ 	jsontype   *result;
+ 	char	   *str;
+ 	int			nbytes;
+ 
+ 	str = pq_getmsgtext(buf, buf->len - buf->cursor, &nbytes);
+ 	if (!json_validate(str, nbytes))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+ 				 errmsg("invalid input syntax for JSON")));
+ 
+ 	result = (jsontype *) cstring_to_text_with_len(str, nbytes);
+ 	pfree(str);
+ 	PG_RETURN_JSON_P(result);
+ }
+ 
+ Datum
+ json_send(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	return textsend(fcinfo);
+ }
+ 
+ /*
+  * json_is_well_formed(text) RETURNS bool
+  */
+ Datum
+ json_is_well_formed(PG_FUNCTION_ARGS)
+ {
+ 	text	   *t = PG_GETARG_TEXT_PP(0);
+ 	bool		ret;
+ 
+ 	ret = json_validate(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t));
+ 	PG_RETURN_BOOL(ret);
+ }
+ 
+ Datum
+ json_parse(PG_FUNCTION_ARGS)
+ {
+ 	text	   *t = PG_GETARG_TEXT_PP(0);
+ 
+ 	if (!json_validate(VARDATA_ANY(t), VARSIZE_ANY_EXHDR(t)))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ 				 errmsg("invalid input syntax for JSON")));
+ 
+ 	PG_RETURN_JSON_P((jsontype *) t);
+ }
+ 
+ Datum
+ json_stringify(PG_FUNCTION_ARGS)
+ {
+ 	/* text and jsontype are binary-compatible */
+ 	PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+ }
+ 
+ /*
+  * json_stringify(json, space) -- Format a JSON value into text with or
+  * without indentation. If 'space' is NULL, all of the unneeded whitespaces
+  * are removed.
+  *
+  * XXX: The whitespace-trimmer might be confusable because json_stringify(x)
+  * and json_stringify(x, NULL) behave differently. We could move the trimmer
+  * into json_in and json_parse. If do so, we cannot return the input as-is,
+  * but the above two calls can return the same result.
+  */
+ Datum
+ json_stringify_space(PG_FUNCTION_ARGS)
+ {
+ 	jsontype	   *json;
+ 	text		   *space;
+ 	Json		   *node;
+ 	StringInfoData	buf;
+ 
+ 	if (PG_ARGISNULL(0))
+ 		PG_RETURN_NULL();
+ 
+ 	json = PG_GETARG_JSON_PP(0);
+ 	space = (PG_ARGISNULL(1) ? NULL : PG_GETARG_TEXT_PP(1));
+ 
+ 	/* This should be succeeded because it have been validated on input. */
+ 	node = json_to_node(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json));
+ 	if (node == NULL)
+ 		elog(ERROR, "unexpected json value");
+ 
+ 	initStringInfo(&buf);
+ 	append_json(&buf, node, space, 0);
+ 
+ #ifdef NOT_USED
+ 	json_free(node);
+ #endif
+ 
+ 	return CStringGetTextDatum(buf.data);
+ }
+ 
+ /*
+  * json_to_node -- parse a JSON text into a node tree.
+  *
+  * Returns NULL if the input is not valid JSON.
+  */
+ static Json *
+ json_to_node(const char *str, int len)
+ {
+ 	return json_parse_internal(str, len, NULL);
+ }
+ 
+ /*
+  * json_validate -- validate a JSON text.
+  */
+ static bool
+ json_validate(const char *str, int len)
+ {
+ 	bool valid;
+ 	(void) json_parse_internal(str, len, &valid);
+ 	return valid;
+ }
+ 
+ /*
+  * json_to_array -- split a JSON array to array of JSON values.
+  */
+ Datum
+ json_to_array(PG_FUNCTION_ARGS)
+ {
+ 	jsontype	   *json = PG_GETARG_JSON_PP(0);
+ 	Json		   *node;
+ 	ListCell	   *cell;
+ 	int				nelems;
+ 	ArrayType	   *array;
+ 
+ 	/* This should be succeeded because it have been validated on input. */
+ 	node = json_to_node(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json));
+ 	if (node == NULL)
+ 		elog(ERROR, "unexpected json value");
+ 
+ 	/* json null to SQL NULL */
+ 	if (node->type == JSON_NULL)
+ 		PG_RETURN_NULL();
+ 
+ 	if (node->type != JSON_ARRAY)
+ 		elog(ERROR, "not a json array");
+ 
+ 	nelems = list_length(node->value.array);
+ 	if (nelems == 0)
+ 	{
+ 		/* fast path for empty array */
+ 		array = construct_empty_array(JSONOID);
+ 	}
+ 	else
+ 	{
+ 		Datum		   *elems;
+ 		StringInfoData	buf;
+ 		int				i;
+ 
+ 		elems = (Datum *) palloc(sizeof(Datum) * nelems);
+ 		initStringInfo(&buf);
+ 
+ 		i = 0;
+ 		foreach (cell, node->value.array)
+ 		{
+ 			Json *item = (Json *) lfirst(cell);
+ 
+ 			resetStringInfo(&buf);
+ 			append_json(&buf, item, NULL, 0);
+ 			elems[i++] = PointerGetDatum(
+ 							cstring_to_text_with_len(buf.data, buf.len));
+ 		}
+ 		array = construct_array(elems, nelems, JSONOID, -1, false, 'i');
+ 	}
+ 
+ 	PG_RETURN_ARRAYTYPE_P(array);
+ }
+ 
+ /*
+  * Helper for json_to_node and json_validate.
+  *
+  * If validate is NULL, this function returns a parse tree, or NULL on a
+  * parse error. If validate is not NULL, this function sets a boolean to the
+  * variable indicating whether the JSON was valid and always return NULL.
+  */
+ static Json *
+ json_parse_internal(const char *str, int len, bool *validate)
+ {
+ 	yyscan_t		scanner;
+ 	char		   *scanbuf;
+ 	JsonParser		parser;
+ 	int				rc;
+ 
+ 	scanbuf = (char *) palloc(len + 2);
+ 	memcpy(scanbuf, str, len);
+ 	scanbuf[len] = scanbuf[len + 1] = '\0';
+ 
+ 	/* initialize parser and scanner */
+ 	memset(&parser, 0, sizeof(parser));
+ 	parser.validateOnly = (validate != NULL);
+ 	if (!parser.validateOnly)
+ 		initStringInfo(&parser.buf);
+ 
+ 	json_yylex_init(&scanner);
+ 	json_yyset_extra(&parser, scanner);
+ 	json_yy_scan_buffer(scanbuf, len + 2, scanner);
+ 
+ 	/* parse the JSON text */
+ 	rc = json_yyparse(scanner);
+ 
+ 	/* cleanup */
+ 	json_yylex_destroy(scanner);
+ 	pfree(scanbuf);
+ 	if (parser.buf.data)
+ 		pfree(parser.buf.data);
+ 
+ 	if (validate != NULL)
+ 		*validate = (rc == 0);
+ 
+ 	return rc == 0 ? parser.json : NULL;
+ }
+ 
+ /*
+  * append_json -- append JSON node tree to string with or without indentation
+  */
+ static void
+ append_json(StringInfo buf, const Json *node, const text *space, int indent)
+ {
+ 	ListCell   *cell;
+ 	bool		sep;
+ 
+ 	switch (node->type)
+ 	{
+ 	case JSON_NULL:
+ 		appendStringInfoString(buf, "null");
+ 		break;
+ 	case JSON_BOOL:
+ 		appendStringInfoString(buf, node->value.boolean ? "true" : "false");
+ 		break;
+ 	case JSON_NUMBER:
+ 		appendStringInfoString(buf, node->value.number);
+ 		break;
+ 	case JSON_STRING:
+ 		appendStringInfoString(buf, node->value.string);
+ 		break;
+ 	case JSON_ARRAY:
+ 		appendStringInfoChar(buf, '[');
+ 		if (space != NULL)
+ 			appendStringInfoChar(buf, '\n');
+ 		sep = false;
+ 		foreach (cell, node->value.array)
+ 		{
+ 			const Json *value = (const Json *) lfirst(cell);
+ 
+ 			if (sep)
+ 			{
+ 				appendStringInfoChar(buf, ',');
+ 				if (space != NULL)
+ 					appendStringInfoChar(buf, '\n');
+ 			}
+ 			else
+ 				sep = true;
+ 			append_indent(buf, space, indent + 1);
+ 			append_json(buf, value, space, indent + 1);
+ 		}
+ 		if (sep && space)
+ 			appendStringInfoChar(buf, '\n');
+ 		append_indent(buf, space, indent);
+ 		appendStringInfoChar(buf, ']');
+ 		break;
+ 	case JSON_OBJECT:
+ 		appendStringInfoChar(buf, '{');
+ 		if (space != NULL)
+ 			appendStringInfoChar(buf, '\n');
+ 		sep = false;
+ 		foreach (cell, node->value.object)
+ 		{
+ 			const JsonAttr *attr = (const JsonAttr *) lfirst(cell);
+ 			const char *key = attr->key;
+ 			const Json *value = attr->value;
+ 
+ 			if (sep)
+ 			{
+ 				appendStringInfoChar(buf, ',');
+ 				if (space != NULL)
+ 					appendStringInfoChar(buf, '\n');
+ 			}
+ 			else
+ 				sep = true;
+ 			append_indent(buf, space, indent + 1);
+ 			appendStringInfoString(buf, key);
+ 			appendStringInfoChar(buf, ':');
+ 			if (space != NULL)
+ 				appendStringInfoChar(buf, ' ');
+ 			append_json(buf, value, space, indent + 1);
+ 		}
+ 		if (sep && space != NULL)
+ 			appendStringInfoChar(buf, '\n');
+ 		append_indent(buf, space, indent);
+ 		appendStringInfoChar(buf, '}');
+ 		break;
+ 	default:
+ 		elog(ERROR, "unexpected json type: %d", node->type);
+ 	}
+ }
+ 
+ static void
+ append_indent(StringInfo buf, const text *space, int indent)
+ {
+ 	int		i;
+ 
+ 	if (space == NULL)
+ 		return;	/* do nothing */
+ 
+ 	for (i = 0; i < indent; i++)
+ 		appendBinaryStringInfo(buf,
+ 			VARDATA_ANY(space), VARSIZE_ANY_EXHDR(space));
+ }
+ 
+ #ifdef NOT_USED
+ /*
+  * json_free -- free JSON node tree recursively
+  */
+ static void
+ json_free(Json *node)
+ {
+ 	ListCell   *cell;
+ 
+ 	switch (node->type)
+ 	{
+ 	case JSON_NULL:
+ 	case JSON_BOOL:
+ 		break;
+ 	case JSON_NUMBER:
+ 		pfree(node->value.number);
+ 		break;
+ 	case JSON_STRING:
+ 		pfree(node->value.string);
+ 		break;
+ 	case JSON_ARRAY:
+ 		foreach (cell, node->value.array)
+ 		{
+ 			Json *item = (Json *) lfirst(cell);
+ 
+ 			json_free(item);
+ 		}
+ 		list_free(node->value.array);
+ 		break;
+ 	case JSON_OBJECT:
+ 		foreach (cell, node->value.object)
+ 		{
+ 			JsonAttr *attr = (JsonAttr *) lfirst(cell);
+ 
+ 			pfree(attr->key);
+ 			json_free(attr->value);
+ 			pfree(attr);
+ 		}
+ 		list_free(node->value.object);
+ 		break;
+ 	default:
+ 		elog(ERROR, "unexpected json type: %d", node->type);
+ 	}
+ 
+ 	pfree(node);
+ }
+ #endif
+ 
+ void
+ json_yyerror(yyscan_t scanner, const char *error)
+ {
+ 	/*
+ 	 * No error report here. Instead, the caller will receives NULL node
+ 	 * tree on ERROR.
+ 	 */
+ }
+ 
+ void *
+ json_yyalloc(size_t bytes, void *yyscanner)
+ {
+ 	return palloc(bytes);
+ }
+ 
+ void *
+ json_yyrealloc(void *ptr, size_t bytes, void *yyscanner)
+ {
+ 	if (ptr != NULL)
+ 		return repalloc(ptr, bytes);
+ 	else
+ 		return palloc(bytes);
+ }
+ 
+ void
+ json_yyfree(void *ptr, void *yyscanner)
+ {
+ 	if (ptr != NULL)
+ 		pfree(ptr);
+ }
diff --git a/src/backend/utils/adt/json_parser.y b/src/backend/utils/adt/json_parser.y
index ...dd3914d .
*** a/src/backend/utils/adt/json_parser.y
--- b/src/backend/utils/adt/json_parser.y
***************
*** 0 ****
--- 1,223 ----
+ %{
+ /*-------------------------------------------------------------------------
+  *
+  * json_parser.y
+  *	  Parser for JSON data types.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *	  src/backend/utils/adt/json_parser.y
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "nodes/pg_list.h"
+ #include "utils/json.h"
+ 
+ static Json *makeJsonNode(JsonType type);
+ static Json *makeJsonNull(JsonParser *parser);
+ static Json *makeJsonBool(JsonParser *parser, bool value);
+ static Json *makeJsonNumber(JsonParser *parser, char *value);
+ static Json *makeJsonString(JsonParser *parser, char *value);
+ static Json *makeJsonArray(JsonParser *parser, List *items);
+ static Json *makeJsonObject(JsonParser *parser, List *attrs);
+ static JsonAttr *makeJsonAttr(JsonParser *parser, char *key, Json *value);
+ static List *makeList(JsonParser *parser, void *elem);
+ static List *lconsList(JsonParser *parser, void *elem, List *list);
+ 
+ #include "json_parser.h"
+ #include "json_scanner.h"
+ 
+ #define getParser()		((JsonParser *) json_yyget_extra(scanner))
+ 
+ extern void json_yyerror(yyscan_t, const char *);
+ %}
+ 
+ %defines
+ %expect 0
+ %name-prefix="json_yy"
+ %pure-parser
+ %parse-param	{yyscan_t scanner}
+ %lex-param		{yyscan_t scanner}
+ 
+ %union {
+ 	bool		boolean;
+ 	char	   *number;
+ 	char	   *string;
+ 	Json	   *node;
+ 	JsonAttr   *attr;
+ 	List	   *list;
+ }
+ 
+ %token <string>			T_JSON_STRING
+ %token <number>			T_JSON_NUMBER
+ %token <boolean>		T_JSON_TRUE T_JSON_FALSE
+ %token T_JSON_NULL
+ %token T_JSON_INVALID
+ 
+ %type <number>			number
+ %type <string>			string
+ %type <node>			json value array object 
+ %type <list>			items attrs
+ %type <attr>			attr
+ 
+ %%
+ json:
+ 	  value				{ getParser()->json = $1; }
+ ;
+ 
+ array:
+ 	  '[' ']'			{ $$ = makeJsonArray(getParser(), NIL); }
+ 	| '[' items ']'		{ $$ = makeJsonArray(getParser(), $2);; }
+ ;
+ 
+ items:
+ 	  value				{ $$ = makeList(getParser(), $1); }
+ 	| value ',' items	{ $$ = lconsList(getParser(), $1, $3); }
+ ;
+ 
+ object:
+ 	  '{' '}'			{ $$ = makeJsonObject(getParser(), NIL); }
+ 	| '{' attrs '}'		{ $$ = makeJsonObject(getParser(), $2); }
+ ;
+ 
+ attrs:
+ 	  attr				{ $$ = makeList(getParser(), $1); }
+ 	| attr ',' attrs	{ $$ = lconsList(getParser(), $1, $3); }
+ ;
+ 
+ attr:
+ 	  string ':' value	{ $$ = makeJsonAttr(getParser(), $1, $3); }
+ ;
+ 
+ value:
+ 	  T_JSON_NULL		{ $$ = makeJsonNull(getParser()); }
+ 	| T_JSON_TRUE		{ $$ = makeJsonBool(getParser(), true); }
+ 	| T_JSON_FALSE		{ $$ = makeJsonBool(getParser(), false); }
+ 	| number			{ $$ = makeJsonNumber(getParser(), $1); }
+ 	| string			{ $$ = makeJsonString(getParser(), $1); }
+ 	| array				{ $$ = $1; }
+ 	| object			{ $$ = $1; }
+ ;
+ 
+ number:		T_JSON_NUMBER		{ $$ = $1; };
+ string:		T_JSON_STRING		{ $$ = $1; };
+ 
+ %%
+ static Json *
+ makeJsonNode(JsonType type)
+ {
+ 	Json *node = (Json *) palloc(sizeof(Json));
+ 	node->type = type;
+ 	return node;
+ }
+ 
+ static Json *
+ makeJsonNull(JsonParser *parser)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 		return makeJsonNode(JSON_NULL);
+ }
+ 
+ static Json *
+ makeJsonBool(JsonParser *parser, bool value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_BOOL);
+ 		node->value.boolean = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonNumber(JsonParser *parser, char *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_NUMBER);
+ 		node->value.number = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonString(JsonParser *parser, char *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_STRING);
+ 		node->value.string = value;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonArray(JsonParser *parser, List *items)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_ARRAY);
+ 		node->value.array = items;
+ 		return node;
+ 	}
+ }
+ 
+ static Json *
+ makeJsonObject(JsonParser *parser, List *attrs)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		Json *node = makeJsonNode(JSON_OBJECT);
+ 		node->value.object = attrs;
+ 		return node;
+ 	}
+ }
+ 
+ static JsonAttr *
+ makeJsonAttr(JsonParser *parser, char *key, Json *value)
+ {
+ 	if (parser->validateOnly)
+ 		return NULL;
+ 	else
+ 	{
+ 		JsonAttr *attr = (JsonAttr *) palloc(sizeof(JsonAttr));
+ 		attr->key = key;
+ 		attr->value = value;
+ 		return attr;
+ 	}
+ }
+ 
+ static List *
+ makeList(JsonParser *parser, void *elem)
+ {
+ 	if (parser->validateOnly)
+ 		return NIL;
+ 	else
+ 		return list_make1(elem);
+ }
+ 
+ static List *
+ lconsList(JsonParser *parser, void *elem, List *list)
+ {
+ 	if (parser->validateOnly)
+ 		return NIL;
+ 	else
+ 		return lcons(elem, list);
+ }
diff --git a/src/backend/utils/adt/json_scanner.l b/src/backend/utils/adt/json_scanner.l
index ...cdf2a6a .
*** a/src/backend/utils/adt/json_scanner.l
--- b/src/backend/utils/adt/json_scanner.l
***************
*** 0 ****
--- 1,92 ----
+ %{
+ /*-------------------------------------------------------------------------
+  *
+  * json_snanner.l
+  *	  Lexical scanner for JSON data types.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *	  src/backend/utils/adt/json_snanner.l
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #include "postgres.h"
+ #include "nodes/pg_list.h"
+ #include "utils/json.h"
+ #include "json_parser.h"
+ 
+ #define getParser()			((JsonParser *) json_yyget_extra(yyscanner))
+ %}
+ 
+ %option reentrant
+ %option bison-bridge
+ %option 8bit
+ %option never-interactive
+ %option nodefault
+ %option noinput
+ %option nounput
+ %option noyywrap
+ %option noyyalloc
+ %option noyyrealloc
+ %option noyyfree
+ %option warn
+ %option prefix="json_yy"
+ %option outfile="json_scanner.c" header-file="json_scanner.h"
+ 
+ %x str
+ 
+ %%
+ 	/* Whitespace */
+ [ \t\r\n]
+ 
+ 	/* Symbol */
+ [\[\]\{\}:,] return yytext[0];
+ 
+ 	/* Null */
+ null		{ return T_JSON_NULL; }
+ 
+ 	/* Boolean */
+ true		{ return T_JSON_TRUE; }
+ false		{ return T_JSON_FALSE; }
+ 
+ 	/* Number */
+ -?(0|[1-9][0-9]*)(\.[0-9]+)?([Ee][+-]?[0-9]+)?	{
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 			yylval->number = pstrdup(yytext);
+ 		return T_JSON_NUMBER;
+ 	}
+ 
+ 	/* String */
+ \"	{
+ 		JsonParser *parser = getParser();
+ 		BEGIN str;
+ 		if (!parser->validateOnly)
+ 			appendStringInfoString(&parser->buf, yytext);
+ 	}
+ <str>([^\"\\\x00-\x1F]+|\\u[0-9A-Fa-f]{4}|\\[\"\\/bfnrt]) {
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 			appendStringInfoString(&parser->buf, yytext);
+ 	}
+ <str>\" {
+ 		JsonParser *parser = getParser();
+ 		if (!parser->validateOnly)
+ 		{
+ 			appendStringInfoString(&parser->buf, yytext);
+ 			yylval->string = pstrdup(parser->buf.data);
+ 			resetStringInfo(&parser->buf);
+ 		}
+ 		BEGIN(INITIAL);
+ 		return T_JSON_STRING;
+ 	}
+ 		/* unterminated string */
+ <str>.|\n		{ return T_JSON_INVALID; }
+ <str><<EOF>>	{ return T_JSON_INVALID; }
+ 
+ 	/* Invalid */
+ . return T_JSON_INVALID;
diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h
index 7e33b94..85b833f 100644
*** a/src/include/catalog/pg_cast.h
--- b/src/include/catalog/pg_cast.h
*************** DATA(insert (  869	 25  730 a f ));
*** 320,325 ****
--- 320,327 ----
  DATA(insert (	16	 25 2971 a f ));
  DATA(insert (  142	 25    0 a b ));
  DATA(insert (	25	142 2896 e f ));
+ DATA(insert (  321	 25    0 a b ));
+ DATA(insert (	25	321 3830 e f ));
  
  /*
   * Cross-category casts to and from VARCHAR
*************** DATA(insert (  869 1043  730 a f ));
*** 331,336 ****
--- 333,340 ----
  DATA(insert (	16 1043 2971 a f ));
  DATA(insert (  142 1043    0 a b ));
  DATA(insert ( 1043	142 2896 e f ));
+ DATA(insert (  321 1043    0 a b ));
+ DATA(insert ( 1043	321 3830 e f ));
  
  /*
   * Cross-category casts to and from BPCHAR
*************** DATA(insert (  869 1042  730 a f ));
*** 342,347 ****
--- 346,353 ----
  DATA(insert (	16 1042 2971 a f ));
  DATA(insert (  142 1042    0 a b ));
  DATA(insert ( 1042	142 2896 e f ));
+ DATA(insert (  321 1042    0 a b ));
+ DATA(insert ( 1042	321 3830 e f ));
  
  /*
   * Length-coercion functions
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 61c6b27..76d95b7 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DESCR("determine if a string is well for
*** 4457,4462 ****
--- 4457,4482 ----
  DATA(insert OID = 3053 (  xml_is_well_formed_content	 PGNSP PGUID 12 1 0 0 f f f t f i 1 0 16 "25" _null_ _null_ _null_ _null_ xml_is_well_formed_content _null_ _null_ _null_ ));
  DESCR("determine if a string is well formed XML content");
  
+ /* JSON support */
+ DATA(insert OID = 3826 (  json_in			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 321 "2275" _null_ _null_ _null_ _null_ json_in _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3827 (  json_out			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2275 "321" _null_ _null_ _null_ _null_ json_out _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3828 (  json_recv			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 321 "2281" _null_ _null_ _null_ _null_ json_recv _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3829 (  json_send			PGNSP PGUID 12 1 0 0 f f f t f i 1 0 17 "321" _null_ _null_ _null_ _null_ json_send _null_ _null_ _null_ ));
+ DESCR("I/O");
+ DATA(insert OID = 3830 (  json_parse		PGNSP PGUID 12 1 0 0 f f f t f s 1 0 321 "25" _null_ _null_ _null_ _null_ json_parse _null_ _null_ _null_ ));
+ DESCR("parse a character string to produce a JSON value");
+ DATA(insert OID = 3831 (  json_is_well_formed PGNSP PGUID 12 1 0 0 f f f t f i 1 0 16 "25" _null_ _null_ _null_ _null_ json_is_well_formed _null_ _null_ _null_ ));
+ DESCR("determine if a string is well formed JSON");
+ DATA(insert OID = 3832 (  json_stringify	PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "321" _null_ _null_ _null_ _null_ json_stringify _null_ _null_ _null_ ));
+ DESCR("serialize a JSON value to a character string");
+ DATA(insert OID = 3833 (  json_stringify	PGNSP PGUID 12 1 0 0 f f f f f i 2 0 25 "321 25" _null_ _null_ _null_ _null_ json_stringify_space _null_ _null_ _null_ ));
+ DESCR("serialize a JSON value to a character string with space option");
+ DATA(insert OID = 3834 (  json_to_array		PGNSP PGUID 12 1 0 0 f f f t f i 1 0 322 "321" _null_ _null_ _null_ _null_ json_to_array _null_ _null_ _null_ ));
+ DESCR("convert a JSON array to an array of JSON values");
+ 
  /* uuid */
  DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 0 f f f t f i 1 0 2950 "2275" _null_ _null_ _null_ _null_ uuid_in _null_ _null_ _null_ ));
  DESCR("I/O");
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index fc2c306..6034cc3 100644
*** a/src/include/catalog/pg_type.h
--- b/src/include/catalog/pg_type.h
*************** DESCR("storage manager");
*** 354,359 ****
--- 354,364 ----
  
  /* OIDS 300 - 399 */
  
+ DATA(insert OID = 321 ( json	   PGNSP PGUID -1 f b U f t \054 0 0 322 json_in json_out json_recv json_send - - - i x f 0 -1 0 _null_ _null_ ));
+ DESCR("JSON content");
+ #define JSONOID 321
+ DATA(insert OID = 322 ( _json	   PGNSP PGUID -1 f b A f t \054 0 321 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 _null_ _null_ ));
+ 
  /* OIDS 400 - 499 */
  
  /* OIDS 500 - 599 */
diff --git a/src/include/utils/json.h b/src/include/utils/json.h
index ...55113b3 .
*** a/src/include/utils/json.h
--- b/src/include/utils/json.h
***************
*** 0 ****
--- 1,77 ----
+ /*-------------------------------------------------------------------------
+  *
+  * json.h
+  *	  Declarations for JSON data type support.
+  *
+  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * src/include/utils/json.h
+  *
+  *-------------------------------------------------------------------------
+  */
+ 
+ #ifndef JSON_H
+ #define JSON_H
+ 
+ #include "fmgr.h"
+ #include "nodes/execnodes.h"
+ #include "nodes/primnodes.h"
+ 
+ typedef struct varlena jsontype;
+ 
+ #define DatumGetJsonP(X)		((jsontype *) PG_DETOAST_DATUM(X))
+ #define DatumGetJsonPP(X)		((jsontype *) PG_DETOAST_DATUM_PACKED(X))
+ #define JsonPGetDatum(X)		PointerGetDatum(X)
+ 
+ #define PG_GETARG_JSON_P(n)		DatumGetJsonP(PG_GETARG_DATUM(n))
+ #define PG_GETARG_JSON_PP(n)	DatumGetJsonPP(PG_GETARG_DATUM(n))
+ #define PG_RETURN_JSON_P(x)		PG_RETURN_POINTER(x)
+ 
+ extern Datum json_in(PG_FUNCTION_ARGS);
+ extern Datum json_out(PG_FUNCTION_ARGS);
+ extern Datum json_recv(PG_FUNCTION_ARGS);
+ extern Datum json_send(PG_FUNCTION_ARGS);
+ extern Datum json_parse(PG_FUNCTION_ARGS);
+ extern Datum json_is_well_formed(PG_FUNCTION_ARGS);
+ extern Datum json_stringify(PG_FUNCTION_ARGS);
+ extern Datum json_stringify_space(PG_FUNCTION_ARGS);
+ extern Datum json_to_array(PG_FUNCTION_ARGS);
+ 
+ typedef enum JsonType
+ {
+ 	JSON_NULL,
+ 	JSON_BOOL,
+ 	JSON_NUMBER,
+ 	JSON_STRING,
+ 	JSON_ARRAY,
+ 	JSON_OBJECT
+ } JsonType;
+ 
+ typedef struct Json
+ {
+ 	JsonType	type;
+ 	union
+ 	{
+ 		bool	boolean;
+ 		char   *number;
+ 		char   *string;
+ 		List   *array;		/* a list of Json */
+ 		List   *object;		/* a list of JsonAttr */
+ 	} value;
+ } Json;
+ 
+ typedef struct JsonAttr
+ {
+ 	char	   *key;
+ 	Json	   *value;
+ } JsonAttr;
+ 
+ typedef struct JsonParser
+ {
+ 	Json		   *json;
+ 	StringInfoData	buf;
+ 	bool			validateOnly;
+ } JsonParser;
+ 
+ #endif   /* JSON_H */
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index ...fc87cd4 .
*** a/src/test/regress/expected/json.out
--- b/src/test/regress/expected/json.out
***************
*** 0 ****
--- 1,107 ----
+ CREATE TABLE jsontest (t text);
+ -- Control characters not allowed in JSON strings.
+ -- Can't test \x00 because it's not even a valid TEXT.
+ -- Although \x7F is a control character, the JSON RFC only says
+ -- "..., and the control characters (U+0000 through U+001F)".
+ -- However, whether to trust the RFC here or not is a tough call.
+ copy jsontest from stdin;
+ SELECT t, json_is_well_formed(t)::text FROM jsontest;
+                  t                 | json_is_well_formed 
+ -----------------------------------+---------------------
+                                    | false
+  null                              | true
+  {"booleans":[true,false]}         | true
+  0                                 | true
+  +0                                | false
+  -0                                | true
+  3                                 | true
+  -3                                | true
+  3.                                | false
+  3.1                               | true
+  .3                                | false
+  0.3                               | true
+  123.456e14234                     | true
+  123.456e-14234                    | true
+  123.456e+14234                    | true
+  0.3e                              | false
+  0.3e+                             | false
+  0.3e+5                            | true
+  0.3e-5                            | true
+  3e+1                              | true
+  5e5                               | true
+  5.e5                              | false
+  3.2e+1                            | true
+  3.2e+1.5                          | false
+  ""                                | true
+  "hello"                           | true
+  'hello'                           | false
+  hello                             | false
+  "hello                            | false
+  "hello"""                         | false
+  "\"hello\""                       | true
+  "newline                         +| false
+  "                                 | 
+  "\u00000"                         | true
+  "\u99999"                         | true
+  "\uDB00"                          | true
+  "\uDB00\uDBFF"                    | true
+  "\u1234\uD800"                    | true
+  "\ud800\udc00"                    | true
+  "\uFFFE\uFFFF"                    | true
+  "\UD834\UDD1E"                    | false
+  "\n\r\t\b"                        | true
+  "\n\r\t\v"                        | false
+  "\x01"                            | false
+  "\x1F"                            | false
+  "\x7F"                            | false
+  []                                | true
+  [,]                               | false
+  [,2]                              | false
+  [)                                | false
+  []]                               | false
+  [}                                | false
+  [1]                               | true
+  ["1":2]                           | false
+  [1,2,]                            | false
+  [1:2}                             | false
+  [3, [4, [5], 6]]                  | true
+  ["hello"]                         | true
+  ["hello", "bye", 3]               | true
+  {}                                | true
+  {]                                | false
+  {,}                               | false
+  {"1":{}}                          | true
+  {"1":2}                           | true
+  {"1":2,}                          | false
+  {1:2}                             | false
+  {"1":2, "3":4}                    | true
+  {"1":2, "3":4                     | false
+  {"1":2, "3" : [4, {}, ["5"]}      | false
+  {"no value"}                      | false
+  {"non-string key":null, 123:null} | false
+ (70 rows)
+ 
+ SELECT json_stringify('{"null":null, "boolean":true, "array":[1,2,3], "empty array":[], "empty object":{}}', '  ');
+    json_stringify    
+ ---------------------
+  {                  +
+    "null": null,    +
+    "boolean": true, +
+    "array": [       +
+      1,             +
+      2,             +
+      3              +
+    ],               +
+    "empty array": [ +
+    ],               +
+    "empty object": {+
+    }                +
+  }
+ (1 row)
+ 
+ SELECT json_to_array('[1, 2, 3, "string", [ "array" ], { "object":null }]');
+                      json_to_array                      
+ --------------------------------------------------------
+  {1,2,3,"\"string\"","[\"array\"]","{\"object\":null}"}
+ (1 row)
+ 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 4703d49..ca6ff28 100644
*** a/src/test/regress/expected/opr_sanity.out
--- b/src/test/regress/expected/opr_sanity.out
*************** WHERE c.castfunc = p.oid AND
*** 402,407 ****
--- 402,409 ----
  -- texttoxml(), which does an XML syntax check.
  -- As of 9.1, this finds the cast from pg_node_tree to text, which we
  -- intentionally do not provide a reverse pathway for.
+ -- Also, this finds the casts from json to text, varchar, and bpchar,
+ -- because of the same reason as xml.
  SELECT castsource::regtype, casttarget::regtype, castfunc, castcontext
  FROM pg_cast c
  WHERE c.castmethod = 'b' AND
*************** WHERE c.castmethod = 'b' AND
*** 416,424 ****
   pg_node_tree      | text              |        0 | i
   cidr              | inet              |        0 | i
   xml               | text              |        0 | a
   xml               | character varying |        0 | a
   xml               | character         |        0 | a
! (7 rows)
  
  -- **************** pg_operator ****************
  -- Look for illegal values in pg_operator fields.
--- 418,429 ----
   pg_node_tree      | text              |        0 | i
   cidr              | inet              |        0 | i
   xml               | text              |        0 | a
+  json              | text              |        0 | a
   xml               | character varying |        0 | a
+  json              | character varying |        0 | a
   xml               | character         |        0 | a
!  json              | character         |        0 | a
! (10 rows)
  
  -- **************** pg_operator ****************
  -- Look for illegal values in pg_operator fields.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index a05bfeb..286fd4b 100644
*** a/src/test/regress/parallel_schedule
--- b/src/test/regress/parallel_schedule
*************** test: select_views portals_p2 foreign_ke
*** 91,97 ****
  # NB: temp.sql does a reconnect which transiently uses 2 connections,
  # so keep this parallel group to at most 19 tests
  # ----------
! test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
  
  # run stats by itself because its delay may be insufficient under heavy load
  test: stats
--- 91,97 ----
  # NB: temp.sql does a reconnect which transiently uses 2 connections,
  # so keep this parallel group to at most 19 tests
  # ----------
! test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml json
  
  # run stats by itself because its delay may be insufficient under heavy load
  test: stats
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index e2f8351..d7d6e45 100644
*** a/src/test/regress/serial_schedule
--- b/src/test/regress/serial_schedule
*************** test: returning
*** 123,126 ****
--- 123,127 ----
  test: largeobject
  test: with
  test: xml
+ text: json
  test: stats
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index ...a7e283d .
*** a/src/test/regress/sql/json.sql
--- b/src/test/regress/sql/json.sql
***************
*** 0 ****
--- 1,85 ----
+ CREATE TABLE jsontest (t text);
+ 
+ -- Control characters not allowed in JSON strings.
+ -- Can't test \x00 because it's not even a valid TEXT.
+ -- Although \x7F is a control character, the JSON RFC only says
+ -- "..., and the control characters (U+0000 through U+001F)".
+ -- However, whether to trust the RFC here or not is a tough call.
+ 
+ copy jsontest from stdin;
+ 
+ null
+ {"booleans":[true,false]}
+ 0
+ +0
+ -0
+ 3
+ -3
+ 3.
+ 3.1
+ .3
+ 0.3
+ 123.456e14234
+ 123.456e-14234
+ 123.456e+14234
+ 0.3e
+ 0.3e+
+ 0.3e+5
+ 0.3e-5
+ 3e+1
+ 5e5
+ 5.e5
+ 3.2e+1
+ 3.2e+1.5
+ ""
+ "hello"
+ 'hello'
+ hello
+ "hello
+ "hello"""
+ "\\"hello\\""
+ "newline\n"
+ "\\u00000"
+ "\\u99999"
+ "\\uDB00"
+ "\\uDB00\\uDBFF"
+ "\\u1234\\uD800"
+ "\\ud800\\udc00"
+ "\\uFFFE\\uFFFF"
+ "\\UD834\\UDD1E"
+ "\\n\\r\\t\\b"
+ "\\n\\r\\t\\v"
+ "\\x01"
+ "\\x1F"
+ "\\x7F"
+ []
+ [,]
+ [,2]
+ [)
+ []]
+ [}
+ [1]
+ ["1":2]
+ [1,2,]
+ [1:2}
+ [3, [4, [5], 6]]
+ ["hello"]
+ ["hello", "bye", 3]
+ {}
+ {]
+ {,}
+ {"1":{}}
+ {"1":2}
+ {"1":2,}
+ {1:2}
+ {"1":2, "3":4}
+ {"1":2, "3":4
+ {"1":2, "3" : [4, {}, ["5"]}
+ {"no value"}
+ {"non-string key":null, 123:null}
+ \.
+ 
+ SELECT t, json_is_well_formed(t)::text FROM jsontest;
+ 
+ SELECT json_stringify('{"null":null, "boolean":true, "array":[1,2,3], "empty array":[], "empty object":{}}', '  ');
+ SELECT json_to_array('[1, 2, 3, "string", [ "array" ], { "object":null }]');
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 0d084a1..fb71bb2 100644
*** a/src/test/regress/sql/opr_sanity.sql
--- b/src/test/regress/sql/opr_sanity.sql
*************** WHERE c.castfunc = p.oid AND
*** 320,325 ****
--- 320,327 ----
  
  -- As of 9.1, this finds the cast from pg_node_tree to text, which we
  -- intentionally do not provide a reverse pathway for.
+ -- Also, this finds the casts from json to text, varchar, and bpchar,
+ -- because of the same reason as xml.
  
  SELECT castsource::regtype, casttarget::regtype, castfunc, castcontext
  FROM pg_cast c
#10Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#9)
Re: Basic JSON support

On Tue, Sep 21, 2010 at 8:38 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

Sorry for my insincere manner. Surely I read his code.
Do you know his contact address? I cannot find it...

It alarms me quite a bit that someone who is a committer on this
project would accidentally copy code from another project with a
different license into PostgreSQL. How does that happen? And how
much got copied, besides the regular expression? I would be inclined
to flush this patch altogether rather than take ANY risk of GPL
contamination.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#11Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#10)
Re: Basic JSON support

On Tue, Sep 21, 2010 at 9:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It alarms me quite a bit that someone who is a committer on this
project would accidentally copy code from another project with a
different license into PostgreSQL.  How does that happen?  And how
much got copied, besides the regular expression?  I would be inclined
to flush this patch altogether rather than take ANY risk of GPL
contamination.

Only regular expressions in the scanner. So I've thought it's OK,
but I should have been more careful. Sorry.

--
Itagaki Takahiro

#12Joseph Adams
joeyadams3.14159@gmail.com
In reply to: Joseph Adams (#8)
Re: Basic JSON support

On Mon, Sep 20, 2010 at 12:38 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

Here's one thing I'm worried about: the bison/flex code in your patch
looks rather similar to the code in
http://www.jsonlint.com/bin/jsonval.tgz , which is licensed under the
GPL.  In particular, the incorrect number regex I discussed above can
also be found in jsonval verbatim.  However, because there are a lot
of differences in both the bison and flex code now,  I'm not sure
they're close enough to be "copied", but I am not a lawyer.  It might
be a good idea to contact Ben Spencer and ask him for permission to
license our modified version of the code under PostgreSQL's more
relaxed license, just to be on the safe side.

With the help and motivation of David Fetter, I sent an e-mail to Ben
Spencer (google "jsonval", check out the git repo, and look in the git
log to find his e-mail address) a few minutes ago, asking if he would
license jsonval under a license compatible with PostgreSQL, and am
currently awaiting a response.

If he doesn't respond, or outright refuses (which I, for one, doubt
will happen), my fallback plan is to rewrite the JSON validation code
by drawing from my original code (meaning it won't be in bison/flex)
and post a patch for it. Unfortunately, it seems to me that there
aren't very many ways of expressing a JSON parser in bison/flex, and
thus the idea of JSON parsing with bison/flex is effectively locked
down by the GPL unless we can get a more permissive license for
jsonval. But, I am not a lawyer.

#13Robert Haas
robertmhaas@gmail.com
In reply to: Joseph Adams (#12)
Re: Basic JSON support

On Mon, Oct 4, 2010 at 2:50 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote:

On Mon, Sep 20, 2010 at 12:38 AM, Joseph Adams
<joeyadams3.14159@gmail.com> wrote:

Here's one thing I'm worried about: the bison/flex code in your patch
looks rather similar to the code in
http://www.jsonlint.com/bin/jsonval.tgz , which is licensed under the
GPL.  In particular, the incorrect number regex I discussed above can
also be found in jsonval verbatim.  However, because there are a lot
of differences in both the bison and flex code now,  I'm not sure
they're close enough to be "copied", but I am not a lawyer.  It might
be a good idea to contact Ben Spencer and ask him for permission to
license our modified version of the code under PostgreSQL's more
relaxed license, just to be on the safe side.

With the help and motivation of David Fetter, I sent an e-mail to Ben
Spencer (google "jsonval", check out the git repo, and look in the git
log to find his e-mail address) a few minutes ago, asking if he would
license jsonval under a license compatible with PostgreSQL, and am
currently awaiting a response.

If he doesn't respond, or outright refuses (which I, for one, doubt
will happen), my fallback plan is to rewrite the JSON validation code
by drawing from my original code (meaning it won't be in bison/flex)
and post a patch for it.  Unfortunately, it seems to me that there
aren't very many ways of expressing a JSON parser in bison/flex, and
thus the idea of JSON parsing with bison/flex is effectively locked
down by the GPL unless we can get a more permissive license for
jsonval.  But, I am not a lawyer.

If someone who hasn't looked at the GPL code sits down and codes
something up based on the json.org home page, it's hard to imagine how
anyone could be grumpy about that. But we do need to be at some
points to make sure that anything we do is truly clean-room, because
we simply can't have GPL code creeping into our code base.

Thanks for continuing to pursue this!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: Basic JSON support

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Oct 4, 2010 at 2:50 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote:

If he doesn't respond, or outright refuses (which I, for one, doubt
will happen), my fallback plan is to rewrite the JSON validation code
by drawing from my original code (meaning it won't be in bison/flex)
and post a patch for it. �Unfortunately, it seems to me that there
aren't very many ways of expressing a JSON parser in bison/flex, and
thus the idea of JSON parsing with bison/flex is effectively locked
down by the GPL unless we can get a more permissive license for
jsonval. �But, I am not a lawyer.

If someone who hasn't looked at the GPL code sits down and codes
something up based on the json.org home page, it's hard to imagine how
anyone could be grumpy about that.

Yeah. Joseph seems to be confusing copyrights with patents. The idea
of "parse JSON with bison/flex" is not patentable by any stretch of the
imagination.

But having said that, I wonder whether bison/flex are really the best
tool for the job in the first place. From what I understand of JSON
(which admittedly ain't much) a bison parser seems like overkill:
it'd probably be both bloated and slow compared to a simple handwritten
recursive-descent parser.

regards, tom lane

#15Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#14)
Re: Basic JSON support

All,

But having said that, I wonder whether bison/flex are really the best
tool for the job in the first place. From what I understand of JSON
(which admittedly ain't much) a bison parser seems like overkill:
it'd probably be both bloated and slow compared to a simple handwritten
recursive-descent parser.

This appears not to be necessary. The author of JSONval has indicated
that, should we choose to include it in PostgreSQL 9.1, he is open to
re-licensing.

So on a completely *technical* basis, do we want to use JSONval?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Josh Berkus (#15)
Re: Basic JSON support

On 10/04/2010 08:00 PM, Josh Berkus wrote:

All,

But having said that, I wonder whether bison/flex are really the best
tool for the job in the first place. From what I understand of JSON
(which admittedly ain't much) a bison parser seems like overkill:
it'd probably be both bloated and slow compared to a simple handwritten
recursive-descent parser.

This appears not to be necessary. The author of JSONval has indicated
that, should we choose to include it in PostgreSQL 9.1, he is open to
re-licensing.

So on a completely *technical* basis, do we want to use JSONval?

I agree with Tom that a hand-cut RD parser is much more likely to be the
way to go. We should not use bison/flex for parsing data values.

cheers

andrew

#17Joseph Adams
joeyadams3.14159@gmail.com
In reply to: Tom Lane (#14)
Re: Basic JSON support

On Mon, Oct 4, 2010 at 7:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah.  Joseph seems to be confusing copyrights with patents.  The idea
of "parse JSON with bison/flex" is not patentable by any stretch of the
imagination.

What I meant is, anyone who sets out to write a JSON parser with
bison/flex is probably going to end up with something very similar to
jsonval even without looking at it.

But having said that, I wonder whether bison/flex are really the best
tool for the job in the first place.  From what I understand of JSON
(which admittedly ain't much) a bison parser seems like overkill:
it'd probably be both bloated and slow compared to a simple handwritten
recursive-descent parser.

I agree, and also because JSON isn't a moving target. However, in my
opinion, the differences in quality between a bison parser and a
handwritten parser are to small to be a big deal right now.

A bison parser is probably slower and bigger than a simple
hand-written one, but I haven't benchmarked it, and it's probably not
worth benchmarking at this point.

In correctness, I suspect a bison parser and a hand-written one to be
about the same. Both pass the same army of testcases I built up while
writing my original JSON patch. Granted, there could be weird
semantics of bison/flex that make the parser wrong in subtle ways (for
instance, the . regex char doesn't match \n and EOF), but there can
also be weird errors in hand-written code. Pick your poison.

In safety, a handwritten recursive descent parser could stack overflow
and crash the server unless the proper guards are in place (e.g if
someone tries to encode '[[[[[[...' as JSON). bison guards against
this (chopping the maximum depth of JSON trees to 9997 levels or so),
but I don't know if the real stack is smaller than bison thinks it is,
or if bison uses its own buffer. A handwritten parser could also
accept trees of any depth using a manually managed stack, but it
wouldn't do much good because code that works with JSON trees (e.g.
json_stringify) is currently recursive and has the same problem. In
any case, I think limiting the depth of JSON, to something generous
like 1000, is the right way to go. The RFC allows us to do that.

In my opinion, the path of least resistance is the best path for now.
If we use the bison parser now, then we can replace it with a
hand-written one that's functionally equivalent later.

Joey Adams

#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#14)
Re: Basic JSON support

2010/10/5 Tom Lane <tgl@sss.pgh.pa.us>:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Oct 4, 2010 at 2:50 PM, Joseph Adams <joeyadams3.14159@gmail.com> wrote:

If he doesn't respond, or outright refuses (which I, for one, doubt
will happen), my fallback plan is to rewrite the JSON validation code
by drawing from my original code (meaning it won't be in bison/flex)
and post a patch for it.  Unfortunately, it seems to me that there
aren't very many ways of expressing a JSON parser in bison/flex, and
thus the idea of JSON parsing with bison/flex is effectively locked
down by the GPL unless we can get a more permissive license for
jsonval.  But, I am not a lawyer.

If someone who hasn't looked at the GPL code sits down and codes
something up based on the json.org home page, it's hard to imagine how
anyone could be grumpy about that.

Yeah.  Joseph seems to be confusing copyrights with patents.  The idea
of "parse JSON with bison/flex" is not patentable by any stretch of the
imagination.

But having said that, I wonder whether bison/flex are really the best
tool for the job in the first place.  From what I understand of JSON
(which admittedly ain't much) a bison parser seems like overkill:
it'd probably be both bloated and slow compared to a simple handwritten
recursive-descent parser.

PostgreSQL use a bison for same simple things - cube, PostGis,
bootstrap .. so I don't see some special overhead. I am thinking so
lex can be shared from core parser. bison parser for JSON means a less
rows for maintaining a repeating some a basic pattern in pg source
code.

Regards

Pavel Stehule

./contrib/seg/segparse.y
./contrib/cube/cubeparse.y
./src/backend/bootstrap/bootparse.y
./src/backend/parser/gram.y
./src/pl/plpgsql/src/gram.y
./src/interfaces/ecpg/preproc/preproc.y

Show quoted text

                       regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers