timestamptz parsing bug?

Started by Andrew Dunstanover 14 years ago9 messages
#1Andrew Dunstan
andrew@dunslane.net

Why do we parse this as a correct timestamptz literal:

2011-08-29T09:11:14.123 CDT

but not this:

2011-08-29T09:11:14.123 America/Chicago

Replace the ISO-8601 style T between the date and time parts of the
latter with a space and the parser is happy again.

cheers

andrew

#2Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Andrew Dunstan (#1)
1 attachment(s)
Re: timestamptz parsing bug?

On 29 August 2011 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

Why do we parse this as a correct timestamptz literal:

   2011-08-29T09:11:14.123 CDT

but not this:

   2011-08-29T09:11:14.123 America/Chicago

Replace the ISO-8601 style T between the date and time parts of the latter
with a space and the parser is happy again.

cheers

andrew

Funny, I've just recently been looking at this code.

I think that the issue is in the DTK_TIME handling code in DecodeDateTime().

For this input string the "T" is recognised as the start of an ISO
time, and the ptype variable is set to DTK_TIME. The next field is a
DTK_TIME, however, when it is handled it doesn't reset the ptype
variable.

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

The attached patch seems to fix it. Could probably use a new
regression test though.

Regards,
Dean

Attachments:

datetime.patchapplication/octet-stream; name=datetime.patchDownload
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
new file mode 100644
index 3d320cc..a935d98
*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
*************** DecodeDateTime(char **field, int *ftype,
*** 942,947 ****
--- 942,957 ----
  				break;
  
  			case DTK_TIME:
+ 				/*
+ 				 * This might be an ISO time following a "t" field.
+ 				 */
+ 				if (ptype != 0)
+ 				{
+ 					/* Sanity check; should not fail this test */
+ 					if (ptype != DTK_TIME)
+ 						return DTERR_BAD_FORMAT;
+ 					ptype = 0;
+ 				}
  				dterr = DecodeTime(field[i], fmask, INTERVAL_FULL_RANGE,
  								   &tmask, tm, fsec);
  				if (dterr)
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#2)
Re: timestamptz parsing bug?

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 29 August 2011 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

Why do we parse this as a correct timestamptz literal:
2011-08-29T09:11:14.123 CDT
but not this:
2011-08-29T09:11:14.123 America/Chicago

For this input string the "T" is recognised as the start of an ISO
time, and the ptype variable is set to DTK_TIME. The next field is a
DTK_TIME, however, when it is handled it doesn't reset the ptype
variable.

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

Do we actually *want* to support this? The "T" is supposed to mean that
the string is strictly ISO-conformant, no?

regards, tom lane

#4David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#3)
Re: timestamptz parsing bug?

On Aug 29, 2011, at 12:30 PM, Tom Lane wrote:

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

Do we actually *want* to support this? The "T" is supposed to mean that
the string is strictly ISO-conformant, no?

I didn't realize that appending a time zone was not conformant, but apparently it's not.

http://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators

Only appending a "Z" or an offset seems to be legal. Interesting.

David

#5Andrew Dunstan
andrew@dunslane.net
In reply to: David E. Wheeler (#4)
Re: timestamptz parsing bug?

On 08/29/2011 03:35 PM, David E. Wheeler wrote:

On Aug 29, 2011, at 12:30 PM, Tom Lane wrote:

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

Do we actually *want* to support this? The "T" is supposed to mean that
the string is strictly ISO-conformant, no?

I didn't realize that appending a time zone was not conformant, but apparently it's not.

http://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators

Only appending a "Z" or an offset seems to be legal. Interesting.

In that case we shouldn't be accepting an abbreviation either.

cheers

andrew

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: timestamptz parsing bug?

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/29/2011 03:35 PM, David E. Wheeler wrote:

On Aug 29, 2011, at 12:30 PM, Tom Lane wrote:

Do we actually *want* to support this? The "T" is supposed to mean that
the string is strictly ISO-conformant, no?

In that case we shouldn't be accepting an abbreviation either.

Yeah, that would be the logical conclusion. OTOH you could argue that
we don't want to remove the abbreviation case for backward-compatibility
reasons, in which case allowing full names as well is a reasonable
thing. I don't know the answer, I'm just asking the question.

regards, tom lane

#7Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Andrew Dunstan (#5)
Re: timestamptz parsing bug?

On 29 August 2011 20:43, Andrew Dunstan <andrew@dunslane.net> wrote:

On 08/29/2011 03:35 PM, David E. Wheeler wrote:

On Aug 29, 2011, at 12:30 PM, Tom Lane wrote:

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

Do we actually *want* to support this?  The "T" is supposed to mean that
the string is strictly ISO-conformant, no?

I didn't realize that appending a time zone was not conformant, but
apparently it's not.

  http://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators

Only appending a "Z" or an offset seems to be legal. Interesting.

In that case we shouldn't be accepting an abbreviation either.

The remit of the function is to support inputs in "almost any
reasonable format", not just ISO format. I'd say that supporting both
extensions of the ISO format is reasonable. Supporting one and not the
other is inconsistent, and removing support for one will likely break
someone's code.

Regards,
Dean

#8Bruce Momjian
bruce@momjian.us
In reply to: Dean Rasheed (#2)
Re: timestamptz parsing bug?

I assume we want to apply this patch based on discussion that we should
allow a wider range of date/time formats.

---------------------------------------------------------------------------

On Mon, Aug 29, 2011 at 06:40:07PM +0100, Dean Rasheed wrote:

On 29 August 2011 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

Why do we parse this as a correct timestamptz literal:

� �2011-08-29T09:11:14.123 CDT

but not this:

� �2011-08-29T09:11:14.123 America/Chicago

Replace the ISO-8601 style T between the date and time parts of the latter
with a space and the parser is happy again.

cheers

andrew

Funny, I've just recently been looking at this code.

I think that the issue is in the DTK_TIME handling code in DecodeDateTime().

For this input string the "T" is recognised as the start of an ISO
time, and the ptype variable is set to DTK_TIME. The next field is a
DTK_TIME, however, when it is handled it doesn't reset the ptype
variable.

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

The attached patch seems to fix it. Could probably use a new
regression test though.

Regards,
Dean

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
new file mode 100644
index 3d320cc..a935d98
*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
*************** DecodeDateTime(char **field, int *ftype,
*** 942,947 ****
--- 942,957 ----
break;
case DTK_TIME:
+ 				/*
+ 				 * This might be an ISO time following a "t" field.
+ 				 */
+ 				if (ptype != 0)
+ 				{
+ 					/* Sanity check; should not fail this test */
+ 					if (ptype != DTK_TIME)
+ 						return DTERR_BAD_FORMAT;
+ 					ptype = 0;
+ 				}
dterr = DecodeTime(field[i], fmask, INTERVAL_FULL_RANGE,
&tmask, tm, fsec);
if (dterr)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#9Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#8)
Re: timestamptz parsing bug?

On Wed, Aug 15, 2012 at 05:29:26PM -0400, Bruce Momjian wrote:

I assume we want to apply this patch based on discussion that we should
allow a wider range of date/time formats.

Applied, thanks.

---------------------------------------------------------------------------

On Mon, Aug 29, 2011 at 06:40:07PM +0100, Dean Rasheed wrote:

On 29 August 2011 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

Why do we parse this as a correct timestamptz literal:

� �2011-08-29T09:11:14.123 CDT

but not this:

� �2011-08-29T09:11:14.123 America/Chicago

Replace the ISO-8601 style T between the date and time parts of the latter
with a space and the parser is happy again.

cheers

andrew

Funny, I've just recently been looking at this code.

I think that the issue is in the DTK_TIME handling code in DecodeDateTime().

For this input string the "T" is recognised as the start of an ISO
time, and the ptype variable is set to DTK_TIME. The next field is a
DTK_TIME, however, when it is handled it doesn't reset the ptype
variable.

When it gets to the timezone "America/Chicago" at the end, this is
handled in the DTK_DATE case, because of the "/". But because ptype is
still set, it is expecting this to be an ISO time, so it errors out.

The attached patch seems to fix it. Could probably use a new
regression test though.

Regards,
Dean

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
new file mode 100644
index 3d320cc..a935d98
*** a/src/backend/utils/adt/datetime.c
--- b/src/backend/utils/adt/datetime.c
*************** DecodeDateTime(char **field, int *ftype,
*** 942,947 ****
--- 942,957 ----
break;
case DTK_TIME:
+ 				/*
+ 				 * This might be an ISO time following a "t" field.
+ 				 */
+ 				if (ptype != 0)
+ 				{
+ 					/* Sanity check; should not fail this test */
+ 					if (ptype != DTK_TIME)
+ 						return DTERR_BAD_FORMAT;
+ 					ptype = 0;
+ 				}
dterr = DecodeTime(field[i], fmask, INTERVAL_FULL_RANGE,
&tmask, tm, fsec);
if (dterr)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +