to_date_valid()

Started by Andreas 'ads' Scherbaumalmost 10 years ago37 messageshackers
Jump to latest
#1Andreas 'ads' Scherbaum
adsmail@wars-nicht.de

Hello,

we have customers complaining that to_date() accepts invalid dates, and
returns a different date instead. This is a known issue:

http://sql-info.de/postgresql/notes/to_date-to_timestamp-gotchas.html

On the other hand this leads to wrong dates when loading dates into the
database, because the database happily accepts invalid dates and ends up
writing something completely different into the table.

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Attachments:

to_date_valid.patch.gzapplication/gzip; name=to_date_valid.patch.gzDownload
In reply to: Andreas 'ads' Scherbaum (#1)
Re: to_date_valid()

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Euler Taveira de Oliveira (#2)
Re: to_date_valid()

El 2/7/2016 20:33, "Euler Taveira" <euler@timbira.com.br> escribió:

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
want to be bug compatible so it doesn't matter if we break something.

--
Jaime Casanova http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#4Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Jaime Casanova (#3)
Re: to_date_valid()

On 03.07.2016 07:05, Jaime Casanova wrote:

El 2/7/2016 20:33, "Euler Taveira" <euler@timbira.com.br
<mailto:euler@timbira.com.br>> escribió:

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
want to be bug compatible so it doesn't matter if we break something.

There are previous discussions about such a change, and this was rejected:

/messages/by-id/lbjf1v$a2v$1@ger.gmane.org
/messages/by-id/A737B7A37273E048B164557ADEF4A58B17C9140E@ntex2010i.host.magwien.gv.at

Hence the new function, which does not collide with the existing
implementation.

Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Craig Ringer
craig@2ndquadrant.com
In reply to: Euler Taveira de Oliveira (#2)
Re: to_date_valid()

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br> wrote:

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate function too.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Craig Ringer (#5)
Re: to_date_valid()

2016-07-04 4:25 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br> wrote:

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate function too.

personally I prefer first variant - this is same function with stronger
check.

The name to_date_valid sounds little bit strange - maybe to_date_strict
should be better.

Regards

Pavel

Show quoted text

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#7Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Pavel Stehule (#6)
Re: to_date_valid()

On 04/07/16 15:19, Pavel Stehule wrote:

2016-07-04 4:25 GMT+02:00 Craig Ringer <craig@2ndquadrant.com
<mailto:craig@2ndquadrant.com>>:

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br
<mailto:euler@timbira.com.br>> wrote:

On 02-07-2016 22 <tel:02-07-2016%2022>:04, Andreas 'ads'
Scherbaum wrote:

The attached patch adds a new function "to_date_valid()"

which will

validate the date and return an error if the input and

output date do

not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could
default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate
function too.

personally I prefer first variant - this is same function with
stronger check.

The name to_date_valid sounds little bit strange - maybe
to_date_strict should be better.

Regards

Pavel

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Yeah, my feeling too, is that 'to_date_strict' would be better!

Cheers,
Gavin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#6)
Re: to_date_valid()

2016-07-04 5:19 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2016-07-04 4:25 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br> wrote:

On 02-07-2016 22:04, Andreas 'ads' Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true | false)
instead of creating another function? The new parameter could default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate function
too.

personally I prefer first variant - this is same function with stronger
check.

Currently probably we have not two similar function - one fault tolerant
and second stricter. There is only one example of similar behave -
parse_ident with "strict" option.

The three parameters are ok still - so I don't see a reason why we have to
implement new function. If you need to emphasize the fact so behave should
be strict, you can use named parameters

select to_date('blah', 'patter', strict => true)

Regards

Pavel

Show quoted text

The name to_date_valid sounds little bit strange - maybe to_date_strict
should be better.

Regards

Pavel

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Andreas 'ads' Scherbaum (#1)
Re: to_date_valid()

On Sun, Jul 3, 2016 at 6:34 AM, Andreas 'ads' Scherbaum
<adsmail@wars-nicht.de> wrote:

Hello,

we have customers complaining that to_date() accepts invalid dates, and
returns a different date instead. This is a known issue:

http://sql-info.de/postgresql/notes/to_date-to_timestamp-gotchas.html

On the other hand this leads to wrong dates when loading dates into the
database, because the database happily accepts invalid dates and ends up
writing something completely different into the table.

The attached patch adds a new function "to_date_valid()" which will validate
the date and return an error if the input and output date do not match.
Tests included, documentation update as well.

It seems that you are calling many additional function calls
(date_out, timestamp_in, etc.) to validate the date. Won't the
additional function calls make to_date much costlier than its current
implementation? I don't know if there is a better way, but I think it
is worth to consider, if we can find a cheaper way to detect validity
of date.

Note - Your patch is small (~13KB) enough that it doesn't need to zipped.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Amit Kapila (#9)
Re: to_date_valid()

On 04.07.2016 16:33, Amit Kapila wrote:

On Sun, Jul 3, 2016 at 6:34 AM, Andreas 'ads' Scherbaum
<adsmail@wars-nicht.de> wrote:

Hello,

we have customers complaining that to_date() accepts invalid dates, and
returns a different date instead. This is a known issue:

http://sql-info.de/postgresql/notes/to_date-to_timestamp-gotchas.html

On the other hand this leads to wrong dates when loading dates into the
database, because the database happily accepts invalid dates and ends up
writing something completely different into the table.

The attached patch adds a new function "to_date_valid()" which will validate
the date and return an error if the input and output date do not match.
Tests included, documentation update as well.

It seems that you are calling many additional function calls
(date_out, timestamp_in, etc.) to validate the date. Won't the
additional function calls make to_date much costlier than its current
implementation? I don't know if there is a better way, but I think it
is worth to consider, if we can find a cheaper way to detect validity
of date.

It certainly is costlier, and I'm open to suggestions how to improve the
performance. That is one of the reasons why I considered a separate
function, instead of adding this to to_date() and add even more overhead
there.

Note - Your patch is small (~13KB) enough that it doesn't need to zipped.

It's not about the small size, it's about websites like Gmail which
mingle up the linebreaks and then git fails. Ran into this problem on
the pgAdminIII list a while ago, ever since I just zip it and avoid the
trouble.

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Pavel Stehule (#8)
Re: to_date_valid()

On 04.07.2016 05:51, Pavel Stehule wrote:

2016-07-04 5:19 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com
<mailto:pavel.stehule@gmail.com>>:

2016-07-04 4:25 GMT+02:00 Craig Ringer <craig@2ndquadrant.com
<mailto:craig@2ndquadrant.com>>:

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br
<mailto:euler@timbira.com.br>> wrote:

On 02-07-2016 22 <tel:02-07-2016%2022>:04, Andreas 'ads'
Scherbaum wrote:

The attached patch adds a new function "to_date_valid()" which will
validate the date and return an error if the input and output date do
not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true |
false)
instead of creating another function? The new parameter
could default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate
function too.

personally I prefer first variant - this is same function with
stronger check.

Currently probably we have not two similar function - one fault
tolerant and second stricter. There is only one example of similar
behave - parse_ident with "strict" option.

The three parameters are ok still - so I don't see a reason why we have
to implement new function. If you need to emphasize the fact so behave
should be strict, you can use named parameters

select to_date('blah', 'patter', strict => true)

The new function is not "strict", it just adds a validation step:

postgres=# select to_date_valid(NULL, NULL);
to_date_valid
---------------

(1 row)

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andreas 'ads' Scherbaum (#11)
Re: to_date_valid()

2016-07-04 18:24 GMT+02:00 Andreas 'ads' Scherbaum <adsmail@wars-nicht.de>:

On 04.07.2016 05:51, Pavel Stehule wrote:

2016-07-04 5:19 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com
<mailto:pavel.stehule@gmail.com>>:

2016-07-04 4:25 GMT+02:00 Craig Ringer <craig@2ndquadrant.com
<mailto:craig@2ndquadrant.com>>:

On 3 July 2016 at 09:32, Euler Taveira <euler@timbira.com.br
<mailto:euler@timbira.com.br>> wrote:

On 02-07-2016 22 <tel:02-07-2016%2022>:04, Andreas 'ads'
Scherbaum wrote:

The attached patch adds a new function "to_date_valid()"

which will

validate the date and return an error if the input and

output date do

not match. Tests included, documentation update as well.

Why don't you add a third parameter (say, validate = true |
false)
instead of creating another function? The new parameter
could default to
false to not break compatibility.

because

SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate
function too.

personally I prefer first variant - this is same function with
stronger check.

Currently probably we have not two similar function - one fault
tolerant and second stricter. There is only one example of similar
behave - parse_ident with "strict" option.

The three parameters are ok still - so I don't see a reason why we have
to implement new function. If you need to emphasize the fact so behave
should be strict, you can use named parameters

select to_date('blah', 'patter', strict => true)

The new function is not "strict", it just adds a validation step:

I understand - I know, so this has zero relation to function flag STRICT

I don't know if the name "strict" is best, but the name "validate" is not
good too. Current to_date does some validations too.

Regards

Pavel

Show quoted text

postgres=# select to_date_valid(NULL, NULL);
to_date_valid
---------------

(1 row)

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

#13Andreas Karlsson
andreas.karlsson@percona.com
In reply to: Andreas 'ads' Scherbaum (#4)
Re: to_date_valid()

On 07/03/2016 12:36 PM, Andreas 'ads' Scherbaum wrote:

On 03.07.2016 07:05, Jaime Casanova wrote:

Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
want to be bug compatible so it doesn't matter if we break something.

There are previous discussions about such a change, and this was rejected:

/messages/by-id/lbjf1v$a2v$1@ger.gmane.org
/messages/by-id/A737B7A37273E048B164557ADEF4A58B17C9140E@ntex2010i.host.magwien.gv.at

Hence the new function, which does not collide with the existing
implementation.

I do not see a clear conclusion in the linked threads. For example Bruce
calls it a bug in one of the emails
(/messages/by-id/201107200103.p6K13ix10517@momjian.us).

I think we should fix to_date() to throw an error. Personally I would be
happy if my code broke due to this kind of change since the exception
would reveal an old bug which has been there a long time eating my data.
I cannot see a case where I would have wanted the current behavior.

If there is any legitimate use for the current behavior then we can add
it back as another function.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andreas Karlsson (#13)
Re: to_date_valid()

2016-07-04 22:15 GMT+02:00 Andreas Karlsson <andreas@proxel.se>:

On 07/03/2016 12:36 PM, Andreas 'ads' Scherbaum wrote:

On 03.07.2016 07:05, Jaime Casanova wrote:

Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
want to be bug compatible so it doesn't matter if we break something.

There are previous discussions about such a change, and this was rejected:

/messages/by-id/lbjf1v$a2v$1@ger.gmane.org

/messages/by-id/A737B7A37273E048B164557ADEF4A58B17C9140E@ntex2010i.host.magwien.gv.at

Hence the new function, which does not collide with the existing
implementation.

I do not see a clear conclusion in the linked threads. For example Bruce
calls it a bug in one of the emails (
/messages/by-id/201107200103.p6K13ix10517@momjian.us
).

I think we should fix to_date() to throw an error. Personally I would be
happy if my code broke due to this kind of change since the exception would
reveal an old bug which has been there a long time eating my data. I cannot
see a case where I would have wanted the current behavior.

If I remember, this implementation is based on Oracle's behave. It is
pretty old and documented - so it is hard to speak about it like the bug. I
understand, so the behave is strange, but it was designed in different
time. You can enter not 100% valid string, but you get correct date

postgres=# select to_date('2016-12-40','YYYY-MM-DD');

┌────────────┐
│ to_date │
╞════════════╡
│ 2017-01-09 │
└────────────┘
(1 row)

It can be used for some date calculations. In old age the applications was
designed in this style. I am against to change of default behave. We can
introduce new new different function with different name with better
designed format and rules, or we can add new options to this function, or
we can live with current state.

Now, to_date function should not be used - functions make_date,
make_timestamp are faster and safe.

postgres=# select make_date(2017,01,40);
ERROR: date field value out of range: 2017-01-40

Regards

Pavel

Show quoted text

If there is any legitimate use for the current behavior then we can add it
back as another function.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andreas Karlsson
andreas.karlsson@percona.com
In reply to: Pavel Stehule (#14)
Re: to_date_valid()

On 07/04/2016 10:55 PM, Pavel Stehule wrote:

2016-07-04 22:15 GMT+02:00 Andreas Karlsson <andreas@proxel.se
<mailto:andreas@proxel.se>>:
I do not see a clear conclusion in the linked threads. For example
Bruce calls it a bug in one of the emails
(/messages/by-id/201107200103.p6K13ix10517@momjian.us).

I think we should fix to_date() to throw an error. Personally I
would be happy if my code broke due to this kind of change since the
exception would reveal an old bug which has been there a long time
eating my data. I cannot see a case where I would have wanted the
current behavior.

If I remember, this implementation is based on Oracle's behave.

In the thread I linked above they claim that Oracle (at least 10g) does
not work like this.

Thomas Kellerer wrote:

Oracle throws an error for the above example:

SQL> select to_date('20110231', 'YYYYMMDD') from dual;
select to_date('20110231', 'YYYYMMDD') from dual
*
ERROR at line 1:
ORA-01839: date not valid for month specified

I do not have access to an Oracle installation so I cannot confirm this
myself.

It is
pretty old and documented - so it is hard to speak about it like the
bug. I understand, so the behave is strange, but it was designed in
different time. You can enter not 100% valid string, but you get correct
date

postgres=# select to_date('2016-12-40','YYYY-MM-DD');

┌────────────┐
│ to_date │
╞════════════╡
│ 2017-01-09 │
└────────────┘
(1 row)

It can be used for some date calculations. In old age the applications
was designed in this style. I am against to change of default behave. We
can introduce new new different function with different name with better
designed format and rules, or we can add new options to this function,
or we can live with current state.

While clever, I think this behavior is a violation of the principle of
least surprise. It is not obvious to someone reading a query that
to_date() would behave like this. Especially when Oracle's to_date()
works differently.

Now, to_date function should not be used - functions make_date,
make_timestamp are faster and safe.

Yeah, I personally know of this behavior and therefore would never use
to_date(), but I am far from the average PostgreSQL user.

Andreas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Pavel Stehule (#12)
Re: to_date_valid()

On 04.07.2016 18:37, Pavel Stehule wrote:

I don't know if the name "strict" is best, but the name "validate" is
not good too. Current to_date does some validations too.

Obviously not enough, because it allows invalid dates. I'd say that the
current to_date() merely validates the input format for string parsing,
and that the date is in range. But there is not much validation on the
date itself.

So the name can't be "strict" because of the conflict with "NULL"
handling, and you don't like "valid" - what other options do you offer?

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17David G. Johnston
david.g.johnston@gmail.com
In reply to: Andreas 'ads' Scherbaum (#16)
Re: to_date_valid()

On Mon, Jul 4, 2016 at 8:39 PM, Andreas 'ads' Scherbaum <
adsmail@wars-nicht.de> wrote:

On 04.07.2016 18:37, Pavel Stehule wrote:

I don't know if the name "strict" is best, but the name "validate" is
not good too. Current to_date does some validations too.

Obviously not enough, because it allows invalid dates. I'd say that the
current to_date() merely validates the input format for string parsing, and
that the date is in range. But there is not much validation on the date
itself.

So the name can't be "strict" because of the conflict with "NULL"
handling, and you don't like "valid" - what other options do you offer?

​We don't have to change the name...we could do something like how
RegularExpressions work - like (?i) - and just add a new modifier ​code.

​'~YYYY-MI-DD' --that's a leading tilde, could be anything - even something
like "HMYYYY-MI-DD" for "historical mode"

​Also, see this thread of a few weeks ago for related material:

/messages/by-id/1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com

It seems that fixing it is back on the table, possibly even for 9.6 since
this is such a hideous bug - one that closely resembles a cockroach ;)

WRT to the 2014 "reject out-of-range dates" thread, I'm kinda surprised
that we didn't just set the date to be the minimum or maximum allowable
date back in 9.2 instead of rejecting it...

I'd be more inclined to add another argument if it was basically an enum.

to_date(text, format, format_style)

This at least opens the door for future enhancements that are associated
with named styles. I could imagine an "exact" format that also allows for
something like regexp character classes so that one can write:
"YYYY[:SEP:]MM[:SEP:]DD" to keep exact matches but accommodate variations
on what people type as a separator e.g. [.-/]

format_style=>historical would invoke today's behaviors

David J.

#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: David G. Johnston (#17)
Re: to_date_valid()

2016-07-05 4:33 GMT+02:00 David G. Johnston <david.g.johnston@gmail.com>:

On Mon, Jul 4, 2016 at 8:39 PM, Andreas 'ads' Scherbaum <
adsmail@wars-nicht.de> wrote:

On 04.07.2016 18:37, Pavel Stehule wrote:

I don't know if the name "strict" is best, but the name "validate" is
not good too. Current to_date does some validations too.

Obviously not enough, because it allows invalid dates. I'd say that the
current to_date() merely validates the input format for string parsing, and
that the date is in range. But there is not much validation on the date
itself.

So the name can't be "strict" because of the conflict with "NULL"
handling, and you don't like "valid" - what other options do you offer?

​We don't have to change the name...we could do something like how
RegularExpressions work - like (?i) - and just add a new modifier ​code.

​'~YYYY-MI-DD' --that's a leading tilde, could be anything - even
something like "HMYYYY-MI-DD" for "historical mode"

​Also, see this thread of a few weeks ago for related material:

/messages/by-id/1873520224.1784572.1465833145330.JavaMail.yahoo@mail.yahoo.com

It seems that fixing it is back on the table, possibly even for 9.6 since
this is such a hideous bug - one that closely resembles a cockroach ;)

WRT to the 2014 "reject out-of-range dates" thread, I'm kinda surprised
that we didn't just set the date to be the minimum or maximum allowable
date back in 9.2 instead of rejecting it...

I'd be more inclined to add another argument if it was basically an enum.

to_date(text, format, format_style)

This at least opens the door for future enhancements that are associated
with named styles. I could imagine an "exact" format that also allows for
something like regexp character classes so that one can write:
"YYYY[:SEP:]MM[:SEP:]DD" to keep exact matches but accommodate variations
on what people type as a separator e.g. [.-/]

format_style=>historical would invoke today's behaviors

this is compatibility break - the question is "can we break it?"

Pavel

Show quoted text

David J.

#19Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andreas 'ads' Scherbaum (#16)
Re: to_date_valid()

2016-07-05 2:39 GMT+02:00 Andreas 'ads' Scherbaum <adsmail@wars-nicht.de>:

On 04.07.2016 18:37, Pavel Stehule wrote:

I don't know if the name "strict" is best, but the name "validate" is
not good too. Current to_date does some validations too.

Obviously not enough, because it allows invalid dates. I'd say that the
current to_date() merely validates the input format for string parsing, and
that the date is in range. But there is not much validation on the date
itself.

So the name can't be "strict" because of the conflict with "NULL"
handling, and you don't like "valid" - what other options do you offer?

I have not - so third option looks best for me - it can be long name
"only_correct_date", "only_valid_date", "only_valid_date_on_input" ...

Pavel

Show quoted text

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

#20Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andreas Karlsson (#15)
Re: to_date_valid()

Andreas Karlsson wrote:

On 07/04/2016 10:55 PM, Pavel Stehule wrote:

2016-07-04 22:15 GMT+02:00 Andreas Karlsson wrote:

I do not see a clear conclusion in the linked threads. For example
Bruce calls it a bug in one of the emails
(/messages/by-id/201107200103.p6K13ix10517@momjian.us).

I think we should fix to_date() to throw an error. Personally I
would be happy if my code broke due to this kind of change since the
exception would reveal an old bug which has been there a long time
eating my data. I cannot see a case where I would have wanted the
current behavior.

If I remember, this implementation is based on Oracle's behave.

In the thread I linked above they claim that Oracle (at least 10g) does
not work like this.

[...]

I do not have access to an Oracle installation so I cannot confirm this
myself.

Oracle 12.1:

SQL> SELECT to_date('2016-12-40','YYYY-MM-DD') FROM dual;
SELECT to_date('2016-12-40','YYYY-MM-DD') FROM dual
*
ERROR at line 1:
ORA-01847: day of month must be between 1 and last day of month

SQL> SELECT to_date('2017-02-29','YYYY-MM-DD') FROM dual;
SELECT to_date('2017-02-29','YYYY-MM-DD') FROM dual
*
ERROR at line 1:
ORA-01839: date not valid for month specified

So no, compatibility with Oracle is certainly not the reason
to leave it as it is.

But notwithstanding your feeling that you would like your application
to break if it makes use of this behaviour, it is a change that might
make some people pretty unhappy - nobody can tell how many.

Yours,
Laurenz Albe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Pavel Stehule (#19)
#22Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: David G. Johnston (#17)
#23David G. Johnston
david.g.johnston@gmail.com
In reply to: Andreas 'ads' Scherbaum (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: David G. Johnston (#23)
#25Peter Eisentraut
peter_e@gmx.net
In reply to: Laurenz Albe (#20)
#26Joshua D. Drake
jd@commandprompt.com
In reply to: Peter Eisentraut (#25)
#27Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Joshua D. Drake (#26)
#28Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Andreas 'ads' Scherbaum (#27)
#29Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Joshua D. Drake (#26)
#30Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Andreas 'ads' Scherbaum (#29)
#31Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Arthur Zakirov (#30)
#32Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Andreas 'ads' Scherbaum (#31)
#33Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Arthur Zakirov (#32)
#34Peter Eisentraut
peter_e@gmx.net
In reply to: Andreas 'ads' Scherbaum (#31)
#35Andreas 'ads' Scherbaum
adsmail@wars-nicht.de
In reply to: Peter Eisentraut (#34)
#36Peter Eisentraut
peter_e@gmx.net
In reply to: Andreas 'ads' Scherbaum (#35)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#34)