How to handle bogus nulls from ActiveRecord
It is required for application data verification filters that
default values for table columns are known to ActiveRecord when
creating a new row. So ActiveRecord obtains the default values from
the tables dynamically and assigns them to their appropriate column
attributes. The problem we encounter arises because ActiveRecord
then uses those column assignments when inserting a row even if the
column is not otherwise referenced.
I am developing a web application using the Ruby on Rails framework
with PostgreSQL as the back-end store. In one of our tables we have
a column called expected_by which is a time-stamp. It is set to NOT
NULL DEFAULT 'INFINITY'. However, Ruby has no concept of infinity
and whatever the PostgreSQL adapter is returning for it ActiveRecord
receives as nil which is converted to NULL.
So, the real fix to this is to alter the persistence class so that
columns with default values are not explicitly set to those values
on insert. This is unlikely to happen in the short term and will
take some time to be integrated into the framework even when it is
completed, if ever,
So solve this for the moment what I think I require is a trigger on
expected_at which tests for NULL on insert and converts it to
infinity. The other alternative is to simply set the default to
some valid, but unreachable, date like 9999-12-31.
I would like other opinions about how to best handle this situation
and observations on what other significant concerns I may not be
aware of but should provide for.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of James B. Byrne
Sent: Thursday, May 12, 2011 9:12 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] How to handle bogus nulls from ActiveRecordSo solve this for the moment what I think I require is a trigger on
expected_at which tests for NULL on insert and converts it to infinity.
The
other alternative is to simply set the default to some valid, but
unreachable,
date like 9999-12-31.
Not a huge fan of Infinity as a value...but that just may be lack of
experience.
I'd probably remove the NOT NULL constraint on expected_at and deal with
tri-value logic; or also include a boolean (is_expected) and form queries
like
NOT is_expected OR (is_expected AND expected_at <op> timestamp)
"is_expected" could be a calculated value in a view to make things somewhat
easier; otherwise you'd need a table constraint to ensure non-null expected
has a true is_expected.
Without more info as to how you use "expected_at" other advice is difficult
but can you user a meaningful value (say now()+'30 days'::interval) for the
default?
David J.
On Thu, May 12, 2011 12:40, David Johnston wrote:
Not a huge fan of Infinity as a value...but that just may be lack of
experience.I'd probably remove the NOT NULL constraint on expected_at and deal
with tri-value logic; or also include a boolean (is_expected) and
form queries like
Well, actually, the reason for the NOT NULL constraint is to catch
application errors exactly like this one. Removing it is not
contemplated. I had no idea that AR actually 'copied' and used
default values on columns that were not referenced in the
application code until I encountered this. And had it gone
undetected this would have been a major problem later on. As it
was, our tests brought it to our attention quite early which is why
we can contemplate several solutions.
Without more info as to how you use "expected_at" other
advice is difficult but can you user a meaningful value
(say now()+'30 days'::interval) for the default?
The column expected_by contains an estimated time of arrival for a
particular conveyance. When a row is initialized this value is
unknown some of the time. The expected_by value is reset to the
arrived_at value on UPDATE if and only if expected_by is greater
than arrived_at.
Conveyances that have +infinite expected_by time-stamps are
considered pending. At some point conveyance rows that are never
going to arrive are otherwise flagged. On the other hand, rows with
overdue expected_by values are given somewhat more attention, to put
it mildly. So, we either fix the problem with AR, possibly by
moving to Sequel ORM for this case, although I have not yet received
an answer as to whether it does any better; Or we trap and override
NULL values with infinity in a trigger; Or we choose for the default
value a fixed date far, far into the future.
+Infinity was chosen as a default to avoid the complexities of
dealing with NULL logic in SELECTS. I suppose that the simplest
solution is to go with a date of 9999-12-31 and treat that value
like infinity.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
The column expected_by contains an estimated time of arrival for a
particular
conveyance. When a row is initialized this value is unknown some of the
time. The expected_by value is reset to the arrived_at value on UPDATE if
and only if expected_by is greater than arrived_at.Conveyances that have +infinite expected_by time-stamps are considered
pending. At some point conveyance rows that are never going to arrive are
otherwise flagged. On the other hand, rows with overdue expected_by
values are given somewhat more attention, to put it mildly. So, we either
fix
the problem with AR, possibly by moving to Sequel ORM for this case,
although I have not yet received an answer as to whether it does any
better;
Or we trap and override NULL values with infinity in a trigger; Or we
choose
for the default value a fixed date far, far into the future.
+Infinity was chosen as a default to avoid the complexities of
dealing with NULL logic in SELECTS. I suppose that the simplest solution
is to
go with a date of 9999-12-31 and treat that value like infinity.
The "just make it work" solution has many merits - I would also probably
just use 9999-12-31 as a close approximation for +infinity; which itself is
just there because you are avoiding "estimate is unknown".
Why bother updating the "expected_by" value once the conveyance is no longer
pending? Do you not really care if something arrived early? Even if you do
not currently it seems a waste to throw out the data when you can readily
get the same result as-needed (CASE WHEN expected_by <= arrived_at THEN
arrived_at ELSE expected_by END) without giving up the ability to calculate
early-ness. It would make more sense to set expected = arrived if and only
if expected = 'Infinity'. Still, it would at least seem reasonable to guess
a reasonable expected date if one is not otherwise provided - possibly with
a flag indicating that it is a true guestimate instead of a estimate.
David J.
On Thu, May 12, 2011 15:51, David Johnston wrote:
+Infinity was chosen as a default to avoid the complexities of
dealing with NULL logic in SELECTS. I suppose that the simplest
solution is to go with a date of 9999-12-31 and treat that value
like infinity.The "just make it work" solution has many merits - I would
also probably just use 9999-12-31 as a close approximation
for +infinity; which itself is just there because you are
avoiding "estimate is unknown".Why bother updating the "expected_by" value once the conveyance
is no longer pending? Do you not really care if something
arrived early? Even if you do not currently it seems a waste
to throw out the data when you can readily get the same result
as-needed (CASE WHEN expected_by <= arrived_at THEN arrived_at
ELSE expected_by END) without giving up the ability to calculate
The main reason to update expected_by is that sometimes the
conveyance arrives without the expected_by ever being set. Leaving
the expected_by value at infinity, or 99991231, or NULL, complicates
other parts of the system. However, leaving untouched expected_by
values that are less than the infinite value is doable and is a
better approach.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
David suggested using a guesstimate default date along with a boolean to
indicate when you're using guesstimates. I think this is a solid approach,
but if the default expected_by idea doesn't work for you, a boolean would
still make this a lot easier on the Rails side.
It sounds like you're using a setup for Heroku, so I checked the postgreSQL
8.3 manual--a boolean is 1
byte<http://www.postgresql.org/docs/8.3/interactive/datatype-boolean.html>.
If this isn't for Heroku, other postgreSQL version probably implement
booleans the same way. Your database size should go up by # records * 1
byte + indexing overhead.
Though I don't know how many records you're working with, this seems
relatively cheap given that it will make your code more readable ("if
expected_date_estimated?"). It should also simplify any remaining code you
have to write, as you won't have to think about writing elaborate "if" or
"case" statements to determine if expected_by was explicitly set.
On Thu, May 12, 2011 at 1:06 PM, James B. Byrne <byrnejb@harte-lyne.ca>wrote:
Show quoted text
On Thu, May 12, 2011 15:51, David Johnston wrote:
+Infinity was chosen as a default to avoid the complexities of
dealing with NULL logic in SELECTS. I suppose that the simplest
solution is to go with a date of 9999-12-31 and treat that value
like infinity.The "just make it work" solution has many merits - I would
also probably just use 9999-12-31 as a close approximation
for +infinity; which itself is just there because you are
avoiding "estimate is unknown".Why bother updating the "expected_by" value once the conveyance
is no longer pending? Do you not really care if something
arrived early? Even if you do not currently it seems a waste
to throw out the data when you can readily get the same result
as-needed (CASE WHEN expected_by <= arrived_at THEN arrived_at
ELSE expected_by END) without giving up the ability to calculateThe main reason to update expected_by is that sometimes the
conveyance arrives without the expected_by ever being set. Leaving
the expected_by value at infinity, or 99991231, or NULL, complicates
other parts of the system. However, leaving untouched expected_by
values that are less than the infinite value is doable and is a
better approach.--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, May 12, 2011 20:30, Eric Hu wrote:
David suggested using a guesstimate default date along with
a boolean to indicate when you're using guesstimates.
I think this is a solid approach, but if the default
expected_by idea doesn't work for you, a boolean
would still make this a lot easier on the Rails side.
Since this part of the project is still in development, albeit
partially in use for some purposes, all those approaches are under
cosnideration considered. In fact some things are implemented in
that fashion for other parts of the system, particularly with
respect to current status.
However, the main problem to be solved is this issue over the
default values for time stamp columns, all of which must have a NOT
NULL constraint since we simply cannot allow an errant application
overwrite valid data, as would have happened in the case under
consideration.
I am coming to the conclusion that an arbitrary value of 99991231 is
a better approach than using PGs built-in idea of 'infinity' since
I was told that concept is not supported in Ruby.
Actually, it turn out that 'infinity' is supported in Ruby.
Apparently infinity can be represented by assigning the value
obtained by dividing a float by zero.
$ irb
ruby-1.8.7-p334 :001 > infinity = 1.0/0
=> Infinity
ruby-1.8.7-p334 :002 > ninfinity = -1.0/0
=> -Infinity
ruby-1.8.7-p334 :003 >
So, I guess this now qualifies as a bug in the Ruby pg adapter gem.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
2011/5/13 James B. Byrne <byrnejb@harte-lyne.ca>:
Actually, it turn out that 'infinity' is supported in Ruby.
Apparently infinity can be represented by assigning the value
obtained by dividing a float by zero.$ irb
ruby-1.8.7-p334 :001 > infinity = 1.0/0
=> Infinity
ruby-1.8.7-p334 :002 > ninfinity = -1.0/0
=> -Infinity
ruby-1.8.7-p334 :003 >So, I guess this now qualifies as a bug in the Ruby pg adapter gem.
humm.. interesting... i did some tests here...
[testdb]
SELECT * from infinity_date_test;
id | created_at
----+------------
1 | infinity
2 | infinity
3 | infinity
4 | infinity
5 | 2011-05-13
--- test.rb ---
require 'active_record'
ActiveRecord::Base.establish_connection({
:adapter => 'postgresql',
:database => 'testdb',
:port => 5434,
:host => 'localhost',
:username => 'guedes',
:password => 'guedes'
})
class InfinityDateTest < ActiveRecord::Base
set_table_name 'infinity_date_test'
end
InfinityDateTest.all.each do |row|
puts "#{row.id} | #{row.created_at} | #{row.created_at.class}"
end
i = InfinityDateTest.new
i.created_at = Date::Infinity.new
i.save
----
and the output
1 | | NilClass
2 | | NilClass
3 | | NilClass
4 | | NilClass
5 | 2011-05-13 | Date
/home/dba/.rvm/gems/ruby-1.9.2-p0@rails3/gems/activerecord-3.0.5/lib/active_record/connection_adapters/abstract_adapter.rb:207:in
`rescue in log': PGError: ERRO: sintaxe de entrada é inválida para
tipo date: "--- !ruby/object:Date::Infinity
(ActiveRecord::StatementInvalid)
d: 1
"
LINE 1: ... INTO "infinity_date_test" ("created_at") VALUES ('--- !ruby...
^
: INSERT INTO "infinity_date_test" ("created_at") VALUES ('---
!ruby/object:Date::Infinity
d: 1
') RETURNING "id"
...
Well, fetching from database it came nil and when saved into, it was
trying to save a serialized object. From postgresql_adapter.rb [1]https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L85-86 you
can see that it returns the correct internal type based when field
type is datetime, but i can't see the same thing for 'date', i suppose
that it going [2]https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L117-118 to 'super' [3]https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/column.rb#L253-254, so I suppose that this method [4]https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/column.rb#L197-219
should be override in 'postgresql_adapter.rb'.
[1]: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L85-86
[2]: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L117-118
[3]: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/column.rb#L253-254
[4]: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/column.rb#L197-219
Best regards,
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br
On Fri, May 13, 2011 11:50, Dickson S. Guedes wrote:
Well, fetching from database it came nil and when saved into, it was
trying to save a serialized object. From postgresql_adapter.rb [1]
you
can see that it returns the correct internal type based when field
type is datetime, but i can't see the same thing for 'date', i
suppose
that it going [2] to 'super' [3], so I suppose that this method [4]
should be override in 'postgresql_adapter.rb'.
I have opened an issue for this with the ActiveRecord folks.
https://github.com/rails/rails/issues/544
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
On Fri, May 13, 2011 13:04, James B. Byrne wrote:
I have opened an issue for this with the ActiveRecord folks.
This has been addressed by the AR team and is committed to master.
+-Infinity support for dates is slated for general release with
RoR-3.0.8.
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3
On 18/05/2011 4:02 AM, James B. Byrne wrote:
On Fri, May 13, 2011 13:04, James B. Byrne wrote:
I have opened an issue for this with the ActiveRecord folks.
This has been addressed by the AR team and is committed to master.
+-Infinity support for dates is slated for general release with
RoR-3.0.8.
... now if only Java would support infinite dates, too. Alas, unlike
RoR, I doubt there'll be a quick "we've fixed this, grab the next point
release" post for Java.
Not even the 3rd party JodaTime date/time library supports infinite
intervals, the +infinity date, etc, so I have to represent unbounded
intervals with some distant future date, or use null. Neither option is
very palatable.
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/