AS OF queries
I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.
It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)
2. Enable commit timestamp (track_commit_timestamp = on)
3. Add asofTimestamp to snapshot and patch XidInMVCCSnapshot to compare
commit timestamps when it is specified in snapshot.
Attached please find my prototype implementation of it.
Most of the efforts are needed to support asof timestamp in grammar and
add it to query plan.
I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...
Please notice that now ASOF timestamp is used only for data snapshot,
not for catalog snapshot.
I am not sure that it is possible (and useful) to travel through
database schema history...
Below is an example of how it works:
postgres=# create table foo(pk serial primary key, ts timestamp default
now(), val text);
CREATE TABLE
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# insert into foo (val) values ('insert');
INSERT 0 1
postgres=# select * from foo;
pk | ts | val
----+----------------------------+--------
1 | 2017-12-20 14:59:17.715453 | insert
2 | 2017-12-20 14:59:22.933753 | insert
3 | 2017-12-20 14:59:27.87712 | insert
(3 rows)
postgres=# select * from foo asof timestamp '2017-12-20 14:59:25';
pk | ts | val
----+----------------------------+--------
1 | 2017-12-20 14:59:17.715453 | insert
2 | 2017-12-20 14:59:22.933753 | insert
(2 rows)
postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
pk | ts | val
----+----------------------------+--------
1 | 2017-12-20 14:59:17.715453 | insert
(1 row)
postgres=# update foo set val='upd',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 14:59:20';
pk | ts | val
----+----------------------------+--------
1 | 2017-12-20 14:59:17.715453 | insert
(1 row)
postgres=# select * from foo;
pk | ts | val
----+----------------------------+--------
2 | 2017-12-20 14:59:22.933753 | insert
3 | 2017-12-20 14:59:27.87712 | insert
1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)
postgres=# update foo set val='upd2',ts=now() where pk=1;
UPDATE 1
postgres=# select * from foo asof timestamp '2017-12-20 15:10';
pk | ts | val
----+----------------------------+--------
2 | 2017-12-20 14:59:22.933753 | insert
3 | 2017-12-20 14:59:27.87712 | insert
1 | 2017-12-20 15:09:17.046047 | upd
(3 rows)
Comments and feedback are welcome:)
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
asof.patchtext/x-patch; name=asof.patchDownload+525-15
Konstantin Knizhnik wrote:
I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.
I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.
I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...
I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.
Yours,
Laurenz Albe
On 20.12.2017 16:12, Laurenz Albe wrote:
Konstantin Knizhnik wrote:
I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.
Please notice that it is necessary to configure postgres in proper way
in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.
So DBA should make a decision in advance whether this feature is needed
or not.
It is not a proper instrument for restoring/auditing existed database
which was not configured to keep all versions.
May be it is better to add special configuration parameter for this
feature which should implicitly toggle
autovacuumand track_commit_timestamp parameters).
The obvious drawbacks of keeping all versions are
1. Increased size of database.
2. Decreased query execution speed because them need to traverse a lot
of not visible versions.
So may be in practice it will be useful to limit lifetime of versions.
I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.
Completely agree with you: I just give up after few hours of attempts
to make bison to resolve this conflicts.
Yours,
Laurenz Albe
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 20 Dec 2017, at 13:48, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
On 20.12.2017 16:12, Laurenz Albe wrote:
Konstantin Knizhnik wrote:
I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm <https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm>).
As far as I know something similar is now developed for MariaDB.I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.Please notice that it is necessary to configure postgres in proper way in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not possible to locate required timeline.So DBA should make a decision in advance whether this feature is needed or not.
It is not a proper instrument for restoring/auditing existed database which was not configured to keep all versions.May be it is better to add special configuration parameter for this feature which should implicitly toggle
autovacuum and track_commit_timestamp parameters).
I seem to recall that Oracle handles this by requiring tables that want the capability to live within a tablespace that supports flashback. That tablespace is obviously configured to retain redo/undo logs. It would be nice if the vacuuming process could be configured in a similar manner. I have no idea if it would make sense on a tablespace basis or not, though — I’m not entirely sure how analogous they are between Postgres & Oracle as I’ve never used tablespaces in Postgres.
-Joe
Konstantin Knizhnik wrote:
Please notice that it is necessary to configure postgres in proper way in order to be able to perform time travels.
If you do not disable autovacuum, then old versions will be just cleaned-up.
If transaction commit timestamps are not tracked, then it is not possible to locate required timeline.So DBA should make a decision in advance whether this feature is needed or not.
It is not a proper instrument for restoring/auditing existed database which was not configured to keep all versions.
Of course; you'd have to anticipate the need to travel in time,
and you have to pay the price for it.
Anybody who has read science fiction stories know that time travel
does not come free.
May be it is better to add special configuration parameter for this feature which should implicitly toggle
autovacuum and track_commit_timestamp parameters).
The feature would be most useful with some kind of "moving xid
horizon" that guarantees that only dead tuples whose xmax lies
more than a certain time interval in the past can be vacuumed.
Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
Konstantin Knizhnik wrote:
I failed to support AS OF clause (as in Oracle) because of shift-reduce
conflicts with aliases,
so I have to introduce new ASOF keyword. May be yacc experts can propose
how to solve this conflict without introducing new keyword...
I think it would be highly desirable to have AS OF, because that's
the way the SQL standard has it.
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:
regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)
so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.
If we absolutely had to do it, the path to a solution would involve some
lexer-level lookahead, cf base_yylex() --- but that's messy and tends to
introduce its own set of corner case misbehaviors. I'd much rather use a
syntax that wasn't chosen with blind disregard for SQL's existing
syntactic constraints.
regards, tom lane
On Wed, Dec 20, 2017 at 03:03:50PM +0100, Laurenz Albe wrote:
Konstantin Knizhnik wrote:
Please notice that it is necessary to configure postgres in proper
way in order to be able to perform time travels. If you do not
disable autovacuum, then old versions will be just cleaned-up. If
transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.So DBA should make a decision in advance whether this feature is
needed or not. It is not a proper instrument for
restoring/auditing existed database which was not configured to
keep all versions.Of course; you'd have to anticipate the need to travel in time, and
you have to pay the price for it. Anybody who has read science
fiction stories know that time travel does not come free.
A few extra terabytes' worth of storage space is a pretty small price
to pay, at least on the scale of time travel penalties.
May be it is better to add special configuration parameter for
this feature which should implicitly toggle autovacuum and
track_commit_timestamp parameters).The feature would be most useful with some kind of "moving xid
horizon" that guarantees that only dead tuples whose xmax lies more
than a certain time interval in the past can be vacuumed.
+1 for this horizon. It would be very nice, but maybe not strictly
necessary, for this to be adjustable downward without a restart.
It's not clear that adjusting it upward should work at all, but if it
did, the state of dead tuples would have to be known, and they'd have
to be vacuumed a way that was able to establish a guarantee of
gaplessness at least back to the new horizon. Maybe there could be
some kind of "high water mark" for it. Would that impose overhead or
design constraints on vacuum that we don't want?
Also nice but not strictly necessary, making it tunable per relation,
or at least per table. I'm up in the air as to whether queries with
an AS OF older than the horizon[1]If we allow setting this at granularities coarser than DB instance, this means going as far back as the relationship with the newest "last" tuple among the relations involved in the query. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 should error out or merely throw
warnings.
Best,
David.
[1]: If we allow setting this at granularities coarser than DB instance, this means going as far back as the relationship with the newest "last" tuple among the relations involved in the query. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778
instance, this means going as far back as the relationship with the
newest "last" tuple among the relations involved in the query.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.
The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.
The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.
Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.
Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.
It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.
regards, tom lane
On Wed, Dec 20, 2017 at 5:17 PM, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.
There was a presentation about this given at FOSDEM PGDay a couple of years
back. Slides at
https://wiki.postgresql.org/images/6/64/Fosdem20150130PostgresqlTemporal.pdf
.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Wed, Dec 20, 2017 at 4:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.
Examples from DB2 documentation (which may be closer to the standard):
SELECT coverage_amt
FROM policy FOR SYSTEM_TIME AS OF '2010-12-01'
WHERE id = 1111;
SELECT count(*)
FROM policy FOR SYSTEM_TIME FROM '2011-11-30'
TO '9999-12-30'
WHERE vin = 'A1111';
So besides AS .. AS , it could also be FROM .. FROM
Show quoted text
Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.regards, tom lane
On 20/12/17 14:48, Konstantin Knizhnik wrote:
On 20.12.2017 16:12, Laurenz Albe wrote:
Konstantin Knizhnik wrote:
I wonder if Postgres community is interested in supporting time travel
queries in PostgreSQL (something like AS OF queries in Oracle:
https://docs.oracle.com/cd/B14117_01/appdev.101/b10795/adfns_fl.htm).
As far as I know something similar is now developed for MariaDB.I think that would be a good thing to have that could make
the DBA's work easier - all the requests to restore a table
to the state from an hour ago.Please notice that it is necessary to configure postgres in proper way
in order to be able to perform time travels.
This makes sense. BTW, I believe this feature would be an amazing
addition to PostgreSQL.
If you do not disable autovacuum, then old versions will be just
cleaned-up.
If transaction commit timestamps are not tracked, then it is not
possible to locate required timeline.So DBA should make a decision in advance whether this feature is
needed or not.
It is not a proper instrument for restoring/auditing existed database
which was not configured to keep all versions.May be it is better to add special configuration parameter for this
feature which should implicitly toggle
autovacuumand track_commit_timestamp parameters).
Downthread a "moving xid horizon" is proposed. I believe this is
not too user friendly. I'd rather use a timestamp horizon (e.g. "up to 2
days ago"). Given that the commit timestamp is tracked, I don't think
this is an issue. This is the same as the undo_retention in Oracle,
which is expressed in seconds.
The obvious drawbacks of keeping all versions are
1. Increased size of database.
2. Decreased query execution speed because them need to traverse a lot
of not visible versions.
In other words, what is nowadays called "bloat". I have seen in the
field a lot of it. Not everybody tunes vacuum to keep up to date. So I
don't expect this feature to be too expensive for many. While at the
same time an awesome addition, not to fire a new separate server and
exercise PITR, and then find the ways to move the old data around.
Regards,
Álvaro
--
Alvaro Hernandez
-----------
OnGres
On 21 December 2017 at 00:17, Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous. This is required to work by spec:regression=# select x as of from (values(1)) t(x);
of
----
1
(1 row)so it's not possible for us ever to support an expression that includes
top-level "AS OF" (or, pretty much, "AS anything") without some rather
enormous pushups.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.
Well, the SQL committe seem to specialise in parser torture.
Window functions, anybody?
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 20.12.2017 19:26, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.regards, tom lane
Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after
AS is OF.
So now it is possible to write query in this way:
select * from foo as of timestamp '2017-12-21 14:12:15.1867';
There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not
with particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.
But certainly it also can be fixed.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
asof-2.patchtext/x-patch; name=asof-2.patchDownload+534-15
On Thu, Dec 21, 2017 at 05:00:35PM +0300, Konstantin Knizhnik wrote:
On 20.12.2017 19:26, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because it's
formally ambiguous.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.regards, tom lane
Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after AS is
OF.
So now it is possible to write query in this way:��� select * from foo as of timestamp '2017-12-21 14:12:15.1867';
Thanks for your hard work so far on this! It looks really exciting.
There is still one significant difference of my prototype implementation
with SQL standard: it associates timestamp with select statement, not with
particular table.
It seems to be more difficult to support and I am not sure that joining
tables from different timelines has much sense.
I can think of a use case right offhand that I suspect would be very
common: comparing the state of a table at multiple times.
But certainly it also can be fixed.
That would be really fantastic.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)
"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"
I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.
We do have the infrastructure to deal with that. c.f.
vacuum_defer_cleanup_age. So in _theory_ you could create a snapshot
with xmin older than recent_global_xmin as long as it's not more than
vacuum_defer_cleanup_age older. But the devil will be in the details.
It does mean that you'll be making recent_global_xmin move backwards
which it has always been promised to *not* do
Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.
Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.
--
greg
On Fri, Dec 22, 2017 at 11:08:02PM +0000, Greg Stark wrote:
On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"
I would be amazed if you have been able to finish the 14 volumes of the
series. There is a lot of content to take.
Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.
This subject is showing up a couple of times lately, things would
be interested to see. What I am sure about is that people are not
willing to emulate that with triggers and two extra columns per table.
--
Michael
On Dec 23, 2017, at 2:08 AM, Greg Stark wrote:
On 20 December 2017 at 12:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:It seems to me that it will be not so difficult to implement them in
Postgres - we already have versions of tuples.
Looks like we only need to do three things:
1. Disable autovacuum (autovacuum = off)"The Wheel of Time turns, and Ages come and pass, leaving memories
that become legend. Legend fades to myth, and even myth is long
forgotten when the Age that gave it birth comes again"I think you'll find it a lot harder to get this to work than just
disabling autovacuum. Notably HOT updates can get cleaned up (and even
non-HOT updates can now leave tombstone dead line pointers iirc) even
if vacuum hasn't run.
Yeh, I suspected that just disabling autovacuum was not enough.
I heard (but do no know too much) about microvacuum and hot updates.
This is why I was a little bit surprised when me test didn't show lost of updated versions.
May be it is because of vacuum_defer_cleanup_age.
We do have the infrastructure to deal with that. c.f.
vacuum_defer_cleanup_age. So in _theory_ you could create a snapshot
with xmin older than recent_global_xmin as long as it's not more than
vacuum_defer_cleanup_age older. But the devil will be in the details.
It does mean that you'll be making recent_global_xmin move backwards
which it has always been promised to *not* do
But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?
Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.
Yes, catalog changes this is another problem of time travel.
I do not know any suitable way to handle several different catalog snapshots in one query.
But I think that there are a lot of cases where time travels without possibility of database schema change still will be useful.
The question is how we should handle such catalog changes if them are happen. Ideally we should not allow to move back beyond this point.
Unfortunately it is not so easy to implement.
Show quoted text
Postgres used to have time travel. I think it's come up more than once
in the pasts as something that can probably never come back due to
other decisions made. If more decisions have made it possible again
that will be fascinating.--
greg
On 21/12/17 15:00, Konstantin Knizhnik wrote:
On 20.12.2017 19:26, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 12/20/17 10:29, Tom Lane wrote:
Please say that's just an Oracle-ism and not SQL standard, because
it's
formally ambiguous.The SQL standard syntax appears to be something like
"tablename" [ AS OF SYSTEM TIME 'something' ] [ [ AS ] "alias" ]
That's not going to be fun to parse.Bleah. In principle we could look two tokens ahead so as to recognize
"AS OF SYSTEM", but base_yylex is already a horrid mess with one-token
lookahead; I don't much want to try to extend it to that.Possibly the most workable compromise is to use lookahead to convert
"AS OF" to "AS_LA OF", and then we could either just break using OF
as an alias, or add an extra production that allows "AS_LA OF" to
be treated as "AS alias" if it's not followed by the appropriate
stuff.It's a shame that the SQL committee appears to be so ignorant of
standard parsing technology.regards, tom lane
Thank you for suggestion with AS_LA: it really works.
Actually instead of AS_LA I just return ASOF token if next token after
AS is OF.
So now it is possible to write query in this way:select * from foo as of timestamp '2017-12-21 14:12:15.1867';
There is still one significant difference of my prototype
implementation with SQL standard: it associates timestamp with select
statement, not with particular table.
It seems to be more difficult to support and I am not sure that
joining tables from different timelines has much sense.
But certainly it also can be fixed.
If the standard is "AS OF SYSTEM TIME" and we're going to deviate
and go for "AS OF TIMESTAMP", I'd recommend then, if possible, to:
- Make "TIMESTAMP" optional, i.e., "AS OF [TIMESTAMP] <timestamp>"
- Augment the syntax to support also a transaction id, similar to
Oracle's "AS OF SCN <scn>": "AS OF TRANSACTION <txid>".
Merry Christmas,
Álvaro
--
Alvaro Hernandez
-----------
OnGres
On 24 December 2017 at 04:53, konstantin knizhnik <k.knizhnik@postgrespro.ru
wrote:
But what if I just forbid to change recent_global_xmin?
If it is stalled at FirstNormalTransactionId and never changed?
Will it protect all versions from been deleted?
That's totally impractical, you'd have unbounded bloat and a nonfunctional
system in no time.
You'd need a mechanism - akin to what we have with replication slots - to
set a threshold for age.
Then there's another issue that logical replication has had to deal
with -- catalog changes. You can't start looking at tuples that have a
different structure than the current catalog unless you can figure out
how to use the logical replication infrastructure to use the old
catalogs. That's a huge problem to bite off and probably can just be
left for another day if you can find a way to reliably detect the
problem and raise an error if the schema is inconsistent.Yes, catalog changes this is another problem of time travel.
I do not know any suitable way to handle several different catalog
snapshots in one query.
I doubt it's practical unless you can extract it to subplans that can be
materialized separately. Even then, UDTs, rowtype results, etc...
The question is how we should handle such catalog changes if them are
happen. Ideally we should not allow to move back beyond this point.
Unfortunately it is not so easy to implement.
I think you can learn a lot from studying logical decoding here.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services