CREATE TABLE, load and freezing

Started by Pavan Deolaseealmost 18 years ago11 messages
#1Pavan Deolasee
pavan.deolasee@gmail.com

I had this idea sometime back. Not sure if this has been discussed before

In a typical scenario, user might create a table and load data in the table as
part of a single transaction (e.g pg_restore). In this case, it would help if we
create the tuples in the *frozen* state to avoid any wrap-around related issues
with the table. Without this, very large read-only tables would
require one round of
complete freezing if there are lot of transactional activities in the
other parts
of the database. And when that happens, it would generate lots of unnecessary
IOs on these large tables.

I don't know if this is a real problem for anybody, but I could think
of its use case, at least in theory.

Is it worth doing ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#2Florian G. Pflug
fgp@phlo.org
In reply to: Pavan Deolasee (#1)
Re: CREATE TABLE, load and freezing

Pavan Deolasee wrote:

In a typical scenario, user might create a table and load data in the
table as part of a single transaction (e.g pg_restore). In this case,
it would help if we create the tuples in the *frozen* state to avoid
any wrap-around related issues with the table. Without this, very
large read-only tables would require one round of complete freezing
if there are lot of transactional activities in the other parts of
the database. And when that happens, it would generate lots of
unnecessary IOs on these large tables.

If that works, then we might also want to set the visibility hint bits.
Not because lookup of that information is expensive - the tuples all
came from the same transaction, virtually guaranteeing that the relevent
pg_clog page stays in memory after the first few pages.
But by setting them immediatly we'd save some IO, since we won't dirty
all pages during the first scan.

I don't know if this is a real problem for anybody, but I could think
of its use case, at least in theory.

A cannot speak for freeze-on-restore, but in a project I'm currently
working on, the IO caused (I guess) by hint-bit updates during the
first scan of the table is at least noticeably...

regards, Florian Pflug

#3ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Pavan Deolasee (#1)
Re: CREATE TABLE, load and freezing

"Pavan Deolasee" <pavan.deolasee@gmail.com> wrote:

In a typical scenario, user might create a table and load data in the table as
part of a single transaction (e.g pg_restore). In this case, it would help if we
create the tuples in the *frozen* state to avoid any wrap-around related issues
with the table.

Sounds cool. I recommended users to do VACUUM FREEZE just after initial
loading, but we can avoid it with your method.

Without this, very large read-only tables would require one round of
complete freezing if there are lot of transactional activities in the other parts
of the database. And when that happens, it would generate lots of unnecessary
IOs on these large tables.

To make things worse, the freezing day comes at once because the first restore
is done in a single or near transactions; The wraparound timings of many
tables are aligned at the same time. Freezing copy will be the solution.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#4Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Florian G. Pflug (#2)
Re: CREATE TABLE, load and freezing

On Thu, Feb 28, 2008 at 3:05 PM, Florian G. Pflug <fgp@phlo.org> wrote:

If that works, then we might also want to set the visibility hint bits.

Oh yes. Especially because random time-scattered index scans on
the table can actually generate multiple writes of a page of a
read-only table.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#5Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: ITAGAKI Takahiro (#3)
Re: CREATE TABLE, load and freezing

On Thu, Feb 28, 2008 at 3:25 PM, ITAGAKI Takahiro
<itagaki.takahiro@oss.ntt.co.jp> wrote:

Sounds cool. I recommended users to do VACUUM FREEZE just after initial
loading, but we can avoid it with your method.

Yeah, and the additional step of VACUUM FREEZE adds up to the restore
time.

To make things worse, the freezing day comes at once because the first restore
is done in a single or near transactions; The wraparound timings of many
tables are aligned at the same time. Freezing copy will be the solution.

If we can start with a freezed table and even if the table is
subsequently updated,
hopefully DSM (or something of that sort) will help us reduce the vacuum freeze
time whenever its required.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#6Heikki Linnakangas
heikki@enterprisedb.com
In reply to: Pavan Deolasee (#4)
Re: CREATE TABLE, load and freezing

Pavan Deolasee wrote:

On Thu, Feb 28, 2008 at 3:05 PM, Florian G. Pflug <fgp@phlo.org> wrote:

If that works, then we might also want to set the visibility hint bits.

Oh yes. Especially because random time-scattered index scans on
the table can actually generate multiple writes of a page of a
read-only table.

I remember that Simon tried to set hint bits as well when he wrote the
"skip WAL on new table" optimization, but there was some issues with it.
I can't remember the details, but I think it was related to commands in
the same transaction seeing the tuples too early. Like triggers, or
portals opened before the COPY.

Hint bits is the critical part of the issue. If you can set the hint
bits, then you can freeze as well, but freezing without setting hint
bits doesn't buy you much.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#7Florian Pflug
fgp.phlo.org@gmail.com
In reply to: ITAGAKI Takahiro (#3)
Re: CREATE TABLE, load and freezing

ITAGAKI Takahiro wrote:

Without this, very large read-only tables would require one round of
complete freezing if there are lot of transactional activities in the other parts
of the database. And when that happens, it would generate lots of unnecessary
IOs on these large tables.

To make things worse, the freezing day comes at once because the first restore
is done in a single or near transactions; The wraparound timings of many
tables are aligned at the same time. Freezing copy will be the solution.

Hm.. Couldn't we eliminate that particular concern easily by adding some
randomness to the freeze_age?

regards, Florian Pflug

#8Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Pavan Deolasee (#1)
Re: CREATE TABLE, load and freezing

On Thu, Feb 28, 2008 at 3:08 AM, in message

<2e78013d0802280108s59e52e14hae032e1e67409d6c@mail.gmail.com>, "Pavan Deolasee"
<pavan.deolasee@gmail.com> wrote:

I had this idea sometime back. Not sure if this has been discussed before

There was a thread discussing the problems you're looking to address:

http://archives.postgresql.org/pgsql-performance/2007-12/msg00230.php

I don't know if this is a real problem for anybody, but I could think
of its use case, at least in theory.

Yeah, it's real. We are now doing a VACUUM FREEZE of a table or
database which has been freshly loaded. If you can load them
frozen and/or with hint bits, that would reduce the time to bring
a database online. It would be much appreciated here.

-Kevin

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavan Deolasee (#1)
Re: CREATE TABLE, load and freezing

"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:

In a typical scenario, user might create a table and load data in the
table as part of a single transaction (e.g pg_restore). In this case,
it would help if we create the tuples in the *frozen* state to avoid
any wrap-around related issues with the table.

We've heard that idea before, and it's just as bad as it was when
proposed before. "Pre-frozen" tuples eliminate any possibility of
tracking when a tuple was inserted; which is extremely important to know
when you are trying to do forensic analysis of a broken table. The
point of the current design is to not throw away information about tuple
insertion time until the tuple is old enough that the info is (probably)
not interesting anymore.

regards, tom lane

#10Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Tom Lane (#9)
Re: CREATE TABLE, load and freezing

On Fri, Feb 29, 2008 at 9:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

We've heard that idea before, and it's just as bad as it was when
proposed before. "Pre-frozen" tuples eliminate any possibility of
tracking when a tuple was inserted; which is extremely important to know
when you are trying to do forensic analysis of a broken table. The
point of the current design is to not throw away information about tuple
insertion time until the tuple is old enough that the info is (probably)
not interesting anymore.

Understood. But if we consider a special case of creation and loading
of a table in a single transaction, we can possibly save the information
that the table was loaded with pre-frozen tuples with xmin equals to the
transaction creating the table.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#11Simon Riggs
simon@2ndquadrant.com
In reply to: Pavan Deolasee (#1)
Re: CREATE TABLE, load and freezing

On Thu, 2008-02-28 at 14:38 +0530, Pavan Deolasee wrote:

I had this idea sometime back. Not sure if this has been discussed before

Check the archives for my post to hackers in Jan 2007 and subsequent
discussion. It's possible, just a little fiddly.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk