GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

Started by Fabrízio de Royes Melloalmost 11 years ago15 messages

fabriziomello@gmail.com

almost 11 years ago

Hi all,

Below I written my proposal idea to this GSoC.

*** Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED”
statement ***

Last year during the GSoC2014 I implemented the feature to allow an
unlogged table to be changed to logged [1]https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014, but the desing chosen was to
rewrite the entire heap in a new relfilenode with a new relpersistence
because some troubles pointed here [2]/messages/by-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com.

The project was successfully finished and got committed [3]http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f41872d0c1239d36ab03393c39ec0b70e9ee2a3c into PostgreSQL
to be released this year in the 9.5 version.

However this design lead us to performance problems with large relations
because we need to rewrite the entire content of the relation twice, one
into a new heap and other into the WAL, so this project will change the
current desing of the mechanism of change an unlogged table to logged
without the need to rewrite the entire heap, but just by removing the init
forks and if the wal_level != minimal we’ll write the contents to the WAL
too.

** Benefits to the PostgreSQL Community **

The “unlogged” tables feature was introduced by 9.1 version, and provide
better write performance than regular tables (logged), but are not
crash-safe. Their contents are automatically discarded (cleared) if the
server crashes. Also, their contents do not propagate to standby servers.

We already have a way to change an unlogged table to logged using “ALTER
TABLE name SET LOGGED” developed last year during the GSoC2014, now during
the GSoC2015 we’ll redesing the internals to improve the I/O performance of
this feature removing the need of rewrite the entire heap into a new
relfilenode.

The are a good idea about the desing here [4]/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com, but I’ll discuss the design
with my mentor to implement this improvement.

** Additional Goals **

The main goal of this project is improve the performance of the ALTER TABLE
name SET {LOGGED|UNLOGGED}, but we can expand this propostal to more
related goals.

* Allow unlogged materialized views
. ALTER MATERIALIZED VIEW name SET { UNLOGGED | LOGGED }
* Allow unlogged indexes on logged tables
. ALTER INDEX name SET { UNLOGGED | LOGGED }

** Deliverables **

This project has just one deliverable at the end. The deliverable will be
the improvement of the routines that transform an “unlogged” table to
“logged” and “logged” to “unlogged”, without the need to create a new
“relfilenode” with a different “relpersistence”.

** Project Schedule **

until May 25:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project
[4]: /messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com
* learn about some PostgreSQL internals:
. control data (src/include/catalog/pg_control.h)
. storage (src/backend/storage/*)
* discuss the additional goals with community

May 26 - June 21
* implementation of the first prototype:
. implement the change of unlogged table to logged without rewrite the
entire heap when “wal_level = minimal”
. at this point when “wal_level != minimal” we use the current
implementation
* write documentation and the test cases
* submit this first prototype to the commitfest 2015/06 (
https://commitfest.postgresql.org/5/)

June 22 - June 26
* mentor review the work in progress

June 27 - August 17
* do the adjustments based on the community feedback during the commitfest
2015/06
* implementation of the second prototype:
. when “wal_level != minimal” we’ll remove the init fork (first
prototype) and write relation pages to the WAL.
. implement “ALTER MATERIALIZED VIEW .. SET LOGGED / UNLOGGED”
* submit to the commitfest 2015/09 for final evaluation and maybe will be
committed to 9.6 version (webpage don’t created yet)

August 18 - August 21
* do the adjustments based on the community feedback during the commitfest
2015/09
* final mentor review

** About the proponent **

Fabrízio de Royes Mello

e-mail: fabriziomello@gmail.com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello

Currently I help people and teams to take the full potential of relational
databases, especially PostgreSQL, helping teams to design the structure of
the database (modeling), build physical architecture (database schema),
programming (procedural languages), SQL (usage, tuning, best practices),
optimization and orchestration of instances in production too. I perform a
volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br),
supporting mailing lists, organizing events (pgbr.postgresql.org.br) and
some admin tasks. And also I help a little the PostgreSQL Global
Development Group (PGDG) in the implementation of some features and review
of patches (git.postgresql.org).

Links

[1]: https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014
https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014
[2]: /messages/by-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com
/messages/by-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com
[3]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f41872d0c1239d36ab03393c39ec0b70e9ee2a3c
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f41872d0c1239d36ab03393c39ec0b70e9ee2a3c
[4]: /messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com
/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Fabrízio de Royes Mello (#1)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Fabrízio de Royes Mello

fabriziomello@gmail.com

almost 11 years ago

In reply to: Robert Haas (#2)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

How the cleanup will happens?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Robert Haas (#2)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

You're right, and we have another design (steps 1 and 2 was implemented
last year):

*** ALTER TABLE changes

1) ATController
1.1) Acquire an AccessExclusiveLock (src/backend/commands/tablecmds.c -
AlterTableGetLockLevel:3023)

2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -
ATPrepCmd:3249-3270)
• check temp table (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11074)
• check foreign key constraints (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11102)

3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync
(MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
exists)

4) Create a new fork called "TRANSIENT INIT FORK":

• from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
∘ new forkName (src/common/relpath.c) called "_initl"
∘ insert XLog record to drop it if transaction abort

• from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
∘ new forkName (src/common/relpath.c) called "_initu"
∘ insert XLog record to drop it if transaction abort

5) Change the relpersistence in catalog (pg_class->relpersistence) (heap,
toast, indexes)

6) Remove/Create INIT_FORK

• from Unlogged to Logged
∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the
pendingDeletes queue
• from Logged to Unlogged
∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
∘ create the INIT_FORK using "heap_create_init_fork"
∘ insert XLog record to drop init fork if the transaction abort

*** CRASH RECOVERY changes

1) During crash recovery
(src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)

• if the transient fork "_initl" exists then
∘ drop the transient fork "_initl"
∘ if the init fork doesn't exist then create it
∘ reset relation
• if the transient fork "_initu" exists then
∘ drop the transient fork "_initl"
∘ if the init fork exists then drop it
∘ don't reset the relation

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 10 years ago

In reply to: Fabrízio de Royes Mello (#4)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello <
fabriziomello@gmail.com> wrote:

On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

You're right, and we have another design (steps 1 and 2 was implemented
last year):

*** ALTER TABLE changes

1) ATController
1.1) Acquire an AccessExclusiveLock (src/backend/commands/tablecmds.c
- AlterTableGetLockLevel:3023)

2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -
ATPrepCmd:3249-3270)
• check temp table (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11074)
• check foreign key constraints (src/backend/commands/tablecmds.c -
ATPrepChangePersistence:11102)

3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync
(MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
exists)

4) Create a new fork called "TRANSIENT INIT FORK":

• from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
∘ new forkName (src/common/relpath.c) called "_initl"
∘ insert XLog record to drop it if transaction abort

• from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
∘ new forkName (src/common/relpath.c) called "_initu"
∘ insert XLog record to drop it if transaction abort

AFAIU, while reading WAL, the server doesn't know whether the transaction
that produced that WAL record aborted or committed. It's only when it sees
a COMMIT/ABORT record down the line, it can confirm whether the transaction
committed or aborted. So, one can only "redo" the things that WAL tells
have been done. We can not "undo" things based on the WAL. We might record
this fact "somewhere" while reading the WAL and act on it once we know the
status of the transaction, but I do not see that as part of this idea. This
comment applies to all the steps inserting WALs for undoing things.

5) Change the relpersistence in catalog (pg_class->relpersistence) (heap,
toast, indexes)

6) Remove/Create INIT_FORK

• from Unlogged to Logged
∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the
pendingDeletes queue
• from Logged to Unlogged
∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
∘ create the INIT_FORK using "heap_create_init_fork"
∘ insert XLog record to drop init fork if the transaction abort

*** CRASH RECOVERY changes

1) During crash recovery
(src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)

This operation is carried out in two phases: one before replaying WAL
records and second after that. Are you sure that the WALs generated for the
unlogged or logged forks, as described above, have been taken care of?

• if the transient fork "_initl" exists then
∘ drop the transient fork "_initl"
∘ if the init fork doesn't exist then create it
∘ reset relation
• if the transient fork "_initu" exists then
∘ drop the transient fork "_initl"
∘ if the init fork exists then drop it
∘ don't reset the relation

Consider case of converting unlogged to logged. The server crashes after
6th step when both the forks are removed. During crash recovery, it will
not see any of the fork and won't reset the unlogged table. Remember the
table is still unlogged since the transaction was aborted due to the crash.
So, it has to have an init fork to be reset on crash recovery.

Similarly while converting from logged to unlogged. The server crashes in
the 6th step after creating the INIT_FORK, during crash recovery the init
fork will be seen and a supposedly logged table will be trashed.

The ideas in 1 and 2 might be better than having yet another init fork.

1. /messages/by-id/533D457A.4030007@vmware.com

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Ashutosh Bapat (#5)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Fri, Jul 3, 2015 at 9:29 AM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:

On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello <

fabriziomello@gmail.com> wrote:

On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

You're right, and we have another design (steps 1 and 2 was implemented

last year):

*** ALTER TABLE changes

1) ATController
1.1) Acquire an AccessExclusiveLock

(src/backend/commands/tablecmds.c - AlterTableGetLockLevel:3023)

2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -

ATPrepCmd:3249-3270)

• check temp table (src/backend/commands/tablecmds.c -

ATPrepChangePersistence:11074)

• check foreign key constraints (src/backend/commands/tablecmds.c -

ATPrepChangePersistence:11102)

3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync

(MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
exists)

4) Create a new fork called "TRANSIENT INIT FORK":

• from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
∘ new forkName (src/common/relpath.c) called "_initl"
∘ insert XLog record to drop it if transaction abort

• from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
∘ new forkName (src/common/relpath.c) called "_initu"
∘ insert XLog record to drop it if transaction abort

AFAIU, while reading WAL, the server doesn't know whether the transaction

that produced that WAL record aborted or committed. It's only when it sees
a COMMIT/ABORT record down the line, it can confirm whether the transaction
committed or aborted. So, one can only "redo" the things that WAL tells
have been done. We can not "undo" things based on the WAL. We might record
this fact "somewhere" while reading the WAL and act on it once we know the
status of the transaction, but I do not see that as part of this idea. This
comment applies to all the steps inserting WALs for undoing things.

Even if I "xlog" the create/drop of the transient fork?

5) Change the relpersistence in catalog (pg_class->relpersistence)

(heap, toast, indexes)

6) Remove/Create INIT_FORK

• from Unlogged to Logged
∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the

pendingDeletes queue

• from Logged to Unlogged
∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
∘ create the INIT_FORK using "heap_create_init_fork"
∘ insert XLog record to drop init fork if the transaction abort

*** CRASH RECOVERY changes

1) During crash recovery

(src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)

This operation is carried out in two phases: one before replaying WAL

records and second after that. Are you sure that the WALs generated for the
unlogged or logged forks, as described above, have been taken care of?

Hummm... actually no...

• if the transient fork "_initl" exists then
∘ drop the transient fork "_initl"
∘ if the init fork doesn't exist then create it
∘ reset relation
• if the transient fork "_initu" exists then
∘ drop the transient fork "_initl"
∘ if the init fork exists then drop it
∘ don't reset the relation

Consider case of converting unlogged to logged. The server crashes after

6th step when both the forks are removed. During crash recovery, it will
not see any of the fork and won't reset the unlogged table. Remember the
table is still unlogged since the transaction was aborted due to the crash.
So, it has to have an init fork to be reset on crash recovery.

Similarly while converting from logged to unlogged. The server crashes in

the 6th step after creating the INIT_FORK, during crash recovery the init
fork will be seen and a supposedly logged table will be trashed.

My intention for the 6th step is all recorded to wal, so if a crash occurs
the recovery process clean the mess.

The ideas in 1 and 2 might be better than having yet another init fork.

The link for idea 2 is missing...

1. /messages/by-id/533D457A.4030007@vmware.com

Unfortunately I completely missed this idea, but is also interesting. But
I'll need to "stamp" in both ways (from unlogged to logged and vice-versa)

During the recovery to check the status of a transaction can I use
src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking
because I don't know this part of code to much.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 10 years ago

In reply to: Fabrízio de Royes Mello (#6)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Wed, Jul 8, 2015 at 1:27 AM, Fabrízio de Royes Mello <
fabriziomello@gmail.com> wrote:

On Fri, Jul 3, 2015 at 9:29 AM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:

On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello <

fabriziomello@gmail.com> wrote:

On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas@gmail.com>

wrote:

On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

/messages/by-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

I like the idea of the feature a lot, but the proposal to which you
refer here mentions some problems, which I'm curious how you think you
might solve. (I don't have any good ideas myself, beyond what I
mentioned there.)

You're right, and we have another design (steps 1 and 2 was implemented

last year):

*** ALTER TABLE changes

1) ATController
1.1) Acquire an AccessExclusiveLock

(src/backend/commands/tablecmds.c - AlterTableGetLockLevel:3023)

2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -

ATPrepCmd:3249-3270)

• check temp table (src/backend/commands/tablecmds.c -

ATPrepChangePersistence:11074)

• check foreign key constraints (src/backend/commands/tablecmds.c -

ATPrepChangePersistence:11102)

3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync

(MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
exists)

4) Create a new fork called "TRANSIENT INIT FORK":

• from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
∘ new forkName (src/common/relpath.c) called "_initl"
∘ insert XLog record to drop it if transaction abort

• from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
∘ new forkName (src/common/relpath.c) called "_initu"
∘ insert XLog record to drop it if transaction abort

AFAIU, while reading WAL, the server doesn't know whether the

transaction that produced that WAL record aborted or committed. It's only
when it sees a COMMIT/ABORT record down the line, it can confirm whether
the transaction committed or aborted. So, one can only "redo" the things
that WAL tells have been done. We can not "undo" things based on the WAL.
We might record this fact "somewhere" while reading the WAL and act on it
once we know the status of the transaction, but I do not see that as part
of this idea. This comment applies to all the steps inserting WALs for
undoing things.

Even if I "xlog" the create/drop of the transient fork?

Yes.

5) Change the relpersistence in catalog (pg_class->relpersistence)

(heap, toast, indexes)

6) Remove/Create INIT_FORK

• from Unlogged to Logged
∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the

pendingDeletes queue

• from Logged to Unlogged
∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
∘ create the INIT_FORK using "heap_create_init_fork"
∘ insert XLog record to drop init fork if the transaction abort

*** CRASH RECOVERY changes

1) During crash recovery

(src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)

This operation is carried out in two phases: one before replaying WAL

records and second after that. Are you sure that the WALs generated for the
unlogged or logged forks, as described above, have been taken care of?

Hummm... actually no...

• if the transient fork "_initl" exists then
∘ drop the transient fork "_initl"
∘ if the init fork doesn't exist then create it
∘ reset relation
• if the transient fork "_initu" exists then
∘ drop the transient fork "_initl"
∘ if the init fork exists then drop it
∘ don't reset the relation

Consider case of converting unlogged to logged. The server crashes after

6th step when both the forks are removed. During crash recovery, it will
not see any of the fork and won't reset the unlogged table. Remember the
table is still unlogged since the transaction was aborted due to the crash.
So, it has to have an init fork to be reset on crash recovery.

Similarly while converting from logged to unlogged. The server crashes

in the 6th step after creating the INIT_FORK, during crash recovery the
init fork will be seen and a supposedly logged table will be trashed.

My intention for the 6th step is all recorded to wal, so if a crash occurs
the recovery process clean the mess.

AFAIU, PostgreSQL recovery is based on "redo"ing WAL. What you described
earlier, "undo"ing based on the WAL does not fit in the current framework.

The ideas in 1 and 2 might be better than having yet another init fork.

The link for idea 2 is missing...

Sorry for that. 2nd idea is what Robert described using control file.

1. /messages/by-id/533D457A.4030007@vmware.com

Unfortunately I completely missed this idea, but is also interesting. But
I'll need to "stamp" in both ways (from unlogged to logged and vice-versa)

During the recovery to check the status of a transaction can I use
src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking
because I don't know this part of code to much.

AFAIU, not unless all WALs (or atleast WALs till the commit/abort record of
the transaction in question) are replayed.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Ashutosh Bapat (#7)

Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Wed, Jul 8, 2015 at 3:20 AM, Ashutosh Bapat <
ashutosh.bapat@enterprisedb.com> wrote:

My intention for the 6th step is all recorded to wal, so if a crash

occurs the recovery process clean the mess.

AFAIU, PostgreSQL recovery is based on "redo"ing WAL. What you described

earlier, "undo"ing based on the WAL does not fit in the current framework.

Understood!

During the recovery to check the status of a transaction can I use

src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking
because I don't know this part of code to much.

AFAIU, not unless all WALs (or atleast WALs till the commit/abort record

of the transaction in question) are replayed.

So we'll need to execute this procedure after replay, then the
"ResetUnloggedTables" should be executed in two phases:

1) Before the replay checking just the datafiles with the init fork (_init)

2) After the replay checking the datafiles with the stamped init fork (
_init_XID, or something else)

Is this reasonable?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: Fabrízio de Royes Mello (#4)

Re: Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On 2015-07-03 08:18:09 -0300, Fabrï¿½zio de Royes Mello wrote:

*** ALTER TABLE changes
...

Without going into the specifics of this change: Are we actually sure
this feature warrants the complexity it'll introduce? I'm really rather
doubtful.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Andres Freund (#9)

Re: [HACKERS] Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Wed, Jul 8, 2015 at 10:53 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-03 08:18:09 -0300, Fabrízio de Royes Mello wrote:

*** ALTER TABLE changes
...

Without going into the specifics of this change: Are we actually sure
this feature warrants the complexity it'll introduce? I'm really rather
doubtful.

Think in an ETL job that can be use an unlogged table to improve the load
performance, but this job create a "large table" and to guarantee the data
consistency you need to transform it into a regular table, and with the
current implementation rewrite the entire heap, toast and indexes.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

#11

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: Fabrízio de Royes Mello (#10)

Re: Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On 2015-07-08 10:58:51 -0300, Fabrï¿½zio de Royes Mello wrote:

Think in an ETL job that can be use an unlogged table to improve the load
performance, but this job create a "large table" and to guarantee the data
consistency you need to transform it into a regular table, and with the
current implementation rewrite the entire heap, toast and indexes.

Don't buy that. The final target relation will usually already have
content. Also everything but wal_level=minimal will force you to WAL log
the contents anyway.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Andres Freund (#11)

Re: [HACKERS] Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Wed, Jul 8, 2015 at 11:37 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-08 10:58:51 -0300, Fabrízio de Royes Mello wrote:

Think in an ETL job that can be use an unlogged table to improve the

load

performance, but this job create a "large table" and to guarantee the

data

consistency you need to transform it into a regular table, and with the
current implementation rewrite the entire heap, toast and indexes.

Don't buy that. The final target relation will usually already have
content. Also everything but wal_level=minimal will force you to WAL log
the contents anyway.

If the "wal_level=minimal" we don't need to force the wal log of the
contents. If the "wal_level != minimal" we need just to xlog all the pages,
but in both cases we don't need the extra job to create a new datafiles and
copy the contents between them. So we'll improve performance, or am I
missing something?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

#13

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: Fabrízio de Royes Mello (#12)

Re: Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On 2015-07-09 10:39:35 -0300, Fabrï¿½zio de Royes Mello wrote:

If the "wal_level=minimal" we don't need to force the wal log of the
contents. If the "wal_level != minimal" we need just to xlog all the pages,
but in both cases we don't need the extra job to create a new datafiles and
copy the contents between them. So we'll improve performance, or am I
missing something?

Sure. It'll be a bit faster. I just don't see the peformance increase in
not that common situations being worth the price we'll pay in
development, code review, debugging and then maintaining some nontrivial
code. If this were likely to be a 15 line patch, I'd think differently.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Fabrízio de Royes Mello

fabriziomello@gmail.com

over 10 years ago

In reply to: Andres Freund (#13)

Re: [HACKERS] Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

On Thu, Jul 9, 2015 at 10:56 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-09 10:39:35 -0300, Fabrízio de Royes Mello wrote:

If the "wal_level=minimal" we don't need to force the wal log of the
contents. If the "wal_level != minimal" we need just to xlog all the

pages,

but in both cases we don't need the extra job to create a new datafiles

and

copy the contents between them. So we'll improve performance, or am I
missing something?

Sure. It'll be a bit faster. I just don't see the peformance increase in
not that common situations being worth the price we'll pay in
development, code review, debugging and then maintaining some nontrivial
code. If this were likely to be a 15 line patch, I'd think differently.

Well, this will not be a 15 line patch, but I'll try to simplify as much as
possible.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello

#15

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andres Freund (#13)

Re: Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

Andres Freund <andres@anarazel.de> writes:

On 2015-07-09 10:39:35 -0300, Fabr�zio de Royes Mello wrote:

If the "wal_level=minimal" we don't need to force the wal log of the
contents. If the "wal_level != minimal" we need just to xlog all the pages,
but in both cases we don't need the extra job to create a new datafiles and
copy the contents between them. So we'll improve performance, or am I
missing something?

Sure. It'll be a bit faster. I just don't see the peformance increase in
not that common situations being worth the price we'll pay in
development, code review, debugging and then maintaining some nontrivial
code. If this were likely to be a 15 line patch, I'd think differently.

I'm even more worried about the possible reliability problems (ie
introduction of bugs, which are likely to be of the data-eating variety)
that such changes are likely to incur. So I tend to agree with Andres
that this is probably not a good idea.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers