GSoC proposal - "make an unlogged table logged"

Started by Fabrízio de Royes Melloabout 12 years ago47 messageshackers

fabriziomello@gmail.com

about 12 years ago

Hi all,

Is the TODO item "make an unlogged table logged" [1]/messages/by-id/AANLkTinENZbRxdCwOHkqBba2BHUbfY8_C5JwRXLodxcX@mail.gmail.com a good GSoC project?

Regards,

[1]: /messages/by-id/AANLkTinENZbRxdCwOHkqBba2BHUbfY8_C5JwRXLodxcX@mail.gmail.com
/messages/by-id/AANLkTinENZbRxdCwOHkqBba2BHUbfY8_C5JwRXLodxcX@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Fabrízio de Royes Mello (#1)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

about 12 years ago

In reply to: Robert Haas (#2)

Re: GSoC proposal - "make an unlogged table logged"

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

Thanks,

Stephen

Tom Lane

tgl@sss.pgh.pa.us

about 12 years ago

In reply to: Stephen Frost (#3)

Re: GSoC proposal - "make an unlogged table logged"

Stephen Frost <sfrost@snowman.net> writes:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabr�zio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

IIRC, the reason you'd have to do that is to make the table contents
appear on slave servers. If you don't consider replication then it might
seem easier.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Hannu Krosing

hannu@tm.ee

about 12 years ago

In reply to: Tom Lane (#4)

Re: GSoC proposal - "make an unlogged table logged"

On 03/03/2014 05:22 PM, Tom Lane wrote:

Stephen Frost <sfrost@snowman.net> writes:

...

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

IIRC, the reason you'd have to do that is to make the table contents
appear on slave servers. If you don't consider replication then it might
seem easier.

So switch on logging and then perform CLUSTER/VACUUM FULL ?

Should this work, or is something extra needed ?

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic Oï¿½

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Stephen Frost (#3)

Re: GSoC proposal - "make an unlogged table logged"

On 2014-03-03 12:08:26 -0500, Stephen Frost wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

I don't see that as a particularly problematic problem. The primary
reason to want to convert a unlogged to a logged table probably is that
it's easier to do so than to recreate the table + dependencies. Also the
overhead of logging full pages will be noticeably smaller than the
overhead of adding all rows individually, even if using
heap_multi_insert().

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

about 12 years ago

In reply to: Fabrízio de Royes Mello (#1)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

Another interesting project around unlogged tables would be to make it
possible to have unlogged indexes on fully-logged tables. That is
something that there was some discussion of before, that might be
easier.

FWIW, I don't think that TODO page is a very good resource for finding
a starter project. Picking a good project is a skill in and of itself.
A lot of that stuff is aspirational, either because it's difficult,
or, more often, because it's difficult relative to the rewards, which
can be quite low. To be honest, if I have what I imagine to be a great
idea for a project, I don't put it on that page. Maybe I should, but I
don't, and I don't think that is uncommon. This is not because I'm
particularly guarded about sharing the information.

Why do you think that hash indexes still aren't WAL-logged after all
these years (a project that someone made noise about recently in
relation to GSoC), even though that's generally considered to be a
SMOP?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Peter Geoghegan (#7)

Re: GSoC proposal - "make an unlogged table logged"

On 2014-03-03 12:44:26 -0800, Peter Geoghegan wrote:

On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

Another interesting project around unlogged tables would be to make it
possible to have unlogged indexes on fully-logged tables. That is
something that there was some discussion of before, that might be
easier.

I'd actually say it's harder because it requires modifying the catalog
or transparently introducing hacks similar to what unlogged matviews are
doing, to make sure the index is marked invalid after a crash restart.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Hannu Krosing (#5)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 2:40 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:

On 03/03/2014 05:22 PM, Tom Lane wrote:

Stephen Frost <sfrost@snowman.net> writes:

...

ISTR the discussion going something along the lines of "we'd have to

WAL

log the entire table to do that, and if we have to do that, what's the
point?".

IIRC, the reason you'd have to do that is to make the table contents
appear on slave servers. If you don't consider replication then it

might

seem easier.

So switch on logging and then perform CLUSTER/VACUUM FULL ?

Should this work, or is something extra needed ?

Today I do something like that:

1) create unlogged table tmp_foo ...
2) populate 'tmp_foo' table (ETL scripts or whatever)
3) start transaction
4) lock table tmp_foo in access exclusive mode
5) update pg_class set relpersistence = 'p' where oid = 'tmp_foo':regclass
6) drop table foo; -- the old foo table
7) alter table tmp_foo rename to foo;
8) end transaction
9) run pg_repack in table 'foo'

I know it's very ugly, but works... and works for standbys too... :-)

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#10

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Peter Geoghegan (#7)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 5:44 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC

project?

Another interesting project around unlogged tables would be to make it
possible to have unlogged indexes on fully-logged tables. That is
something that there was some discussion of before, that might be
easier.

More easy than "make an unlogged table logged" ?

FWIW, I don't think that TODO page is a very good resource for finding
a starter project. Picking a good project is a skill in and of itself.
A lot of that stuff is aspirational, either because it's difficult,
or, more often, because it's difficult relative to the rewards, which
can be quite low. To be honest, if I have what I imagine to be a great
idea for a project, I don't put it on that page. Maybe I should, but I
don't, and I don't think that is uncommon. This is not because I'm
particularly guarded about sharing the information.

Share your ideas, please!

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#11

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#8)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 5:47 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-03-03 12:44:26 -0800, Peter Geoghegan wrote:

On Mon, Mar 3, 2014 at 8:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC

project?

Another interesting project around unlogged tables would be to make it
possible to have unlogged indexes on fully-logged tables. That is
something that there was some discussion of before, that might be
easier.

I'd actually say it's harder because it requires modifying the catalog
or transparently introducing hacks similar to what unlogged matviews are
doing, to make sure the index is marked invalid after a crash restart.

Unlogged matviews was disallowed [1]/messages/by-id/E1UZNrD-0008Eh-83@gemulon.postgresql.org.

[1]: /messages/by-id/E1UZNrD-0008Eh-83@gemulon.postgresql.org
/messages/by-id/E1UZNrD-0008Eh-83@gemulon.postgresql.org

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#12

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#6)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 2:42 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-03-03 12:08:26 -0500, Stephen Frost wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC

project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

I don't see that as a particularly problematic problem. The primary
reason to want to convert a unlogged to a logged table probably is that
it's easier to do so than to recreate the table + dependencies. Also the
overhead of logging full pages will be noticeably smaller than the
overhead of adding all rows individually, even if using
heap_multi_insert().

Do you know some similar in the source code?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#13

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Fabrízio de Royes Mello (#9)

Re: GSoC proposal - "make an unlogged table logged"

On 2014-03-04 01:10:50 -0300, Fabrï¿½zio de Royes Mello wrote:

Today I do something like that:

1) create unlogged table tmp_foo ...
2) populate 'tmp_foo' table (ETL scripts or whatever)
3) start transaction
4) lock table tmp_foo in access exclusive mode
5) update pg_class set relpersistence = 'p' where oid = 'tmp_foo':regclass
6) drop table foo; -- the old foo table
7) alter table tmp_foo rename to foo;
8) end transaction
9) run pg_repack in table 'foo'

I know it's very ugly, but works... and works for standbys too... :-)

No, it doesn't work. It just may happen to not fail loudly/visibly in
some cases. You're absolutely risking corruption of this *and* other
relations when doing so.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Stephen Frost (#3)

Re: GSoC proposal - "make an unlogged table logged"

On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

No, not really. The issue is more around what happens if we crash
part way through. At crash recovery time, the system catalogs are not
available, because the database isn't consistent yet and, anyway, the
startup process can't be bound to a database, let alone every database
that might contain unlogged tables. So the sentinel that's used to
decide whether to flush the contents of a table or index is the
presence or absence of an _init fork, which the startup process
obviously can see just fine. The _init fork also tells us what to
stick in the relation when we reset it; for a table, we can just reset
to an empty file, but that's not legal for indexes, so the _init fork
contains a pre-initialized empty index that we can just copy over.

Now, to make an unlogged table logged, you've got to at some stage
remove those _init forks. But this is not a transactional operation.
If you remove the _init forks and then the transaction rolls back,
you've left the system an inconsistent state. If you postpone the
removal until commit time, then you have a problem if it fails,
particularly if it works for the first file but fails for the second.
And if you crash at any point before you've fsync'd the containing
directory, you have no idea which files will still be on disk after a
hard reboot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Robert Haas (#14)

Re: GSoC proposal - "make an unlogged table logged"

On 2014-03-04 09:47:08 -0500, Robert Haas wrote:

On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

No, not really. The issue is more around what happens if we crash
part way through. At crash recovery time, the system catalogs are not
available, because the database isn't consistent yet and, anyway, the
startup process can't be bound to a database, let alone every database
that might contain unlogged tables. So the sentinel that's used to
decide whether to flush the contents of a table or index is the
presence or absence of an _init fork, which the startup process
obviously can see just fine. The _init fork also tells us what to
stick in the relation when we reset it; for a table, we can just reset
to an empty file, but that's not legal for indexes, so the _init fork
contains a pre-initialized empty index that we can just copy over.

Now, to make an unlogged table logged, you've got to at some stage
remove those _init forks. But this is not a transactional operation.
If you remove the _init forks and then the transaction rolls back,
you've left the system an inconsistent state. If you postpone the
removal until commit time, then you have a problem if it fails,
particularly if it works for the first file but fails for the second.
And if you crash at any point before you've fsync'd the containing
directory, you have no idea which files will still be on disk after a
hard reboot.

Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do that
for anything but wal_level=minimal anyway.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#13)

Re: GSoC proposal - "make an unlogged table logged"

On Tue, Mar 4, 2014 at 3:31 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-03-04 01:10:50 -0300, Fabrízio de Royes Mello wrote:

Today I do something like that:

1) create unlogged table tmp_foo ...
2) populate 'tmp_foo' table (ETL scripts or whatever)
3) start transaction
4) lock table tmp_foo in access exclusive mode
5) update pg_class set relpersistence = 'p' where oid =

'tmp_foo':regclass

6) drop table foo; -- the old foo table
7) alter table tmp_foo rename to foo;
8) end transaction
9) run pg_repack in table 'foo'

I know it's very ugly, but works... and works for standbys too... :-)

No, it doesn't work. It just may happen to not fail loudly/visibly in
some cases. You're absolutely risking corruption of this *and* other
relations when doing so.

Well this already works for some time, but you are correct, exists the risk
of corruption!

But in my case if all run without any interrupt the relation is switched to
logged. I do some checks before and after, and if something happens with
this process we cleanup everything and start from the beginning.

Maybe I must run CLUSTER inside the transaction block after update pg_class
and execute DROP and RENAME after, in a second phase. Maybe this way is
more secure. Is it?

If some crash occurs and PostgreSQL restart I check if the unlogged table
'tmp_foo' exists and then I drop it.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#17

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#15)

Re: GSoC proposal - "make an unlogged table logged"

On Tue, Mar 4, 2014 at 11:50 AM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-03-04 09:47:08 -0500, Robert Haas wrote:

On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net>

wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC

project?

I'm pretty sure we found some problems in that design that we

couldn't

figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to

WAL

log the entire table to do that, and if we have to do that, what's the
point?".

No, not really. The issue is more around what happens if we crash
part way through. At crash recovery time, the system catalogs are not
available, because the database isn't consistent yet and, anyway, the
startup process can't be bound to a database, let alone every database
that might contain unlogged tables. So the sentinel that's used to
decide whether to flush the contents of a table or index is the
presence or absence of an _init fork, which the startup process
obviously can see just fine. The _init fork also tells us what to
stick in the relation when we reset it; for a table, we can just reset
to an empty file, but that's not legal for indexes, so the _init fork
contains a pre-initialized empty index that we can just copy over.

Now, to make an unlogged table logged, you've got to at some stage
remove those _init forks. But this is not a transactional operation.
If you remove the _init forks and then the transaction rolls back,
you've left the system an inconsistent state. If you postpone the
removal until commit time, then you have a problem if it fails,
particularly if it works for the first file but fails for the second.
And if you crash at any point before you've fsync'd the containing
directory, you have no idea which files will still be on disk after a
hard reboot.

Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do that
for anything but wal_level=minimal anyway.

Did you see this initial patch [1]/messages/by-id/263033.9223.qm@web29013.mail.ird.yahoo.com from Leonardo Francalanci ?

Regards,

[1]: /messages/by-id/263033.9223.qm@web29013.mail.ird.yahoo.com
/messages/by-id/263033.9223.qm@web29013.mail.ird.yahoo.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#18

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Andres Freund (#15)

Re: GSoC proposal - "make an unlogged table logged"

On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-03-04 09:47:08 -0500, Robert Haas wrote:

On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:

Is the TODO item "make an unlogged table logged" [1] a good GSoC project?

I'm pretty sure we found some problems in that design that we couldn't
figure out how to solve. I don't have a pointer to the relevant
-hackers discussion off-hand, but I think there was one.

ISTR the discussion going something along the lines of "we'd have to WAL
log the entire table to do that, and if we have to do that, what's the
point?".

No, not really. The issue is more around what happens if we crash
part way through. At crash recovery time, the system catalogs are not
available, because the database isn't consistent yet and, anyway, the
startup process can't be bound to a database, let alone every database
that might contain unlogged tables. So the sentinel that's used to
decide whether to flush the contents of a table or index is the
presence or absence of an _init fork, which the startup process
obviously can see just fine. The _init fork also tells us what to
stick in the relation when we reset it; for a table, we can just reset
to an empty file, but that's not legal for indexes, so the _init fork
contains a pre-initialized empty index that we can just copy over.

Now, to make an unlogged table logged, you've got to at some stage
remove those _init forks. But this is not a transactional operation.
If you remove the _init forks and then the transaction rolls back,
you've left the system an inconsistent state. If you postpone the
removal until commit time, then you have a problem if it fails,
particularly if it works for the first file but fails for the second.
And if you crash at any point before you've fsync'd the containing
directory, you have no idea which files will still be on disk after a
hard reboot.

Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do that
for anything but wal_level=minimal anyway.

Yes, that would work. I've tended to view optimizing away the
relfilenode copy as an indispensable part of this work, but that might
be wrongheaded. It would certainly be a lot easier to make this
happen if we didn't insist on that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Robert Haas (#18)

Re: GSoC proposal - "make an unlogged table logged"

On 2014-03-04 12:54:02 -0500, Robert Haas wrote:

On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do that
for anything but wal_level=minimal anyway.

Yes, that would work. I've tended to view optimizing away the
relfilenode copy as an indispensable part of this work, but that might
be wrongheaded. It would certainly be a lot easier to make this
happen if we didn't insist on that.

I think it'd already much better than today's situation, and it's a
required codepath for wal_level > logical anyway. So even if somebody
wants to make this work without the full copy for minimal, it'd still be
a required codepath. So I am perfectly ok with a patch just adding that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#19)

Re: GSoC proposal - "make an unlogged table logged"

On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund <andres@2ndquadrant.com>
wrote:

On 2014-03-04 12:54:02 -0500, Robert Haas wrote:

On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund <andres@2ndquadrant.com>

wrote:

On 2014-03-04 09:47:08 -0500, Robert Haas wrote:
Can't that be solved by just creating the permanent relation in a new
relfilenode? That's equivalent to a rewrite, yes, but we need to do

that

for anything but wal_level=minimal anyway.

Yes, that would work. I've tended to view optimizing away the
relfilenode copy as an indispensable part of this work, but that might
be wrongheaded. It would certainly be a lot easier to make this
happen if we didn't insist on that.

I think it'd already much better than today's situation, and it's a
required codepath for wal_level > logical anyway. So even if somebody
wants to make this work without the full copy for minimal, it'd still be
a required codepath. So I am perfectly ok with a patch just adding that.

Then is this a good idea for a GSoC project ?

I don't know very well this internals, but I am willing to learn and I
think the GSoC is a good opportunity.

Any of you are willing to mentoring this project?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Timbira: http://www.timbira.com.br
Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#21

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Fabrízio de Royes Mello (#20)

#22

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Fabrízio de Royes Mello (#21)

#23

Thom Brown

thom@linux.com

about 12 years ago

In reply to: Robert Haas (#22)

#24

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Robert Haas (#22)

#25

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Fabrízio de Royes Mello (#24)

#26

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Robert Haas (#25)

#27

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Fabrízio de Royes Mello (#26)

#28

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Robert Haas (#25)

#29

Tom Lane

tgl@sss.pgh.pa.us

about 12 years ago

In reply to: Fabrízio de Royes Mello (#27)

#30

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Tom Lane (#29)

#31

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Tom Lane (#29)

#32

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Fabrízio de Royes Mello (#31)

#33

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Tom Lane (#29)

#34

Jim Nasby

Jim.Nasby@BlueTreble.com

about 12 years ago

In reply to: Andres Freund (#15)

#35

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Jim Nasby (#34)

#36

Fabrízio de Royes Mello

fabriziomello@gmail.com

about 12 years ago

In reply to: Andres Freund (#32)

#37

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Andres Freund (#35)

#38

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#37)

#39

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Andres Freund (#38)

#40

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#33)

#41

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Heikki Linnakangas (#33)

#42

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#41)

#43

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#37)

#44

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 12 years ago

In reply to: Andres Freund (#43)

#45

Andres Freund

andres@anarazel.de

about 12 years ago

In reply to: Heikki Linnakangas (#44)

#46

Tom Lane

tgl@sss.pgh.pa.us

about 12 years ago

In reply to: Heikki Linnakangas (#41)

#47

Robert Haas

robertmhaas@gmail.com

about 12 years ago

In reply to: Heikki Linnakangas (#41)