Replication slots and footguns

Started by Josh Berkusalmost 12 years ago22 messages
#1Josh Berkus
josh@agliodbs.com

All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

If I'm not, that seems like something to fix before 9.4 release.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#2)
Re: Replication slots and footguns

On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/12/2014 12:03 PM, Andres Freund wrote:

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

In a world of network proxies, a walsender could be "connected" for
hours after the replica has ceased to exist. Fortunately,
wal_sender_timeout is changeable on a reload. We check for actual
standby feedback for the timeout, yes?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

We have no safe way to terminate the walsender that I know of;
pg_terminate_backend() doesn't include walsenders last I checked.

So the procedure for this would be:

1) set wal_sender_timeout to some low value (1);
2) reload
3) call pg_drop_replication_slot('slotname')

Clumsy, but it will do for a first pass; we can make it better (for
example, by adding a "force" boolean to pg_drop_replication_slot) in 9.5.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@2ndquadrant.com
In reply to: Robert Haas (#3)
Re: Replication slots and footguns

On 2014-03-12 15:18:04 -0400, Robert Haas wrote:

On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#4)
Re: Replication slots and footguns

On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:

On 03/12/2014 12:03 PM, Andres Freund wrote:

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

In a world of network proxies, a walsender could be "connected" for
hours after the replica has ceased to exist. Fortunately,
wal_sender_timeout is changeable on a reload. We check for actual
standby feedback for the timeout, yes?

Yep.

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

We have no safe way to terminate the walsender that I know of;
pg_terminate_backend() doesn't include walsenders last I checked.

SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

works.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#5)
Re: Replication slots and footguns

On Wed, Mar 12, 2014 at 3:25 PM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-03-12 15:18:04 -0400, Robert Haas wrote:

On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

One with a connected walsender.

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Sold.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/12/2014 12:26 PM, Andres Freund wrote:

On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:

On 03/12/2014 12:03 PM, Andres Freund wrote:

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

We have no safe way to terminate the walsender that I know of;
pg_terminate_backend() doesn't include walsenders last I checked.

SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

Aha! Ok, I'll work on some documentation.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/12/2014 12:34 PM, Robert Haas wrote:

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Sold.

Wait ... before you go further ... I object to dropping the word
"active" from the error message. The column is called "active", and
that's where a DBA should look; that word needs to stay in the error
message.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Thom Brown
thom@linux.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 12 March 2014 19:00, Josh Berkus <josh@agliodbs.com> wrote:

All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL. I thought it was inactive
slots that were the problem in this regard?

--
Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Michael Paquier
michael.paquier@gmail.com
In reply to: Thom Brown (#10)
Re: Replication slots and footguns

On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown <thom@linux.com> wrote:

On 12 March 2014 19:00, Josh Berkus <josh@agliodbs.com> wrote:

All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL. I thought it was inactive
slots that were the problem in this regard?

You could still have an active slot with a standby that is not able to
catch up AFAIK.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Thom Brown
thom@linux.com
In reply to: Michael Paquier (#11)
Re: Replication slots and footguns

On 12 March 2014 23:17, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown <thom@linux.com> wrote:

On 12 March 2014 19:00, Josh Berkus <josh@agliodbs.com> wrote:

All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR: 55006: replication slot "slot_1" is already active
LOCATION: ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier. Am I missing something?

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL. I thought it was inactive
slots that were the problem in this regard?

You could still have an active slot with a standby that is not able to
catch up AFAIK.

In that scenario, why would one wish to drop the replication slot? If
it can't keep up, dropping the replication slot would likely mean
you'd orphan the standby due to the primary no longer holding on to
the necessary WAL, and the standby is then useless. In which case, if
the standby is causing such problems, why not shut down that standby,
thereby effectively decommissioning it, then delete the slot?

--
Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/12/2014 04:52 PM, Thom Brown wrote:

On 12 March 2014 23:17, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown <thom@linux.com> wrote:

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL. I thought it was inactive
slots that were the problem in this regard?

You could still have an active slot with a standby that is not able to
catch up AFAIK.

In that scenario, why would one wish to drop the replication slot? If
it can't keep up, dropping the replication slot would likely mean
you'd orphan the standby due to the primary no longer holding on to
the necessary WAL, and the standby is then useless. In which case, if
the standby is causing such problems, why not shut down that standby,
thereby effectively decommissioning it, then delete the slot?

The problem I'm anticipating is that the replica server is actually
offline, but the master doesn't know it yet. So here's the situ:

1. replica with a slot dies
2. wal logs start piling up and master is running low on disk space
3. replica is still marked "active" because we're waiting for default
tcp timeout (3+ hours) or for the proxy to kill the connection (forever).

But as Andres has shown, there's a two ways to fix the above. So we're
in good shape.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#9)
Re: Replication slots and footguns

On 2014-03-12 13:34:47 -0700, Josh Berkus wrote:

On 03/12/2014 12:34 PM, Robert Haas wrote:

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Sold.

Wait ... before you go further ... I object to dropping the word
"active" from the error message. The column is called "active", and
that's where a DBA should look; that word needs to stay in the error
message.

"replication slot "%s" is in active in another backend"?

Alternatively we could replace the boolean active by the owner's pid,
but that's a not entirely trivial change...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/13/2014 04:07 AM, Andres Freund wrote:

On 2014-03-12 13:34:47 -0700, Josh Berkus wrote:

On 03/12/2014 12:34 PM, Robert Haas wrote:

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Sold.

Wait ... before you go further ... I object to dropping the word
"active" from the error message. The column is called "active", and
that's where a DBA should look; that word needs to stay in the error
message.

"replication slot "%s" is in active in another backend"?

"*for* another backend", but that works for me. I just want to keep the
word "active", because when I encountered that error in testing I knew
*immediately* where to look because of the word.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#15)
Re: Replication slots and footguns

On Thu, Mar 13, 2014 at 1:03 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 04:07 AM, Andres Freund wrote:

On 2014-03-12 13:34:47 -0700, Josh Berkus wrote:

On 03/12/2014 12:34 PM, Robert Haas wrote:

Urgh. That error message looks susceptible to improvement. How about:

replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Sold.

Wait ... before you go further ... I object to dropping the word
"active" from the error message. The column is called "active", and
that's where a DBA should look; that word needs to stay in the error
message.

"replication slot "%s" is in active in another backend"?

"*for* another backend", but that works for me. I just want to keep the
word "active", because when I encountered that error in testing I knew
*immediately* where to look because of the word.

I think "in use" is just as clear as active, and I think the text
Andres proposed previously reads a whole lot more nicely than this:

replication slot "%s" is in use by another backend

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/13/2014 01:17 PM, Robert Haas wrote:

I think "in use" is just as clear as active, and I think the text
Andres proposed previously reads a whole lot more nicely than this:

replication slot "%s" is in use by another backend

Then we should change the column name in the pg_stat_replication_slots
view to "in_use". My point is that the error message and the diagnostic
view should use the same word, or we're needlessly confusing our users.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#17)
Re: Replication slots and footguns

On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 01:17 PM, Robert Haas wrote:

I think "in use" is just as clear as active, and I think the text
Andres proposed previously reads a whole lot more nicely than this:

replication slot "%s" is in use by another backend

Then we should change the column name in the pg_stat_replication_slots
view to "in_use". My point is that the error message and the diagnostic
view should use the same word, or we're needlessly confusing our users.

I see. That's an interesting point....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/13/2014 05:01 PM, Robert Haas wrote:

On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 01:17 PM, Robert Haas wrote:

I think "in use" is just as clear as active, and I think the text
Andres proposed previously reads a whole lot more nicely than this:

replication slot "%s" is in use by another backend

Then we should change the column name in the pg_stat_replication_slots
view to "in_use". My point is that the error message and the diagnostic
view should use the same word, or we're needlessly confusing our users.

I see. That's an interesting point....

As I said earlier, the fact that the current error message says "active"
and the column in pg_stat_replication_slots is called "active" meant I
knew *immediately* where to look. So I'm speaking from personal experience.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#19)
Re: Replication slots and footguns

On Thu, Mar 13, 2014 at 8:09 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 05:01 PM, Robert Haas wrote:

On Thu, Mar 13, 2014 at 6:45 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 01:17 PM, Robert Haas wrote:

I think "in use" is just as clear as active, and I think the text
Andres proposed previously reads a whole lot more nicely than this:

replication slot "%s" is in use by another backend

Then we should change the column name in the pg_stat_replication_slots
view to "in_use". My point is that the error message and the diagnostic
view should use the same word, or we're needlessly confusing our users.

I see. That's an interesting point....

As I said earlier, the fact that the current error message says "active"
and the column in pg_stat_replication_slots is called "active" meant I
knew *immediately* where to look. So I'm speaking from personal experience.

Well we may have kind of hosed ourselves, because the in-memory data
structures that represent the data structure have an in_use flag that
indicates whether the structure is allocated at all, and then an
active flag that indicates whether some backend is using it. I never
liked that naming much. Maybe we should go through and let in_use ->
allocated and active -> in_use.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: Replication slots and footguns

On 03/13/2014 05:28 PM, Robert Haas wrote:

Well we may have kind of hosed ourselves, because the in-memory data
structures that represent the data structure have an in_use flag that
indicates whether the structure is allocated at all, and then an
active flag that indicates whether some backend is using it. I never
liked that naming much. Maybe we should go through and let in_use ->
allocated and active -> in_use.

Wait, which one of those does pg_drop_replication_slot() care about?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#21)
Re: Replication slots and footguns

On Thu, Mar 13, 2014 at 9:07 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 03/13/2014 05:28 PM, Robert Haas wrote:

Well we may have kind of hosed ourselves, because the in-memory data
structures that represent the data structure have an in_use flag that
indicates whether the structure is allocated at all, and then an
active flag that indicates whether some backend is using it. I never
liked that naming much. Maybe we should go through and let in_use ->
allocated and active -> in_use.

Wait, which one of those does pg_drop_replication_slot() care about?

Well... the slots that aren't in_use can't be dropped because they
don't exist in the first place. The ones that aren't active can't be
dropped because somebody else is using them. So both, sorta, I guess?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers