Re: [HACKERS] Early locking option to parallel backup

Started by Stephen Frostabout 8 years ago5 messages

sfrost@snowman.net

about 8 years ago

Lucas, Robert, all,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Mon, Nov 6, 2017 at 4:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if we couldn't somehow repurpose the work that was done for
parallel workers' locks. Lots of security-type issues to be handled
if we're to open that up to clients, but maybe it's solvable. For
instance, maybe only allowing it to clients sharing the same snapshot
would help.

Interesting idea. There's a bunch of holes that would need to be
patched there; for instance, you can't have one session running DDL
while somebody else has AccessShareLock. Parallel query relies on the
parallel-mode restrictions to prevent that kind of thing from
happening, but it would be strange (and likely somewhat broken) to try
to enforce those here. It would be strange and probably bad if LOCK
TABLE a; LOCK TABLE b in one session and LOCK TABLE b; LOCK TABLE a in
another session failed to deadlock. In short, there's a big
difference between a single session using multiple processes and
multiple closely coordinated sessions.

Parallel pg_dump is based on synchronized transactions though and we
have a bunch of checks in ImportSnapshot() because a pg_dump parallel
worker also can't really be quite the same as a normal backend. Perhaps
we could add on more restrictions in ImportSnapshot() to match the
restrictions for parallel-mode workers? If we think there's other users
of SET TRANSACTION SNAPSHOT then we might need to extend that command
for this case, but that seems relatively straight-forward. I don't know
how reasonable the idea of taking a normally-started backend and making
it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
PARALLEL (or whatever) happens to allow it to skip the lock fairness is
though.

Also, even if you did it, you still need a lot of PROCLOCKs. Workers
don't need to take all locks up front because they can be assured of
getting them later, but they've still got to lock the objects they
actually want to access. Group locking aims to prevent deadlocks
between cooperating processes; it is not a license to skip locking
altogether.

This wouldn't be any different from what's happening today in pg_dump
though, so I'm not sure why this would be an issue? The proposed patch
locks everything in every worker, which is quite different from the main
process locking everything and then the individual workers locking the
objects they're working on.

* Lucas B (lucas75@gmail.com) wrote:

Em 05/11/2017 21:09, Andres Freund escreveu:

Well, the current approach afaics requires #relations * 2 locks, whereas
acquiring them in every worker would scale that with the number of
workers.

Yes, that is why I proposed as an option. As an option will not affect
anyone that does not want to use it.

That's not actually correct- additional options add maintenance overhead
for hackers and for users to have to deal with and there's at least a
clear idea on how we could make this "just work" without having to add
an additional option. Based on that and the discussion thus far, it
seems like the next step is to try and work through a way to change
things to allow workers to skip the lock fairness and that the current
patch isn't going to be accepted.

It seems natural to think several connections in a synchronized snapshot as
the same connection. Then it may be reasonable to grant a shared lock out of
turn if any connection of the same shared snapshot already have a granted
lock for the same relation. Last year Tom mentioned that there is already
queue-jumping logic of that sort in the lock manager for other purposes.
Although seems conceptually simple, I suspect the implementation is not.

The implementation is almost certainly not simple, but that doesn't mean
we should go in another direction and require a great deal more locks or
add an option to do so.

I'm going to move this patch into 'Waiting for Author' since it's
clearly gotten a good bit of review and discussion already. If there's
no interest in further exploring the approach of changing the locking
logic then we should probably update the patch to RWF.

This seems like a good project to go on the TODO list though, if you
aren't going to pursue it, to further explore how to let parallel
pg_dump processes jump the lock queue, with a reference back to this
thread.

Thanks!

Stephen

Import Notes

Reply to msg id not found: 31c4b0dc-5e50-cb3d-c7df-1e41bda2fcad@gmail.comCA+Tgmobw7tAV366ra7t4umbywqjTJb_9Z0obFHi1c-iLsgmFMA@mail.gmail.com

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Stephen Frost (#1)

On Thu, Jan 11, 2018 at 9:02 AM, Stephen Frost <sfrost@snowman.net> wrote:

Parallel pg_dump is based on synchronized transactions though and we
have a bunch of checks in ImportSnapshot() because a pg_dump parallel
worker also can't really be quite the same as a normal backend. Perhaps
we could add on more restrictions in ImportSnapshot() to match the
restrictions for parallel-mode workers? If we think there's other users
of SET TRANSACTION SNAPSHOT then we might need to extend that command
for this case, but that seems relatively straight-forward. I don't know
how reasonable the idea of taking a normally-started backend and making
it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
PARALLEL (or whatever) happens to allow it to skip the lock fairness is
though.

It seems pretty tricky to me. I actually don't think this use case
has all that much in common with the parallel query case. Both can be
addressed by tweaking the lock manager, but I think this needs a
different set of tweaks than that did.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Robert Haas (#2)

On 2018-01-15 12:12:26 -0500, Robert Haas wrote:

On Thu, Jan 11, 2018 at 9:02 AM, Stephen Frost <sfrost@snowman.net> wrote:

Parallel pg_dump is based on synchronized transactions though and we
have a bunch of checks in ImportSnapshot() because a pg_dump parallel
worker also can't really be quite the same as a normal backend. Perhaps
we could add on more restrictions in ImportSnapshot() to match the
restrictions for parallel-mode workers? If we think there's other users
of SET TRANSACTION SNAPSHOT then we might need to extend that command
for this case, but that seems relatively straight-forward. I don't know
how reasonable the idea of taking a normally-started backend and making
it close enough to a parallel worker when a SET TRANSACTION SNAPSHOT
PARALLEL (or whatever) happens to allow it to skip the lock fairness is
though.

It seems pretty tricky to me. I actually don't think this use case
has all that much in common with the parallel query case. Both can be
addressed by tweaking the lock manager, but I think this needs a
different set of tweaks than that did.

There seems to to be consensus in this thread that the approach Lucas
proposed isn't what we want, and that instead some shared lock based
approach is desirable. As that has been the case for ~1.5 months, I
propose we mark this as returned with feedback?

Greetings,

Andres Freund

Robert Haas

robertmhaas@gmail.com

almost 8 years ago

In reply to: Andres Freund (#3)

On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund <andres@anarazel.de> wrote:

There seems to to be consensus in this thread that the approach Lucas
proposed isn't what we want, and that instead some shared lock based
approach is desirable. As that has been the case for ~1.5 months, I
propose we mark this as returned with feedback?

Yes, that seems pretty clear-cut to me. It would be totally unfair if
a patch that hasn't been updated since November were allowed to submit
a new version after the start of the final CommitFest. We shouldn't
be working on anything now that hasn't been under active development
recently; we have enough things (and then some) that have.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Robert Haas (#4)

On 2018-03-02 11:42:06 -0500, Robert Haas wrote:

On Fri, Mar 2, 2018 at 2:29 AM, Andres Freund <andres@anarazel.de> wrote:

There seems to to be consensus in this thread that the approach Lucas
proposed isn't what we want, and that instead some shared lock based
approach is desirable. As that has been the case for ~1.5 months, I
propose we mark this as returned with feedback?

Yes, that seems pretty clear-cut to me. It would be totally unfair if
a patch that hasn't been updated since November were allowed to submit
a new version after the start of the final CommitFest. We shouldn't
be working on anything now that hasn't been under active development
recently; we have enough things (and then some) that have.

Done.