order of operations for pg_restore

Started by Andrew Hammondabout 14 years ago2 messages
#1Andrew Hammond
andrew.george.hammond@gmail.com

I'm working on a tool that runs pg_restore with -j 4. I notice that
after COPYing in the data, pg_restore does two indexes and a cluster
command in parallel. The first CREATE INDEX is running, the CLUSTER
command is waiting on it and the second CREATE INDEX is waiting on the
CLUSTER. This seems sub-optimal. Would it make sense to run the
CLUSTER command first? I'm pretty sure I can replicate the behavior if
necessary. Running 9.1.2.

Andrew

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Hammond (#1)
Re: order of operations for pg_restore

On 01/11/2012 07:57 PM, Andrew Hammond wrote:

I'm working on a tool that runs pg_restore with -j 4. I notice that
after COPYing in the data, pg_restore does two indexes and a cluster
command in parallel. The first CREATE INDEX is running, the CLUSTER
command is waiting on it and the second CREATE INDEX is waiting on the
CLUSTER. This seems sub-optimal. Would it make sense to run the
CLUSTER command first? I'm pretty sure I can replicate the behavior if
necessary. Running 9.1.2.

Well, we don't actually run CLUSTER. We run a command to mark a table as
clustered on the index. The nasty part is that it's not a separate TOC
member, it's in the same TOC as the index creation. But ALTER TABLE has
different locking requirements from CREATE INDEX. If the clustered index
is not one created from a constraint we could have the dependencies
wrong. It looks like this is something we all missed when parallel
restore was implemented. I think we might need to split the ALTER TABLE
... CLUSTER from its parent statement.

cheers

andrew