BUG #10823: Better REINDEX syntax.

Started by Daniel Migowskialmost 12 years ago21 messageshackersbugs
Jump to latest
#1Daniel Migowski
dmigowski@ikoffice.de
hackersbugs

The following bug has been logged on the website:

Bug reference: 10823
Logged by: Daniel Migowski
Email address: dmigowski@ikoffice.de
PostgreSQL version: 9.1.13
Operating system: n/a
Description:

Hello.

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

PS: Thanks for all your work on this great database!

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Bruce Momjian
bruce@momjian.us
In reply to: Daniel Migowski (#1)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

On Tue, Jul 1, 2014 at 10:33:07AM +0000, dmigowski@ikoffice.de wrote:

The following bug has been logged on the website:

Bug reference: 10823
Logged by: Daniel Migowski
Email address: dmigowski@ikoffice.de
PostgreSQL version: 9.1.13
Operating system: n/a
Description:

Hello.

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

Wow, yeah, that is kind of odd, e.g.

REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ FORCE ]
...
name
The name of the specific index, table, or database
to be reindexed. Index and table names can be
schema-qualified. Presently, REINDEX DATABASE and REINDEX SYSTEM
can only reindex the current database, so their parameter must
match the current database's name.

Let me look at improving that for 9.5.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Jul 1, 2014 at 10:33:07AM +0000, dmigowski@ikoffice.de wrote:

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

Wow, yeah, that is kind of odd, e.g.

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#3)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

On Wed, Jul 30, 2014 at 01:29:31PM -0400, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Jul 1, 2014 at 10:33:07AM +0000, dmigowski@ikoffice.de wrote:

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

Wow, yeah, that is kind of odd, e.g.

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

Uh, do we encourage database-wide VACUUM FULL or CLUSTER, as we use them
there with no parameter. Is there a reason REINDEX should be harder,
and require a dummy argument to run?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

Bruce Momjian <bruce@momjian.us> writes:

On Wed, Jul 30, 2014 at 01:29:31PM -0400, Tom Lane wrote:

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

Uh, do we encourage database-wide VACUUM FULL or CLUSTER, as we use them
there with no parameter. Is there a reason REINDEX should be harder,
and require a dummy argument to run?

I believe that REINDEX on system catalogs carries a risk of deadlock
failures against other processes --- there was a recent example of that
in the mailing lists. VACUUM FULL has such risks too, but that's been
pretty well deprecated for many years. (I think CLUSTER is probably
relatively safe on this score because it's not going to think any system
catalogs are clustered.)

If there were a variant of REINDEX that only hit user tables, I'd be fine
with making that easy to invoke.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6Vik Fearing
vik@postgresfriends.org
In reply to: Bruce Momjian (#4)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

On 07/30/2014 07:35 PM, Bruce Momjian wrote:

On Wed, Jul 30, 2014 at 01:29:31PM -0400, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Jul 1, 2014 at 10:33:07AM +0000, dmigowski@ikoffice.de wrote:

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

Wow, yeah, that is kind of odd, e.g.

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

Uh, do we encourage database-wide VACUUM FULL or CLUSTER, as we use them
there with no parameter. Is there a reason REINDEX should be harder,
and require a dummy argument to run?

I agree. The request isn't for a naked REINDEX command, it's for a
naked REINDEX DATABASE command.
--
Vik

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Bruce Momjian
bruce@momjian.us
In reply to: Vik Fearing (#6)
hackersbugs
Re: BUG #10823: Better REINDEX syntax.

On Wed, Jul 30, 2014 at 07:48:39PM +0200, Vik Fearing wrote:

On 07/30/2014 07:35 PM, Bruce Momjian wrote:

On Wed, Jul 30, 2014 at 01:29:31PM -0400, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Jul 1, 2014 at 10:33:07AM +0000, dmigowski@ikoffice.de wrote:

Compared to CLUSTER and VACUUM FULL we need to specify a database to the
REINDEX command. Why? It would be logical to reindex the current database,
exactly like CLUSTER does. So why isn't the DATABASE parameter optional?

Wow, yeah, that is kind of odd, e.g.

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

Uh, do we encourage database-wide VACUUM FULL or CLUSTER, as we use them
there with no parameter. Is there a reason REINDEX should be harder,
and require a dummy argument to run?

I agree. The request isn't for a naked REINDEX command, it's for a
naked REINDEX DATABASE command.

Yes, the question is should we support REINDEX DATABASE without a
database name that matches the current database. REINDEX alone might be
too risky.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#5)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

On 07/30/2014 07:46 PM, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Wed, Jul 30, 2014 at 01:29:31PM -0400, Tom Lane wrote:

I don't find it all that odd. We should not be encouraging routine
database-wide reindexes.

Uh, do we encourage database-wide VACUUM FULL or CLUSTER, as we use them
there with no parameter. Is there a reason REINDEX should be harder,
and require a dummy argument to run?

I believe that REINDEX on system catalogs carries a risk of deadlock
failures against other processes --- there was a recent example of that
in the mailing lists. VACUUM FULL has such risks too, but that's been
pretty well deprecated for many years. (I think CLUSTER is probably
relatively safe on this score because it's not going to think any system
catalogs are clustered.)

If there were a variant of REINDEX that only hit user tables, I'd be fine
with making that easy to invoke.

Here are two patches for this.

The first one, reindex_user_tables.v1.patch, implements the variant that
only hits user tables, as suggested by you.

The second one, reindex_no_dbname.v1.patch, allows the three
database-wide variants to omit the database name (voted for by Daniel
Migowski, Bruce, and myself; voted against by you). This patch is to be
applied on top of the first one.
--
Vik

Attachments:

reindex_user_tables.v1.patchtext/x-diff; name=reindex_user_tables.v1.patchDownload+135-12
reindex_no_dbname.v1.patchtext/x-diff; name=reindex_no_dbname.v1.patchDownload+40-9
#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Vik Fearing (#8)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

Vik Fearing wrote:

Here are two patches for this.

The first one, reindex_user_tables.v1.patch, implements the variant that
only hits user tables, as suggested by you.

The second one, reindex_no_dbname.v1.patch, allows the three
database-wide variants to omit the database name (voted for by Daniel
Migowski, Bruce, and myself; voted against by you). This patch is to be
applied on top of the first one.

Not a fan. Here's a revised version that provides REINDEX USER TABLES,
which can only be used without a database name; other modes are not
affected i.e. they continue to require a database name. I also renamed
your proposed reindexdb's --usertables to --user-tables.

Oh, I just noticed that if you say reindexdb --all --user-tables, the
latter is not honored. Must fix before commit.

Makes sense?

Note: I don't like the reindexdb UI; if you just run "reindexdb -d
foobar" it will reindex everything, including system catalogs. I think
USER TABLES should be the default operation mode for reindex. If you
want plain old "REINDEX DATABASE foobar" which also hits the catalogs,
you should request that separately (how?). This patch doesn't change
this.

Also note: if you say "user tables", information_schema is reindexed too,
which kinda sucks.

Further note: this command is probably pointless in the majority of
cases. Somebody should spend some serious time with REINDEX
CONCURRENTLY ..

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

reindex_user_tables.v2.patchtext/x-diff; charset=us-asciiDownload+132-10
#10Marko Tiikkaja
marko@joh.to
In reply to: Alvaro Herrera (#9)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

On 2014-08-29 01:00, Alvaro Herrera wrote:

Vik Fearing wrote:

Here are two patches for this.

The first one, reindex_user_tables.v1.patch, implements the variant that
only hits user tables, as suggested by you.

The second one, reindex_no_dbname.v1.patch, allows the three
database-wide variants to omit the database name (voted for by Daniel
Migowski, Bruce, and myself; voted against by you). This patch is to be
applied on top of the first one.

Not a fan. Here's a revised version that provides REINDEX USER TABLES,
which can only be used without a database name; other modes are not
affected i.e. they continue to require a database name.

Yeah, I think I like this better than allowing all of them without the
database name.

I also renamed
your proposed reindexdb's --usertables to --user-tables.

I agree with this change.

Oh, I just noticed that if you say reindexdb --all --user-tables, the
latter is not honored. Must fix before commit.

Definitely.

Note: I don't like the reindexdb UI; if you just run "reindexdb -d
foobar" it will reindex everything, including system catalogs. I think
USER TABLES should be the default operation mode for reindex. If you
want plain old "REINDEX DATABASE foobar" which also hits the catalogs,
you should request that separately (how?). This patch doesn't change
this.

This should probably be a separate patch if it's going to happen. But
the idea seems reasonable.

Also note: if you say "user tables", information_schema is reindexed too,
which kinda sucks.

*shrug* It sort of makes sense if you think of this as the opposite of
REINDEX SYSTEM. I'm not at all sure whether including or excluding it
would be the better choice here.

Do we have some kind of an agreement on what this patch should look
like? Is someone going to prepare an updated patch? Vik?

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Marko Tiikkaja (#10)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

Marko Tiikkaja wrote:

On 2014-08-29 01:00, Alvaro Herrera wrote:

Note: I don't like the reindexdb UI; if you just run "reindexdb -d
foobar" it will reindex everything, including system catalogs. I think
USER TABLES should be the default operation mode for reindex. If you
want plain old "REINDEX DATABASE foobar" which also hits the catalogs,
you should request that separately (how?). This patch doesn't change
this.

This should probably be a separate patch if it's going to happen.

Yeh, no argument there.

Also note: if you say "user tables", information_schema is reindexed too,
which kinda sucks.

*shrug* It sort of makes sense if you think of this as the opposite
of REINDEX SYSTEM. I'm not at all sure whether including or
excluding it would be the better choice here.

Yeah, probably not worth bothering.

Do we have some kind of an agreement on what this patch should look
like? Is someone going to prepare an updated patch? Vik?

I think the only issue left for this to be committable is reindexdb
--all previously mentioned.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Marko Tiikkaja
marko@joh.to
In reply to: Alvaro Herrera (#11)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

On 2014-09-02 22:24, Alvaro Herrera wrote:

Marko Tiikkaja wrote:

Do we have some kind of an agreement on what this patch should look
like? Is someone going to prepare an updated patch? Vik?

I think the only issue left for this to be committable is reindexdb
--all previously mentioned.

I scanned through the patch and found the exit_nicely() business a bit
weird, so that might be another thing worth looking at.

.marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Marko Tiikkaja (#12)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

Marko Tiikkaja wrote:

On 2014-09-02 22:24, Alvaro Herrera wrote:

Marko Tiikkaja wrote:

Do we have some kind of an agreement on what this patch should look
like? Is someone going to prepare an updated patch? Vik?

I think the only issue left for this to be committable is reindexdb
--all previously mentioned.

I scanned through the patch and found the exit_nicely() business a
bit weird, so that might be another thing worth looking at.

Yeah, just rip that out and do PQfinish(conn); exit(1); as other exit
paths do, I'd think.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Vik Fearing
vik@postgresfriends.org
In reply to: Marko Tiikkaja (#10)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

On 09/02/2014 10:17 PM, Marko Tiikkaja wrote:

On 2014-08-29 01:00, Alvaro Herrera wrote:

Vik Fearing wrote:

Here are two patches for this.

The first one, reindex_user_tables.v1.patch, implements the variant that
only hits user tables, as suggested by you.

The second one, reindex_no_dbname.v1.patch, allows the three
database-wide variants to omit the database name (voted for by Daniel
Migowski, Bruce, and myself; voted against by you). This patch is to be
applied on top of the first one.

Not a fan. Here's a revised version that provides REINDEX USER TABLES,
which can only be used without a database name; other modes are not
affected i.e. they continue to require a database name.

Yeah, I think I like this better than allowing all of them without the
database name.

Why? It's just a noise word!

I also renamed
your proposed reindexdb's --usertables to --user-tables.

I agree with this change.

Me, too.

Oh, I just noticed that if you say reindexdb --all --user-tables, the
latter is not honored. Must fix before commit.

Definitely.

Okay, I'll look at that.

Is someone going to prepare an updated patch? Vik?

Yes, I will update the patch.
--
Vik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Stephen Frost
sfrost@snowman.net
In reply to: Vik Fearing (#14)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

* Vik Fearing (vik.fearing@dalibo.com) wrote:

On 09/02/2014 10:17 PM, Marko Tiikkaja wrote:

Yeah, I think I like this better than allowing all of them without the
database name.

Why? It's just a noise word!

Eh, because it ends up reindexing system tables too, which is probably
not what new folks are expecting. Also, it's not required when you say
'user tables', so it's similar to your user_tables v1 patch in that
regard.

Yes, I will update the patch.

Still planning to do this..?

Marking this back to waiting-for-author.

Thanks!

Stephen

#16Vik Fearing
vik@postgresfriends.org
In reply to: Stephen Frost (#15)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

On 09/08/2014 06:17 AM, Stephen Frost wrote:

* Vik Fearing (vik.fearing@dalibo.com) wrote:

On 09/02/2014 10:17 PM, Marko Tiikkaja wrote:

Yeah, I think I like this better than allowing all of them without the
database name.

Why? It's just a noise word!

Eh, because it ends up reindexing system tables too, which is probably
not what new folks are expecting.

No behavior is changed at all. REINDEX DATABASE dbname; has always hit
the system tables. Since dbname can *only* be the current database,
there's no logic nor benefit in requiring it to be specified.

Also, it's not required when you say
'user tables', so it's similar to your user_tables v1 patch in that
regard.

The fact that REINDEX USER TABLES; is the only one that doesn't require
the dbname seems very inconsistent and confusing.

Yes, I will update the patch.

Still planning to do this..?

Marking this back to waiting-for-author.

Yes, but probably not for this commitfest unfortunately.
--
Vik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Stephen Frost
sfrost@snowman.net
In reply to: Vik Fearing (#16)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

* Vik Fearing (vik.fearing@dalibo.com) wrote:

On 09/08/2014 06:17 AM, Stephen Frost wrote:

* Vik Fearing (vik.fearing@dalibo.com) wrote:

On 09/02/2014 10:17 PM, Marko Tiikkaja wrote:

Yeah, I think I like this better than allowing all of them without the
database name.

Why? It's just a noise word!

Eh, because it ends up reindexing system tables too, which is probably
not what new folks are expecting.

No behavior is changed at all. REINDEX DATABASE dbname; has always hit
the system tables. Since dbname can *only* be the current database,
there's no logic nor benefit in requiring it to be specified.

Sure, but I think the point is that reindexing the system tables as part
of a database-wide reindex is a *bad* thing which we shouldn't be
encouraging by making it easier.

I realize you're a bit 'stuck' here because we don't like the current
behavior, but we don't want to change it either.

Also, it's not required when you say
'user tables', so it's similar to your user_tables v1 patch in that
regard.

The fact that REINDEX USER TABLES; is the only one that doesn't require
the dbname seems very inconsistent and confusing.

I understand, but the alternative would be a 'reindex;' which *doesn't*
reindex the system tables- would that be less confusing? Or getting rid
of the current 'reindex database' which also reindexes system tables...

Yes, I will update the patch.

Still planning to do this..?

Marking this back to waiting-for-author.

Yes, but probably not for this commitfest unfortunately.

Fair enough, I'll mark it 'returned with feedback'.

Thanks!

Stephen

#18Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#17)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

Stephen Frost wrote:

Yes, I will update the patch.

Still planning to do this..?

Marking this back to waiting-for-author.

Yes, but probably not for this commitfest unfortunately.

Fair enough, I'll mark it 'returned with feedback'.

We lost this patch for the October commitfest, didn't we?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#18)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

We lost this patch for the October commitfest, didn't we?

I'm guessing you missed that a new version just got submitted..?

I'd be fine with today's being added to the october commitfest..

Of course, there's a whole independent discussion to be had about how
there wasn't any break between last commitfest and this one, but that
probably deserves its own thread.

THanks,

Stephen

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#19)
hackersbugs
Re: [BUGS] BUG #10823: Better REINDEX syntax.

Stephen Frost wrote:

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

We lost this patch for the October commitfest, didn't we?

I'm guessing you missed that a new version just got submitted..?

Which one, reindex schema? Isn't that a completely different patch?

I'd be fine with today's being added to the october commitfest..

Of course, there's a whole independent discussion to be had about how
there wasn't any break between last commitfest and this one, but that
probably deserves its own thread.

It's not the first that that happens, and honestly I don't see all that
much cause for concern. Heikki did move pending patches to the current
one, and closed a lot of inactive ones as 'returned with feedback'.
Attentive patch authors should have submitted new versions ... if they
don't, then someone else with an interest in the patch should do so.
If no one update the patches, what do we want them for?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#20)
hackersbugs