ANALYZE ONLY

Started by Michael Harrisover 1 year ago44 messages
Jump to latest
#1Michael Harris
harmic@gmail.com

Hello Postgres Hackers

An application that I am developing uses Postgresql, and includes a fairly large
number of partitioned tables which are used to store time series data.

The tables are partitioned by time, and typically there is only one partition
at a time - the current partition - that is actually being updated.
Older partitions
are available for query and eventually dropped.

As per the documentation, partitioned tables are not analyzed by the autovacuum
workers, although their partitions are. Statistics are needed on the partitioned
table level for at least some query planning activities.

The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the individual
partitions. There is currently no option to prevent it from including the
partitions.

This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.

I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very significant.

For my use case at least it would be fantastic if we could add an ONLY option
to ANALYZE, which would cause it to analyze the named table only and not
descend into the partitions.

I took a look at the source and it looks doable, but before launching into it
I thought I would ask a few questions here.

1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

2. The existing ANALYZE command has the following structure:

ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:

ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

Regards
Mike.

#2Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Michael Harris (#1)
Re: ANALYZE ONLY

On Tue, 20 Aug 2024 at 07:52, Michael Harris <harmic@gmail.com> wrote:

1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

I think this sounds like a reasonable feature.

2. The existing ANALYZE command has the following structure:

ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:

ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.

#3Ilia Evdokimov
ilya.evdokimov@tantorlabs.com
In reply to: Jelte Fennema-Nio (#2)
Re: ANALYZE ONLY

On 20.8.24 10:42, Jelte Fennema-Nio wrote:

On Tue, 20 Aug 2024 at 07:52, Michael Harris<harmic@gmail.com> wrote:

1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

I think this sounds like a reasonable feature.

2. The existing ANALYZE command has the following structure:

ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:

ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.

Hi everyone,

Your proposal is indeed interesting, but I have a question: can't your
issue be resolved by properly configuring |autovacuum| instead of
developing a new feature for |ANALYZE|?

From my perspective, |ANALYZE| is intended to forcibly gather
statistics from all partitioned tables. If the goal is to ensure that
statistics are updated at the right moment, adjusting the
|autovacuum_analyze_threshold| and |autovacuum_analyze_scale_factor|
parameters might be the solution.

--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.

#4David Rowley
dgrowleyml@gmail.com
In reply to: Ilia Evdokimov (#3)
Re: ANALYZE ONLY

On Tue, 20 Aug 2024 at 23:25, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:

Your proposal is indeed interesting, but I have a question: can't your issue be resolved by properly configuring autovacuum instead of developing a new feature for ANALYZE?

Basically, no. There's a "tip" in [1]https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-STATISTICS which provides information on
the limitation, namely:

"The autovacuum daemon does not issue ANALYZE commands for partitioned
tables. Inheritance parents will only be analyzed if the parent itself
is changed - changes to child tables do not trigger autoanalyze on the
parent table. If your queries require statistics on parent tables for
proper planning, it is necessary to periodically run a manual ANALYZE
on those tables to keep the statistics up to date."

There is also some discussion about removing the limitation in [2]/messages/by-id/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com.
While I agree that it would be nice to have autovacuum handle this,
it's not clear how exactly it would work. Additionally, if we had
that, it would still be useful if the ANALYZE command could be
instructed to just gather statistics for the partitioned table only.

David

[1]: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-STATISTICS
[2]: /messages/by-id/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com

#5Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Michael Harris (#1)
Re: ANALYZE ONLY

Hi Michael,

Thanks for starting this thread. I've also spent a bit time on this after
reading your first thread on this issue [1]

Michael Harris <harmic@gmail.com>, 20 Ağu 2024 Sal, 08:52 tarihinde şunu
yazdı:

The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the
individual
partitions. There is currently no option to prevent it from including the
partitions.

This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.

I agree that it's a waste to analyze partitions when they're already
analyzed by autovacuum. It would be nice to have a way to run analyze only
on a partitioned table without its partitions.

I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very
significant.

I quickly tweaked the code a bit to exclude partitions when a partitioned
table is being analyzed. I can confirm that there is a significant gain
even on a simple case like a partitioned table with 10 partitions and 1M
rows in each partition.

1. Would such a feature be welcomed? Are there any traps I might not

have thought of?

2. The existing ANALYZE command has the following structure:

ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name
An alternative would be:

ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

I feel closer to adding this as an option instead of a new keyword in
ANALYZE grammar. To me, it would be easier to have this option and then
give the names of partitioned tables as opposed to typing ONLY before each
partition table.
But we should think of another name as ONLY is used differently (attached
to the table name as you mentioned) in other contexts.

I've been also thinking about how this new option should affect inheritance
tables. Should it have just no impact on them or only analyze the parent
table without taking child tables into account? There are two records for
an inheritance parent table in pg_statistic, one row for only the parent
table and a second row including children. We might only analyze the parent
table if this new "ONLY" option is specified. I'm not sure if that would be
something users would need or not, but I think this option should behave
similarly for both partitioned tables and inheritance tables.

If we decide to go with only partition tables and not care about
inheritance, then naming this option to SKIP_PARTITIONS as Jelte suggested
sounds fine. But that name wouldn't work if this option will affect
inheritance tables.

Thanks,
--
Melih Mutlu
Microsoft

#6Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Melih Mutlu (#5)
Re: ANALYZE ONLY

Melih Mutlu <m.melihmutlu@gmail.com>, 20 Ağu 2024 Sal, 19:26 tarihinde şunu
yazdı:

Hi Michael,

Thanks for starting this thread. I've also spent a bit time on this after
reading your first thread on this issue [1]

Forgot to add the reference [1]/messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com

[1]: /messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
/messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com

#7Robert Haas
robertmhaas@gmail.com
In reply to: Michael Harris (#1)
Re: ANALYZE ONLY

On Tue, Aug 20, 2024 at 1:52 AM Michael Harris <harmic@gmail.com> wrote:

2. The existing ANALYZE command has the following structure:

ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:

ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

I like trying to use ONLY somehow.

--
Robert Haas
EDB: http://www.enterprisedb.com

#8David Rowley
dgrowleyml@gmail.com
In reply to: Robert Haas (#7)
Re: ANALYZE ONLY

On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:

I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.

If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available

David

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#8)
Re: ANALYZE ONLY

David Rowley <dgrowleyml@gmail.com> writes:

On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:

I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

FWIW, I think that's the right approach, for consistency with the
way that ONLY works in DML.

regards, tom lane

#10Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#8)
Re: ANALYZE ONLY

On Tue, Aug 20, 2024 at 6:53 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:

I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

The table modifier idea seems nice to me.

If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available

Perhaps. I'm not sure about this part.

--
Robert Haas
EDB: http://www.enterprisedb.com

#11Melih Mutlu
m.melihmutlu@gmail.com
In reply to: David Rowley (#8)
Re: ANALYZE ONLY

David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu
yazdı:

On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:

I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

I believe we should go this route if we want this to be called "ONLY" so
that it would be consistent with other places too.

Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the

behaviour of #2.

Havin it as an option would be easier to use when we have several partition
tables. But I agree that if we call it "ONLY ", it may be confusing and the
other approach would be appropriate.

If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available

+1 to raising a warning message in that case instead of staying silent. We
might also not allow ONLY if ANALYZE is not present in VACUUM query and
raise an error. But that would require changes in grams.y and could
complicate things. So it may not be necessary and we may be fine with just
a warning.

Regards,
--
Melih Mutlu
Microsoft

#12Michael Harris
harmic@gmail.com
In reply to: Melih Mutlu (#11)
Re: ANALYZE ONLY

Thank you all for the replies & discussion.

It sounds like more are in favour of using the ONLY syntax attached to
the tables is the best way to go, with the main advantages being:
- consistency with other commands
- flexibility in allowing to specify whether to include partitions
for individual tables when supplying a list of tables

I will start working on an implementation along those lines. It looks
like we can simply replace qualified_name with relation_expr in the
production for vacuum_relation within gram.y.

One other thing I noticed when reading the code. The function
expand_vacuum_rel in vacuum.c seems to be responsible for adding the
partitions. If I am reading it correctly, it only adds child tables in
the case of a partitioned table, not in the case of an inheritance
parent:

include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
..
if (include_parts)
{
.. add partitions ..

This is a little different to some other contexts where the ONLY
keyword is used, in that ONLY would be the default and only available
mode of operation for an inheritance parent.

Regards,
Mike

Show quoted text

On Wed, 21 Aug 2024 at 20:04, Melih Mutlu <m.melihmutlu@gmail.com> wrote:

David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu yazdı:

On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:

I like trying to use ONLY somehow.

Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?

Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.

1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.

I believe we should go this route if we want this to be called "ONLY" so that it would be consistent with other places too.

Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.

Havin it as an option would be easier to use when we have several partition tables. But I agree that if we call it "ONLY ", it may be confusing and the other approach would be appropriate.

If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:

postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available

+1 to raising a warning message in that case instead of staying silent. We might also not allow ONLY if ANALYZE is not present in VACUUM query and raise an error. But that would require changes in grams.y and could complicate things. So it may not be necessary and we may be fine with just a warning.

Regards,
--
Melih Mutlu
Microsoft

#13David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#12)
Re: ANALYZE ONLY

On Thu, 22 Aug 2024 at 11:32, Michael Harris <harmic@gmail.com> wrote:

One other thing I noticed when reading the code. The function
expand_vacuum_rel in vacuum.c seems to be responsible for adding the
partitions. If I am reading it correctly, it only adds child tables in
the case of a partitioned table, not in the case of an inheritance
parent:

include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
..
if (include_parts)
{
.. add partitions ..

This is a little different to some other contexts where the ONLY
keyword is used, in that ONLY would be the default and only available
mode of operation for an inheritance parent.

That's inconvenient and quite long-established behaviour. I had a look
as far back as 9.2 and we only analyze parents there too. I'm keen on
the ONLY syntax, but it would be strange if ONLY did the same thing as
not using ONLY for inheritance parents.

I feel like we might need to either bite the bullet and make ONLY work
consistently with both, or think of another way to have ANALYZE not
recursively gather stats for each partition on partitioned tables.
Could we possibly get away with changing inheritance parent behaviour?

David

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#13)
Re: ANALYZE ONLY

David Rowley <dgrowleyml@gmail.com> writes:

I feel like we might need to either bite the bullet and make ONLY work
consistently with both, or think of another way to have ANALYZE not
recursively gather stats for each partition on partitioned tables.
Could we possibly get away with changing inheritance parent behaviour?

Yeah, my thought was to change the behavior so it's consistent in
that case too. This doesn't seem to me like a change that'd
really cause anybody serious problems: ANALYZE without ONLY would
do more than before, but it's work you probably want done anyway.

regards, tom lane

#15Michael Harris
harmic@gmail.com
In reply to: Tom Lane (#14)
Re: ANALYZE ONLY

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, my thought was to change the behavior so it's consistent in
that case too. This doesn't seem to me like a change that'd
really cause anybody serious problems: ANALYZE without ONLY would
do more than before, but it's work you probably want done anyway.

Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.

It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.

#16David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#15)
Re: ANALYZE ONLY

On Thu, 22 Aug 2024 at 13:38, Michael Harris <harmic@gmail.com> wrote:

Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.

I think they'd need to work the same way as for "VACUUM (ANALYZE)", it
would be strange to analyze some tables that you didn't vacuum. It's
just a much bigger pill to swallow in terms of the additional effort.

It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.

Yeah, I suspect that's exactly why it was coded that way.

David

#17Michael Harris
harmic@gmail.com
In reply to: David Rowley (#16)
Re: ANALYZE ONLY

Hi All,

Here is a first draft of a patch to implement the ONLY option for
VACUUM and ANALYZE.

I'm a little nervous about the implications of changing the behaviour of VACUUM
for inheritance structures; I can imagine people having regularly
executed scripts
that currently vacuum all the tables in their inheritance structure;
after this change
they might get more vacuuming than they bargained for.

It's my first attempt to submit a patch here so apologies if I've
missed any part
of the process.

Cheers
Mike

Show quoted text

On Thu, 22 Aug 2024 at 12:09, David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 22 Aug 2024 at 13:38, Michael Harris <harmic@gmail.com> wrote:

Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.

I think they'd need to work the same way as for "VACUUM (ANALYZE)", it
would be strange to analyze some tables that you didn't vacuum. It's
just a much bigger pill to swallow in terms of the additional effort.

It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.

Yeah, I suspect that's exactly why it was coded that way.

David

Attachments:

0001-Initial-implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=0001-Initial-implementation-of-the-ONLY-feature.patchDownload+140-17
#18Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Michael Harris (#17)
Re: ANALYZE ONLY

Hi Michael,

Thanks for the patch.
I quickly tried running some ANALYZE ONLY queries, it seems like it works
fine.

-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )

] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable>
[, ...] ]

It seems like extended_relation_expr allows "tablename *" syntax too. That
should be added in docs as well. (Same for VACUUM doc)

<para>

For partitioned tables, <command>ANALYZE</command> gathers statistics
by
sampling rows from all partitions; in addition, it will recurse into
each
partition and update its statistics. Each leaf partition is analyzed
only
once, even with multi-level partitioning. No statistics are collected
for
only the parent table (without data from its partitions), because with
partitioning it's guaranteed to be empty.
</para>

We may also want to update the above note in ANALYZE doc.

+-- ANALYZE ONLY / VACUUM ONLY on partitioned table

+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);

+INSERT INTO only_parted1 VALUES (1, 'a');

Tests don't seem right to me.
I believe the above should be " PARTITION OF only_parted " instead of
vacparted.
It may better to insert into partitioned table (only_parted) instead of the
partition (only_parted1);

Also it may be a good idea to test VACUUM ONLY for inheritance tables the
same way you test ANALYZE ONLY.

Lastly, the patch includes an unrelated file (compile_flags.txt) and has
whitespace errors when I apply it.

Regards,
--
Melih Mutlu
Microsoft

#19Michael Harris
harmic@gmail.com
In reply to: Melih Mutlu (#18)
Re: ANALYZE ONLY

Thanks for the feedback Melih,

On Thu, 22 Aug 2024 at 21:26, Melih Mutlu <m.melihmutlu@gmail.com> wrote:

It seems like extended_relation_expr allows "tablename *" syntax too. That should be added in docs as well. (Same for VACUUM doc)

I had included it in the parameter description but had missed it from
the synopsis. I've fixed that now.

We may also want to update the above note in ANALYZE doc.

Well spotted. I have updated that now.

Tests don't seem right to me.
I believe the above should be " PARTITION OF only_parted " instead of vacparted.
It may better to insert into partitioned table (only_parted) instead of the partition (only_parted1);

Also it may be a good idea to test VACUUM ONLY for inheritance tables the same way you test ANALYZE ONLY.

Well spotted again. I have fixed the incorrect table names and added
tests as suggested.

Lastly, the patch includes an unrelated file (compile_flags.txt) and has whitespace errors when I apply it.

Oops! Fixed,

V2 of the patch is attached.

Cheers
Mike

Attachments:

v2-0001-Initial-implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v2-0001-Initial-implementation-of-the-ONLY-feature.patchDownload+176-24
#20Melih Mutlu
m.melihmutlu@gmail.com
In reply to: Michael Harris (#19)
Re: ANALYZE ONLY

Hi Michael,

Michael Harris <harmic@gmail.com>, 23 Ağu 2024 Cum, 13:01 tarihinde şunu
yazdı:

V2 of the patch is attached.

Thanks for updating the patch. I have a few more minor feedbacks.

-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )

] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable>
[, ...] ]

I believe moving "[ ONLY ]" to the definitions of table_and_columns below,
as you did with "[ * ]", might be better to be consistent with other places
(see [1]https://www.postgresql.org/docs/16/sql-createpublication.html)

+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! i

nclude_children)

There are also some issues with coding conventions in some places (e.g. the
space between "!" and "include_children" abode). I think running pgindent
would resolve such issue in most places.

[1]: https://www.postgresql.org/docs/16/sql-createpublication.html

Regards,
--
Melih Mutlu
Microsoft

#21torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Harris (#19)
#22jian he
jian.universality@gmail.com
In reply to: Melih Mutlu (#20)
#23David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#19)
#24Michael Harris
harmic@gmail.com
In reply to: torikoshia (#21)
#25Michael Harris
harmic@gmail.com
In reply to: David Rowley (#23)
#26David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#25)
#27torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Harris (#24)
#28David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#24)
#29Michael Harris
harmic@gmail.com
In reply to: David Rowley (#28)
#30torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Harris (#29)
#31Michael Harris
harmic@gmail.com
In reply to: torikoshia (#30)
#32torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Harris (#31)
#33jian he
jian.universality@gmail.com
In reply to: Michael Harris (#31)
#34Michael Harris
harmic@gmail.com
In reply to: jian he (#33)
#35David Rowley
dgrowleyml@gmail.com
In reply to: Michael Harris (#34)
#36Michael Harris
harmic@gmail.com
In reply to: David Rowley (#35)
#37jian he
jian.universality@gmail.com
In reply to: David Rowley (#35)
#38David Rowley
dgrowleyml@gmail.com
In reply to: jian he (#37)
#39jian he
jian.universality@gmail.com
In reply to: David Rowley (#38)
#40David Rowley
dgrowleyml@gmail.com
In reply to: jian he (#39)
#41jian he
jian.universality@gmail.com
In reply to: David Rowley (#40)
#42David Rowley
dgrowleyml@gmail.com
In reply to: jian he (#41)
#43jian he
jian.universality@gmail.com
In reply to: David Rowley (#42)
#44David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#40)