alter table xxx alter column yyy set (n_distinct=nnnn );

Started by Tefft, Michael Jabout 2 years ago6 messagesgeneral
Jump to latest
#1Tefft, Michael J
Michael.J.Tefft@snapon.com

I have been struggling to set n_distinct on a few table columns, and confirm that my changes have actually been accepted.

I have a 400-million row table with 81 partitions. PostgreSQL version is 14.11.
Column p_id has 13 million distinct values but pg_stats says n_distinct is only 82k.
Column pi_id has 18 million distinct values but pg_stats says n_distinct is only 29k.
Column i_id has 70k distinct values but pg_stats says n_distinct is only 1800.

I tried:
alter table xxx alter column p_id set (n_distinct=-0.033 );
alter table xxx alter column pi_id set (n_distinct=-0.045 );
alter table xxx alter column i_id set (n_distinct=-0.0002 );

All ran without error, but pg_stats shows no changes. I tried:

alter table xxx alter column p_id set (n_distinct=13118955 );
alter table xxx alter column pi_id set (n_distinct=18059179 );
alter table xxx alter column i_id set (n_distinct=69911 );

Again, all ran without error, but pg_stats shows no changes.

I saw somewhere (but can't locate the reference today) that setting n_distinct takes effect after the next ANALYZE. I tried ANALYZE xxx and VACUUM ANALYZE xxx but the settings never appeared to take effect.

I tried increasing "statistics" on the columns to 1000 and running ANALYZE; this produced somewhat-higher n_distinct values but still far from accurate and my manually-set values still did not appear.

How can I get these values to take effect?

Thanks,
Mike Tefft

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tefft, Michael J (#1)
Re: alter table xxx alter column yyy set (n_distinct=nnnn );

On Mon, 2024-03-11 at 12:36 +0000, Tefft, Michael J wrote:

I have been struggling to set n_distinct on a few table columns, and confirm that my changes have actually been accepted.
 
I have a 400-million row table with 81 partitions. PostgreSQL version is 14.11.
Column p_id has 13 million distinct values but pg_stats says n_distinct is only 82k.
Column pi_id has 18 million distinct values but pg_stats says n_distinct is only 29k.
Column i_id has 70k distinct values but pg_stats says n_distinct is only 1800.
 
I tried:
alter table xxx alter column p_id set (n_distinct=-0.033 );
alter table xxx alter column pi_id set (n_distinct=-0.045 );
alter table xxx alter column i_id set (n_distinct=-0.0002 );
 
All ran without error, but pg_stats shows no changes. I tried:
 
alter table xxx alter column p_id set (n_distinct=13118955 );
alter table xxx alter column pi_id set (n_distinct=18059179 );
alter table xxx alter column i_id set (n_distinct=69911 );
 
Again, all ran without error, but pg_stats shows no changes.
 
I saw somewhere (but can’t locate the reference today) that setting n_distinct takes effect after the next ANALYZE. I tried ANALYZE xxx and VACUUM ANALYZE xxx but the settings never appeared to take effect.
 
I tried increasing “statistics” on the columns to 1000 and running ANALYZE; this produced somewhat-higher n_distinct values but still far from accurate and my manually-set values still did not appear.
 
How can I get these values to take effect?

If it is a partitioned table, set "n_distinct_inherited" on the column,
then run ANALYZE.

Yours,
Laurenz Albe

#3Tefft, Michael J
Michael.J.Tefft@snapon.com
In reply to: Laurenz Albe (#2)
RE: alter table xxx alter column yyy set (n_distinct=nnnn );

Thanks very much for the reply.
I tried setting

alter table ctrg.xxx alter column p_id set (n_distinct_inherited=-0.033 );
alter table ctrg.xxx alter column pi_id set (n_distinct_inherited=-0.045 );
alter table ctrg.xxx alter column i_id set (n_distinct_inherited=-0.0002 );

I then ran
analyze xxx;
but saw no change in pg_stats.n_distinct for xxx.
Reviewing the values for pg_stats.n_distinct in the partitions of xxx,
- values for i_id are all positive
- values for p_id are a mix - mostly positive, a few negative, but none carry the value I attempted to set
- values for pi_id are all negative, but none carry the value I attempted to set

Thanks,
Mike Tefft

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: Monday, March 11, 2024 9:21 AM
To: Tefft, Michael J <Michael.J.Tefft@snapon.com>; pgsql-general@lists.postgresql.org
Subject: Re: alter table xxx alter column yyy set (n_distinct=nnnn );

On Mon, 2024-03-11 at 12:36 +0000, Tefft, Michael J wrote:

I have been struggling to set n_distinct on a few table columns, and confirm that my changes have actually been accepted.
 
I have a 400-million row table with 81 partitions. PostgreSQL version is 14.11.
Column p_id has 13 million distinct values but pg_stats says n_distinct is only 82k.
Column pi_id has 18 million distinct values but pg_stats says n_distinct is only 29k.
Column i_id has 70k distinct values but pg_stats says n_distinct is only 1800.
 
I tried:
alter table xxx alter column p_id set (n_distinct=-0.033 ); alter
table xxx alter column pi_id set (n_distinct=-0.045 ); alter table xxx
alter column i_id set (n_distinct=-0.0002 );
 
All ran without error, but pg_stats shows no changes. I tried:
 
alter table xxx alter column p_id set (n_distinct=13118955 ); alter
table xxx alter column pi_id set (n_distinct=18059179 ); alter table
xxx alter column i_id set (n_distinct=69911 );
 
Again, all ran without error, but pg_stats shows no changes.
 
I saw somewhere (but can’t locate the reference today) that setting n_distinct takes effect after the next ANALYZE. I tried ANALYZE xxx and VACUUM ANALYZE xxx but the settings never appeared to take effect.
 
I tried increasing “statistics” on the columns to 1000 and running ANALYZE; this produced somewhat-higher n_distinct values but still far from accurate and my manually-set values still did not appear.
 
How can I get these values to take effect?

If it is a partitioned table, set "n_distinct_inherited" on the column, then run ANALYZE.

Yours,
Laurenz Albe

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tefft, Michael J (#3)
Re: alter table xxx alter column yyy set (n_distinct=nnnn );

On Mon, 2024-03-11 at 14:26 +0000, Tefft, Michael J wrote:

I tried setting

alter table ctrg.xxx alter column p_id set (n_distinct_inherited=-0.033 );
alter table ctrg.xxx alter column pi_id set (n_distinct_inherited=-0.045 );
alter table ctrg.xxx alter column i_id set (n_distinct_inherited=-0.0002 );

I then ran
analyze xxx;
but saw no change in pg_stats.n_distinct for xxx.

Well, it works on my PostgreSQL v16.

Yours,
Laurenz Albe

#5Greg Sabino Mullane
greg@turnstep.com
In reply to: Laurenz Albe (#4)
Re: alter table xxx alter column yyy set (n_distinct=nnnn );

Works for me on Postgres 14 as well. Maybe you are looking at the wrong
table? Might help to show the exact steps you typed in.

#6Tefft, Michael J
Michael.J.Tefft@snapon.com
In reply to: Greg Sabino Mullane (#5)
RE: alter table xxx alter column yyy set (n_distinct=nnnn );

OK, thanks to both of you, it is now working for me. Many thanks.

This is what I determined was the missing ingredient from the secret sauce…

One of the things I had tried (before posting here) was
Alter table xxx alter column p_id set statistics [varying values];

Including zero.

When I first tried the suggestion of also setting n_distinct_inherited, these columns were still set to “statistics 0”.

When I altered them back to the default value i.e.
Alter table xxx alter column p_id set statistics -1;
(or, in fact, any nonzero value), the desired n_distinct values appeared after ANALYZE.

So, to set n_distinct for column col1 in a partitioned table sss.xxx:

1. Make sure column col1 has “statistics” != 0
2. alter table sss.xxx alter column p_id set (n_distinct_inherited=[desired_value] );
3. analyze sss.xxx;
4. verify with: select attname, n_distinct from pg_stats where schemaname='sss' and tablename='xxx' and attname = ‘col1’;

So setting n_distinct on the column at the parent/partitioned-table level is not relevant for this.

Thanks again.

Mike Tefft

From: Greg Sabino Mullane <htamfids@gmail.com>
Sent: Monday, March 11, 2024 12:23 PM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Tefft, Michael J <Michael.J.Tefft@snapon.com>; pgsql-general@lists.postgresql.org
Subject: Re: alter table xxx alter column yyy set (n_distinct=nnnn );

Works for me on Postgres 14 as well. Maybe you are looking at the wrong table? Might help to show the exact steps you typed in. ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍ ‍

Works for me on Postgres 14 as well. Maybe you are looking at the wrong table? Might help to show the exact steps you typed in.