Statistics on a table

Started by Maksim Likharevalmost 23 years ago4 messagesgeneral
Jump to latest
#1Maksim Likharev
mlikharev@aurigin.com

Hi,
I just found very interesting situation,
statistic ( n_distinct in particular) records for one of my columns,
greatly under calculated it saying:

49726, but in reality 33409816.

So planer never choose index but rather using table scan, and query
never returns,
is it any way how I can improve that?

I can turn seqscan off but is it safe?

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Maksim Likharev (#1)
Re: Statistics on a table

On Tue, 8 Jul 2003, Maksim Likharev wrote:

Hi,
I just found very interesting situation,
statistic ( n_distinct in particular) records for one of my columns,
greatly under calculated it saying:

49726, but in reality 33409816.

So planer never choose index but rather using table scan, and query
never returns,
is it any way how I can improve that?

You might see if raising the statistics target with ALTER TABLE ALTER
COLUMN followed by an analyze helps getting reasonable values any. Maybe
try a few thousand?

I can turn seqscan off but is it safe?

It's somewhat of a large hammer, especially if you're doing joins with
this table or something.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#2)
Re: Statistics on a table

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

statistic ( n_distinct in particular) records for one of my columns,
greatly under calculated it saying:
49726, but in reality 33409816.
is it any way how I can improve that?

You might see if raising the statistics target with ALTER TABLE ALTER
COLUMN followed by an analyze helps getting reasonable values any. Maybe
try a few thousand?

That seems like a big jump, considering the default is only 10. Try
setting it to 100 (then re-ANALYZE, then check the plan). If that
doesn't work, try more. I'd be interested to see what it takes to
get the estimate closer to reality.

regards, tom lane

#4Maksim Likharev
mlikharev@aurigin.com
In reply to: Tom Lane (#3)
Re: Statistics on a table

Tried 100 did not help, it seems like n_distinct grows 10000 per 10
statistic,
in my case to be close to reality I have to put 1000.
Is there any complications of that, I mean growing statistic depth other
then
disk space?

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, July 08, 2003 11:33 PM
To: Stephan Szabo
Cc: Maksim Likharev; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Statistics on a table

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

statistic ( n_distinct in particular) records for one of my columns,
greatly under calculated it saying:
49726, but in reality 33409816.
is it any way how I can improve that?

You might see if raising the statistics target with ALTER TABLE ALTER
COLUMN followed by an analyze helps getting reasonable values any.

Maybe

try a few thousand?

That seems like a big jump, considering the default is only 10. Try
setting it to 100 (then re-ANALYZE, then check the plan). If that
doesn't work, try more. I'd be interested to see what it takes to
get the estimate closer to reality.

regards, tom lane