query optimizer questions

Started by Robert Bergeralmost 25 years ago8 messagesgeneral

rwb@vtiscan.com

almost 25 years ago

Why do the following three queries use different scan types under 7.1.2?
Under
6.5.3 they all used index (and my system ran MUCH faster, since I often
use the trailing %)

I've done a vacuum analyze.
I installed 7.1.2 on RedHat 6.2 using the RPMs.

anderson=# explain select value from s_n_grantor where value='foobar';
NOTICE: QUERY PLAN:

Index Scan using s_n_grantor_key on s_n_grantor (cost=0.00..823.58
rows=247 width=12)

EXPLAIN
anderson=# explain select value from s_n_grantor where value like
'foobar';
NOTICE: QUERY PLAN:

Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=247 width=12)

anderson=# explain select value from s_n_grantor where value like
'foobar%';
NOTICE: QUERY PLAN:

Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=1 width=12)

EXPLAIN
anderson=# \d s_n_grantor_key
Index "s_n_grantor_key"
Attribute | Type
-----------+-----------------------
value | character varying(80)
btree

Peter Eisentraut

peter_e@gmx.net

almost 25 years ago

In reply to: Robert Berger (#1)

Re: query optimizer questions

Robert Berger writes:

Why do the following three queries use different scan types under 7.1.2?
Under
6.5.3 they all used index (and my system ran MUCH faster, since I often
use the trailing %)

In 7.1, LIKE optimization (i.e., possibly using an index) has been
disabled unless you are in the C locale because it does not work
correctly. You could set your locale to C and re-initdb to get it to work
again.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

Robert Berger

rwb@vtiscan.com

almost 25 years ago

In reply to: Peter Eisentraut (#2)

Re: query optimizer questions

Thanks. How do I set the locale?

Show quoted text

In 7.1, LIKE optimization (i.e., possibly using an index) has been
disabled unless you are in the C locale because it does not work
correctly. You could set your locale to C and re-initdb to get it to work
again.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Tom Lane

tgl@sss.pgh.pa.us

almost 25 years ago

In reply to: Robert Berger (#1)

Re: query optimizer questions

Robert Berger <rwb@vtiscan.com> writes:

Why do the following three queries use different scan types under 7.1.2?

As Peter points out, index optimization of LIKE queries is disabled
unless your database is in C locale (if you're not sure,
contrib/pg_controldata can tell you what locale your database is using).

However, the estimated row counts seem peculiar to me in any case:

anderson=# explain select value from s_n_grantor where value='foobar';
NOTICE: QUERY PLAN:

Index Scan using s_n_grantor_key on s_n_grantor (cost=0.00..823.58
rows=247 width=12)

EXPLAIN
anderson=# explain select value from s_n_grantor where value like
'foobar';
NOTICE: QUERY PLAN:

Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=247 width=12)

anderson=# explain select value from s_n_grantor where value like
'foobar%';
NOTICE: QUERY PLAN:

Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=1 width=12)

I'd expect the LIKE 'foobar' to produce the same estimate as = 'foobar',
because the LIKE estimator recognizes this as a fixed pattern. But why
is that considered *less* selective than LIKE 'foobar%'? Something
fishy about that. Could we see the output of

select attname,attdispersion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 's_n_grantor';

(If you get no output, it means you didn't VACUUM ANALYZE ...)

regards, tom lane

Allan Engelhardt

allane@cybaea.com

almost 25 years ago

In reply to: Tom Lane (#4)

Re: query optimizer questions

Thanks. How do I set the locale?

Short answer:

bash$ LANG=C initdb

Longer answer:

I find locale support in PostgreSQL confusing. The locale of the current user at the time of initialising the database determines the collating order (e.g. is 'a' < 'A'). This in turns affect things like indexing (as described below). The locale of the account that is running the current postmaster defines format conversions (e.g. the currency symbol in to_char(1,'L99')). The locale of the current process requesting a query defines how the user expects to see the result, and is (of course) ignored by PostgreSQL.

To make matters worse: On Red Hat 7.1 (US boxed set) the default locale is set in /etc/sysconf/i18n to en_US, i.e. American English. Fine, except some Americans may be surprised to learn that 'A' > 'a'. When you start the postgresql service without doing an initdb first, the script will helpfully copy i18n to ~postgres/initdb.i18n so you have a record of the locale (and can re-set it). Except it never uses this file. In fact, the postmaster service runs in the default locale *for root*. A previous version of RH did go through the trouble of sourcing initdb.i18n in .bash_profile, but it does nothing for postmaster which is started as su -l postgres -s /bin/sh ...

In order to start postmaster on RH7.1 with a locale that is different from the default (actually: different from root) you should create ~postgres/.profile with the content

PGLIB=/usr/lib/pgsql
PGDATA=/var/lib/pgsql/data
export PGLIB PGDATA
[ -f $PGDATA/../initdb.i18n ] && source $PGDATA/../initdb.i18n

You may, of course, have to change the path. You may want to add that last line to your ~postgres/.bash_profile as well, in order to avoid future confusion....

If you dont have initdb.i18n, just delete the content of the data directory and start the postgres service. The result should look something like

LANG="en_US"
export LANG LC_ALL LC_CTYPE LC_COLLATE LC_NUMERIC LC_CTYPE LC_TIME

for a default installation, but of course you want LANG="C"

Hope this makes sense to somebody....e-mail if I've managed to confuse anybody, including myself :-)

Allan.

Import Notes

Reference msg id not found: 200107032210.f63MAwa07548@postgresql.org | Resolved by subject fallback

Peter Eisentraut

peter_e@gmx.net

almost 25 years ago

In reply to: Allan Engelhardt (#5)

Re: Re: query optimizer questions

Allan Engelhardt writes:

I find locale support in PostgreSQL confusing. The locale of the
current user at the time of initialising the database determines the
collating order (e.g. is 'a' < 'A'). This in turns affect things like
indexing (as described below).

This is because if you were allowed to change this setting after the
database initialization, your indexes would get corrupted. (Recall that
(some) indexes are sorted, and the index scanning code relies on indexes
being sorted a certain way.)

The locale of the account that is running the current postmaster
defines format conversions (e.g. the currency symbol in
to_char(1,'L99')).

This is a courtesy because the above contraints do not apply to this
locale category. I agree it's confusing, but I'm not sure what can be
done about it.

The locale of the current process requesting a query defines how the
user expects to see the result, and is (of course) ignored by
PostgreSQL.

I guess it would make some sense if the client would pass along its locale
settings (in the allowable categories) to the server.

To make matters worse: On Red Hat 7.1 (US boxed set) the default
locale is set in /etc/sysconf/i18n to en_US, i.e. American English.
Fine, except some Americans may be surprised to learn that 'A' > 'a'.

What's so surprising about that? Yeah, ASCII order, blah. Remember that
computers are for humans, not for computers.

Btw., actually the rules are more complicated: ab < Ab < ac

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

Allan Engelhardt

allane@cybaea.com

almost 25 years ago

In reply to: Peter Eisentraut (#6)

Re: query optimizer questions

Peter Eisentraut wrote:

Allan Engelhardt writes:

I find locale support in PostgreSQL confusing. The locale of the
current user at the time of initialising the database determines the
collating order (e.g. is 'a' < 'A'). This in turns affect things like
indexing (as described below).

This is because if you were allowed to change this setting after the
database initialization, your indexes would get corrupted. (Recall that
(some) indexes are sorted, and the index scanning code relies on indexes
being sorted a certain way.)

Yes, I understand WHY it is, I just wish it was documented somewhere obvious. But nevermind - that what these mailing lists are for!

Thanks, all.

Allan.

Robert Berger

bob@bberger.net

almost 25 years ago

In reply to: Peter Eisentraut (#6)

Re: Re: query optimizer questions

This is because if you were allowed to change this setting after the
database initialization

Why can't it be set at database creation time, so different databases
can have different locales? Why is it a global per-server setting?