additional message to the bug #7499

Started by Denis Kolesnikover 13 years ago4 messagesbugs
Jump to latest
#1Denis Kolesnik
lirex.software@gmail.com

I have now VERY strong argument to consider it is as a bug:

if there a understandable for SQL language sequence which sorts
in other fashion when adding "LIMIT".

I did try the same with a last name starting with "G" (there also more
than one entry with identical surnames) and it worked ok(the results
were represented as I waited).

this one last example brings me to conseder it is as a bug.

id | str_last_name
-----+----------------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX

and

id | str_last_name
----+----------------------
83 | GXXXXXXXXX
(1 строка)

select id, str_last_name from tbl_owners_individual order by
str_last_name offset 26;

and

select id, str_last_name from tbl_owners_individual order by
str_last_name offset 26 limit 1;

corresponding...

and even sorting by id:
select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by str_last_name;

id | str_last_name
-----+----------------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX
1 | Kolesnik
111 | Kolesnik
(4 строки)

select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by id;

id | str_last_name
-----+----------------------
1 | Kolesnik
83 | GXXXXXXXXX
111 | Kolesnik
175 | GXXXXXXXXX
(4 строки)

anyway sorted by id results the record with the "1" id appear before
the record with the id "111".

#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Denis Kolesnik (#1)
Re: additional message to the bug #7499

Denis Kolesnik <lirex.software@gmail.com> wrote:

I have now VERY strong argument to consider it is as a bug:

No, you appear to have very strong feelings about it, but you are
not making an argument that holds water.

if there a understandable for SQL language sequence which sorts
in other fashion when adding "LIMIT".

Each query is evaluated in terms of satisfying what is requested in
that query, and the fastest plan which returns those results is
chosen. If you want results to be generated in a specific order, it
is incumbent on you to specify that in the query -- there is no
"natural order" to rows which is used as a tie-breaker. There are
even optimizations to have one query which is going to scan a table
start at the point that another table scan, already in progress is
at, to prevent extra reads -- so exactly the same query run at about
the same time, with no concurrent database changes can easily return
rows in different orders. That's a feature, not a bug. If you want
them in a particular order, say so, and appropriate index usage or
sorts will be added to the query execution to provide them the way
you ask, even though that is slower than it would be if you didn't
care about the order.

and even sorting by id:
select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by str_last_name;

id | str_last_name
-----+----------------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX
1 | Kolesnik
111 | Kolesnik
(4 ******)

No, it didn't go out of its way to sort that way, it just happened
to fall out that way that time; don't count on it always being that
way, even if it happens many times in a row.

test=# create table tbl_owners_individual
test-# (id int not null primary key, str_last_name text not null);
CREATE TABLE
test=# insert into tbl_owners_individual values
test-# (1, 'Kolesnik'),
test-# (83, 'GXXXXXXXXX'),
test-# (111, 'Kolesnik'),
test-# (175, 'GXXXXXXXXX');
INSERT 0 4
test=# select id, str_last_name from tbl_owners_individual where id
in
test-# (83,175,111,1) order by str_last_name;
id | str_last_name
-----+---------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX
1 | Kolesnik
111 | Kolesnik
(4 rows)

-Kevin

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#2)
Re: additional message to the bug #7499

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

Denis Kolesnik <lirex.software@gmail.com> wrote:

and even sorting by id:
select id, str_last_name from tbl_owners_individual where id in
(83,175,111,1) order by str_last_name;

id | str_last_name
-----+----------------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX
1 | Kolesnik
111 | Kolesnik
(4 ******)

No, it didn't go out of its way to sort that way, it just happened
to fall out that way that time; don't count on it always being
that way, even if it happens many times in a row.

test=# create table tbl_owners_individual
test-# (id int not null primary key, str_last_name text not
null);
CREATE TABLE
test=# insert into tbl_owners_individual values
test-# (1, 'Kolesnik'),
test-# (83, 'GXXXXXXXXX'),
test-# (111, 'Kolesnik'),
test-# (175, 'GXXXXXXXXX');
INSERT 0 4
test=# select id, str_last_name from tbl_owners_individual
test-# where id in (83,175,111,1) order by str_last_name;
id | str_last_name
-----+---------------
83 | GXXXXXXXXX
175 | GXXXXXXXXX
1 | Kolesnik
111 | Kolesnik
(4 rows)

Dang! I missed the tail of that session, which was the part that
made my point. If you follow the above with a VACUUM ANALYZE and
then run the same query again, you get a different order:

test=# vacuum analyze tbl_owners_individual;
VACUUM
test=# select id, str_last_name from tbl_owners_individual
test-# where id in (83,175,111,1) order by str_last_name;
id | str_last_name
-----+---------------
175 | GXXXXXXXXX
83 | GXXXXXXXXX
111 | Kolesnik
1 | Kolesnik
(4 rows)

With better statistics from the VACUUM ANALYZE it realized that the
index usage was pointless and slower, so it just used a table scan.

-Kevin

#4Craig Ringer
craig@2ndquadrant.com
In reply to: Denis Kolesnik (#1)
Re: additional message to the bug #7499

On 08/22/2012 08:36 AM, Denis Kolesnik wrote:

I have now VERY strong argument to consider it is as a bug:

if there a understandable for SQL language sequence which sorts
in other fashion when adding "LIMIT".

Underspecified sorts can have unstable results, that's allowed by the
spec and is a reasonable performance optimisation. If you want stable
sorts you have to provide a fully defined ordering.

To say this is a bug is a bit like saying that undefined behaviour in C
is a bug. It isn't, the language and compiler are giving you what
they're supposed to, just not what you want.

http://en.wikipedia.org/wiki/Undefined_behavior

Unstable sorts are a common optimisation

http://en.wikipedia.org/wiki/Sorting_algorithm#Stability

and a very useful one. Not allowing Pg to use unstable sorts, or use
different sorts for different query plans, would slow correct queries
down to make an incorrect query like yours run how you expect.

--
Craig Ringer