BUG #13790: last row of limit/offset result produces duplicates
The following bug has been logged on the website:
Bug reference: 13790
Logged by: Peter Belbin
Email address: pbelbin@gmail.com
PostgreSQL version: 9.4.5
Operating system: Mac
Description:
the following table:
CREATE TABLE demo1
(
id serial NOT NULL,
code character varying(20),
match character varying(20),
CONSTRAINT demo1_pk PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
populated with rows where many of the 'code' rows contain the same value,
and then queried with something like:
select * from proxy_homing order by code limit 10 offset 40
is returning the exact same result in the last row, regardless of the
offset.
here is some sample data:
insert into demo1 (code, match) values ('AAAA', '203' );
insert into demo1 (code, match) values ('BBBB', '204' );
insert into demo1 (code, match) values ('CCCC', '206' );
insert into demo1 (code, match) values ('DDDD', '207' );
insert into demo1 (code, match) values ('EEEE', '208' );
insert into demo1 (code, match) values ('FFFF', '212' );
insert into demo1 (code, match) values ('CHCG', '215' );
insert into demo1 (code, match) values ('CHCG', '216' );
insert into demo1 (code, match) values ('CHCG', '217' );
insert into demo1 (code, match) values ('CHCG', '218' );
insert into demo1 (code, match) values ('CHCG', '219' );
insert into demo1 (code, match) values ('CHCG', '224' );
insert into demo1 (code, match) values ('CHCG', '226' );
insert into demo1 (code, match) values ('CHCG', '231' );
insert into demo1 (code, match) values ('CHCG', '234' );
insert into demo1 (code, match) values ('CHCG', '240' );
insert into demo1 (code, match) values ('CHCG', '248' );
insert into demo1 (code, match) values ('CHCG', '250' );
insert into demo1 (code, match) values ('CHCG', '253' );
insert into demo1 (code, match) values ('CHCG', '260' );
insert into demo1 (code, match) values ('CHCG', '262' );
insert into demo1 (code, match) values ('CHCG', '267' );
insert into demo1 (code, match) values ('CHCG', '269' );
insert into demo1 (code, match) values ('CHCG', '270' );
insert into demo1 (code, match) values ('CHCG', '276' );
insert into demo1 (code, match) values ('CHCG', '289' );
insert into demo1 (code, match) values ('CHCG', '301' );
insert into demo1 (code, match) values ('CHCG', '302' );
insert into demo1 (code, match) values ('CHCG', '304' );
insert into demo1 (code, match) values ('CHCG', '306' );
insert into demo1 (code, match) values ('CHCG', '307' );
insert into demo1 (code, match) values ('CHCG', '308' );
insert into demo1 (code, match) values ('CHCG', '309' );
insert into demo1 (code, match) values ('CHCG', '312' );
insert into demo1 (code, match) values ('CHCG', '313' );
insert into demo1 (code, match) values ('CHCG', '314' );
insert into demo1 (code, match) values ('CHCG', '315' );
insert into demo1 (code, match) values ('CHCG', '317' );
insert into demo1 (code, match) values ('CHCG', '319' );
insert into demo1 (code, match) values ('CHCG', '320' );
insert into demo1 (code, match) values ('CHCG', '330' );
insert into demo1 (code, match) values ('CHCG', '331' );
insert into demo1 (code, match) values ('CHCG', '339' );
insert into demo1 (code, match) values ('CHCG', '347' );
insert into demo1 (code, match) values ('CHCG', '351' );
insert into demo1 (code, match) values ('CHCG', '360' );
insert into demo1 (code, match) values ('CHCG', '401' );
the last result row should not appear more than once in the output! but it
does!
eg:
select * from demo1 order by code limit 5 offset 10
18;CHCG;240
19;CHCG;248
20;CHCG;250
14;CHCG;224
10;CHCG;216
and then:
select * from demo1 order by code limit 5 offset 20
28;CHCG;289
29;CHCG;301
30;CHCG;302
14;CHCG;224
10;CHCG;216
the last row of the two result sets above is the same row!
changing the sorting so that it includes the match column appears to avoid
the issue, but, this is a bug. each row should only appear once if the
limit/offset values are looking at different portions of the result that
would be produced without the limit/offset options.
regards,
peter
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, Dec 2, 2015 at 2:02 PM, <pbelbin@gmail.com> wrote:
select * from demo1 order by code limit 5 offset 10
18;CHCG;240
19;CHCG;248
20;CHCG;250
14;CHCG;224
10;CHCG;216and then:
select * from demo1 order by code limit 5 offset 20
28;CHCG;289
29;CHCG;301
30;CHCG;302
14;CHCG;224
10;CHCG;216the last row of the two result sets above is the same row!
changing the sorting so that it includes the match column appears to avoid
the issue, but, this is a bug. each row should only appear once if the
limit/offset values are looking at different portions of the result that
would be produced without the limit/offset options
.
I'm sorry but no, that is not how this works. The two queries are
independent and if you fail to adequately specify the ORDER BY the fault is
yours, not the system's.
David J.
pbelbin@gmail.com writes:
... populated with rows where many of the 'code' rows contain the same value,
and then queried with something like:
select * from proxy_homing order by code limit 10 offset 40
is returning the exact same result in the last row, regardless of the
offset.
Sorry, this is not a bug. If you have an underspecified ORDER BY
ordering, the sorting code is entitled to return equal-keyed rows
in any order whatsoever, and there is no guarantee that changing
the limit/offset parameters won't affect that.
changing the sorting so that it includes the match column appears to avoid
the issue, but, this is a bug. each row should only appear once if the
limit/offset values are looking at different portions of the result that
would be produced without the limit/offset options.
We do not make any guarantee that that is how limit/offset works. You
will only get consistent results across queries if the underlying result
ordering is guaranteed consistent across queries, but that's not the case
in this example. Postgres would be within its rights to return different
rowsets on different executions even without any change in limit/offset,
and indeed it can do so in some cases.
(The technical reason why this happens in this particular case is that
when using a bounded heap to do a top-N sort, which of the equal-keyed
tuples survive to the end of the sort depends heavily on the exact heap
size, ie the LIMIT+OFFSET sum; and the order in which the survivors end
up getting output is also dependent on the exact heap size. But there
are other mechanisms that could cause the results to be unstable.)
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs