Why is this SELECT evaluated?

Started by Miles Elamabout 3 years ago3 messagesgeneral

miles.elam@productops.com

about 3 years ago

Postgres v15

Given this example of a conversion from a byte array to an int8
masquerading as an "unsigned" int4

SELECT (get_byte(bytes, byte_offset)::int8 << 24)
| (get_byte(bytes, byte_offset + 1) << 16)
| (get_byte(bytes, byte_offset + 2) << 8)
| (get_byte(bytes, byte_offset + 3))
FROM ( VALUES ('\x010000'::bytea, 0) ) b(bytes, byte_offset)
WHERE length(bytes) >= (4 + byte_offset)
;

Why does this error result?

ERROR: index 3 out of valid range, 0..2
SQL state: 2202E

I was under the impression that if the WHERE clause evaluated to
false, the SELECT clause would not be evaluated. Why is get_byte(...)
ever run in the first place even though length(bytes) is 3?

- Miles Elam

ewie@ewie.name

about 3 years ago

In reply to: Miles Elam (#1)

Re: Why is this SELECT evaluated?

On 28/01/2023 20:29 CET Miles Elam <miles.elam@productops.com> wrote:

Postgres v15

Given this example of a conversion from a byte array to an int8
masquerading as an "unsigned" int4

SELECT (get_byte(bytes, byte_offset)::int8 << 24)
| (get_byte(bytes, byte_offset + 1) << 16)
| (get_byte(bytes, byte_offset + 2) << 8)
| (get_byte(bytes, byte_offset + 3))
FROM ( VALUES ('\x010000'::bytea, 0) ) b(bytes, byte_offset)
WHERE length(bytes) >= (4 + byte_offset)
;

Why does this error result?

ERROR: index 3 out of valid range, 0..2
SQL state: 2202E

I was under the impression that if the WHERE clause evaluated to
false, the SELECT clause would not be evaluated.

Yes, according to https://www.postgresql.org/docs/15/sql-select.html#id-1.9.3.172.7
the WHERE clause is evaluated before the SELECT list.

Why is get_byte(...) ever run in the first place even though length(bytes)
is 3?

Postgres also applies constant folding which can be observed in the execution
plan (I removed the 4th get_byte call). The WHERE clause is always false:

QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
One-Time Filter: false
(2 rows)

And with those constants the SELECT list is evaluated before the statement is
processed in the documented order.

Does the SQL standard say anything about constant folding and when or if it can
be applied? I assume it's just an implementation detail of Postgres. Without
the constant folding I would also expect that query to just return the empty set.

get_byte checks the index at runtime. Adding a fourth byte (index is 0-based)
works as expected with index 3:

test=# select get_byte('\x010203', 3);
ERROR: index 3 out of valid range, 0..2

test=# select get_byte('\x01020304', 3);
get_byte
----------
4
(1 row)

test=# select get_byte('\x01020304', 4);
ERROR: index 4 out of valid range, 0..3

With a random bytea length the query is processed in the expected order and
returns one or zero rows but never raises an error:

SELECT
b,
(get_byte(bytes, byte_offset)::int8 << 24)
| (get_byte(bytes, byte_offset + 1) << 16)
| (get_byte(bytes, byte_offset + 2) << 8)
| (get_byte(bytes, byte_offset + 3))
FROM (
VALUES (substring('\x01020304'::bytea from 1 for (random() * 4)::int), 0)
) b(bytes, byte_offset)
WHERE
length(bytes) >= (4 + byte_offset);

--
Erik

Christophe Pettus

xof@thebuild.com

about 3 years ago

In reply to: Miles Elam (#1)

Re: Why is this SELECT evaluated?

On Jan 28, 2023, at 11:29, Miles Elam <miles.elam@productops.com> wrote:
Why does this error result?

While the standard order of operations is to evaluate the WHERE before the SELECT list, it's not guaranteed to result in short-cut execution. In particular, constant folding happens very early in the processing of a query, well before the WHERE clause is evaluated:

xof=# SELECT 1/0 WHERE FALSE;
ERROR: division by zero