TRIM_ARRAY
The SQL standard defines a function called TRIM_ARRAY that surprisingly
has syntax that looks like a function! So I implemented it using a thin
wrapper around our array slice syntax. It is literally just ($1)[1:$2].
An interesting case that I decided to handle by explaining it in the
docs is that this won't give you the first n elements if your lower
bound is not 1. My justification for this is 1) non-standard lower
bounds are so rare in the wild that 2) people using them can just not
use this function. The alternative is to go through the unnest dance
(or write it in C) which defeats inlining.
Patch attached.
--
Vik Fearing
Attachments:
0001-implement-trim_array.patchtext/x-patch; charset=UTF-8; name=0001-implement-trim_array.patchDownload
From 6429316ab6060a16889b7c188ca577e17a5c7e4c Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Tue, 16 Feb 2021 18:38:24 +0100
Subject: [PATCH] implement trim_array
---
doc/src/sgml/func.sgml | 19 +++++++++++++++++++
src/include/catalog/pg_proc.dat | 4 ++++
src/test/regress/expected/arrays.out | 19 +++++++++++++++++++
src/test/regress/sql/arrays.sql | 16 ++++++++++++++++
4 files changed, 58 insertions(+)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1ab31a9056..c3e157622f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17916,6 +17916,25 @@ SELECT NULLIF(value, '(none)') ...
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>trim_array</primary>
+ </indexterm>
+ <function>trim_array</function> ( <parameter>array</parameter> <type>anyarray</type>, <parameter>n</parameter> <type>integer</type> )
+ <returnvalue>anyarray</returnvalue>
+ </para>
+ <para>
+ Trims an array to the elements 1 through <parameter>n</parameter>. Usually these are
+ the first <parameter>n</parameter> elements of the array, but may not be if the lower
+ bound is different from 1.
+ </para>
+ <para>
+ <literal>select trim_array(ARRAY[1,2,3,4,5,6], 3)</literal>
+ <returnvalue>{1,2,3}</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1487710d59..0aae4daf3b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1674,6 +1674,10 @@
proname => 'arraycontjoinsel', provolatile => 's', prorettype => 'float8',
proargtypes => 'internal oid internal int2 internal',
prosrc => 'arraycontjoinsel' },
+{ oid => '8819', descr => 'trim an array down to n elements',
+ proname => 'trim_array', prolang => 'sql', provolatile => 'i',
+ prorettype => 'anyarray', proargtypes => 'anyarray int4',
+ prosrc => 'select ($1)[1:$2]' },
{ oid => '764', descr => 'large object import',
proname => 'lo_import', provolatile => 'v', proparallel => 'u',
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 8bc7721e7d..a79ad36cb0 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2399,3 +2399,22 @@ SELECT width_bucket(5, ARRAY[3, 4, NULL]);
ERROR: thresholds array must not contain NULLs
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
ERROR: thresholds must be one-dimensional array
+-- trim_array
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('{1,2,3,4,5,6}'),
+ ('[-15:-10]={1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+SELECT arr, trim_array(arr, 3)
+FROM trim_array_test
+ORDER BY arr;
+ arr | trim_array
+---------------------------------------------+------------------------
+ [-15:-10]={1,2,3,4,5,6} | {}
+ {1,2,3,4,5,6} | {1,2,3}
+ [10:15]={1,2,3,4,5,6} | {}
+ {{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}} | {{1,10},{2,20},{3,30}}
+(4 rows)
+
+DROP TABLE trim_array_test;
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index c40619a8d5..6d34cc468e 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -722,3 +722,19 @@ SELECT width_bucket(5, '{}');
SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
SELECT width_bucket(5, ARRAY[3, 4, NULL]);
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
+
+
+-- trim_array
+
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('{1,2,3,4,5,6}'),
+ ('[-15:-10]={1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+
+SELECT arr, trim_array(arr, 3)
+FROM trim_array_test
+ORDER BY arr;
+
+DROP TABLE trim_array_test;
--
2.25.1
On Tue, 16 Feb 2021 at 12:54, Vik Fearing <vik@postgresfriends.org> wrote:
The SQL standard defines a function called TRIM_ARRAY that surprisingly
has syntax that looks like a function! So I implemented it using a thin
wrapper around our array slice syntax. It is literally just ($1)[1:$2].An interesting case that I decided to handle by explaining it in the
docs is that this won't give you the first n elements if your lower
bound is not 1. My justification for this is 1) non-standard lower
bounds are so rare in the wild that 2) people using them can just not
use this function. The alternative is to go through the unnest dance
(or write it in C) which defeats inlining.
I don't recall ever seeing non-default lower bounds, so I actually think
it's OK to just rule out that scenario, but why not something like this:
($1)[:array_lower ($1, 1) + $2 - 1]
Note that I've used the 9.6 feature that allows omitting the lower bound.
On 2/16/21 7:32 PM, Isaac Morland wrote:
On Tue, 16 Feb 2021 at 12:54, Vik Fearing <vik@postgresfriends.org> wrote:
The SQL standard defines a function called TRIM_ARRAY that surprisingly
has syntax that looks like a function! So I implemented it using a thin
wrapper around our array slice syntax. It is literally just ($1)[1:$2].An interesting case that I decided to handle by explaining it in the
docs is that this won't give you the first n elements if your lower
bound is not 1. My justification for this is 1) non-standard lower
bounds are so rare in the wild that 2) people using them can just not
use this function. The alternative is to go through the unnest dance
(or write it in C) which defeats inlining.I don't recall ever seeing non-default lower bounds, so I actually think
it's OK to just rule out that scenario, but why not something like this:($1)[:array_lower ($1, 1) + $2 - 1]
I'm kind of embarrassed that I didn't think about doing that; it is a
much better solution. You lose the non-standard bounds but I don't
think there is any way besides C to keep the lower bound regardless of
how you trim it.
V2 attached.
--
Vik Fearing
Attachments:
0001-implement-trim_array.v2.patchtext/x-patch; charset=UTF-8; name=0001-implement-trim_array.v2.patchDownload
From 21fb2060516a42b8b3edd1c9df3773ec8920d62b Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Tue, 16 Feb 2021 18:38:24 +0100
Subject: [PATCH] implement trim_array
---
doc/src/sgml/func.sgml | 17 +++++++++++++++++
src/include/catalog/pg_proc.dat | 4 ++++
src/test/regress/expected/arrays.out | 19 +++++++++++++++++++
src/test/regress/sql/arrays.sql | 16 ++++++++++++++++
4 files changed, 56 insertions(+)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1ab31a9056..58ab7860f4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17916,6 +17916,23 @@ SELECT NULLIF(value, '(none)') ...
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>trim_array</primary>
+ </indexterm>
+ <function>trim_array</function> ( <parameter>array</parameter> <type>anyarray</type>, <parameter>n</parameter> <type>integer</type> )
+ <returnvalue>anyarray</returnvalue>
+ </para>
+ <para>
+ Trims an array to the first <parameter>n</parameter> elements.
+ </para>
+ <para>
+ <literal>trim_array(ARRAY[1,2,3,4,5,6], 3)</literal>
+ <returnvalue>{1,2,3}</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1487710d59..76e2030865 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1674,6 +1674,10 @@
proname => 'arraycontjoinsel', provolatile => 's', prorettype => 'float8',
proargtypes => 'internal oid internal int2 internal',
prosrc => 'arraycontjoinsel' },
+{ oid => '8819', descr => 'trim an array down to n elements',
+ proname => 'trim_array', prolang => 'sql', provolatile => 'i',
+ prorettype => 'anyarray', proargtypes => 'anyarray int4',
+ prosrc => 'select ($1)[:array_lower($1, 1) + $2 - 1]' },
{ oid => '764', descr => 'large object import',
proname => 'lo_import', provolatile => 'v', proparallel => 'u',
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 8bc7721e7d..3fb1b8dcef 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2399,3 +2399,22 @@ SELECT width_bucket(5, ARRAY[3, 4, NULL]);
ERROR: thresholds array must not contain NULLs
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
ERROR: thresholds must be one-dimensional array
+-- trim_array
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('{1,2,3,4,5,6}'),
+ ('[-15:-10]={1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+SELECT arr, trim_array(arr, 3)
+FROM trim_array_test
+ORDER BY arr;
+ arr | trim_array
+---------------------------------------------+------------------------
+ [-15:-10]={1,2,3,4,5,6} | {1,2,3}
+ {1,2,3,4,5,6} | {1,2,3}
+ [10:15]={1,2,3,4,5,6} | {1,2,3}
+ {{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}} | {{1,10},{2,20},{3,30}}
+(4 rows)
+
+DROP TABLE trim_array_test;
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index c40619a8d5..6d34cc468e 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -722,3 +722,19 @@ SELECT width_bucket(5, '{}');
SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
SELECT width_bucket(5, ARRAY[3, 4, NULL]);
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
+
+
+-- trim_array
+
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('{1,2,3,4,5,6}'),
+ ('[-15:-10]={1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+
+SELECT arr, trim_array(arr, 3)
+FROM trim_array_test
+ORDER BY arr;
+
+DROP TABLE trim_array_test;
--
2.25.1
On 2/16/21 11:38 PM, Vik Fearing wrote:
On 2/16/21 7:32 PM, Isaac Morland wrote:
On Tue, 16 Feb 2021 at 12:54, Vik Fearing <vik@postgresfriends.org> wrote:
The SQL standard defines a function called TRIM_ARRAY that surprisingly
has syntax that looks like a function! So I implemented it using a thin
wrapper around our array slice syntax. It is literally just ($1)[1:$2].An interesting case that I decided to handle by explaining it in the
docs is that this won't give you the first n elements if your lower
bound is not 1. My justification for this is 1) non-standard lower
bounds are so rare in the wild that 2) people using them can just not
use this function. The alternative is to go through the unnest dance
(or write it in C) which defeats inlining.I don't recall ever seeing non-default lower bounds, so I actually think
it's OK to just rule out that scenario, but why not something like this:($1)[:array_lower ($1, 1) + $2 - 1]
I'm kind of embarrassed that I didn't think about doing that; it is a
much better solution. You lose the non-standard bounds but I don't
think there is any way besides C to keep the lower bound regardless of
how you trim it.
I've made a bit of a mess out of this, but I partly blame the standard
which is very unclear. It actually describes trimming the right n
elements instead of the left n like I've done here. I'll be back later
with a better patch that does what it's actually supposed to.
--
Vik Fearing
On 2/17/21 1:25 AM, Vik Fearing wrote:
I've made a bit of a mess out of this, but I partly blame the standard
which is very unclear. It actually describes trimming the right n
elements instead of the left n like I've done here. I'll be back later
with a better patch that does what it's actually supposed to.
And here is that patch.
Since the only justification I have for such a silly function is that
it's part of the standard, I decided to also issue the errors that the
standard describes which means the new function is now in C.
--
Vik Fearing
Attachments:
0001-implement-trim_array.v3.patchtext/x-patch; charset=UTF-8; name=0001-implement-trim_array.v3.patchDownload
From 1fb4e07651053ab20b2f853e5ec753aad5002782 Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Tue, 16 Feb 2021 18:38:24 +0100
Subject: [PATCH] implement trim_array
---
doc/src/sgml/func.sgml | 18 ++++++++++++
src/backend/catalog/sql_features.txt | 2 +-
src/backend/utils/adt/arrayfuncs.c | 42 ++++++++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 5 ++++
src/test/regress/expected/arrays.out | 23 +++++++++++++++
src/test/regress/sql/arrays.sql | 19 +++++++++++++
6 files changed, 108 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1ab31a9056..aa5026af07 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17916,6 +17916,24 @@ SELECT NULLIF(value, '(none)') ...
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>trim_array</primary>
+ </indexterm>
+ <function>trim_array</function> ( <parameter>array</parameter> <type>anyarray</type>, <parameter>n</parameter> <type>integer</type> )
+ <returnvalue>anyarray</returnvalue>
+ </para>
+ <para>
+ Trims an array by removing the last <parameter>n</parameter> elements.
+ If the array is multidimensional, only the first dimension is trimmed.
+ </para>
+ <para>
+ <literal>trim_array(ARRAY[1,2,3,4,5,6], 2)</literal>
+ <returnvalue>{1,2,3,4}</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index a24387c1e7..b9863f04e9 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -398,7 +398,7 @@ S301 Enhanced UNNEST YES
S401 Distinct types based on array types NO
S402 Distinct types based on distinct types NO
S403 ARRAY_MAX_CARDINALITY NO
-S404 TRIM_ARRAY NO
+S404 TRIM_ARRAY YES
T011 Timestamp in Information Schema NO
T021 BINARY and VARBINARY data types NO
T022 Advanced support for BINARY and VARBINARY data types NO
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f7012cc5d9..d38a99f0b0 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -6631,3 +6631,45 @@ width_bucket_array_variable(Datum operand,
return left;
}
+
+/*
+ * Trim the right N elements from an array by calculating an appropriate slice.
+ * Only the first dimension is trimmed.
+ */
+Datum
+trim_array(PG_FUNCTION_ARGS)
+{
+ ArrayType *v = PG_GETARG_ARRAYTYPE_P(0);
+ int n = PG_GETARG_INT32(1);
+ int array_length = ARR_DIMS(v)[0];
+ int16 elmlen;
+ bool elmbyval;
+ char elmalign;
+ int lower[MAXDIM];
+ int upper[MAXDIM];
+ bool lowerProvided[MAXDIM];
+ bool upperProvided[MAXDIM];
+ Datum result;
+
+ /* Throw an error if out of bounds */
+ if (n < 0 || n > array_length)
+ ereport(ERROR,
+ (errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
+ errmsg("number of elements to trim must be between 0 and %d", array_length)));
+
+ /* Set all the bounds as unprovided except the first upper bound */
+ memset(lowerProvided, 0, sizeof(lowerProvided));
+ memset(upperProvided, 0, sizeof(upperProvided));
+ upper[0] = ARR_LBOUND(v)[0] + array_length - n - 1;
+ upperProvided[0] = true;
+
+ /* Fetch the needed information about the element type */
+ get_typlenbyvalalign(ARR_ELEMTYPE(v), &elmlen, &elmbyval, &elmalign);
+
+ /* Get the slice */
+ result = array_get_slice(PointerGetDatum(v), 1,
+ upper, lower, upperProvided, lowerProvided,
+ -1, elmlen, elmbyval, elmalign);
+
+ PG_RETURN_DATUM(result);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1487710d59..8ab911238d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1674,6 +1674,11 @@
proname => 'arraycontjoinsel', provolatile => 's', prorettype => 'float8',
proargtypes => 'internal oid internal int2 internal',
prosrc => 'arraycontjoinsel' },
+{ oid => '8819',
+ descr => 'trim an array down to n elements',
+ proname => 'trim_array', proisstrict => 't', provolatile => 'i',
+ prorettype => 'anyarray', proargtypes => 'anyarray int4',
+ prosrc => 'trim_array' },
{ oid => '764', descr => 'large object import',
proname => 'lo_import', provolatile => 'v', proparallel => 'u',
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 8bc7721e7d..fd3e4bfc49 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2399,3 +2399,26 @@ SELECT width_bucket(5, ARRAY[3, 4, NULL]);
ERROR: thresholds array must not contain NULLs
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
ERROR: thresholds must be one-dimensional array
+-- trim_array
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('[-15:-10]={1,2,3,4,5,6}'),
+ ('{1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+SELECT arr, trim_array(arr, 2)
+FROM trim_array_test
+ORDER BY arr;
+ arr | trim_array
+---------------------------------------------+-------------------------------
+ [-15:-10]={1,2,3,4,5,6} | {1,2,3,4}
+ {1,2,3,4,5,6} | {1,2,3,4}
+ [10:15]={1,2,3,4,5,6} | {1,2,3,4}
+ {{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}} | {{1,10},{2,20},{3,30},{4,40}}
+(4 rows)
+
+DROP TABLE trim_array_test;
+VALUES (trim_array(ARRAY[1, 2, 3], -1)); -- fail
+ERROR: number of elements to trim must be between 0 and 3
+VALUES (trim_array(ARRAY[1, 2, 3], 10)); -- fail
+ERROR: number of elements to trim must be between 0 and 3
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index c40619a8d5..551cf5c5c9 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -722,3 +722,22 @@ SELECT width_bucket(5, '{}');
SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
SELECT width_bucket(5, ARRAY[3, 4, NULL]);
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
+
+
+-- trim_array
+
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('[-15:-10]={1,2,3,4,5,6}'),
+ ('{1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+
+SELECT arr, trim_array(arr, 2)
+FROM trim_array_test
+ORDER BY arr;
+
+DROP TABLE trim_array_test;
+
+VALUES (trim_array(ARRAY[1, 2, 3], -1)); -- fail
+VALUES (trim_array(ARRAY[1, 2, 3], 10)); -- fail
--
2.25.1
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, failed
Documentation: tested, failed
This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise. I was able to
"break" the function with an untyped null in psql:
select trim_array(null, 2);
ERROR: could not determine polymorphic type because input has type unknown
I don't know whether there are any circumstances other than manual entry
in psql where this could happen, since column values and variables will
always be typed. I don't have access to the standard, but DB2's docs[1]https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_trimarray.html
note "if any argument is null, the result is the null value", so an
up-front null check might be preferable to a slightly arcane user-facing
error, even if it's a silly invocation of a silly function :)
[1]: https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_trimarray.html
The new status of this patch is: Waiting on Author
On 3/2/21 12:14 AM, Dian Fay wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, failed
Documentation: tested, failed
Thank you for looking at my patch!
This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise.
Hmm. It appears between cardinality and unnest in the source code and
also my compiled html. Can you say more about where you're seeing the
wrong order?
I was able to
"break" the function with an untyped null in psql:select trim_array(null, 2);
ERROR: could not determine polymorphic type because input has type unknownI don't know whether there are any circumstances other than manual entry
in psql where this could happen, since column values and variables will
always be typed. I don't have access to the standard, but DB2's docs[1]
note "if any argument is null, the result is the null value", so an
up-front null check might be preferable to a slightly arcane user-facing
error, even if it's a silly invocation of a silly function :)[1] https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_trimarray.html
The standard also says that if either argument is null, the result is
null. The problem here is that postgres needs to know what the return
type is and it can only determine that from the input.
If you give the function a typed null, it returns null as expected.
The new status of this patch is: Waiting on Author
I put it back to Needs Review without a new patch because I don't know
what I would change.
--
Vik Fearing
Dian Fay <dian.m.fay@gmail.com> writes:
This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise. I was able to
"break" the function with an untyped null in psql:
select trim_array(null, 2);
ERROR: could not determine polymorphic type because input has type unknown
That's a generic parser behavior for polymorphic functions, not something
this particular function could or should dodge.
regards, tom lane
On Mon Mar 1, 2021 at 6:53 PM EST, Vik Fearing wrote:
This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise.Hmm. It appears between cardinality and unnest in the source code and
also my compiled html. Can you say more about where you're seeing the
wrong order?
I applied the patch to the latest commit, ffd3944ab9. Table 9.52 is
ordered:
array_to_string
array_upper
trim_array
cardinality
unnest
The problem here is that postgres needs to know what the return
type is and it can only determine that from the input.If you give the function a typed null, it returns null as expected.
The new status of this patch is: Waiting on Author
I put it back to Needs Review without a new patch because I don't know
what I would change.
I'd thought that checking v and returning null instead of raising the
error would be more friendly, should it be possible to pass an untyped
null accidentally instead of on purpose, and I couldn't rule that out.
I've got no objections other than the docs having been displaced.
On 3/2/21 1:02 AM, Dian M Fay wrote:
On Mon Mar 1, 2021 at 6:53 PM EST, Vik Fearing wrote:
This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise.Hmm. It appears between cardinality and unnest in the source code and
also my compiled html. Can you say more about where you're seeing the
wrong order?I applied the patch to the latest commit, ffd3944ab9. Table 9.52 is
ordered:array_to_string
array_upper
trim_array
cardinality
unnest
So it turns out I must have fixed it locally after I posted the patch
and then forgot I did that. Attached is a new patch with the order
correct. Thanks for spotting it!
The problem here is that postgres needs to know what the return
type is and it can only determine that from the input.If you give the function a typed null, it returns null as expected.
The new status of this patch is: Waiting on Author
I put it back to Needs Review without a new patch because I don't know
what I would change.I'd thought that checking v and returning null instead of raising the
error would be more friendly, should it be possible to pass an untyped
null accidentally instead of on purpose, and I couldn't rule that out.
As Tom said, that is something that does not belong in this patch.
--
Vik Fearing
Attachments:
0001-implement-trim_array.v4.patchtext/x-patch; charset=UTF-8; name=0001-implement-trim_array.v4.patchDownload
From 351bf14400d8ea2948788a96b47ce6c20847de97 Mon Sep 17 00:00:00 2001
From: Vik Fearing <vik@postgresfriends.org>
Date: Tue, 16 Feb 2021 18:38:24 +0100
Subject: [PATCH] implement trim_array
---
doc/src/sgml/func.sgml | 18 ++++++++++++
src/backend/catalog/sql_features.txt | 2 +-
src/backend/utils/adt/arrayfuncs.c | 42 ++++++++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 5 ++++
src/test/regress/expected/arrays.out | 23 +++++++++++++++
src/test/regress/sql/arrays.sql | 19 +++++++++++++
6 files changed, 108 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 08f08322ca..2aac87cf8d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17909,6 +17909,24 @@ SELECT NULLIF(value, '(none)') ...
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>trim_array</primary>
+ </indexterm>
+ <function>trim_array</function> ( <parameter>array</parameter> <type>anyarray</type>, <parameter>n</parameter> <type>integer</type> )
+ <returnvalue>anyarray</returnvalue>
+ </para>
+ <para>
+ Trims an array by removing the last <parameter>n</parameter> elements.
+ If the array is multidimensional, only the first dimension is trimmed.
+ </para>
+ <para>
+ <literal>trim_array(ARRAY[1,2,3,4,5,6], 2)</literal>
+ <returnvalue>{1,2,3,4}</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index ab0895ce3c..32eed988ab 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -398,7 +398,7 @@ S301 Enhanced UNNEST YES
S401 Distinct types based on array types NO
S402 Distinct types based on distinct types NO
S403 ARRAY_MAX_CARDINALITY NO
-S404 TRIM_ARRAY NO
+S404 TRIM_ARRAY YES
T011 Timestamp in Information Schema NO
T021 BINARY and VARBINARY data types NO
T022 Advanced support for BINARY and VARBINARY data types NO
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f7012cc5d9..d38a99f0b0 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -6631,3 +6631,45 @@ width_bucket_array_variable(Datum operand,
return left;
}
+
+/*
+ * Trim the right N elements from an array by calculating an appropriate slice.
+ * Only the first dimension is trimmed.
+ */
+Datum
+trim_array(PG_FUNCTION_ARGS)
+{
+ ArrayType *v = PG_GETARG_ARRAYTYPE_P(0);
+ int n = PG_GETARG_INT32(1);
+ int array_length = ARR_DIMS(v)[0];
+ int16 elmlen;
+ bool elmbyval;
+ char elmalign;
+ int lower[MAXDIM];
+ int upper[MAXDIM];
+ bool lowerProvided[MAXDIM];
+ bool upperProvided[MAXDIM];
+ Datum result;
+
+ /* Throw an error if out of bounds */
+ if (n < 0 || n > array_length)
+ ereport(ERROR,
+ (errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
+ errmsg("number of elements to trim must be between 0 and %d", array_length)));
+
+ /* Set all the bounds as unprovided except the first upper bound */
+ memset(lowerProvided, 0, sizeof(lowerProvided));
+ memset(upperProvided, 0, sizeof(upperProvided));
+ upper[0] = ARR_LBOUND(v)[0] + array_length - n - 1;
+ upperProvided[0] = true;
+
+ /* Fetch the needed information about the element type */
+ get_typlenbyvalalign(ARR_ELEMTYPE(v), &elmlen, &elmbyval, &elmalign);
+
+ /* Get the slice */
+ result = array_get_slice(PointerGetDatum(v), 1,
+ upper, lower, upperProvided, lowerProvided,
+ -1, elmlen, elmbyval, elmalign);
+
+ PG_RETURN_DATUM(result);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1487710d59..8ab911238d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1674,6 +1674,11 @@
proname => 'arraycontjoinsel', provolatile => 's', prorettype => 'float8',
proargtypes => 'internal oid internal int2 internal',
prosrc => 'arraycontjoinsel' },
+{ oid => '8819',
+ descr => 'trim an array down to n elements',
+ proname => 'trim_array', proisstrict => 't', provolatile => 'i',
+ prorettype => 'anyarray', proargtypes => 'anyarray int4',
+ prosrc => 'trim_array' },
{ oid => '764', descr => 'large object import',
proname => 'lo_import', provolatile => 'v', proparallel => 'u',
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 8bc7721e7d..fd3e4bfc49 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2399,3 +2399,26 @@ SELECT width_bucket(5, ARRAY[3, 4, NULL]);
ERROR: thresholds array must not contain NULLs
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
ERROR: thresholds must be one-dimensional array
+-- trim_array
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('[-15:-10]={1,2,3,4,5,6}'),
+ ('{1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+SELECT arr, trim_array(arr, 2)
+FROM trim_array_test
+ORDER BY arr;
+ arr | trim_array
+---------------------------------------------+-------------------------------
+ [-15:-10]={1,2,3,4,5,6} | {1,2,3,4}
+ {1,2,3,4,5,6} | {1,2,3,4}
+ [10:15]={1,2,3,4,5,6} | {1,2,3,4}
+ {{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}} | {{1,10},{2,20},{3,30},{4,40}}
+(4 rows)
+
+DROP TABLE trim_array_test;
+VALUES (trim_array(ARRAY[1, 2, 3], -1)); -- fail
+ERROR: number of elements to trim must be between 0 and 3
+VALUES (trim_array(ARRAY[1, 2, 3], 10)); -- fail
+ERROR: number of elements to trim must be between 0 and 3
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index c40619a8d5..551cf5c5c9 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -722,3 +722,22 @@ SELECT width_bucket(5, '{}');
SELECT width_bucket('5'::text, ARRAY[3, 4]::integer[]);
SELECT width_bucket(5, ARRAY[3, 4, NULL]);
SELECT width_bucket(5, ARRAY[ARRAY[1, 2], ARRAY[3, 4]]);
+
+
+-- trim_array
+
+CREATE TABLE trim_array_test (arr integer[]);
+INSERT INTO trim_array_test
+VALUES ('[-15:-10]={1,2,3,4,5,6}'),
+ ('{1,2,3,4,5,6}'),
+ ('[10:15]={1,2,3,4,5,6}'),
+ ('{{1,10},{2,20},{3,30},{4,40},{5,50},{6,60}}');
+
+SELECT arr, trim_array(arr, 2)
+FROM trim_array_test
+ORDER BY arr;
+
+DROP TABLE trim_array_test;
+
+VALUES (trim_array(ARRAY[1, 2, 3], -1)); -- fail
+VALUES (trim_array(ARRAY[1, 2, 3], 10)); -- fail
--
2.25.1
Vik Fearing <vik@postgresfriends.org> writes:
On 3/2/21 1:02 AM, Dian M Fay wrote:
I'd thought that checking v and returning null instead of raising the
error would be more friendly, should it be possible to pass an untyped
null accidentally instead of on purpose, and I couldn't rule that out.
As Tom said, that is something that does not belong in this patch.
Yeah, the individual function really doesn't have any way to affect
this, since the error happens on the way to identifying which function
we should call in the first place.
I had the same problem as Dian of the func.sgml hunk winding up in
the wrong place. I think this is practically inevitable unless the
submitter uses more than 3 lines of context for the diff, because
otherwise the context is just boilerplate that looks the same
everywhere in the function tables. Unless the diff is 100% up to date
so that the line numbers are exactly right, patch is likely to guess
wrong about where to insert the new hunk. We'll just have to be
vigilant about that.
I fooled with your test case a bit ... I didn't think it was really
necessary to create and drop a table, when we could just use a VALUES
clause as source of test data. Also you'd forgotten to update the
"descr" description of the function to match the final understanding
of the semantics.
Looks good otherwise, so pushed.
regards, tom lane
On 3/3/21 10:47 PM, Tom Lane wrote:
I had the same problem as Dian of the func.sgml hunk winding up in
the wrong place. I think this is practically inevitable unless the
submitter uses more than 3 lines of context for the diff, because
otherwise the context is just boilerplate that looks the same
everywhere in the function tables. Unless the diff is 100% up to date
so that the line numbers are exactly right, patch is likely to guess
wrong about where to insert the new hunk. We'll just have to be
vigilant about that.
Noted.
I fooled with your test case a bit ... I didn't think it was really
necessary to create and drop a table, when we could just use a VALUES
clause as source of test data. Also you'd forgotten to update the
"descr" description of the function to match the final understanding
of the semantics.
Thank you.
Looks good otherwise, so pushed.
Thanks!
--
Vik Fearing