Why does this array query fail?

Started by Ken Tanzerover 12 years ago17 messagesgeneral
Jump to latest
#1Ken Tanzer
ken.tanzer@gmail.com

Hi. Can someone explain to me why the last query below is failing, or what
exactly the error message means? I'm sure there's a simple reason, but I'm
totally not seeing it. I boiled this down from a more complicated example,
but I think the problem is the same. Thanks in advance.

Ken

ets_reach=> SELECT ARRAY['test','pass','fail'];
array
------------------
{test,pass,fail}
(1 row)

ets_reach=> SELECT 'found' WHERE 'test' = ANY( ARRAY['test','pass','fail']
);
?column?
----------
found
(1 row)

ets_reach=> SELECT 'found' WHERE ARRAY['test','pass','fail'] = (SELECT
ARRAY['test','pass','fail']);
?column?
----------
found
(1 row)

ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail']) );

ERROR: array value must start with "{" or dimension information
LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
^

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing
list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

#2bricklen
bricklen@gmail.com
In reply to: Ken Tanzer (#1)
Re: Why does this array query fail?

On Mon, Sep 16, 2013 at 5:32 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:

SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pass','fail']) );

It works if you drop the inner SELECT.

SELECT 'found' WHERE 'test' = ANY( ARRAY['test','pass','fail'] );

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#1)
Re: Why does this array query fail?

Ken Tanzer wrote

ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail']) );

ERROR: array value must start with "{" or dimension information
LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
^

Per documentation of "ANY" it accepts either an array or a subquery. This
is the subquery form. PostgreSQL is trying to convert 'test' into an array
in order to match the array column returned by the subquery.

Remove the "SELECT" to make it work the way you expect - i.e., the
ANY(array) form of the expression.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771170.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#3)
Re: Why does this array query fail?

Thanks for the explanation. I think I at least understand what it's doing
now. I'm either surprised or confused though, as I was under the
impression that you could substitute a subquery for a value pretty much
anywhere, but I guess that's not the case?

Cheers,
Ken

On Mon, Sep 16, 2013 at 6:16 PM, David Johnston <polobo@yahoo.com> wrote:

Ken Tanzer wrote

ets_reach=> SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail']) );

ERROR: array value must start with "{" or dimension information
LINE 1: SELECT 'found' WHERE 'test' = ANY( (SELECT ARRAY['test','pas...
^

Per documentation of "ANY" it accepts either an array or a subquery. This
is the subquery form. PostgreSQL is trying to convert 'test' into an array
in order to match the array column returned by the subquery.

Remove the "SELECT" to make it work the way you expect - i.e., the
ANY(array) form of the expression.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771170.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing
list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

#5bricklen
bricklen@gmail.com
In reply to: Ken Tanzer (#4)
Re: Why does this array query fail?

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:

Thanks for the explanation. I think I at least understand what it's doing
now. I'm either surprised or confused though, as I was under the
impression that you could substitute a subquery for a value pretty much
anywhere, but I guess that's not the case?

Cheers,
Ken

Your subquery can also be explicitly casted to make it work. Note the
"::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail'])::TEXT[] );

#6Ken Tanzer
ken.tanzer@gmail.com
In reply to: bricklen (#5)
Re: Why does this array query fail?

OK I tried that and see it works with the cast. But now I'm confused about
both what exactly is failing without the cast, and about the resulting
error message.

Is the query failing because PG doesn't understand the subquery is yielding
an array? Seems unlikely. But if the problem is a type mismatch between
'test' (on the left) and my subquery, I'd expect the same error message as
if I try to compare an int to a text array:

SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
ERROR: operator does not exist: integer = text[]
LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

Instead of the error message I actually got:

ERROR: array value must start with "{" or dimension information

Thanks.

Ken

On Mon, Sep 16, 2013 at 6:42 PM, bricklen <bricklen@gmail.com> wrote:

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:

Thanks for the explanation. I think I at least understand what it's
doing now. I'm either surprised or confused though, as I was under the
impression that you could substitute a subquery for a value pretty much
anywhere, but I guess that's not the case?

Cheers,
Ken

Your subquery can also be explicitly casted to make it work. Note the
"::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail'])::TEXT[] );

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing
list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

#7Ken Tanzer
ken.tanzer@gmail.com
In reply to: Ken Tanzer (#6)
Re: Why does this array query fail?

Well I partially take back my last question. In the error message, I
missed the non-array / array part of "integer = text[]"

But I'm still confused. My subselect returns an array. If I cast it to a
text array, ANY is happy. But if I don't do so, what exactly does Postgres
think my subquery has yielded? And the error message still doesn't seem to
make sense...

On Mon, Sep 16, 2013 at 6:59 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:

OK I tried that and see it works with the cast. But now I'm confused
about both what exactly is failing without the cast, and about the
resulting error message.

Is the query failing because PG doesn't understand the subquery is
yielding an array? Seems unlikely. But if the problem is a type mismatch
between 'test' (on the left) and my subquery, I'd expect the same error
message as if I try to compare an int to a text array:

SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
ERROR: operator does not exist: integer = text[]
LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

Instead of the error message I actually got:

ERROR: array value must start with "{" or dimension information

Thanks.

Ken

On Mon, Sep 16, 2013 at 6:42 PM, bricklen <bricklen@gmail.com> wrote:

On Mon, Sep 16, 2013 at 6:36 PM, Ken Tanzer <ken.tanzer@gmail.com> wrote:

Thanks for the explanation. I think I at least understand what it's
doing now. I'm either surprised or confused though, as I was under the
impression that you could substitute a subquery for a value pretty much
anywhere, but I guess that's not the case?

Cheers,
Ken

Your subquery can also be explicitly casted to make it work. Note the
"::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail'])::TEXT[] );

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing
list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#7)
Re: Why does this array query fail?

Ken Tanzer wrote

Well I partially take back my last question. In the error message, I
missed the non-array / array part of "integer = text[]"

But I'm still confused. My subselect returns an array. If I cast it to a
text array, ANY is happy. But if I don't do so, what exactly does
Postgres
think my subquery has yielded? And the error message still doesn't seem
to
make sense...

On Mon, Sep 16, 2013 at 6:59 PM, Ken Tanzer &lt;

ken.tanzer@

&gt; wrote:

OK I tried that and see it works with the cast. But now I'm confused
about both what exactly is failing without the cast, and about the
resulting error message.

Is the query failing because PG doesn't understand the subquery is
yielding an array? Seems unlikely. But if the problem is a type
mismatch
between 'test' (on the left) and my subquery, I'd expect the same error
message as if I try to compare an int to a text array:

SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','fail']));
ERROR: operator does not exist: integer = text[]
LINE 1: SELECT 'found' WHERE 1 = ANY( (SELECT ARRAY['test','pass','f...
^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

Instead of the error message I actually got:

ERROR: array value must start with "{" or dimension information

Thanks.

Ken

Your subquery can also be explicitly casted to make it work. Note the
"::TEXT[]"

SELECT 'found' WHERE 'test' = ANY( (SELECT
ARRAY['test','pass','fail'])::TEXT[] );

Can we please follow list norms (or at least my example since I was the
first to respond) and bottom-post.

Sub-queries come in a few different flavors:

This:

ANY( (SELECT ARRAY[])::text[] )

works for the same reason this:

SELECT (SELECT col1 FROM (VALUES (ARRAY[4,5,6])) tbl (col1))::integer[]

works;

but this:

SELECT (SELECT col1 FROM (VALUES (ARRAY[1,2,3]),(ARRAY[4,5,6])) tbl
(col1))::integer[]

fails.

For this explanation I will simplify and state that there are two kinds of
sub-queries:

Scalar
Table-like (i.e., non-scalar)

A scalar sub-query must return, at most, one row and only a single column.
A scalar sub-query can be used wherever a literal value is needed.

A table-like sub-query can return as many rows as desired and either one or
an unlimited number of columns - context depending. In the case
"ANY(sub-query)" context it may only return a single column (but still
multiple rows). In this context PostgreSQL goes looking for an operator -
e.g., equals(=) - with the right-side argument being of the type of the
sub-query column. Since equals needs to have a matching type on the
left-side PostgreSQL presumes that whatever is on the left side must be of
the same type. In this example you supplied an unadorned literal ('test')
which has no type information. Thus PostgreSQL attempts to cast the unknown
literal to the type it requires (text[]) and fails since an array literal
must begin with '{'. The attempt to cast 'test' failed.

WHERE 1 = ANY(SELECT ARRAY[]:text[])

This gives a different error because PostgreSQL knows that the number "1" is
an integer and thus has a known type - no cast is necessary. However, since
the type "integer" does not match the needed type "text[]" a type mis-match
error is thrown or in this case no operator equals(integer, text[]) was
located.

It is the context of the use of the sub-query; not the form of the query
itself, that determines whether a particular sub-query will be treated as
scalar or table-like. It is because "ANY(...)" can accept either a literal
or a table-like sub-query that this ambiguity arises.

The example query just happens to only have a single row but nothing is
explicitly stopping it from generating more. The work-around of casting the
sub-query to "text[]" works because the system knows that it must either get
a scalar result or the sub-query will throw an exception (as in my failing
example above). The system cannot use the actual number of rows returned to
make the decision and so if you know that only one row (at most) can be
returned and you want a scalar interpretation you have to explicitly
indicate that in the query.

There is quite a bit more to this that can be gleaned by reading the
documentation for sub-queries.

Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if you
cannot directly invoke the ARRAY[] syntax. Where it comes in handy would be
something like:

WITH array_to_check (atc) AS ( VALUES (ARRAY[1,2,3]::integer[]) )
SELECT ...
FROM ...
WHERE 2 = ANY((SELECT atc FROM array_to_check)::integer[])

For the most part, however, forcing a non-scalar sub-query to become a
scalar sub-query is an indication that you are doing something wrong.

It may help to think of:

ANY(subquery) as meaning:

ANY(setof "whatever column type the subquery returns as its only column")

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771183.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#8)
Re: Why does this array query fail?

Can we please follow list norms (or at least my example since I was the
first to respond) and bottom-post.

Absolutely. Gmail did it without my realizing, but my bad and I'm all for
following the list conventions.

Thanks for taking the time to explain this stuff, which I appreciate.
Mostly it makes sense, but a couple of things remain puzzling to me.

1) On what exactly does PG base its decision to interpret the ANY as
scalar or not? Or are you saying a sub-query will always be treated as
non-scalar, unless it is explicitly cast to an array?

2) Regarding:

In this context PostgreSQL goes looking for an operator -
e.g., equals(=) - with the right-side argument being of the type of the
sub-query column.

Why? In this case you have ANY (varchar[]), which as I understand it "="
needs to compare with a varchar. So why is it looking for an array? If it
was just varchar = varchar[], I'd get that, but in this case it's
ANY(varchar[]), so does PG extract the type of the argument to ANY without
paying attention to the ANY itself?

There is quite a bit more to this that can be gleaned by reading the

documentation for sub-queries.

I'm not sure if there's something specific you're referring to. I had
looked at the page on subquery expressions (
http://www.postgresql.org/docs/9.0/static/functions-subquery.html), as well
as the following page on "row and array comparisons" to see the two forms
of ANY, but don't see anything that covers these nuances in greater depth.
Is there another page I should be looking at?

Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if...

Yeah, I tried to boil down my example, but this is closer to what I was
really trying to do:

CREATE TEMP TABLE mytable ( codes varchar[] );
INSERT INTO mytable VALUES ( array[ 'pass','fail'] );
INSERT INTO mytable VALUES ( array[ 'found'] );
SELECT 'found' WHERE 'found' =ANY(
(SELECT array_agg(code) FROM (SELECT unnest(codes) AS code FROM
mytable) foo
)
);

And for immediate purposes, found this worked just as well (as a non-scalar
subquery, I guess):

SELECT 'found' WHERE 'found' =ANY(
(SELECT unnest(codes) AS code FROM mytable)
);

Thanks again for your help and explanations!

Ken

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#9)
Re: Why does this array query fail?

Ken Tanzer wrote

1) On what exactly does PG base its decision to interpret the ANY as
scalar or not? Or are you saying a sub-query will always be treated as
non-scalar, unless it is explicitly cast to an array?

Correct. With respect to a sub-query inside ANY(...) it will be treated as
non-scalar. You can explicitly make it scalar by casting it to an array -
understanding that the query will fail if the sub-query does not actually
conform.

2) Regarding:

In this context PostgreSQL goes looking for an operator -
e.g., equals(=) - with the right-side argument being of the type of the
sub-query column.

Why? In this case you have ANY (varchar[]), which as I understand it "="
needs to compare with a varchar. So why is it looking for an array? If
it
was just varchar = varchar[], I'd get that, but in this case it's
ANY(varchar[]), so does PG extract the type of the argument to ANY without
paying attention to the ANY itself?

No. The sub-query version is basically:

varchar = ANY (setof varchar[]) ... which is wrong

varchar[] = ANY (setof varchar[]) ... is correct

The "setof" is the difference; its not trying to look inside the array but
rather looking for an entire array that matches one of the arrays the
sub-query generates.

ARRAY[1,2,3] = ANY ( SELECT col1 FROM (VALUES (ARRAY[2,3,4]::int[]),
(ARRAY[1,2,3])) src (col1) )

In the above ANY has to decide whether {2,3,4} or {1,2,3} is equal to the
input; which must be an array. It does not mean "does the number 1 exist in
any of the supplied arrays". Again, it becomes more clear if you understand
ANY(subquery) can return more than one row.

Is there another page I should be looking at?

Not that I can think of offhand.

Note that "(SELECT ARRAY[...])::text[]" is only a proper solution if...

Yeah, I tried to boil down my example, but this is closer to what I was
really trying to do:

CREATE TEMP TABLE mytable ( codes varchar[] );
INSERT INTO mytable VALUES ( array[ 'pass','fail'] );
INSERT INTO mytable VALUES ( array[ 'found'] );
SELECT 'found' WHERE 'found' =ANY(
(SELECT array_agg(code) FROM (SELECT unnest(codes) AS code FROM
mytable) foo
)
);

And for immediate purposes, found this worked just as well (as a
non-scalar
subquery, I guess):

SELECT 'found' WHERE 'found' =ANY(
(SELECT unnest(codes) AS code FROM mytable)
);

Yes, un-nesting can make the problem go away though it too is unusual. For
the most part either use relations/sets or use arrays (for a specific
component of the schema). Your example mixes the two which makes using that
part of the schema difficult.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771343.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#10)
Re: Why does this array query fail?

Thanks again David. I think that's all making sense to me now, except I
want to follow up on your last point:

Yes, un-nesting can make the problem go away though it too is unusual. For the

most part either use relations/sets or use arrays (for a specific
component of the schema). Your example mixes the two which makes using
that part of the schema difficult.

I'm not sure exactly what you're saying here, but it's important to me
because I've recently had to do a lot with arrays, and continue to have to
do so. What I'm working with is similar to the example I gave you, but let
me be more concrete.

I'm working with a social service agency. Every time they see a client
they fill out an "encounter" record. Part of what is tracked is what kind
of services were provided, which is stored as an array of "service codes"
within the encounter. The encounter also has a date.

So I frequently have to provide information like "what were all the types
of services this client received during the last quarter?" or "show me all
the clients who received service X last year." I've learned enough to use
ANY, array_agg and unnest to get through these queries, but if I'm going
about this wrong or there's a better way to do it I'd love to know about it!

Ken

#12David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#11)
Re: Why does this array query fail?

Ken Tanzer wrote

So I frequently have to provide information like "what were all the types
of services this client received during the last quarter?" or "show me all
the clients who received service X last year." I've learned enough to use
ANY, array_agg and unnest to get through these queries, but if I'm going
about this wrong or there's a better way to do it I'd love to know about
it!

Your example query does not ask those questions.

SELECT DISTINCT service_code
FROM (SELECT unnest(services_rendered_array) AS service_code FROM
services_tables WHERE ...) svcs;

SELECT DISTINCT client_id FROM (
SELECT * FROM services_table WHERE 'X' = ANY(services_rendered_array)
) svcs;

In neither case do you need to use a sub-query answer the question. Namely,
what you describe makes use of arrays only, and not relations (though the
allowed array item values could be defined on a table somewhere).

Option A:
A. T1: session_id, client_id, service_codes[], date

Note that A is the basic structured assumed for the two example queries
above.

Option B:
B. T1: session_id, session_date, client_id
B. T2: session_id (FK-many), service_code

B. T2 would have a single record for each service performed within a given
session while A. T1 models the multiple service aspect of a session by using
an array.

Incorrect Option C:
C. T1: session_id, session_date, client_id
C. T2: session_id, service_codes[]

This makes use of a one-to-many relationship but also embeds yet another
"many" aspect within C. T2 This is generally going to be a bad idea as you
are now mixing the models together. And note that I do qualify this as
generally since you may very well decide that C is an elegant and/or the
most correct way to model your domain.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771359.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#12)
Re: Why does this array query fail?

Based on what you described, I think I've generally gone with option A.
Conceptually I like B better, but it's generally more complicated and
seems like overkill for simple checkbox-type options. (But as an aside, I
am looking forward to the time when ELEMENT FKs overcome their performance
issues and become part of Postgres!)

The trouble seems to be that even with Option A (services_codes in an array
within an encounter record), you still kind of end up with option C on a
client level:

client {service_codes}
client {service_codes}

There may be no way around it, but it seems like you end up needing to
write rather cumbersome queries to get at your data. OTOH there's always
room for improvement; since I'm relatively new to working extensively with
arrays, I'm hoping they become more intuitive and less painful as one gets
used to them. :)

SELECT client_id,
COALESCE(
(SELECT array_agg(code) FROM (
SELECT distinct
client_id,unnest(accessed_health_care_non_urgent_codes) AS code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-08-01' AND '2013-08-31'
) foo
),array['(none)'])
AS accessed_health_care_non_urgent_codes
FROM client;

It's probably way more detail than you want, but I've attached the table
structure and pasted in a quarterly report that the query above was taken
from in case you have any pointers or are simply curious.

Thanks again!

Ken

/*
CREATE OR REPLACE VIEW hch_quarterly AS
*/

SELECT *
FROM (

SELECT

export_id,
UPPER(SUBSTRING(name_last,1,2) || SUBSTRING(name_first,1,2) ||
COALESCE(to_char(dob,'MMDDYY'),'')) AS hch_id,
name_last,
name_first,
dob,
'2013-01-01' AS quarter_start_date,
'2013-03-31' AS quarter_end_date,
referral_source_code || COALESCE(' (' || referral_source_other || ')','')
AS referral_source,
facility_code AS living_situation_end,

/*
unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(other_living_situation_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS other_sleeping_codes,
*/

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(other_living_situation_codes) AS code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31'

UNION SELECT distinct client_id,moved_from_code AS code
FROM residence_other
WHERE client_id=client.client_id
AND residence_date BETWEEN '2013-01-01' AND '2013-03-31' ) foo
),array['(none)'])) AS other_sleeping_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_health_care_non_urgent_codes) AS
code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_health_care_non_urgent_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_services_cd_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_services_cd_codes,

unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(completed_services_cd_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS completed_services_cd_codes,
unnest(COALESCE((SELECT array_agg(code) FROM
(SELECT distinct client_id,unnest(accessed_services_mh_codes) AS code

FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-01-01' AND '2013-03-31') foo
),array['(none)'])) AS accessed_services_mh_codes,
CASE WHEN client_id IN (SELECT DISTINCT ON (client_id,staff_assign_date)
client_id
FROM staff_assign
WHERE staff_assign_type_code='PAYEE' AND staff_assign_date<='2013-03-31'
AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS
has_payee,
CASE WHEN client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_type_code='OUTREACH' AND staff_assign_date<='2013-03-31'
AND COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS
outreach_client,
CASE WHEN client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_type_code='CM' AND staff_assign_date<='2013-03-31' AND
COALESCE(staff_assign_date_end,'2013-03-31') >= '2013-01-01'
ORDER BY client_id,staff_assign_date DESC) THEN 'Yes' ELSE 'No' END AS
cm_client,

service_plan_status_code

FROM client
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM intake_reach
WHERE intake_reach_date <= '2013-03-31'
ORDER BY client_id,intake_reach_date DESC ) AS ir USING (client_id)
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM residence_other
WHERE residence_date <= '2013-03-31' AND
COALESCE(residence_date_end,'2013-03-31')>='2013-01-01'
ORDER BY client_id,residence_date DESC) AS ro USING (client_id)
LEFT JOIN (SELECT DISTINCT ON (client_id) *
FROM service_reach
WHERE service_date BETWEEN '2013-01-01' AND '2013-03-31'
ORDER BY client_id,service_date DESC) AS sr USING (client_id)
LEFT JOIN (SELECT client_id,export_id
FROM client_export_id
WHERE export_organization_code='HCH') exp USING (client_id)
WHERE client_id IN (SELECT client_id
FROM staff_assign
WHERE staff_assign_date <= '2013-03-31' AND staff_assign_type_code IN
('CM','OUTREACH') AND COALESCE(staff_assign_date_end,'2013-03-31') >=
'2013-01-01' AND staff_project(staff_id) IN ('OUTREACH','REACH'))

) AS whole_shebang
--ORDER BY client_name(client_id)

On Tue, Sep 17, 2013 at 5:02 PM, David Johnston <polobo@yahoo.com> wrote:

Ken Tanzer wrote

So I frequently have to provide information like "what were all the types
of services this client received during the last quarter?" or "show me

all

the clients who received service X last year." I've learned enough to

use

ANY, array_agg and unnest to get through these queries, but if I'm going
about this wrong or there's a better way to do it I'd love to know about
it!

Your example query does not ask those questions.

SELECT DISTINCT service_code
FROM (SELECT unnest(services_rendered_array) AS service_code FROM
services_tables WHERE ...) svcs;

SELECT DISTINCT client_id FROM (
SELECT * FROM services_table WHERE 'X' = ANY(services_rendered_array)
) svcs;

In neither case do you need to use a sub-query answer the question.
Namely,
what you describe makes use of arrays only, and not relations (though the
allowed array item values could be defined on a table somewhere).

Option A:
A. T1: session_id, client_id, service_codes[], date

Note that A is the basic structured assumed for the two example queries
above.

Option B:
B. T1: session_id, session_date, client_id
B. T2: session_id (FK-many), service_code

B. T2 would have a single record for each service performed within a given
session while A. T1 models the multiple service aspect of a session by
using
an array.

Incorrect Option C:
C. T1: session_id, session_date, client_id
C. T2: session_id, service_codes[]

This makes use of a one-to-many relationship but also embeds yet another
"many" aspect within C. T2 This is generally going to be a bad idea as you
are now mixing the models together. And note that I do qualify this as
generally since you may very well decide that C is an elegant and/or the
most correct way to model your domain.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771359.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
AGENCY Software
A data system that puts you in control
100% Free Software
*http://agency-software.org/*
ken.tanzer@agency-software.org
(253) 245-3801

Subscribe to the mailing
list<agency-general-request@lists.sourceforge.net?body=subscribe>
to
learn more about AGENCY or
follow the discussion.

Attachments:

create.tbl_service.sqlapplication/octet-stream; name=create.tbl_service.sqlDownload
#14David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#13)
Re: Why does this array query fail?

Ken Tanzer wrote

SELECT client_id,
COALESCE(
(SELECT array_agg(code) FROM (
SELECT distinct
client_id,unnest(accessed_health_care_non_urgent_codes) AS code
FROM service_reach
WHERE client_id=client.client_id
AND service_date BETWEEN '2013-08-01' AND '2013-08-31'
) foo
),array['(none)'])
AS accessed_health_care_non_urgent_codes
FROM client;

Equivalent semantics:

WITH clients_with_codes AS (
SELECT client_id, array_agg(code) AS client_codes FROM (SELECT client_id,
unnest(accessed...) AS code FROM service_reach) foo GROUP BY client_id
)
SELECT client_id, COALESCE(client_codes, ARRAY['(none)']) AS client_codes
FROM client LEFT JOIN client_with_codes USING (client_id)

Should (recommend testing) perform better due to the simple fact that you
avoid the correlated sub-query (i.e., a sub-query that references the outer
query to obtain some parameter - in this case the client_id of the current
row). The goal is to create an uncorrelated sub-query/relation that
contains all the data you require then JOIN it with the original outer
relation using the same equality you were using in the correlated version.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771366.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#15David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#13)
Re: Why does this array query fail?

Ken Tanzer wrote

It's probably way more detail than you want, but I've attached the table
structure and pasted in a quarterly report that the query above was taken
from in case you have any pointers or are simply curious.

Is this a quarterly report because that is how long it takes to run?

On a smaller scale I've written queries like this. I enrolled in a
university database design course shortly thereafter...

I would suggest considering how to use functions to encapsulate some of the
"medical code collecting" logic. And consider WITH/CTE constructs as well,
like I used in my last message, to effectively create temporary named tables
for different parts of the query.

Might want to move the whole thing into function and pass in the various
parameters - namely the date range - instead of hard-coding the values into
the view.

Those thoughts aside I've done my own share of "write-once, read never"
queries and if the performance meets your needs and the maintenance burden
is acceptable then sometimes you just leave this in place until someone more
knowledgeable (like a future you probably) comes back and decides to toss
and rewrite it.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771367.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16Ken Tanzer
ken.tanzer@gmail.com
In reply to: David G. Johnston (#15)
Re: Why does this array query fail?

Is this a quarterly report because that is how long it takes to run?

It takes about 7 seconds to run. I suppose if I optimized it I could save
a minute every couple of years.

I usually get concerned about performance issues when they're actually
causing problems. I'm generally more concerned about how long it takes to
write queries, and how cumbersome the SQL involved is. And since arrays
are relatively new to me, I've been trying to understand generally the best
ways to query information out of them, or when their behavior just doesn't
make sense to me. I'll say I answer 99.99% of my own questions before they
ever make it to the list, and by the time they do I invariably have read
the documentation as best as I can. This has been my favorite list ever to
read, as people are invariably helpful, patient and polite to each other.

I would suggest considering how to use functions to encapsulate some
of the "medical

code collecting" logic. And consider WITH/CTE constructs as well, like I
used in my last message, to effectively create temporary named tables for
different parts of the query.

Might want to move the whole thing into function and pass in the various
parameters - namely the date range - instead of hard-coding the values
into the view.

Thanks for these constructive suggestions. I see benefits both ways. And
the dates are actually parsed in by an app at run-time. (I stripped that
part out to avoid confusion--I find it hard to know when submitting a list
item how much to just dump a full real example, and how much to simplify
down to a test case that illustrates the specific issue.)

On a smaller scale I've written queries like this. I enrolled in a

university database design course shortly thereafter...

until someone more knowledgeable (like a future you probably) comes back and

I'm not sure what the point of either of these comments were, but perhaps
they made you feel better. Either way, thanks for taking the time to look
my stuff over and for the other comments and explanations you made.

Ken

#17David G. Johnston
david.g.johnston@gmail.com
In reply to: Ken Tanzer (#16)
Re: Why does this array query fail?

Ken Tanzer wrote

Is this a quarterly report because that is how long it takes to run?

It takes about 7 seconds to run. I suppose if I optimized it I could save
a minute every couple of years.

Was meant to be tongue-in-cheek...though I was curious on the real
answer...agree on optimize when necessary.

On a smaller scale I've written queries like this. I enrolled in a

university database design course shortly thereafter...

until someone more knowledgeable (like a future you probably) comes back
and

I'm not sure what the point of either of these comments were, but perhaps
they made you feel better. Either way, thanks for taking the time to look
my stuff over and for the other comments and explanations you made.

Ken

Mostly that SQL, and regular expressions - which I use a lot too, tend to
result in stuff that makes perfect sense as you build it up from scratch but
if you have to come back and modify it later its very difficult to makes
changes as opposed to building up a new query with the new needs and
experience.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Why-does-this-array-query-fail-tp5771165p5771691.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general