Refining query statement
Working with my sales/client management system using psql I have a select
statement to identify contacts to be made. This statement works:
select (C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name,
A.next_contact)
from Contacts as C, Organizations as O, Activities as A
where C.org_id = O.org_id and C.contact_id = A.contact_id and
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null;
but would benefit from tweaking. When I have had multiple contacts with
someone I want only the most recent one displayed, not all, and they should
be no more ancient than a defined period (e.g., a month).
I want to learn how to make this query cleaner and more flexible. When I
write the UI for this I want to be able to specify a data range in addition
to a fixed 'today'. Pointers on what to read will be very helpful.
TIA,
Rich
Rich Shepard schrieb am 15.01.2019 um 16:39:
Working with my sales/client management system using psql I have a select
statement to identify contacts to be made. This statement works:select (C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact)
from Contacts as C, Organizations as O, Activities as A
where C.org_id = O.org_id and C.contact_id = A.contact_id and
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null;but would benefit from tweaking. When I have had multiple contacts with
someone I want only the most recent one displayed, not all, and they should
be no more ancient than a defined period (e.g., a month).I want to learn how to make this query cleaner and more flexible. When I
write the UI for this I want to be able to specify a data range in addition
to a fixed 'today'. Pointers on what to read will be very helpful.
With regards to "cleaner": the first thing to do is to remove the parentheses around the column list.
In Postgres "(a,b,c)" creates a single column with an anonymous record type (that contains three fields), rather than selecting three columns.
In other DBMS those parentheses are simply useless.
"cleaner" is always subjective, but I find explicit JOIN operators a lot cleaner than the old implicit joins.
The condition "A.next_contact is not null" is actually no necessary because you already have a condition on that column, so NULL values won't be returned anyway.
To get the "most recent one" in Postgres, DISTINCT ON () is usually the best way to do it:
So we end up with something like this:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;
On 1/15/19 7:39 AM, Rich Shepard wrote:
Working with my sales/client management system using psql I have a
select
statement to identify contacts to be made. This statement works:select (C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name,
A.next_contact)
from Contacts as C, Organizations as O, Activities as A
where C.org_id = O.org_id and C.contact_id = A.contact_id and
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null;but would benefit from tweaking. When I have had multiple contacts with
someone I want only the most recent one displayed, not all, and they should
be no more ancient than a defined period (e.g., a month).I want to learn how to make this query cleaner and more flexible. When I
write the UI for this I want to be able to specify a data range in addition
to a fixed 'today'. Pointers on what to read will be very helpful.
For the above I could see using a datepicker widget that allows for
multidate select. The specifics would depend on the software you are
using to write the UI.
TIA,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On 1/15/19 9:47 AM, Thomas Kellerer wrote:
Rich Shepard schrieb am 15.01.2019 um 16:39:
Working with my sales/client management system using psql I have a select
statement to identify contacts to be made. This statement works:select (C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact)
from Contacts as C, Organizations as O, Activities as A
where C.org_id = O.org_id and C.contact_id = A.contact_id and
A.next_contact <= 'today' and A.next_contact > '2018-12-31' and
A.next_contact is not null;but would benefit from tweaking. When I have had multiple contacts with
someone I want only the most recent one displayed, not all, and they should
be no more ancient than a defined period (e.g., a month).I want to learn how to make this query cleaner and more flexible. When I
write the UI for this I want to be able to specify a data range in addition
to a fixed 'today'. Pointers on what to read will be very helpful.With regards to "cleaner": the first thing to do is to remove the parentheses around the column list.
In Postgres "(a,b,c)" creates a single column with an anonymous record type (that contains three fields), rather than selecting three columns.
In other DBMS those parentheses are simply useless."cleaner" is always subjective, but I find explicit JOIN operators a lot cleaner than the old implicit joins.
The condition "A.next_contact is not null" is actually no necessary because you already have a condition on that column, so NULL values won't be returned anyway.
To get the "most recent one" in Postgres, DISTINCT ON () is usually the best way to do it:
So we end up with something like this:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;
And I've never liked this method (though I'm old and crotchety)....
--
Angular momentum makes the world go 'round.
On Tue, 15 Jan 2019, Adrian Klaver wrote:
For the above I could see using a datepicker widget that allows for
multidate select. The specifics would depend on the software you are using
to write the UI.
Adrian,
I'm using wxPython4, and I will use a calendar with that. But, is there a
way to enhance the psql version, too?
Thanks,
Rich
On Tue, 15 Jan 2019, Thomas Kellerer wrote:
With regards to "cleaner": the first thing to do is to remove the
parentheses around the column list. In Postgres "(a,b,c)" creates a single
column with an anonymous record type (that contains three fields), rather
than selecting three columns. In other DBMS those parentheses are simply
useless.
Thomas,
I should have seen that myself. Thanks for pointing out the obvious to me.
"cleaner" is always subjective, but I find explicit JOIN operators a lot
cleaner than the old implicit joins.
This does make sense; I need to refresh my knowledge of JOIN operators and
will do so.
The condition "A.next_contact is not null" is actually no necessary
because you already have a condition on that column, so NULL values won't
be returned anyway.
True that.
To get the "most recent one" in Postgres, DISTINCT ON () is usually the
best way to do it: So we end up with something like this:select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;
This insight really helps. Now I know how to approach other queries.
Many thanks,
Rich
On Tue, 15 Jan 2019, Ron wrote:
And I've never liked this method (though I'm old and crotchety)....
Ron,
I'm older but not crotchety (most days), and I'm taking my SQL knowledge
beyone what I've used in the past. I would appreciate you're explaining why
you don't like the explicit JOINs presented by Thomas. The more I learn from
you more experienced SQL programmers the better I am.
Regards,
Rich
On 1/15/19 8:26 AM, Rich Shepard wrote:
On Tue, 15 Jan 2019, Adrian Klaver wrote:
For the above I could see using a datepicker widget that allows for
multidate select. The specifics would depend on the software you are
using
to write the UI.Adrian,
I'm using wxPython4, and I will use a calendar with that. But, is
there a
way to enhance the psql version, too?
Use BETWEEN?:
https://www.postgresql.org/docs/10/functions-comparison.html
a BETWEEN x AND y between
So:
next_contact BETWEEN '01/01/2019'::date AND '01/15/2019'::date
Or a range function:
https://www.postgresql.org/docs/10/functions-range.html
<@ element is contained by
Thanks,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On 1/15/19 8:02 AM, Ron wrote:
the best way to do it:
So we end up with something like this:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname,
C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;And I've never liked this method (though I'm old and crotchety)....
What is the specific objection?
To me:
1) Plus side. It saves an explicit sub-query
2) Minus side. It is not portable. But then complete portability is a
unicorn in my opinion.
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, 15 Jan 2019, Adrian Klaver wrote:
Use BETWEEN?:
https://www.postgresql.org/docs/10/functions-comparison.html
a BETWEEN x AND y betweenSo:
next_contact BETWEEN '01/01/2019'::date AND '01/15/2019'::date
Or a range function:
Adrian,
Ah, yes! I do need to refresh my SQL knowledge and am working on that.
Thanks very much again,
Rich
Adrian Klaver schrieb am 15.01.2019 um 17:44:
So we end up with something like this:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;And I've never liked this method (though I'm old and crotchety)....
What is the specific objection?
To me:
1) Plus side. It saves an explicit sub-query
2) Minus side. It is not portable. But then complete portability is a unicorn in my opinion.
I think Ron was referring to the JOIN operator, rather than the DISTINCT ON
On 1/15/19 9:02 AM, Ron wrote: select distinct on (C.contact_id)
C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;And I've never liked this method (though I'm old and crotchety)....
I suspect I can match your old and crotchety, but jump in, the water is
fine (once you get used to it)
On Tue, 15 Jan 2019, Thomas Kellerer wrote:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;
Combining this with Adrian's advice to use BETWEEN I have this statement
that almost works:
SELECT DISTINCT ON (c.contact_id) c.contact_id, c.lname, c.fname, c.direct_phone, o.org_name, a.next_contact
FROM Contacts AS c
JOIN Organizations AS o ON c.org_id = o.org_id
JOIN Activities AS a ON c.contact_id = a.contact_id
WHERE next_contact BETWEEN '01/01/2019'::date AND 'today'::date
ORDER BY c.contact_id, a.next_contact DESC;
It fails when the most recent next_contact column in Activities is NULL and
an earier row has a non-NULL date in the specified range.
I tried specifying max(a.next_contact) and added GROUP BY, but the result
set all returned o.org_name columns to the same one.
The WHERE clause needs to exclude a contact_id where the most current row in
Activities has NULL for the next_contact column. I've tried a few ideas but
none work so I need to learn the proper syntax, and I don't find that in
Rick van der Lans' or Joe Celko's books I have.
Looking forward to learning,
Rich
On 1/17/19 8:14 AM, Rich Shepard wrote:
On Tue, 15 Jan 2019, Thomas Kellerer wrote:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname,
C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;Combining this with Adrian's advice to use BETWEEN I have this statement
that almost works:SELECT DISTINCT ON (c.contact_id) c.contact_id, c.lname, c.fname,
c.direct_phone, o.org_name, a.next_contact
FROM Contacts AS c
JOIN Organizations AS o ON c.org_id = o.org_id
JOIN Activities AS a ON c.contact_id = a.contact_id WHERE
next_contact BETWEEN '01/01/2019'::date AND 'today'::date ORDER BY
c.contact_id, a.next_contact DESC;It fails when the most recent next_contact column in Activities is NULL and
an earier row has a non-NULL date in the specified range.I tried specifying max(a.next_contact) and added GROUP BY, but the result
set all returned o.org_name columns to the same one.The WHERE clause needs to exclude a contact_id where the most current
row in
Activities has NULL for the next_contact column. I've tried a few ideas but
none work so I need to learn the proper syntax, and I don't find that in
Rick van der Lans' or Joe Celko's books I have.
?
...
WHERE
next_contact
BETWEEN
'01/01/2019'::date AND 'today'::date
AND
a.next_contact IS NOT NULL
ORDER BY
c.contact_id, a.next_contact DESC;
Looking forward to learning,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On Thu, Jan 17, 2019 at 9:14 AM Rich Shepard <rshepard@appl-ecosys.com> wrote:
ORDER BY c.contact_id, a.next_contact DESC;
The WHERE clause needs to exclude a contact_id where the most current row in
Activities has NULL for the next_contact column.
Why is next_contact allowed to be null?
Your concept of "most current row" is strictly based upon next_contact
so if next_contact is null it is impossible to know whether it is the
most current row (consider that two activities could be missing the
next_contact date - which of those is the most current?)
David J.
On 1/17/19 8:14 AM, Rich Shepard wrote:
On Tue, 15 Jan 2019, Thomas Kellerer wrote:
select distinct on (C.contact_id) C.contact_id, C.lname, C.fname,
C.direct_phone, O.org_name, A.next_contact
from Contacts as C
join Organizations as O on C.org_id = O.org_id
join Activities as A on C.contact_id = A.contact_id
where A.next_contact <= 'today'
and A.next_contact > '2018-12-31'
order by c.contact_id, a.next_contact DESC;Combining this with Adrian's advice to use BETWEEN I have this statement
that almost works:SELECT DISTINCT ON (c.contact_id) c.contact_id, c.lname, c.fname,
c.direct_phone, o.org_name, a.next_contact
FROM Contacts AS c
JOIN Organizations AS o ON c.org_id = o.org_id
JOIN Activities AS a ON c.contact_id = a.contact_id WHERE
next_contact BETWEEN '01/01/2019'::date AND 'today'::date ORDER BY
c.contact_id, a.next_contact DESC;It fails when the most recent next_contact column in Activities is NULL and
an earier row has a non-NULL date in the specified range.I tried specifying max(a.next_contact) and added GROUP BY, but the result
set all returned o.org_name columns to the same one.The WHERE clause needs to exclude a contact_id where the most current
row in
Activities has NULL for the next_contact column. I've tried a few ideas but
none work so I need to learn the proper syntax, and I don't find that in
Rick van der Lans' or Joe Celko's books I have.
Got to thinking more and realized the answer depends on what you want
the query to produce. Can you let us know what is you are trying to pull
out with the query?
Looking forward to learning,
Rich
--
Adrian Klaver
adrian.klaver@aklaver.com
On Thu, 17 Jan 2019, David G. Johnston wrote:
Why is next_contact allowed to be null?
David,
There are a number of reasons. The prospect might have retired, told me to
pound sand, or has put off a decision.
Your concept of "most current row" is strictly based upon next_contact so
if next_contact is null it is impossible to know whether it is the most
current row (consider that two activities could be missing the
next_contact date - which of those is the most current?)
That's quite true. I realize I've been asking the question incorrectly.
What I want is a list of contacts to make today. This includes ones that
should have been made earlier but weren't and excludes earlier contacts that
have no scheduled next contact (therefore, the nulls.).
Does this clarify what I'm asking of the data?
Thanks,
Rich
On Thu, 17 Jan 2019, Adrian Klaver wrote:
Got to thinking more and realized the answer depends on what you want the
query to produce. Can you let us know what is you are trying to pull out
with the query?
Adrian,
Certainly. Over breakfast I realized the same thing: the existing SELECT
query is not asking for the correct data.
As I replied in response to David's message, what I want is a list of
contacts to make today. This includes ones that should have been made
earlier but weren't and excludes earlier contacts that have no scheduled
next contact (therefore, the nulls.).
Thanks,
Rich
On Thu, Jan 17, 2019 at 9:47 AM Rich Shepard <rshepard@appl-ecosys.com> wrote:
What I want is a list of contacts to make today. This includes ones that
should have been made earlier but weren't and excludes earlier contacts that
have no scheduled next contact (therefore, the nulls.).Does this clarify what I'm asking of the data?
Yes...though now it just sounds like a flawed data model. How stuck
are you in that regard? Those "future" contacts should have their own
records and not be derived via an optional field on an existing
record.
Put differently, how do you know which activities are completed and
which are not?
Dave
On Thu, 17 Jan 2019, David G. Johnston wrote:
Yes...though now it just sounds like a flawed data model.
David,
This is what I thought.
How stuck are you in that regard? Those "future" contacts should have their
own records and not be derived via an optional field on an existing
record.
My goal is to make a functioning business tracking application for my
consulting services. Almost all my prior postgres databases hold
environmental data for statistical and spatio-temporal analyses so writing a
business application is a new experience for me and I want to get it
correct.
Put differently, how do you know which activities are completed and
which are not?
The direct answer is that a completed activity has a row with either a
future next-activity date or a null (which is the case when the status of
that organization or contact is 'no further contact'.)
I should rename the Contacts table as People and the Activities table as
Contacts. The original names came from a sales management system I used as a
design guide, but they're probably confusing to others as well as to me. :-)
I can provide my current schema (eight tables) to the list (perhaps as an
attachment), an individual, or put in on a cloud site and pass the URL.
Thanks,
Rich