dumb question

Started by Steve Clarkalmost 10 years ago10 messagesgeneral
Jump to latest
#1Steve Clark
steve.clark@netwolves.com

Hi List,

I am a noob trying to do something that seems like it should be easy but I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1

I want to find the max(id) whose sts is 0 but whose id is not referenced by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Steve Clark (#1)
Re: dumb question

On Thursday, June 2, 2016, Steve Clark <steve.clark@netwolves.com> wrote:

Hi List,

I am a noob trying to do something that seems like it should be easy but I
can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1

I want to find the max(id) whose sts is 0 but whose id is not referenced
by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

So, of all the rows whose sts is 0 and the id is not in (or not exists)
ref_id....subquery for selection
Give me the maximum id...parent query with group by.

David J.

#3Steve Crawford
scrawford@pinpointresearch.com
In reply to: Steve Clark (#1)
Re: dumb question

Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table structure
and the query you tried that doesn't work. If ref_id is actually a
character string then you might need ref_id='' or coalesce(ref_id,'')='' if
it can be null or empty string.

Cheers,
Steve

On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com>
wrote:

Show quoted text

Hi List,

I am a noob trying to do something that seems like it should be easy but I
can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1

I want to find the max(id) whose sts is 0 but whose id is not referenced
by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Felipe Santos
felipepts@gmail.com
In reply to: Steve Crawford (#3)
Re: dumb question

2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>:

Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table
structure and the query you tried that doesn't work. If ref_id is actually
a character string then you might need ref_id='' or coalesce(ref_id,'')=''
if it can be null or empty string.

Cheers,
Steve

On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com>
wrote:

Hi List,

I am a noob trying to do something that seems like it should be easy but
I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1

I want to find the max(id) whose sts is 0 but whose id is not referenced
by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id from
yourtable);

Isn't it?

#5Steve Crawford
scrawford@pinpointresearch.com
In reply to: Felipe Santos (#4)
Re: dumb question

On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com> wrote:

2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawford@pinpointresearch.com>
:

Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table
structure and the query you tried that doesn't work. If ref_id is actually
a character string then you might need ref_id='' or coalesce(ref_id,'')=''
if it can be null or empty string.

Cheers,
Steve

On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.clark@netwolves.com>
wrote:

Hi List,

I am a noob trying to do something that seems like it should be easy but
I can't figure it out.

I have a table like so:

id | ref_id | sts
------------------
1 | | 0
2 | 1 | 1
3 | | 0
4 | | 0
5 | 4 | 1
6 | | 0
7 | 6 | 1

I want to find the max(id) whose sts is 0 but whose id is not referenced
by ref_id.

so the answer would be id=3.

Thanks for any pointers,
Steve

--

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id
from yourtable);

Isn't it?

The OP will need to explain further as we are all guessing. As I mentioned

in my earlier (accidental top - curses GMail) post, table structures and
the query or queries that don't work would be useful. So would a
description of the problem that is being solved since there could be better
approaches.

Cheers,
Steve

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Steve Crawford (#5)
Re: dumb question

On Thu, Jun 2, 2016 at 1:48 PM, Steve Crawford <
scrawford@pinpointresearch.com> wrote:

On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipepts@gmail.com>
wrote:

I think sts=0 means ref_id is null

So, what I think he wants to achieve is:

select max(id) from yourtable where sts=0 and id not in (select ref_id
from yourtable);

Isn't it?

The OP will need to explain further as we are all guessing. As I

mentioned in my earlier (accidental top - curses GMail) post, table
structures and the query or queries that don't work would be useful. So
would a description of the problem that is being solved since there could
be better approaches.

​Maybe we mis-interpreted but as written this is the solution. My out-loud
thinking was a more verbose version of this.​

​David J.​

#7John R Pierce
pierce@hogranch.com
In reply to: David G. Johnston (#6)
Re: dumb question

On 6/2/2016 11:10 AM, Steve Clark wrote:

Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:

select max(id) from yourtable where sts=0 and id not in (select
ref_id from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id
from yourtable);

do note, this is whats known as an 'anti-join', and these can be pretty
expensive on large tables.

--
john r pierce, recycling bits in santa cruz

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: John R Pierce (#7)
Re: dumb question

On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:

On 6/2/2016 11:10 AM, Steve Clark wrote:

Thanks all the below seem to do the trick.

On 06/02/2016 01:58 PM, David G. Johnston wrote:

select max(id) from yourtable where sts=0 and id not in (select ref_id
from yourtable);

select max(id) from yourtable where sts=0 and id not in (select ref_id
from yourtable);

do note, this is whats known as an 'anti-join', and these can be pretty
expensive on large tables.

​+1

​Though I suspect that with a partial index on (id, sts=0) and (ref_id,
ref_id IS NOT NULL), though highly sensitive to density, that even for
large​ total row counts it would perform pretty well; but I'm not
knowledgeable in how smart we are here. Selecting, in descending order,
(id where sts = 0), from the index and then poking into index(ref_id)
should, particularly if the cross-set is sparse, pretty quickly find a
non-match.

David J.

#9Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: David G. Johnston (#8)
Re: dumb question

On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com> wrote:

Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want. (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from t);
max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
max
-----
3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

do note, this is whats known as an 'anti-join', and these can be pretty
expensive on large tables.

+1

*Can* be. Proper indexing can make them very reasonable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Kevin Grittner (#9)
Re: dumb question

On Thu, Jun 2, 2016 at 5:44 PM, Kevin Grittner <kgrittn@gmail.com> wrote:

On Thu, Jun 2, 2016 at 3:23 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Thu, Jun 2, 2016 at 4:11 PM, John R Pierce <pierce@hogranch.com>

wrote:

Thanks all the below seem to do the trick.

I doubt it -- using NOT IN requires (per the SQL specification)
handling NULLs in a way that probably does not give you the answer
you want. (NOT IN also is often much slower than the NOT EXISTS
test which will actually give you the answer you want.)

test=# create table t (id int not null primary key, ref_id int, sts
int not null default 0);
CREATE TABLE
test=# insert into t values
(1,null,0),(2,1,1),(3,null,0),(4,null,0),(5,4,1),(6,null,0),(7,6,1);
INSERT 0 7
test=# select max(id) from t where sts=0 and id not in (select ref_id from
t);
max
-----

(1 row)

test=# select max(id) from t t1 where sts = 0 and not exists (select *
from t t2 where t2.ref_id = t1.id);
max
-----
3
(1 row)

Note that providing minimal setup (like the above) helps in getting
good answers quickly.

do note, this is whats known as an 'anti-join', and these can be pretty
expensive on large tables.

+1

*Can* be. Proper indexing can make them very reasonable.

​Doh (me)...

Indeed, NOT IN (...) doesn't qualify as an anti-join since (for one) it
cannot (I don't think) be optimized in this way as the entire contents of
the IN() need to be determined. IOW, its not really a join but just
another predicate condition whose one side is a subquery.

That said, writing out a full anti-join NOT EXISTS (or, similarly, a
semi-join EXISTS) clause can be a bit tedious for ad-hoc stuff while the
IN() variation is a bit more succinct and, I'd venture to say,
unfortunately familiar. In can be made to work in this situation by
writing the expression as IN (SELECT ref_id FROM t WHERE ref_id IS NOT
NULL).

David J.