(un)grouping question

Started by Rhys A.D. Stewartabout 18 years ago12 messagesgeneral
Jump to latest
#1Rhys A.D. Stewart
rhys.stewart@gmail.com

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue. doing
this by just adding a random number is perfectly fine, however i want to
retain at least one of the original values of somevalue. Any ideas how to do
this in in a query?

Rhys,

Peace & Love|Live Long & Prosper

#2Tonkuma
tonkuma@fiberbit.net
In reply to: Rhys A.D. Stewart (#1)
Re: (un)grouping question

Don't this satisfy your requirement?
(This isn't tested. There may be some syntax error.)
DELETE FROM your_table T
WHERE uid >
(SELECT MIN(uid)
FROM your_table M
WHERE M.somevalue = T.somevalue
)
;

The result I expected is:
SELECT * FROM your_table;

uid|somevalue
--------------------
1|11
3|31
6|33
2|44
5|71

#3Reece Hart
reece@harts.net
In reply to: Rhys A.D. Stewart (#1)
Re: (un)grouping question

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue.
doing this by just adding a random number is perfectly fine, however
i want to retain at least one of the original values of somevalue. Any
ideas how to do this in in a query?

There's certainly no need for a random number hack. Instead, use a query
like 'select distinct on (somevalue) * from mytable;' .
Although DISTINCT is standard, the DISTINCT ON (cols) variant is a
PostgreSQL-ism.

See
http://www.postgresql.org/docs/8.2/static/queries-select-lists.html#QUERIES-DISTINCT .

-Reece, your heterographic brother

--
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0

#4Jeff Davis
pgsql@j-davis.com
In reply to: Rhys A.D. Stewart (#1)
Re: (un)grouping question

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue.
doing this by just adding a random number is perfectly fine, however
i want to retain at least one of the original values of somevalue. Any
ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
Jeff Davis

#5Rhys A.D. Stewart
rhys.stewart@gmail.com
In reply to: Jeff Davis (#4)
Re: (un)grouping question

ok, let me clarify, dont want to remove them just want them changed but need
to keep the uid. However, I would like just one somevalue to remain the
same. so for example, uids, 2,4 and 8 have somevalue 44, after i would like
2 to remain 44 but uids 4 and 8 would be changed.
2008/1/21, Jeff Davis <pgsql@j-davis.com>:

Show quoted text

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue.
doing this by just adding a random number is perfectly fine, however
i want to retain at least one of the original values of somevalue. Any
ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
Jeff Davis

#6Scott Marlowe
scott.marlowe@gmail.com
In reply to: Rhys A.D. Stewart (#1)
Re: (un)grouping question

On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue. doing
this by just adding a random number is perfectly fine, however i want to
retain at least one of the original values of somevalue. Any ideas how to do
this in in a query?

I can get you halfway there. You want a query something like this to
identify all but one of the values:

select a.uid from sometable a left join sometable b on
(a.somevale=b.somevalue and a.uid > b.uid)

From there, you'll have to figure out the update part of problem. :)

#7Scott Marlowe
scott.marlowe@gmail.com
In reply to: Scott Marlowe (#6)
Re: (un)grouping question

On Jan 21, 2008 1:45 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

On Jan 21, 2008 11:36 AM, Rhys Stewart <rhys.stewart@gmail.com> wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue. doing
this by just adding a random number is perfectly fine, however i want to
retain at least one of the original values of somevalue. Any ideas how to do
this in in a query?

I can get you halfway there. You want a query something like this to
identify all but one of the values:

select a.uid from sometable a left join sometable b on
(a.somevale=b.somevalue and a.uid > b.uid)

From there, you'll have to figure out the update part of problem. :)

Oh yea, you might need a distinct in that select query.

#8Andrei Kovalevski
andyk@commandprompt.com
In reply to: Rhys A.D. Stewart (#5)
Re: (un)grouping question

May be this is what you need:

select
test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue)
from
test
left outer join
(select
*
from
test
where
(uid, somevalue) not in
(select min(uid), somevalue from test group by somevalue)
) t on (test.uid = t.uid),
(select max(somevalue) from test) a

Rhys Stewart wrote:

ok, let me clarify, dont want to remove them just want them changed
but need to keep the uid. However, I would like just one somevalue to
remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
after i would like 2 to remain 44 but uids 4 and 8 would be changed.
2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>:

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue.
doing this by just adding a random number is perfectly fine,

however

i want to retain at least one of the original values of

somevalue. Any

ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
Jeff Davis

--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

#9Jeff Davis
pgsql@j-davis.com
In reply to: Rhys A.D. Stewart (#5)
Re: (un)grouping question

On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote:

ok, let me clarify, dont want to remove them just want them changed
but need to keep the uid. However, I would like just one somevalue to
remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
after i would like 2 to remain 44 but uids 4 and 8 would be changed.

Can you explain why you're trying to do this? It's a very unusual
requirement.

That being said, the query would look something like this:

(SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue)
UNION
(SELECT uid, somevalue + random() AS somevalue
FROM mytable WHERE uid NOT IN
(SELECT MIN(uid)
FROM mytable GROUP by somevalue)

Disclaimer: I haven't actually tested this query, but it looks about
right.

Regards,
Jeff Davis

#10Andrei Kovalevski
andyk@commandprompt.com
In reply to: Jeff Davis (#9)
Re: (un)grouping question

Jeff Davis wrote:

On Mon, 2008-01-21 at 14:25 -0500, Rhys Stewart wrote:

ok, let me clarify, dont want to remove them just want them changed
but need to keep the uid. However, I would like just one somevalue to
remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
after i would like 2 to remain 44 but uids 4 and 8 would be changed.

Can you explain why you're trying to do this? It's a very unusual
requirement.

That being said, the query would look something like this:

(SELECT MIN(uid) AS uid, somevalue FROM mytable GROUP BY somevalue)
UNION
(SELECT uid, somevalue + random() AS somevalue
FROM mytable WHERE uid NOT IN
(SELECT MIN(uid)
FROM mytable GROUP by somevalue)

Disclaimer: I haven't actually tested this query, but it looks about
right.

How can you garantee that "somevalue + random()" won't duplicate other
unique values in this column? ;)

Regards,
Jeff Davis

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

#11Reece Hart
reece@harts.net
In reply to: Andrei Kovalevski (#10)
Re: (un)grouping question

On Mon, 2008-01-21 at 22:05 +0200, Andrei Kovalevski wrote:

How can you garantee that "somevalue + random()" won't duplicate other
unique values in this column? ;)

Like the inane memo from the boss says, "expect the unexpected" and
"tell me all the unpredictable issues that will happen".

If uid is unique then something like
=> select uid,somevalue,somevalue||'-'||uid from mytable
is much better.

-Reece

--
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0

#12Rhys A.D. Stewart
rhys.stewart@gmail.com
In reply to: Andrei Kovalevski (#8)
Re: (un)grouping question

Great, this does the trick thanks!!

um... somevalue+random() is a simplified version of what I really wanted to
do, i just wante the general idea of what the query would look like.

2008/1/21, Andrei Kovalevski <andyk@commandprompt.com>:

Show quoted text

May be this is what you need:

select
test.uid, coalesce(t.somevalue + a.max + t.uid, test.somevalue)
from
test
left outer join
(select
*
from
test
where
(uid, somevalue) not in
(select min(uid), somevalue from test group by somevalue)
) t on (test.uid = t.uid),
(select max(somevalue) from test) a

Rhys Stewart wrote:

ok, let me clarify, dont want to remove them just want them changed
but need to keep the uid. However, I would like just one somevalue to
remain the same. so for example, uids, 2,4 and 8 have somevalue 44,
after i would like 2 to remain 44 but uids 4 and 8 would be changed.
2008/1/21, Jeff Davis <pgsql@j-davis.com <mailto:pgsql@j-davis.com>>:

On Mon, 2008-01-21 at 12:36 -0500, Rhys Stewart wrote:

Hi list,

have the following table

uid|somevalue
--------------------
1|11
2|44
3|31
4|44
5|71
6|33
7|33
8|44
9|14

would like to remove the duplicate values in the column somevalue.
doing this by just adding a random number is perfectly fine,

however

i want to retain at least one of the original values of

somevalue. Any

ideas how to do this in in a query?

Would something like this help?

SELECT MIN(uid), somevalue FROM mytable GROUP BY somevalue;

Also consider just doing:

SELECT DISTINCT somevalue FROM mytable;

...if you don't need uid in the result set.

Regards,
Jeff Davis

--
Andrei Kovalevski
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/