Identifying Reason for Column Name Returned by SELECT

Started by Rich Shepardover 14 years ago10 messagesgeneral
Jump to latest
#1Rich Shepard
rshepard@appl-ecosys.com

I run this SELECT statement on a table:

select distinct(site_id) from chemistry order by site_id;

and in the returned set I see:

GW-21
GW-22
GW-22 +

GW-24

I want to find that row returning 'GW-22 +' because I believe it
should be 'GW-23'. However, my attempts to retrieve that row keep failing.
I've tried these statements:

select count(*) from chemistry where site_id = 'GW-22 +';
count
-------
0
(1 row)

yet,

select count(*) from chemistry where site_id = 'GW-22';
count
-------
803
(1 row)

Looking for the blank row also fails when I try to specify site_id as is
null, = ' ', or =''.

Please point me to the proper way of finding this rogue row so I can
correct the value in the site_id column.

TIA,

Rich

#2Steve Crawford
scrawford@pinpointresearch.com
In reply to: Rich Shepard (#1)
Re: Identifying Reason for Column Name Returned by SELECT

I suspect you have a multi-line entry and the '+' is just indicating
that the field continues.

Try ...where site_id ~ 'GW-22'... (this may take a while if the table is
very large).

Cheers,
Steve

Show quoted text

On 09/14/2011 09:35 AM, Rich Shepard wrote:

I run this SELECT statement on a table:

select distinct(site_id) from chemistry order by site_id;

and in the returned set I see:

GW-21
GW-22
GW-22 +

GW-24

I want to find that row returning 'GW-22 +' because I believe it
should be 'GW-23'. However, my attempts to retrieve that row keep
failing.
I've tried these statements:

select count(*) from chemistry where site_id = 'GW-22 +';
count -------
0
(1 row)

yet,

select count(*) from chemistry where site_id = 'GW-22';
count -------
803
(1 row)

Looking for the blank row also fails when I try to specify site_id
as is
null, = ' ', or =''.

Please point me to the proper way of finding this rogue row so I can
correct the value in the site_id column.

TIA,

Rich

#3Darren Duncan
darren@darrenduncan.net
In reply to: Rich Shepard (#1)
Re: Identifying Reason for Column Name Returned by SELECT

Your example suggests that the "GW-22" is a substring of the field followed by
trailing spaces so you'll want something that searches substrings, whereas "="
will always just test on matching the entire field.

Try "like" by default, such as "where site_id like 'GW-22 %'". I added the
space between the 22 and the wildcard % so that the field containing just
'GW-22' isn't also matched.

If you need something more specific than simple substring match, you can use a
regular expression, which I think is spelled like "where site_id ~ '...'" but
check the manual to be sure about regexp syntax. But "like" will probably do
you here.

-- Darren Duncan

Rich Shepard wrote:

Show quoted text

I run this SELECT statement on a table:

select distinct(site_id) from chemistry order by site_id;

and in the returned set I see:

GW-21
GW-22
GW-22 +

GW-24

I want to find that row returning 'GW-22 +' because I believe it
should be 'GW-23'. However, my attempts to retrieve that row keep failing.
I've tried these statements:

select count(*) from chemistry where site_id = 'GW-22 +';
count -------
0
(1 row)

yet,

select count(*) from chemistry where site_id = 'GW-22';
count -------
803
(1 row)

Looking for the blank row also fails when I try to specify site_id as is
null, = ' ', or =''.

Please point me to the proper way of finding this rogue row so I can
correct the value in the site_id column.

TIA,

Rich

#4Richard Broersma
richard.broersma@gmail.com
In reply to: Steve Crawford (#2)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, Sep 14, 2011 at 9:42 AM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:

I suspect you have a multi-line entry and the '+' is just indicating that
the field continues.

Try ...where site_id ~ 'GW-22'... (this may take a while if the table is
very large).

You might be able to get an index scan if you include a few more criteria:

AND site_id BETWEEN 'GW-22' AND 'GW-23'

Also WHERE site_id '^GW-22' may use and index scan also.

--
Regards,
Richard Broersma Jr.

#5Darren Duncan
darren@darrenduncan.net
In reply to: Darren Duncan (#3)
Re: Identifying Reason for Column Name Returned by SELECT

Darren Duncan wrote:

Try "like" by default, such as "where site_id like 'GW-22 %'". I added
the space between the 22 and the wildcard % so that the field containing
just 'GW-22' isn't also matched.

Sorry, I should have said "where site_id like 'GW-22%' and site_id != 'GW-22'"
(no explicit space) as a better way to exclude 'GW-22' from the results. --
Darren Duncan

#6Rich Shepard
rshepard@appl-ecosys.com
In reply to: Steve Crawford (#2)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, 14 Sep 2011, Steve Crawford wrote:

I suspect you have a multi-line entry and the '+' is just indicating that the
field continues.

Steve, et al.:

It's not multi-line, but malformed.

Try ...where site_id ~ 'GW-22'... (this may take a while if the table is very
large).

This found the appropriate rows. Now, my question is DDL-related:

What is the appropriate syntax to change 'GW-22 +' to GW-22? Can I use
'like' or '~' in an ALTER TABLE RENAME <column> ... statement?

Thanks guys,

Rich

#7Richard Broersma
richard.broersma@gmail.com
In reply to: Rich Shepard (#6)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, Sep 14, 2011 at 10:04 AM, Rich Shepard <rshepard@appl-ecosys.com> wrote:

 This found the appropriate rows. Now, my question is DDL-related:

 What is the appropriate syntax to change 'GW-22  +' to GW-22? Can I use
'like' or '~' in an ALTER TABLE RENAME <column> ... statement?

I'm confused. Do you want to UPDATE the affected records to GW-22. Or
do you want to ALTER the table to add a column constraint to prevent
malformed site_id's in the future?

--
Regards,
Richard Broersma Jr.

#8Rich Shepard
rshepard@appl-ecosys.com
In reply to: Richard Broersma (#7)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, 14 Sep 2011, Richard Broersma wrote:

I'm confused.

Richard,

Apparently, I am also confused. Doing too many things simultaneoulsy.

Do you want to UPDATE the affected records to GW-22. Or do you want to
ALTER the table to add a column constraint to prevent malformed site_id's
in the future?

Update. That makes my question moot.

Thanks,

Rich

#9Merlin Moncure
mmoncure@gmail.com
In reply to: Rich Shepard (#6)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, Sep 14, 2011 at 12:04 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:

On Wed, 14 Sep 2011, Steve Crawford wrote:

I suspect you have a multi-line entry and the '+' is just indicating that
the field continues.

Steve, et al.:

 It's not multi-line, but malformed.

It *is* mult-line. psql uses a '+ to show line breaks:

postgres=# select E'a\nb';
?column?
----------
a +
b
(1 row)

To fix your data, i'd consider using the replace() function to knock
out newlines:
postgres=# select replace(E'a\nb', E'\n', '');
replace
---------
ab
(1 row)

Also consider adjusting the app and/or the database to block them in the future.

merlin

#10Rich Shepard
rshepard@appl-ecosys.com
In reply to: Merlin Moncure (#9)
Re: Identifying Reason for Column Name Returned by SELECT

On Wed, 14 Sep 2011, Merlin Moncure wrote:

It *is* mult-line. psql uses a '+ to show line breaks:

Merlin,

Yep. I discovered this when I dumped the table as an ASCII text file and
saw the '\n' after the site_id string on some rows. I've no idea how it got
there.

Thanks,

Rich