Duplicate records returned

Started by Nonamealmost 19 years ago3 messagesgeneral
Jump to latest
#1Noname
daniel65456@gmail.com

I'm getting duplicate rows returned. I don't know know and can't find
out how to construct the SQL to return what I want. I have an old
version of postgres which I cannot upgrade. I'm not even sure how to
get the version. Does this help?

$ grep -i version ..../pi/bin/setup
wtversion='2.0'
export wtversion

My SQL knowledge is quite limited so I've searched all over the net
and read doco but I can't figure this out.

I want to find the total number of hours worked on all projects with
projects.parent="Projects", projects.pct<100, restrictions.hidden=5
(not hidden)

I keep getting duplicate records and I think it's it's becuase I'm
joining a table on 2 other tables. I've tried INTERSECT but I keep
getting parse errors and can't work out why that is. I've tried
creating a temp table but still have to do the twin joins!! I've also
tried creating a view but without success

Here's an example of correct data, a single user and a single project

select username, sum(hours)
from timerecs
where project like 'Testing'
and username = 'long'
group by username
;

username |sum
---------+---
long |127

but there will be many projects to process and to select the
applicable projects requires test on 2 other tables, projects and
restrictions

$the_sql = " SELECT projectname, username, sum(hours)";
$the_sql .= " FROM timerecs";
$the_sql .= " WHERE projectname = projects.projectname ";
$the_sql .= " AND projectname = restrictions.projectname";
$the_sql .= " AND projects.parent = 'Projects'";
$the_sql .= " AND projects.pct < 100";
$the_sql .= " AND restrictions.hidden = 5";
$the_sql .= " AND projectname = 'Testing'"; # just for tsting
$the_sql .= " AND username = 'long'"; # just for testing
$the_sql .= " AND projectname = projects.projectname ";
$the_sql .= " GROUP BY projectname, username";
$the_sql .= " ORDER BY projectname, username";
$the_sql .= " ;";

produces

1 Testing|long|254

How do I get the right list of projectname from timerecs by joining
with the projects and restrictions tables?
I've tried SELECT DISTINCT projectname but make no difference. If I
take the 'restrictions' join out it's fine. I've also tried prefacing
all column names with table names without any change

Thanks

#2Richard Huxton
dev@archonet.com
In reply to: Noname (#1)
Re: Duplicate records returned

daniel65456@gmail.com wrote:

I'm getting duplicate rows returned. I don't know know and can't find
out how to construct the SQL to return what I want. I have an old
version of postgres which I cannot upgrade. I'm not even sure how to
get the version. Does this help?

$ grep -i version ..../pi/bin/setup
wtversion='2.0'
export wtversion

Nope - nothing to do with PostgreSQL I'm afraid.

Try issuing "SELECT version()" as an SQL statement.

My SQL knowledge is quite limited so I've searched all over the net
and read doco but I can't figure this out.

I want to find the total number of hours worked on all projects with
projects.parent="Projects", projects.pct<100, restrictions.hidden=5
(not hidden)

I keep getting duplicate records and I think it's it's becuase I'm
joining a table on 2 other tables. I've tried INTERSECT but I keep
getting parse errors and can't work out why that is. I've tried
creating a temp table but still have to do the twin joins!! I've also
tried creating a view but without success

but there will be many projects to process and to select the
applicable projects requires test on 2 other tables, projects and
restrictions

$the_sql = " SELECT projectname, username, sum(hours)";
$the_sql .= " FROM timerecs";
$the_sql .= " WHERE projectname = projects.projectname ";
$the_sql .= " AND projectname = restrictions.projectname";
$the_sql .= " AND projects.parent = 'Projects'";
$the_sql .= " AND projects.pct < 100";
$the_sql .= " AND restrictions.hidden = 5";
$the_sql .= " AND projectname = 'Testing'"; # just for tsting
$the_sql .= " AND username = 'long'"; # just for testing
$the_sql .= " AND projectname = projects.projectname ";
$the_sql .= " GROUP BY projectname, username";
$the_sql .= " ORDER BY projectname, username";
$the_sql .= " ;";

You might want to read up on "HERE documents" for multi-line blocks of text.

1. You've also not put all your tables into the FROM clause:
FROM timerecs, projects, restrictions
This *should* be generating a warning of some kind
2. You're not qualifying which column comes from which table, which
makes it harder to see what's happening. Try:
FROM timerecs t, projects p, restrictions r
WHERE t.projectname = p.projectname
AND ...
That's called table aliasing, where you give a short name to tables.
3. This query *can't* give duplicates for (projectname,username) pairs
unless you're activating a bug. The GROUP BY eliminates duplicates.

produces

1 Testing|long|254

How do I get the right list of projectname from timerecs by joining
with the projects and restrictions tables?

You've not said what "right" means to you.

I've tried SELECT DISTINCT projectname but make no difference. If I
take the 'restrictions' join out it's fine. I've also tried prefacing
all column names with table names without any change

Try the table aliasing, then post the query again (oh, trim the perl/php
if you could) along with some sample data, the results and what the
results should be. Otherwise there's not much anyone can say.

--
Richard Huxton
Archonet Ltd

#3Harald Fuchs
hf0217x@protecting.net
In reply to: Noname (#1)
Re: Duplicate records returned

In article <4680D3E9.7020706@archonet.com>,
Richard Huxton <dev@archonet.com> writes:

$the_sql = " SELECT projectname, username, sum(hours)";
$the_sql .= " FROM timerecs";
$the_sql .= " WHERE projectname = projects.projectname ";
$the_sql .= " AND projectname = restrictions.projectname";
$the_sql .= " AND projects.parent = 'Projects'";
$the_sql .= " AND projects.pct < 100";
$the_sql .= " AND restrictions.hidden = 5";
$the_sql .= " AND projectname = 'Testing'"; # just for tsting
$the_sql .= " AND username = 'long'"; # just for testing
$the_sql .= " AND projectname = projects.projectname ";
$the_sql .= " GROUP BY projectname, username";
$the_sql .= " ORDER BY projectname, username";
$the_sql .= " ;";

You might want to read up on "HERE documents" for multi-line blocks of text.

In case the above code is Perl, I think

my $sql = q{
SELECT ...
FROM ...
WHERE ...
GROUP ...
};

looks nicer than a here-document.

1. You've also not put all your tables into the FROM clause:
FROM timerecs, projects, restrictions
This *should* be generating a warning of some kind
2. You're not qualifying which column comes from which table, which
makes it harder to see what's happening. Try:
FROM timerecs t, projects p, restrictions r
WHERE t.projectname = p.projectname
AND ...
That's called table aliasing, where you give a short name to tables.

This still mixes JOIN conditions with other result restrictions.

SELECT ...
FROM timerecs t
JOIN projects p ON p.projectname = t.projectname
...

makes it more explicit.