origins/destinations
Hi list,
I have (what I thought was) a relatively simple problem, but my
knowledge of sql is just not good enough to get this done:
I have a table which is basically a number of individuals with both
their origin and destination as columns (see Table 1). In this case,
origins and destinations are the census area in which they and work.
What I would like to do is generate an nxn matrix (preferably output to
csv but I'll take what I can get), where origins are on the y axis, and
destinations on the x axis (see Table 3).
I can already group by both origins and destinations to produce Table 2,
but I don't know what steps are needed to get to Table 3. Any help or
suggestions are greatly appreciated!
Table 1
id | origin | destination
1 area1 area5
2 area1 area5
3 area1 area5
4 area2 area4
5 area4 area2
6 area5 area5
7 area2 area4
8 area2 area4
9 area4 area3
10 area3 area5
...
Table 2
id | origin | destination | count
1 area1 area5 3
4 area2 area4 3
5 area4 area2 1
6 area5 area5 1
9 area4 area3 1
10 area3 area5 1
...
Table 3
origins | area1 | area2 | area3 | area4 | area5 | ...
area1 0 0 0 0 3
area2 0 0 0 3 0
area3 0 0 0 0 1
area4 0 1 1 0 0
area5 0 0 0 0 1
...
Regards,
Carson
--
Carson J. Q. Farmer
ISSP Doctoral Fellow
National Centre for Geocomputation (NCG),
Email: Carson.Farmer@gmail.com
Web: http://www.carsonfarmer.com/
http://www.ftools.ca/
Carson Farmer wrote:
Hi list,
I have (what I thought was) a relatively simple problem, but my
knowledge of sql is just not good enough to get this done:I have a table which is basically a number of individuals with both
their origin and destination as columns (see Table 1). In this case,
origins and destinations are the census area in which they and work.
What I would like to do is generate an nxn matrix (preferably output to
csv but I'll take what I can get), where origins are on the y axis, and
destinations on the x axis (see Table 3).
Google a little for crosstab queries with the tablefunc add-ons in the
contrib/ directory.
--
Richard Huxton
Archonet Ltd
Carson Farmer wrote:
Hi list,
I have (what I thought was) a relatively simple problem, but my
knowledge of sql is just not good enough to get this done:I have a table which is basically a number of individuals with both
their origin and destination as columns (see Table 1). In this case,
origins and destinations are the census area in which they and work.
What I would like to do is generate an nxn matrix (preferably output to
csv but I'll take what I can get), where origins are on the y axis, and
destinations on the x axis (see Table 3).
<snip>
Would it have to be sql only? I think this would be pretty easy in perl.
-Andy
Andy Colson wrote:
Carson Farmer wrote:
Hi list,
I have (what I thought was) a relatively simple problem, but my
knowledge of sql is just not good enough to get this done:I have a table which is basically a number of individuals with both
their origin and destination as columns (see Table 1). In this case,
origins and destinations are the census area in which they and work.
What I would like to do is generate an nxn matrix (preferably output
to csv but I'll take what I can get), where origins are on the y axis,
and destinations on the x axis (see Table 3).<snip>
Would it have to be sql only? I think this would be pretty easy in perl.
-Andy
I took the liberty of assuming the origins and destinations could have
different values
Something like this:
#!/usr/bin/perl -w
use strict;
use DBI;
my $sql = 'select origin, dest, count(*) from tmp group by origin, dest';
my $db = DBI->connect('dbi:Pg:dbname=andy', 'andy', '') or die;
my $orlist = $db->selectcol_arrayref('select distinct origin from tmp
order by origin');
my $dstlist = $db->selectcol_arrayref('select distinct dest from tmp
order by dest');
my %table;
my $q = $db->prepare($sql);
$q->execute();
while (my($origin, $dest, $cc) = $q->fetchrow_array)
{
$table{$origin}->{$dest} += $cc;
}
print "origins\t";
foreach my $dst (@$dstlist)
{
print "$dst\t";
}
print "\n";
foreach my $ori (@$orlist)
{
print "$ori\t";
foreach my $dst (@$dstlist)
{
my $v = $table{$ori}->{$dst};
if (! $v) {
$v = '0';
}
print "$v\t";
}
print "\n";
}
Andy Colson wrote:
Carson Farmer wrote:
Hi list,
I have (what I thought was) a relatively simple problem, but my
knowledge of sql is just not good enough to get this done:I have a table which is basically a number of individuals with both
their origin and destination as columns (see Table 1). In this case,
origins and destinations are the census area in which they and work.
What I would like to do is generate an nxn matrix (preferably output
to csv but I'll take what I can get), where origins are on the y
axis, and destinations on the x axis (see Table 3).<snip>
Would it have to be sql only? I think this would be pretty easy in perl.
indeed, this would better be done outside the database. you're
generating a sparse table of N x N dimensions and likely only relatively
few elements populated, unless your population count greatly exceeds the
number of locations. I think I'd do a SQL query for
distinct(source,dest),count(population) and then use this to fill your
matrix on the client side.
I m new to PostgreSQL so please tell me the drawbacks of is this solution
Your Table 3 should not be a table it should be a array.
CREATE TABLE table3
(
ori_des int[][]
);
for origin area_n and destination area_m if count is k, then ori_des[n][m] = k.
--- On Tue, 19/5/09, Carson Farmer <carson.farmer@gmail.com> wrote:
From: Carson Farmer <carson.farmer@gmail.com>
Subject: [GENERAL] origins/destinations
To: pgsql-general@postgresql.org
Cc: "Carson Farmer" <Carson.Farmer@nuim.ie>
Date: Tuesday, 19 May, 2009, 10:27 PM
Hi list,
I have (what I thought was) a relatively simple problem, but my knowledge of sql is just not good enough to get this done:
I have a table which is basically a number of individuals with both their origin and destination as columns (see Table 1). In this case, origins and destinations are the census area in which they and work. What I would like to do is generate an nxn matrix (preferably output to csv but I'll take what I can get), where origins are on the y axis, and destinations on the x axis (see Table 3).
I can already group by both origins and destinations to produce Table 2, but I don't know what steps are needed to get to Table 3. Any help or suggestions are greatly appreciated!
Table 1
id | origin | destination
1 area1 area5
2 area1 area5
3 area1 area5
4 area2 area4
5 area4 area2
6 area5 area5
7 area2 area4
8 area2 area4
9 area4 area3
10 area3 area5
....
Table 2
id | origin | destination | count
1 area1 area5 3
4 area2 area4 3
5 area4 area2 1
6 area5 area5 1
9 area4 area3 1
10 area3 area5 1
....
Table 3
origins | area1 | area2 | area3 | area4 | area5 | ...
area1 0 0 0 0 3
area2 0 0 0 3 0
area3 0 0 0 0 1
area4 0 1 1 0 0
area5 0 0 0 0 1
....
Regards,
Carson
-- Carson J. Q. Farmer
ISSP Doctoral Fellow
National Centre for Geocomputation (NCG),
Email: Carson.Farmer@gmail.com
Web: http://www.carsonfarmer.com/
http://www.ftools.ca/
-- Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Get an email ID as yourname@ymail.com or yourname@rocketmail..com. Click here http://in.promos.yahoo.com/address
Import Notes
Resolved by subject fallback