Efficiently storing a directed graph
I have a directed graph (nodes and edges) that I want to store
"efficiently": given two nodes, I want to quickly find the shortest
path between them. The graph is NOT acyclic (it's not a tree), is
fairly "sparse" (about 10000 edges for 2500 nodes), and changes
occasionally.
I know PostgreSQL/MySQL can store graphs (as one table of nodes and
one table of edges that reference the nodes), but I think finding the
shortest path between two nodes is quite inefficient that way.
I know PostgreSQL/MySQL have special "plugins" (like PostGIS for
PostgreSQL) for specific problems. Is there a directed graph plugin?
I'm not married to using SQL: are there other efficient solutions to
store directed graphs? Could I hack something up in Perl or Ruby and
then serialize my in-memory graph to a file (for efficient
saving/reloading)?
As a minor note, the nodes/edges will have (non-unique) names and
descriptions, and I want the ability to do fulltext searching on these
names/descriptions. However, this is less important than quickly
finding the shortest path between two nodes.
--
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.
Kelly,
I'm not married to using SQL: are there other efficient solutions to
store directed graphs? Could I hack something up in Perl or Ruby and
then serialize my in-memory graph to a file (for efficient
saving/reloading)?
Did you look at Dijkstra's algorithm?
-----
Kelly Jones wrote:
Show quoted text
I have a directed graph (nodes and edges) that I want to store
"efficiently": given two nodes, I want to quickly find the shortest
path between them. The graph is NOT acyclic (it's not a tree), is
fairly "sparse" (about 10000 edges for 2500 nodes), and changes
occasionally.I know PostgreSQL/MySQL can store graphs (as one table of nodes and
one table of edges that reference the nodes), but I think finding the
shortest path between two nodes is quite inefficient that way.I know PostgreSQL/MySQL have special "plugins" (like PostGIS for
PostgreSQL) for specific problems. Is there a directed graph plugin?I'm not married to using SQL: are there other efficient solutions to
store directed graphs? Could I hack something up in Perl or Ruby and
then serialize my in-memory graph to a file (for efficient
saving/reloading)?As a minor note, the nodes/edges will have (non-unique) names and
descriptions, and I want the ability to do fulltext searching on these
names/descriptions. However, this is less important than quickly
finding the shortest path between two nodes.--
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.
I'm not married to using SQL: are there other efficient solutions to
store directed graphs? Could I hack something up in Perl or Ruby and
then serialize my in-memory graph to a file (for efficient
saving/reloading)?
As far as a perl solution, I would suggest posting your problem on
perlmonks.org
Kelly Jones wrote:
I have a directed graph (nodes and edges) that I want to store
"efficiently": given two nodes, I want to quickly find the shortest
path between them. The graph is NOT acyclic (it's not a tree), is
fairly "sparse" (about 10000 edges for 2500 nodes), and changes
occasionally.I know PostgreSQL/MySQL can store graphs (as one table of nodes and
one table of edges that reference the nodes), but I think finding the
shortest path between two nodes is quite inefficient that way.I know PostgreSQL/MySQL have special "plugins" (like PostGIS for
PostgreSQL) for specific problems. Is there a directed graph plugin?
Perhaps use PL/R plus one of the R packages available here:
http://cran.stat.ucla.edu/web/packages/
(e.g. http://cran.stat.ucla.edu/web/packages/igraph/index.html)
to enable data storage in Postgres and data processing in R.
PL/R:
http://www.joeconway.com/plr/
HTH,
Joe