Skip to content
This repository has been archived by the owner on Oct 21, 2021. It is now read-only.

Discussion: why should we allow edges / vertices to be anything other than Ints? #143

Closed
sbromberger opened this issue Dec 31, 2014 · 3 comments

Comments

@sbromberger
Copy link
Contributor

At least for now.

It seems to me we'd have a much simpler task if we created a basic G={V,E} design with V and E as integers (perhaps Int128 for REALLY large graphs) and pairs, respectively. Any additional metadata / attributes (weights, names, etc) could be derived by lookup and, with proper indexing, could maintain O(1) time. Mapping could also be used to generate deterministic attributes.

This would simplify a whole slew of issues relating to type conversion, return values, etc, (#140) and would allow us (me, primarily, since I'm less steady than others) to focus on adding functionality to a known type of graph as opposed to worrying about how method X performs with graphs that don't support O(1) indexing, or don't have attribute Y, etc. It also makes consistency of return values straightforward (see, e.g., #142).

We're practically doing this already for all functions that return an array of vertex indices anyway, and with functions (I'm looking at you, dijkstra_shortest_paths()) that require a separate vector of data (edge_dists, or VectorEdgePropertyInspector(edge_dists)) in order to work in some respect.

(Yes, I know I was campaigning for something other than simple_graph earlier; I'm throwing this out for discussion now that I'm grokking the code a bit more.)

It also allows us to separate functions that act ON a graph from functions that act on ATTRIBUTES of a graph's components, and would likely simplify persistent storage of graph data.

Anyway - adoption of this proposal would break things, perhaps horrifically. Are there other arguments against?

@mlubin
Copy link
Contributor

mlubin commented Dec 31, 2014

Are you proposing to throw away the current abstractions and just have a single concrete graph type?

@sbromberger
Copy link
Contributor Author

It sounds so extreme when you say it like that, but sort of.

I think there's room for multiple concrete graph types, but for different reasons. Instead of defining types based on the attributes associated with a graph (that is, custom vertices / edges), we would define types based on the functionality provided to the graph operations (dynamic graphs, where nodes and edges can be removed, etc. I'd also like to see "directedness" of graphs be something that is reflected in a concrete type as opposed to in an attribute, though that's simple enough to create sugar for.)

We do this, in effect, with simple_graph() now - we restrict the functionality of this type of graph for very good reasons. #131, #140, and the discussion on #87 are beginning to convince me that simplification of the data types we handle in graph manipulation is going to be the best way to guarantee performance and consistent behavior.

There's another somewhat practical benefit of doing this: changing metadata/attributes of graph vertices and edges should have no bearing on the way we process graphs. That is, if a graph represents a network of people, say, with a set of properties, it should behave / transform identically to a network of smart meters with a set of different properties. This would allow us to create model graphs of the type that networkx provides (see, e.g., http://networkx.github.io/documentation/networkx-1.9.1/reference/generators.html) without worrying about what metadata the user wants associated with each vertex/edge.

@sbromberger
Copy link
Contributor Author

Following up: LightGraphs.jl is an implementation of this concept, and seems to be more efficient across a spectrum of functions/methods.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants