Related components of Apache Spark GraphX

How to use the subgraph function to get a graph that will only include vertices and edges from a specific related component? Let's say I know the identifier of the connected component, the ultimate goal is to create a new graph based on the connected component. I would like to save the vertex attributes from the original graph.

+7
apache-spark spark-graphx
source share
2 answers

You need to join the graph with component identifiers to the source graph, filter (take a subgraph) using the component identifier, and then discard the component identifier.

import scala.reflect._ import org.apache.spark.graphx._ import org.apache.spark.graphx.lib.ConnectedComponents def getComponent[VD: ClassTag, ED: ClassTag]( g: Graph[VD, ED], component: VertexId): Graph[VD, ED] = { val cc: Graph[VertexId, ED] = ConnectedComponents.run(g) // Join component ID to the original graph. val joined = g.outerJoinVertices(cc.vertices) { (vid, vd, cc) => (vd, cc) } // Filter by component ID. val filtered = joined.subgraph(vpred = { (vid, vdcc) => vdcc._2 == Some(component) }) // Discard component IDs. filtered.mapVertices { (vid, vdcc) => vdcc._1 } } 
+6
source share

I take your question for setting VertexId in the original graph, creating a new graph with the nodes and edges associated with this VertexId from the original graph.

Given what I will do:

 val targetVertexId = ... val graph = Graph(..., ...) val newGraph = Graph( graph.vertices.filter{case (vid,attr) => vid == targetVertexId} ++ graph.collectNeighbors(EdgeDirection.Either) .filter{ case (vid,arr) => vid == targetVertexId} .flatMap{ case (vid,arr) => arr}, graph.edges ).subgraph(vpred = { case (vid,attr) => attr != null}) 

A few notes:

You can change EdgeDirection.Either to EdgeDirection.In or EdgeDirection.Out as needed.

At the end of .subgraph , all vertices for which the attribute is null are deleted. If the original val graph has Vertices with attributes set to null , this will not work. Otherwise, this works without first recognizing the type of the Vertex attribute.

+2
source share

All Articles