Networking

My sample CSV data is as follows.

An undirected graph has 90 nodes represented by the numbers {10,11,12 .... 99} whose edges with weights are defined as follows.

[data examples]

node1 node2 weight 23 89 34.9 (ie there is an edge between node 23 and 89 with weight 34.9) 75 14 28.5 so on.... 

I would like to present this in a network form. What is an effective way to represent it (e.g. Gephi, networkx, etc.). The thickness of the edge should represent the weight of the edge.

+8
graph social-networking
source share
4 answers

If you are on Linux and think your csv file looks like this (for example):

 23;89;3.49 23;14;1.29 75;14;2.85 14;75;2.9 75;23;0.9 23;27;4.9 

You can use this program:

 import os def build_G(csv_file): #init graph dict g={} #here we open csv file with open(csv_file,'r') as f: cont=f.read() #here we get field content for line in cont.split('\n'): if line != '': fields=line.split(';') #build origin node if g.has_key(fields[0])==False: g[fields[0]]={} #build destination node if g.has_key(fields[1])==False: g[fields[1]]={} #build edge origin>destination if g[fields[0]].has_key(fields[1])==False: g[fields[0]][fields[1]]=float(fields[2]) return g def main(): #filename csv_file="mynode.csv" #build graph G=build_G(csv_file) #G is now a python dict #G={'27': {}, '75': {'14': 2.85, '23': 0.9}, '89': {}, '14': {'75': 2.9}, '23': {'27': 4.9, '89': 3.49, '14': 1.29}} #write to file f = open('dotgraph.txt','w') f.writelines('digraph G {\nnode [width=.3,height=.3,shape=octagon,style=filled,color=skyblue];\noverlap="false";\nrankdir="LR";\n') f.writelines for i in G: for j in G[i]: #get weight weight = G[i][j] s= ' '+ i s += ' -> ' + j + ' [dir=none,label="' + str(G[i][j]) + '",penwidth='+str(weight)+',color=black]' if s!=' '+ i: s+=';\n' f.writelines(s) f.writelines('}') f.close() #generate graph image from graph text file os.system("dot -Tjpg -omyImage.jpg dotgraph.txt") main() 

I used to look for an effective solution for building a complex graph, and this is the simplest (regardless of module type) method I found.

Here is the image result for an undirected graph (using dir = none ):

enter image description here

+4
source share

Using networkx , you can add edges with attributes

 import networkx as nx G = nx.Graph() G.add_edge(23, 89, weight=34.9) G.add_edge(75, 14, weight=28.5) 
+6
source share

If you have a large CSV, I would recommend using pandas as part of the I / O of your task. networkx has a useful method for interacting with pandas called from_pandas_dataframe . Assuming your data is in csv format in the above format, this command should work for you:

 df = pd.read_csv('path/to/file.csv', columns=['node1', 'node2', 'weight']) 

But for demonstration, I will use 10 random edges according to your requirements (you will not need to import numpy , I just use it to generate random numbers):

 import matplotlib as plt import networkx as nx import pandas as pd #Generate Random edges and weights import numpy as np np.random.seed(0) # for reproducibility w = np.random.rand(10) # weights 0-1 node1 = np.random.randint(10,19, (10)) # I used 10-19 for demo node2 = np.random.randint(10,19, (10)) df = pd.DataFrame({'node1': node1, 'node2': node2, 'weight': w}, index=range(10)) 

Everything in the previous block should generate the same as your pd.read_csv command. The result of this DataFrame, df :

  node1 node2 weight 0 16 13 0.548814 1 17 15 0.715189 2 17 10 0.602763 3 18 12 0.544883 4 11 13 0.423655 5 15 18 0.645894 6 18 11 0.437587 7 14 13 0.891773 8 13 13 0.963663 9 10 13 0.383442 

Use from_pandas_dataframe to initialize the MultiGraph . This assumes that you will have multiple edges connecting to one node (not specified in the OP). To use this method, you will need to easily modify the networkx source code in the networkx file implemented here (it was a simple mistake).

 MG = nx.from_pandas_dataframe(df, 'node1', 'node2', edge_attr='weight', create_using=nx.MultiGraph() ) 

This generates a MultiGraph, you can visualize it using draw :

 positions = nx.spring_layout(MG) # saves the positions of the nodes on the visualization # pass positions and set hold=True nx.draw(MG, pos=positions, hold=True, with_labels=True, node_size=1000, font_size=16) 

More details: positions is a dictionary where each node is a key, and the value is a position on the chart. I will describe why we store positions below. A common draw will draw your instance of MultiGraph MG with the nodes at the specified positions . However, since you can see that the edges have the same width:
Unweighted

But you have everything you need to add weights. First enter the scales in a list called weights . Iteration (with a list) through each edge with edges , we can extract the weight. I decided to multiply by 5 because it looked the cleanest:

 weights = [w[2]['weight']*5 for w in MG.edges(data=True)] 

Finally, we will use draw_networkx_edges , which only draws the edges of the graph (without nodes). Since we have the positions nodes and we set hold=True , we can draw weighted edges directly on top of our previous visualization.

 nx.draw_networkx_edges(MG, pos=positions, width=weights) #width can be array of floats 

Suspended

You can see that node (14, 13) has the heaviest row and the largest value from the DataFrame df (except (13,13) ).

+5
source share

You should edit the line at the beginning of the csv file as follows:

the weight of the original target type is 23 89 non-oriented 34.9 (i.e. there is an edge between node 23 and 89 with a weight of 34.9) 75 14 non-oriented 28.5 so on ....

After that, you can import the csv file into Gephi to present a graph that has the edge thickness for the weight, for example: enter the image description here

0
source share

All Articles