Scientific distributed graph computing made easy
Home › Performances

This section gives comparative performance evaluation results of the Giraph, GraphX and BigGrph platforms. The benchmarks consists of measuring the Load (s), computation time and memory requirements of the BFS algorithm applied to large datasets

The hardware testbed is the dellc6220 rack of the NEF cluster at Inria Sophia Antipolis. The rack is composed of 16 Dell C6220 dual-Xeon E5-2680 v2 @ 2.80GHz (20 cores) nodes with 192 GB of RAM. Nodes are connected through a Gigabit-Ethernet network. They are connected to a shared NFS storage through an Infiniband network.

Once the dataset is loaded, we run a BFS (Breadth First Search) implemented in BSP (Bulk Synchronous Parallel). BFS source code for BigGrph. Giraph is 1.0 with Hadoop 1.2. @Override protected void computeLocalElement(BigAdjacencyTable adj, long ss, long v, MessageList inbox) { if (!getDistancesMap().getLocalMap().containsKey(v)) { getDistancesMap().getLocalData().put(v, ss); post(EmptyMessage.instance, adj.get(v)); } } @Override public void combine(MessageList msgs, EmptyMessage newMessage) { if (msgs.isEmpty()) { msgs.add(newMessage); } }

Dataset: Twitter, 3.8GB on disk
3.1M vertices, 200M edges
avg degree: 64
max degree: 211255

Platform#nodeLoad (s)BFS (s)RAM (GB)
GraphX82118we gave 256

GraphX was set up to use 8 partitions per node.
GraphX takes 24s when assigning predecessors.
BigBrph took 5s instead of 19s when profiting from the NFS cache.

Dataset: com-friendster, 31GB on disk
65.6M vertices, 1.8G edges
avg degree: 27
max degree: 3615

Platform#nodeLoad (s)BFS (s)RAM (GB)
Giraph/SDP4 (16 workers)320250100
Giraph/SDP8 (24 workers)526191282 (undirected graph)
GraphX12300370-780we gave 1100
BigGrph8 1511052

GraphX was set up to use 48 partitions.
BigGrph takes 10mn to convert to its native format.
We don't know about GraphX multi-threading policy (i.e. the number of cores it uses).

Dataset: Twitter, 240GB on disk
398M vertices, 23G edges
avg degree: 58
max degree: 24635412

Platform#nodeLoad (s)BFS (s)RAM (GB)

BigGrph takes 4h to convert to its native format.
Giraph took 1h36min to upload to HDFS

BigGrph CC 8 nodes 630s GraphX CC 8 nodes 240s BigGrph CC 8 nodes 630s