Distributed Graph Processing: LFGraph

LFGraph is a recent study by UIUC which seems solid and promises the best results in distributed graph analytics. It compares itself with Powergraph, one of the most popular and best-performing alternatives. Both are implemented in C++ and does not rely on Hadoop or any other framework.

The basic properties of LFGraph are:

It advocates the use of hash partitioning since it is very fast and provides good balance among workers. It says that clever graph partitioning takes up to 90% of total runtime in most applications and input datasets. Therefore it must be avoided. In addition, as opposed to the claim of PowerGraph, LFGraph authors claim that while synthetic power-law graphs causes high worker imbalance, the real-world graphs have thicker tails which significantly reduces this imbalance.
It uses synchronous supersteps like Pregel. But each superstep consists of 2 sub-steps: computation and communication. In computation step, multiple computation threads go over separately-assigned vertex lists and by using the already received neighbor values, compute the new vertex value. In communication step, this value is sent to other workers in batches while setting up updated flags and invalidating old values.
Instead of a vertex sending its value to every remote neighbor separately (Pregel does that), it uses publish lists at each worker which says which vertices will be published to which workers. The published values are kept in a worker-level list in the receiving worker. As a result communication cost is reduced significantly.
It needs no locking as in Pregel since it uses two values for each vertex: real value and shadow value. Despite this means extra memory and extra work to shift values between lists at each superstep, it is still highly memory efficient (A lot less than Powergraph).
The most basic property: It has single pass over neighbors. It only reads in-edges while others need to go on both incoming edge list and outgoing edge list. This brings significant improvement in computation times compared to Pregel and Graphlab (both need 2 passes).
Since workers take care of sending vertex values to nonlocal neighbors, vertices need only incoming edge list. This decreases memory cost significantly.

Concerns:

Since it uses single-pass only on incoming edges, no message is sent to out-neighbors. They have to be active all the time. So, algorithms that tolerate sleeping of vertices may not run well here. For example in PageRank, Giraphx can sleep most vertices in most supersteps. But to achieve same convergence, LFGraph needs to keep them awake all the time.
It compares PowerGraph and LFGraph with constant number of supersteps in Pregel. However, async systems known to converge faster. Therefore to achieve the same convergence PowerGraph gives in 10 supersteps, LFGraph might need 20 supersteps resulting in similar or worse completion times.
They perform experiments on Emulab cluster (same rack) which has almost zero network delays compared to an EC2 cluster (not necessarily same rack). When the network is slow, decreasing the amount of cross-worker neighbors might make a significant decrease.
No fault tolerance. Values of vertices with remote neighbor can be recovered but local vertices have no way. In fact fault tolerance might be achieved by just backing up local vertices which should be very small in a large cluster.
Despite the given properties, still I do not get how it achieves 10x less memory than PowerGraph. Using smem for LFGraph and heap space for PowerGraph might not be fair.

Distributed Graph Processing

Monday, November 11, 2013

LFGraph

No comments:

Post a Comment