Thursday, November 7, 2013

Distributed GraphLab

Distributed Graphlab's main focus is on solving graph algorithms efficiently in a distributed environment in a shared memory setting. It has 4 main properties:

- It is designed for graph data. But unlike Pregel, the user needs to focus on sequential computation rather than parallel movement of data.
- It is async unlike Pregel. Therefore it uses the most recent data, and it is not affected by the slower worker synchronization problem and highly-variable running times in a skewed network.
- For example in Pagerank, each node converges at a different rate. Graphlab supports prioritizing computation of vertices and adaptively pulling info from neighbors unlike Pregel. This pulling idea resembles our contribution in Giraph-e.
- Provides different levels of consistency to enforce serializibity.
- Data input format is similar to Pregel. There is V, E and D where D contains the edge and vertex data.

- An important difference is graph structure is not modifiable unlike Pregel.
- Another difference is, an update function at a vertex can also modify the data of neighbor vertices. But I dont agree with the argument in Section 3.2. Pregel can also make selective sending in Pagerank. In additon, Alg 1 looks like push not pull.

- Execution model: Similar to Pregel, vertices are executed one by one. But the order may be determined by the user unlike Pregel. But there is no info about the distribution of vertices. Do all workers go over the same set of vertices?

- Consistency is maintained at 3 levels. Level of consistency is inversely proportional to the level of concurrency. "Full consistency > edge cons. > vertex cons." Ex: Pagerank needs write access on current vertex and read access on edges and neighbor vertices. So it needs edge consistency.

- Instead of aggregators in Pregel, GiraphLab has global variables that can only be used sync. But how are they implemented in a distributed setting? As master variables organized via access tokens to workers requested with messaging?

Structure and Design:
- Partitioning: Uses Parallel METIS and two phase partitioning. Border vertices are named ghost and handled specially. Ghosts have cache copies at neighbor partitions. But how is the consistency of ghosts maintained?
The order vertices are selected for processing is done in two ways:
- Chromatic Engine: partially async. uses coloring. No info about how coloring is done. Same colors operate at the same time. color-step instead of superstep. Also needs barrier sync at the end of color-step. But requires less work at the end of a color-step compared to superstep.
- Distributed Locking Engine: Uses a reader-writer lock at each vertex. Deadlocks are avoided by canonical order on locks. Workers only allowed to update local vertices. Lock requests are pipelined to prevent remote lock acquisition latency.

 -Fault Tolerance:achieved via an async version of Chandy-Lamport snapshot algorithm at fixed intervals

No comments:

Post a Comment