Distributed Graphlab's main focus
is on solving graph algorithms efficiently in a distributed
environment in a shared memory setting. It has 4 main properties:
- It is designed for
graph data. But unlike Pregel, the user needs to focus on sequential
computation rather than parallel movement of data.
- It is async unlike
Pregel. Therefore it uses the most recent data, and it is not affected by the slower
worker synchronization problem and highly-variable running times in a skewed
network.
- For example in
Pagerank, each node converges at a different rate. Graphlab supports
prioritizing computation of vertices and adaptively pulling info from
neighbors unlike Pregel. This pulling idea resembles our contribution
in Giraph-e.
- Provides different
levels of consistency to enforce serializibity.
- Data input format is
similar to Pregel. There is V, E and D where D contains the edge and
vertex data.
- An important
difference is graph structure is not modifiable unlike Pregel.
- Another difference
is, an update function at a vertex can also modify the data of
neighbor vertices. But I dont agree with the argument in Section 3.2.
Pregel can also make selective sending in Pagerank. In additon, Alg 1
looks like push not pull.
- Execution model:
Similar to Pregel, vertices are executed one by one. But the order
may be determined by the user unlike Pregel. But there is no info
about the distribution of vertices. Do all workers go over the same
set of vertices?
- Consistency is
maintained at 3 levels. Level of consistency is inversely
proportional to the level of concurrency. "Full consistency >
edge cons. > vertex cons." Ex: Pagerank needs write access on
current vertex and read access on edges and neighbor vertices. So it
needs edge consistency.
- Instead of
aggregators in Pregel, GiraphLab has global variables that can only
be used sync. But how are they implemented in a distributed setting?
As master variables organized via access tokens to workers requested
with messaging?
Structure and Design:
- Partitioning: Uses
Parallel METIS and two phase partitioning. Border vertices are named
ghost and handled specially. Ghosts have cache copies at neighbor
partitions. But how is the consistency of ghosts maintained?
The order vertices are
selected for processing is done in two ways:
- Chromatic Engine:
partially async. uses coloring. No info about how coloring is done.
Same colors operate at the same time. color-step instead of
superstep. Also needs barrier sync at the end of color-step. But
requires less work at the end of a color-step compared to superstep.
- Distributed Locking
Engine: Uses a reader-writer lock at each vertex. Deadlocks are
avoided by canonical order on locks. Workers only allowed to update
local vertices. Lock requests are pipelined to prevent remote lock
acquisition latency.
-Fault
Tolerance:achieved via an async version of Chandy-Lamport snapshot
algorithm at fixed intervals
No comments:
Post a Comment