Signal/Collect
is a framework for synchronous and asynchronous parallel graph
processing. It differs from Pregel for two main reasons: (1) the
edges can have computations associated to them as well in the signal
function, so you basically write a compute() method for the vertices
and another one for the edges (2) the synchronization barrier
constraints are relaxed, so it's possible also to implement async
algorithms.
Asynchronous
computation is done by introducing randomization on the set of
vertices on which signal and collect computations have to be
computed.
The
main problem is, at the time, the core system doesn't scale to
multiple machine. It uses shared memory model on a single powerful
machine with a huge memory. It might be very inefficient or
hard-to-use in a distributed setting.
It
keeps a list which contains the latest known values of neighbors. In
addition it has an incoming message queue for messages that are not
read yet. This neighbor list is useful in Giraphx but it might be
memory inefficient.
It
supports prioritization of vertices or operations. The authors
introduce a threshold score which is used to decide whether a node
should collect its signals or it should send signals. Using
this score, processing of algorithms can be accelerated in a way
that for every superstep only signals and collects are performed if
a certain threshold is hit.
It
also provides a lot of small extensions which might give hints in
Maestro implementation. It does not enable edge/vertex add/removal.
No comments:
Post a Comment