Rewrite approximate_bfs with Rayon and no unsafe
This is a much simpler implementation of the same algorithm.
Unsafe is eliminated by letting Rayon distribute the root nodes, and giving up the idea of almost-synchronization-free 'visited' set, which is now an AtomicBitVec instead of a vector of u8.
This seems to be five times faster based on the ETA after running for a few minutes.