This year I had my senior project from my school, and its topic was Distributed Implementation of Normalized Cuts Image Segmentation. Normalized Cuts are used in image segmentation and the algorithms based on this idea are generally considered as succesful algorithms. I had this project with a friend of mine from school, M. Salih Yıldırım, and in this project we aimed reducing the computing time spent in various steps of the algorithm by taking the chance of distributing the computation among many nodes over a cluster.
Normalized Cuts and Image Segmentation, an article by Jianbo Shi and Jitendra Malik explains the idea of the algorithm in a clear way already so I will not be explaining the algorithm here, the reader is referred to the article itself for a clean and detailed explanation of the algorithm.
One who may read the article would easily see that, like many of the image segmentation algorithms, this one also relies on many matrix operations. Obviously parallel operations would speed the algorithm up. So as to speed up the algorithm we have implemented it using C++ and Octave (we did 2 distinct sequential implementations) and we have observed that
- Creation of Weight Matrix
- Solving the Generalized (or depending on the implementation Standard) Eigen Problem
take most of the time. So we decided to distribute those steps.
The creation of weight matrix was planned as follows: Somehow the image is sent to each node in the system, (either by broadcasting from the master node or opening the image file on each node in a cluster with shared filesystem) each node is then assigned a part of the matrix and each node calculates its own part. This divides the computing time by the number of processing nodes since each node calculates its own part at the same time and thus only a product of sequential calculation time and (1/node count) time is spent in this step.
About solving the eigen problem, we have realized that distributed solving of eigen problems itself is far beyond the scope of the project and perhaps itself can be a project, we decided to use Trilinos, a linear algebra software provided by Sandia National Laboratories. Trilinos solves the eigenproblem distributedly. The operators of the problem (the matrices we have created in the previos step) are distributed data structures, and their distribution map (which node has which part information) are passed to the eigensolver and eigensolver solves the eigenproblem distributedly.
Then the resulting eigenvector is sent to master node and master node does the actual segmentation operation, then for each segment, initiates the same process again.
Timing results are not present yet (experiments are going on) however as far as I can see, the implementation really speeds up the algorithm.
I will present the thesis text of the project and some graphics illustrating the amount of speed up when they are available, and I am presenting the segmented images below:
Thank you all for your time reading this post!
Edit: click the link below to see the project on my universities .edu web site:
http://www.ce.yildiz.edu.tr/project.php?d=819



