DFS in Graphs

Tags: graphs data-structure algorithms technical

DFS (Depth First Search) helps in finding the connected nodes in a graph. By connected, we mean two nodes which can be reached by following a set of edges. This makes sense in both directed and undirected graph – in directed graph, there’s only one way of moving along the edge, but in the undirected graph we can move both forward and backwards along the edge.

In simpler terms, we would want to start at a given node, follow along a path until we reach a deadend, and then move backtrack until we reach a new unvisited node, choose it and start moving along its path. Since we are backtracking, we would want the computer to take care of the bookkeeping, and therefore we use recursion (alternate is to manually handle that using a stack).

Path along one node gives us a depth first tree starting at that node; adding up all the paths for all the nodes in the graph, we get a depth first forest.

A very simple DFS snippet works like (namespace and other boilerplate is already defined here):

public class DFS {
    public Graph DFSGraph { get; set; }
    public DFS(Graph g) {
        DFSGraph = g;
    }
    public void Start() {
       var vertices = DFSGraph.GetVertices();
       foreach (var vertex in vertices) {
          if (!vertex.Visited) {
              Explore(vertex);
              vertex.Visited = true;
          }
       }
    }
    public void Explore(Vertex v) {
        v.Visited = true;
        foreach (var neighbour in v.GetNeighbours()) {
           if (!neighbour.Visited) {
               Explore(neighbour);
               neighbour.Visited = true;
           }
        }
    }
}

In the previous code, GetVertices gets all the vertices of a graph. In contrast, GetNeighbours gets all the nodes which can be reached from the given node. This will work correctly for both directed and undirected graphs.

The Start method starts a DFS tree for each node, and the Explore method takes the job of going down the node path. Since the method is recursive, once Explore reaches a point where there are no unvisited neigbours of current node, it backtracks until it finds an ancestor which has unvisited neighbours, picks one and again follows the path. Each visited node is marked as such so that it is not picked again by any of the methods.

Bookeeping

DFS also allows to keep bookeeping information regarding the structure of the tree and how the nodes are placed relative to one another. For example, we can have start and finish times associated with every node for the time when it was first “visited” and then revisited during backtracking respectively. And it is quite clear that the relative structure of nodes depends on these timestamps. Parent node (or a node found first) will start an Explore subroutine on its neighbours (which are found later), and therefore every neighbour will have a greater start timestamp. During backtracking, the neighbours will have lesser value of the finish timestamp than the node which started the Explore method on the node.

The following code assumes that CurrentTime method is responsible for updating the timestamp. Implementation is available here), where I am using delegates to add pre/post node visit hooks.

public class DFS {
    public Graph DFSGraph { get; set; }
    public DFS(Graph g) {
        DFSGraph = g;
    }
    public void Start() {
       var vertices = DFSGraph.GetVertices();
       foreach (var vertex in vertices) {
          if (!vertex.Visited) {
              vertex.StartTime = CurrentTime();
              Explore(vertex);
              vertex.StopTime = CurrentTime();
              vertex.Visited = true;
          }
       }
    }
    public void Explore(Vertex v) {
        v.Visited = true;
        foreach (var neighbour in v.GetNeighbours()) {
           if (!neighbour.Visited) {
               neighbour.StartTime = CurrentTime();
               Explore(neighbour);
               neighbour.StopTime = CurrentTime();
               neighbour.Visited = true;
           }
        }
    }
}

DFS has also some interesting uses like arranging the nodes topologically via Topological Sort and also to find the connected nodes (Connected Components) in the graph. I will discuss the same in another article.