Performance recursion on graphs – networkx VS adjacency lists with native lists and sets (Python)

I hope someone can help me with checking this code to make it faster.

My goal is to recursively create a graph:

  1. Query of a node and its neighbors from a database (not available here)
  2. Add it and its neighbors to a chart
  3. until you find a target, recur (1, 2) for each neighbor

I also set a control (maxnodes_in_graph) if the target is not found.

I've tried an optimization: I do not want to reprocess a node if I've already met someone else's neighbor and already added its child nodes.

So I use node_processed and node_to_process,

The central part is this:

// version using networkx

// keep track of process
nodes_processed = ()
nodes_to_process = ()

def construct_simple_sub_graph(startNode, g):

    // get my list of edges like ( (startNode, n1) ... )

    g.add_edges_from( edge_list)

    // I just proccessed
    nodes_processed.append( startNode )

    for node in list(g.nodes):
      // here I am doing something like this:
      // I check if already I processed this node
      // if not done, I will add it to a TODO list nodes_to_process
      // if done, I will remove it from the TODO list

      if node not in nodes_processed:
        if node not in nodes_to_process:
          nodes_to_process.append( node )
      else:
        if node in nodes_to_process:
              nodes_to_process.remove( node )  
    return g

and then I put this feature in a while:

 def subgraph(fromNode, g):

    while not ( g.has_node( fromNode) and g.has_node( toNode) ):


        for node in nodes_to_process:

          g = construct_simple_sub_graph( node, g)

          if (g.number_of_nodes() > maxNodes) :
            return g

        return g

 // first line will populate initial nodes in graph, and put
 // populate nodes_to_process with the first neighbors
 g = construct_simple_sub_graph( fromNode, g) 
 g = subgraph(fromNode, g)

Well, this code does what I want, but it takes a lot of time.
This is probably the reason for how networkx is done.

I've tried to revise the code with simple lists and sets, use adjoin lists instead of graphs, and only build them when I'm done.

I have revised the code as:

  // version usings lists and sets()
  // adj_list is like (parent, neighbor)
  adj_list = ()
  // store the nodes in my graph / adj 
  adj_nodes = ()

  // same as above
  nodes_processed = set()
  nodes_to_process = ()


  def construct_simple_adj_list(startNode, nodes_to_process): 

    // get edge_list
    // isFound is a bool that tell me if target node is met in someone's neighbors

    adj_list.extend( edge_list )

    # nodes in graph
    adj_nodes.append(startNode)
    adj_nodes.extend((e(1) for e in edge_list))

    # to process
    nodes_processed.add( startNode)

    // I make a copy of the nodes to process: I need to update the TODO list 
    nodes_to_process = nodes_to_process.copy()
    for node in nodes_processed:
      if node in nodes_to_process:
        nodes_to_process.remove(node)

    return is_found, nodes_to_process

and back to while:

// first iteration
is_found, nodes_to_process = construct_simple_adj_list( fromNode, adj_nodes)

    while not (is_found):

        for node in nodes_to_process:

          is_found, nodes_to_process = construct_simple_adj_list( node, nodes_to_process)

          if (len(adj_nodes) > maxNodes) :
            return adj_list, nodes_to_process

        return adj_list, nodes_to_proces

This second code is 2 orders of magnitude faster (YAI)
but I noticed that the final graphic is very different
and I get lower quality results.

Could you please help us to recognize the differences between the two approaches?

Can you help provide a recursive version that uses only lists and set manipulations to recursively create an Adj matrix?