I recently went through a interview session for a SWE/CS role at a well known company.
It wasn’t specifically a “coding-round” but was titled a “domain interview” session, so I assumed the interviewer was interested in an discussion/high-level-solution related to the problem.
You’re given a computation graph with nodes representing computations, edges representing dependence, and the values on edges representing the data-size in MB (input or output) at that edge.
You want to schedule this graph on a single-processor with limited memory
To execute a node/computation, both inputs and outputs data should fit within X memory of the processor.
It’s given for for each node sum of its inputs/outputs data <= X.
If at any time the data the total data that needs to be stored execeeds X, then there is extra cost per MB of fetching data from the disk. (it can happen with node n1 has produced an output that needs to be used by n2 but not all predecessors of n2 have run yet).
How would you schedule this graph?
My approach/What I discussed:
- I answered by saying that “in general” this problem is intractable(NP-complete) for multi-processor systems, and likely too for single-processor system due to the memory constraint.
- Then I suggested a greedy approach based on keeping track of all ‘ready’ nodes at a given time that can be executed as their predecessors have been executed.
- Which specific node to schedule next from several available/ready
ones can be done based on trying to minimize the cost associated with
memory constraint (i.e. seeking to schedule the one with the min cost
associated with data transfer).
- If there are several nodes with same minimum cost associated with
data transfer, the pick the node among them with the biggest
input+output mem requirement.
During the session, the interviewer didn’t suggest many things or gave any tips or direction on which directions he may have wanted me to lead to, so it turned out to be very open-ended.
He was almost neutral or very slight camera/tight-lipped smile taking notes most of the time.
I couldn’t gauge exactly what interviewer may have been looking for due to the intractableness of the problem — it’s hard to come up with a general optimal algo for such at the top of your head.
Later I came to know I received a ‘no’ decision for this session.
This has me puzzled. I’ve been racking my brain on if I didn’t approach it correctly, or if the interviewer was expecting one to talk about something else that the things I mentioned.
Is there something optimal or direct I missed?
Could there be some specific algo that interviewer may have been looking for that generally one is expected to know?
During the session, I thought I was providing a reasonable solution given the constraints of the problem, and could have deep-dived into any direction that interviewer may have asked me to.
But didn’t get any push-back or suggestions from the interviewer during the session, i’m interested to know what i missed or how I should have approached this problem.