Before you read the code below, note the following explanation:
I have three classes:
They have attributes
findCand takes all these class attributes as input and returns as an output a dataframe with three columns.
So here is the code that I use to run my simulation:
n = 1000 m = 1000 agentA = () iterationA = () itineraryA = (None)*n behaviorA = (None)*n rangeA = (None)*n batteryA = (None)*n startTimeA = (None)*n df = pd.DataFrame(columns = ('Loc', 'Amount', 'Time')) for j in range(m): for i in range(n): x = Itinerary() y = Vehicle() z = Driver() itineraryA(i) = x.destinations behaviorA(i) = z.behavior rangeA(i) = y.pRange batteryA(i) = y.pBattery startTimeA(i) = x.startTime dfi = findCand(itineraryA(i), behaviorA(i), rangeA(i), batteryA(i), startTimeA(i)) df = df.append(dfi) agentA = agentA + (i)*len(dfi) iterationA = iterationA + (j)*len(dfi) df('Ag') = agentA df('It') = iterationA
I run it n times, m iterations. For each dataframe I get from
findCand, I append it to the dataframe
The code works, but it’s taking a crazy amount of time to run.
m=10, it takes about a second.
m=100, it takes around 97 seconds.
m=1000 and it was running for more than 3 hours before I stopped it.
I need to do this for a way higher value of both n and m. I realize that it takes a lot of time to append a dataframe so often but I’ve tried a few other methods
- I used a dictionary and appended larger dataframes fewer times.
- I used lists instead of dataframes and then made one large dataframe in the end.
But these methods took just as long or even longer than the one above. So my question is, can anyone suggest areas that I can improve that might shorten the runtime?