python 3.x – How can I shorten the runtime of my simulation?


Before you read the code below, note the following explanation:
I have three classes: Driver, Vehicle, and Itinerary.

They have attributes Driver.behavior, Vehicle.pRange, Vehicle.pBattery, Itinerary.destinations, and Itinerary.startTime.

The function findCand takes all these class attributes as input and returns as an output a dataframe with three columns.

So here is the code that I use to run my simulation:

n = 1000 
m = 1000 

agentA = ()
iterationA = ()
itineraryA = (None)*n
behaviorA = (None)*n
rangeA = (None)*n
batteryA = (None)*n
startTimeA = (None)*n
df = pd.DataFrame(columns = ('Loc', 'Amount', 'Time'))

for j in range(m):
    for i in range(n):
        x = Itinerary()
        y = Vehicle()
        z = Driver()
        itineraryA(i) = x.destinations
        behaviorA(i) = z.behavior
        rangeA(i) = y.pRange
        batteryA(i) = y.pBattery
        startTimeA(i) = x.startTime
        dfi = findCand(itineraryA(i), behaviorA(i), rangeA(i), batteryA(i), startTimeA(i))
        df = df.append(dfi)
        agentA = agentA + (i)*len(dfi)
        iterationA = iterationA + (j)*len(dfi)

df('Ag') = agentA
df('It') = iterationA

I run it n times, m iterations. For each dataframe I get from findCand, I append it to the dataframe df.

The code works, but it’s taking a crazy amount of time to run.

If n=10 and m=10, it takes about a second.

If n=100 and m=100, it takes around 97 seconds.

I put n=1000 and m=1000 and it was running for more than 3 hours before I stopped it.

I need to do this for a way higher value of both n and m. I realize that it takes a lot of time to append a dataframe so often but I’ve tried a few other methods

  • I used a dictionary and appended larger dataframes fewer times.
  • I used lists instead of dataframes and then made one large dataframe in the end.

But these methods took just as long or even longer than the one above. So my question is, can anyone suggest areas that I can improve that might shorten the runtime?