python – How to optimize this majority vote method

I have the following code to do a majority vote for data in a dataframe:

def vote(df, systems):
    test = df.drop_duplicates(subset=('begin', 'end', 'case', 'system'))
    n = int(len(systems)/2)
    data = ()
        
    for row in test.itertuples():
        # get all matches
        fx = test.loc((test.begin == row.begin) & (test.end == row.end) & (test.case == row.case))
        fx = fx.loc(fx.system.isin(systems))

        # keep if in a majority of systems   
        if len(set(fx.system.tolist())) > n:
            data.append(fx)
             
    out = pd.concat(data, axis=0, ignore_index=True)
    out = out.drop_duplicates(subset=('begin', 'end', 'case'))
    
    return out(('begin', 'end', 'case'))  

The data look like:

systems = ('A', 'B', 'C', 'D', 'E')

df = begin,end,system,case
0,9,A,0365
10,14,A,0365
10,14,B,0365
10,14,C,0365
28,37,A,0366
38,42,A,0366
38,42,B,0366
53,69,C,0366
56,60,B,0366
56,60,C,0366
56,69,D,0366
64,69,E,0366
83,86,B,0367

The expected output should be:

out = begin,end,case
    10,14,0365
    56,69,0366

IOW, if desired elements begin, end, case appear in a majority of systems, we accumulate them and return them as a dataframe.

The algorithm works perfectly fine, but since there are hundreds of thousands of rows in it, this is taking quite a while to process.

One optimization I can think of, but am unsure of how to implement is in the itertuples iteration: If, for the first instance of a filter set begin, end, case there are matches in

fx = test.loc((test.begin == row.begin) & (test.end == row.end) & (test.case == df.case) & (fx.system.isin(systems)))

then, it would be beneficial to not iterate over the other rows in the itertuples iterable that match on this filter. For example, for the first instance of 10,14,A,0365 there is no need to check the next two rows, since they’ve already been evaluated. However, since the iterable is already fixed, there is no way to skip these of which I can think.