python – Finding pair of numbers that minimize a function

This is a problem of a past bioinformatic contest (requires an account). My solution works but is too slow for some test cases.

Input format

The first line of the input contains one integer $T$, $(1 leq T leq 3)$ the number of test cases. Each test case is specified by four lines.

The first line of each test case contains three integer numbers $M$, $K$, $N$.

The second line contains $M$ numbers $m_i$ − masses of metabolites $(0 < m_ile 1000)$.

The third line contains $K$ numbers $a_i$ − masses of adducts $(-1000 le a_i le 1000)$.

The fourth line contains $N$ numbers $s_i$ − masses of signals $(0 < s_ile 1000)$.

All the masses are indicated with exactly six decimal places.

Output format

For each signal $s_i$ of each test case, print numbers $j$ and $k$ such that $s_i = m_j+a_k+Delta$, $m_j+a_k > 0$ and the absolute value of $Delta$ is smallest possible. If there are multiple numbers $j$ and $k$ with same absolute value of $Delta$ for some signal, you can print any of them.

Sample input

3
2 2 5
1.000002 0.000002
0.500000 -0.500000
0.500001 0.500002 0.500003 1.000000 0.000001
2 2 5
1.000002 0.000001
0.500000 -0.500000
0.500001 0.500002 0.500003 1.000000 0.000001
5 4 7
0.000001 0.000002 0.000003 0.000004 0.000005
0.000002 0.000010 0.000001 -0.000001
0.000001 0.000002 0.000100 0.000005 0.000020 0.000010 0.000003

Sample output

1 2
1 2
1 2
1 2
1 2
2 1
1 2
1 2
1 2
2 1
2 4
1 3
5 2
3 1
5 2
1 2
1 1

Test cases

1.txt: $M,K,N≤10$
3.zip: $M,K≤1000;N≤10^5$
4.zip: $M≤10^6;K,N≤1000$
5.zip: $M,K,N≤10^4$
answers.zip: to test the solution

Code

from bisect import bisect_left
from time import perf_counter as pc

# Find in arr the closest number to n
def take_closest(arr, n):
    pos = bisect_left(arr, n)
    if pos == 0:
        return arr(0)
    if pos == len(arr):
        return arr(-1)
    before = arr(pos - 1)
    after = arr(pos)
    if after - n < n - before:
        return after
    else:
        return before


def solve(masses, adducts, signals):
    totals = {}
    for i, m in enumerate(masses):
        for j, a in enumerate(adducts):
            ma = m + a
            if ma > 0:
                totals(ma) = (i + 1, j + 1)
    skeys = sorted(totals.keys())
    for s in signals:
        closest = take_closest(skeys, s)
        yield totals(closest)


if __name__ == "__main__":
    test_num = 3
    of = open(f"out{test_num}.txt", "w")
    with open(f"{test_num}.txt", "r") as f:
        t0 = pc()
        t = int(f.readline())
        for _ in range(t):
            M, K, N = map(int, f.readline().strip().split())
            masses = list(map(float, f.readline().strip().split()))
            adducts = list(map(float, f.readline().strip().split()))
            signals = list(map(float, f.readline().strip().split()))

            for j, k in solve(masses, adducts, signals):
                of.write(f'{j} {k}n')
        t1 = pc()
        print(f"Runtime: {round(t1-t0,3)} s")
    of.close()

Algorithm:

  1. Store all sums $m_i + a_j$ in a dictionary with indices $i,j$ as values.
  2. Sort signals
  3. For each signal, find the closest number among the sorted keys of the dictionary using binary search.

Issues:

The solution works but is too slow for test case 4, while test case 5 takes around 10 minutes on my machine.

Any feedback is appreciated.