algorithms – Approach / data structure to wait for N tasks to complete in arbitrary order

My system needs to wait for the completion of N tasks and terminate when all are completed. The work items are passed in as an immutable array and the execution of these items are handled externally to my system. Checking for completion is quite inexpensive and the items can complete in any order. The size of these lists are relatively small — in the hundreds or thousands of entries, so memory size is not an issue. Given the constraints of the system, the task list passed in is an immutable array so anything I’d want to do with that would need to augment or copy it. In case it’s relevant, I’m using C++17.

A simple (naive?) approach to this would be to create a boolean array of completion status (initialized to false) and mark the corresponding index in that array as true once the work item has been completed. Then, iterate through the task list again and again until all items are completed.

Of course, with this approach, towards the end of the run there will be a lot of checking of the completed list with little actual work to do (and on average, the algorithm will go through half of the entries that have already been done).

Another approach would be to create a singly-linked list of array indicies (initialized to the elements (0, N)) and remove elements as work items are completed. That avoids the “wasted checking” approach of the parallel completed array approach mentioned above; it would just jump directly to the indices in the task list that are not known to already be completed.

It seems that there may be a faster / elegant approach to this, though, or a data structure that naturally handles this.