I am creating a vectorized reinforcement learning environment. It handles multiple instances of a board game in synchron. Up until now, the state of the environment was a specific representation of the board. However, the need arose for running tests with different board representations and later it might expand to include data aside the representation (altough very unlikely).
Structure-wise I am utilizing Data Oriented Design and not Object Oriented Proremming. As result I have contiguous vectors for each environment attribute (board, turns, score) and not a vector of environments. This way of processing resulted with a 5-6x speed increase with otherwise same functions.
What options do I have in supporting multiple state types, especially if performance is a key aspect? (There is no need to support multiple types in the same instance of a vectorized environment.)
In my initial approach (after using a single
std::vector with “fancy” indexing) I utilized a
std::arrays. This was sufficient for handling one board representation, but not for multiple. Therefore I switched to a custom templated class in my last approach with a
std::array as its member. The size of the
array was was set by an enum template argument. (I wanted my data to be in contiguous memory, therefore kept the vector of arrays design, and used tempalting to determine the size for holding all the features of the representation.) This method had advantages such as type safety and handling different scenarios by simple overloads. The disadvantages include the requirement of an extra tempalte argument (who would have though) and – the most importantly – I am not convined whether it could later be expanded into supporting a state that consists of something other than the board.
(To clarify the last part, if the state only requires the boards, then no issues there for handling multiple representations. However, if for a certain representation it needs an extra argument then the overloading only works if I add that extra argument as a dummy argument to all other functions.)
My ideas regarding the next possible approach include switching the type enum from a template argument to a member variable. The states could be handled with a single
std::vector and fancy indexing or each representation would have their own storage allocated and only the selected one used. The proper function calls would be handled by branching (so multi argument state creation is no issue). This appraoch seems to be rather flexible, but maybe it would cost in performance.
What did I miss? How could my approaches be improved? What other solutions do you see?