algorithms – Is parallel permutation generation possible?

I have a character set of 20 characters, for example [0-9a-j]. Is there any algorithm that scales well with unlimited number of cores / threads? I’d like to utilize my gpu’s cude cores for this purpose but i’m having troubles with these kind of algorithms and complexities. If it would be possible to do this parallel then how would you calculate the required time to finish the generation depending on the hardware, number of characters in the set and the maximum length of a permutation?