Workflow for designing a parallel architecture


I've recently been interested in parallel computing and just wanted to check if there's some sort of standard or workflow for designing a parallel architecture.

In particular I am interested in the answer to the following question:
If you need to write a code that you know will be parallelized, it would be best to first write the serial code and make sure it's implemented correctly, and then parallelize it, or you should do the parallelization from the beginning write? I think that could be application specific? That is, some codes written in series may need to be significantly changed in order to be parallelized.

I'm just looking for a general workflow to find the right setting for the parallelized software architecture.