First off – I’m not a Computer Scientist, I’m a Software Developer – so when it comes to presenting an idea in a formal manner to a Computer Scientist, I have no idea how to do so. As such, I’m wondering if someone would be good enough to show me how to write the algorithm/idea I’ve outlined below in some form of formal alogrithmic notation, please?
Say I have a list of ‘words’ made up from characters of the English alphabet. Essentially, I want to split this list of ‘words’ up into twenty-six sub-lists, where each sub-list is associated with one letter of the alphabet – a, b, c, etc. Each ‘word’ should be moved to the sub-list associated with the character that the ‘word’ begins with – so ‘apple’ would go in the ‘a’ sub-list, ‘banana’ would go in the ‘b’ sublist, etc. BUT, I only want to divide my original list up into sub-lists provided that there are at least X ‘words’ in the list that begin with each letter of the alphabet (so if X was 2, there would need to be at least two words beginning with ‘a’, at least two words beginning with ‘b’, …, at least two words beginning with ‘z’, etc.). In essence, it’s either one list with all ‘words’ in it or 26 sub-lists with at least X ‘words’ in it.
Assuming I was able to split the list of ‘words’ up into sub-lists as described in Step One, I then want to further divide each sub-list based on the value of the second character in each ‘word’. So there would be an ‘aa’ sublist, an ‘ab’ sublist, …, a ‘zz’ sublist, etc. Again, I only want to do any further division of the sub-lists provided there are at least X ‘words’ that begin with every possible two-character combination of English alphabet letters – so at least two ‘words’ beginning with ‘aa’, two ‘words’ beginning with ‘ab’, …, two words beginning with ‘zz’, etc. In essence, it’s either 26 sub-lists or 676 sub-lists.
I want this dividing process to continue (character three, character four, etc.) until it is no longer possible to satisfy the criteria that there are at least X ‘words’ in each sub-list that ‘begin’ with every possible combination of the number of characters that are currently under consideration.
For argument’s sake at this point, it can be assumed that all ‘words’ are the same length.
Any help is very much appreciated.