java – Giffler Thompson Algorithm for decoding individuals of a Genetic Algorithm

First of all: Hi, my name is Henrik and I am a mechanical engineer, who loves to code in his free time.
To get better in programming I wrote a genetic algorithm compined with a list planning algorithm (giffler thompsoon algorithm) to solve the fjssp (i have encountered this problem in my master thesis 2 years ago).

Although i would like to have help / reviews about the whole program, i think the biggest potential for improvement is in the class “Individuum”. Especially the methods “calculateStartingTimeMatrix” and “decodierung” (decoding) seem to me somehow cumbersome, inefficient and confusing (the latter is certainly also due to the length of the methods).

To the programm:
After reading a production process out of an excel file and creating a population with random individuals i create permissible productionsplans for each individual (“Individuum”). To be able to do that i use the giffler thompson algorithm to decode the individuals. This happens in the “decodierung” method. For the giffler thompson algorithm i need to calculate the starting times at the beginning and every time i add a operation to the production schedule. This happens in the calculateStartingTimeMatrix methode.

package planningalgorithm;

import java.util.Random;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;


public class Individuum {
    int number;
    int birthGeneration;
    int nOp;
    int nMa;

    int() indiAllocation;                   // Allocation of this Individual
    int() indiSequence;                     // Sequenz of this Individual
    int()() predecessorWorkingTimes;        // Do i need this?
    int()() startingTimeMatrix;             // Do i need this?
    int() timesProduction;
    float timeFitness;
    int susRank;
    int tournamentWins;

    List<Machine> indiMachines;                 // Machines of this Individual
    List<Operationen> indiProcess;              // Process of this Individual

    int() arrayA;
    int() arrayB;
    int operationDash;
    int operationDashDash;
    int operationStar;


    //Konstruktor
    Individuum(int num, int startgen, int nOp, int nMa){
        number=num;
        this.nOp = nOp;
        this.nMa = nMa;
        birthGeneration=startgen;

        indiAllocation = new int(nOp);
        indiSequence = new int(nOp);
        predecessorWorkingTimes = new int(nOp)(nOp);
        startingTimeMatrix = new int(nOp)(nOp+1);
        timesProduction = new int(nOp);

        // Liste der Maschinen erstellen
        indiMachines = new ArrayList<>(nMa);
        for (int i=0;i<nMa;i++) {
            Machine newMachine = new Machine(i,0);
            indiMachines .add(newMachine);
        }

        // Liste der Prozesse erstellen
        indiProcess = new ArrayList<>(nOp);
        for (int i=0;i<nOp;i++) {
            Operationen newOperation = new Operationen(nOp,nMa);
            indiProcess.add(newOperation);
        }

        arrayA = new int(nOp);
        arrayB = new int(nOp);
    }


    // Zufallszahl 
    private double randomNumber(){
        double ranNum;
        Random zufallszahl = new Random();
        ranNum = zufallszahl.nextDouble();
        return ranNum;
    }

    private double round(double value, int decimalPoints) {
        double d = Math.pow(10, decimalPoints);
        return Math.round(value * d) / d;
     }

    private int max(int() fooArr) {
        int maximum = -100;
        for (int i=0;i<fooArr.length;i++) {
            if (fooArr(i) > maximum) {
                maximum = fooArr(i);
            }
        }
        return maximum;
    }

    public void copyArr (int() arr, int() dest){
        for (int i=0;i<arr.length;i++){
            dest(i) = arr(i);
        }
    }

    public int() copyArraySection (int() originArr, int startPointerArr, int() destArr, int startPointerDest, int lenghtCopy){
        for (int i=0;i<lenghtCopy;i++){
            destArr(startPointerDest+i) = originArr(startPointerArr+i);
        }
        return destArr;
    }    

    public int() addOneToArray (int() arr, int what){
        // Case 1: Array is empty
        if (arr == null){
            int() newArr = new int(1);
            newArr(0) = what;
            return newArr;
        }
        // Case 2: Array isnt empty
        else {
            int() newArr = new int(arr.length + 1);
            for (int i=0;i<arr.length;i++){
                newArr(i) = arr(i);
            }
            newArr(arr.length) = what;
            return newArr;
        }
    }

    public int() oneValue(int() barArr, int value){
        int() oneValueArr = new int(barArr.length);
        for (int i=0;i<barArr.length;i++){
            oneValueArr(i) = value;
        }
        return oneValueArr;
    }
 
    public int() changeValue (int() arrArr, int oldValue, int newValue){
        int() newArrArr = new int(arrArr.length);
        for (int i=0;i<arrArr.length;i++){
            if (arrArr(i) == oldValue){
                arrArr(i) = newValue;
            }
        }
        return newArrArr;
    }

    public int count (int() countArr, int value){
        int countDoku = 0;
        for (int i=0;i<countArr.length;i++){
            if (countArr(i)==value){
                countDoku++;
            }
        }
        return countDoku;
    }

    public int()() copyMatrix (int()() arr) {
        int()() copy = new int(arr.length)(arr(0).length);
        for (int i = 0; i < arr.length; i++) {
            for (int j = 0; j < arr(0).length; j++) {
                copy(i)(j) = arr(i)(j);
            }
        }
        return copy;
    }


    void calculateStartingTimeMatrix(){
         //Reset startingTimeMatrix except the last column nOp which includes the occupation times
         for (int r=0;r<nOp;r++){
            for (int c=0;c<nOp;c++){
                startingTimeMatrix(r)(c) = 0;
            }
        }

        // Reset the status in every operation
        for (int i=0;i<nOp;i++){
            indiProcess.get(i).operationDone = false;
            indiProcess.get(i).operationNotReady = true;
            indiProcess.get(i).operationReadyToStart = false;
        }

        // Reset the remainingPredessors
        for (int i=0;i<nOp;i++){
             copyArr(indiProcess.get(i).predecessor,indiProcess.get(i).remainingPredecessors);
        }

        // Update StartingTimeMatrix with Operation, that are already planned
        for (int i=0;i<nOp;i++){
            if (arrayA(i) == -1){
                indiProcess.get(i).operationDone = true;
                indiProcess.get(i).operationNotReady = false;
                indiProcess.get(i).operationReadyToStart = false;
                for (int z2=0;z2<nOp;z2++){
                    if (indiProcess.get(z2).predecessor(i) == 1){
                        startingTimeMatrix(z2)(i) = indiProcess.get(i).timeEnd;
                        indiProcess.get(z2).remainingPredecessors(i) = 0;
                    }
                }
            }
        }


        for (int i=0;i<nOp;i++){
            if (arrayA(i) == 1){
                indiProcess.get(i).operationDone = false;
                indiProcess.get(i).operationNotReady = false;
                indiProcess.get(i).operationReadyToStart = true;
            }
        }

    
        int operationsDone = 0;
        while (operationsDone < nOp){

            
            for (int z=0;z<nOp;z++){
                int foundOp = 0;

                // Search for starting Operation
                if (indiProcess.get(z).operationReadyToStart == true){
                    foundOp = z;
                    for (int z2=0;z2<nOp;z2++){
                        if (indiProcess.get(z2).predecessor(foundOp) == 1){
                            startingTimeMatrix(z2)(foundOp) = max(startingTimeMatrix(foundOp)) + predecessorWorkingTimes(z2)(foundOp); //t_Op = t_PredecessorStart(from startingTimeMatrix) + t_PredecessorProcess(from predecessorTimes)
                        }
                    }

                // Mark the found operation as done
                for (int z3=0;z3<nOp;z3++){
                    if (indiProcess.get(z3).remainingPredecessors(foundOp) == 1){
                        indiProcess.get(z3).remainingPredecessors(foundOp) = 0; // FoundOp is done and is now no longer a predecessor on which other operations have to wait for
                    }
                }

                indiProcess.get(foundOp).operationDone = true;
                }
            }


            //Calculate new  Starting Operations
            for (int z=0;z<nOp;z++){

                // Operation has no remaining Predecessor and wasnt done yet: Ready to Start
                if (max(indiProcess.get(z).remainingPredecessors)==0 && indiProcess.get(z).operationDone == false){
                    indiProcess.get(z).operationReadyToStart = true;
                    indiProcess.get(z).operationNotReady = false;
                }

                // Operation has no remaining Predecessor, but is already done
                if (max(indiProcess.get(z).remainingPredecessors)==0 && indiProcess.get(z).operationDone == true){
                    indiProcess.get(z).operationReadyToStart = false;
                    indiProcess.get(z).operationNotReady = false;
                }

                // Operation still has remaining Predecessors
                if (max(indiProcess.get(z).remainingPredecessors)!=0){
                    indiProcess.get(z).operationNotReady = true;
                    indiProcess.get(z).operationReadyToStart = false;
                    indiProcess.get(z).operationDone = false;
                }
            }


            //Abbruchbedingungen ermitteln
            operationsDone = 0;
            for (int z=0;z<nOp;z++){
                if(indiProcess.get(z).operationDone == true){
                    operationsDone++;
                }
            }
        }


        //Fertigungszeiten berechnen
        for (int z=0;z<nOp;z++){
            timesProduction(z) = max(startingTimeMatrix(z)) + indiProcess.get(z).timeWorking;
        }
    }

    void correctingAllocation (List<Operationen> operationsList){
        // Liste der Maschinen erstellen
        for (int i=0;i<nOp;i++){
            indiProcess.get(i).availableMachines = new int(nMa);
            copyArr(operationsList.get(i).availableMachines,indiProcess.get(i).availableMachines);
        }


        for (int i=0;i<nOp;i++){
            int requestedMachine = indiAllocation(i);
            if (indiProcess.get(i).availableMachines(requestedMachine) == 0){
                // Requested Machine cant execute the operation
                // Find available Machines
                List<Integer> numbersAvailableMachines = new ArrayList<>(); 
                for (int j=0;j<nMa;j++){
                    if (indiProcess.get(i).availableMachines(j) == 1){
                        numbersAvailableMachines.add(j);
                    }
                }
                // Pick random Machine of the available machines
                double randomMachine =  round((numbersAvailableMachines.size()-1) * randomNumber(),0);
                int newMachine = (int) randomMachine;

                //Put new Machine in Allocation
                indiAllocation(i) = numbersAvailableMachines.get(newMachine);
            }
        }
    }



    // Decodierung
    void decodierung(int()() Vorrangmatrix, int()() Maschinenzeiten){

        // VORBEREITUNG
        //Prozesszeitenmatrix bestimmen aus Zuordnung und Maschinenzeiten und Maschinenzeit in Matrix der Vorgänger-Prozess-Zeiten eintragen
        for (int z=0;z<nOp;z++){
            indiProcess.get(z).workingMachine = indiAllocation(z); 
            indiProcess.get(z).timeWorking = Maschinenzeiten(z)(indiProcess.get(z).workingMachine);
        }
    

        // z=zeile/row
        // s=spalte/column
        for (int z=0;z<nOp;z++){
            for (int s=0;s<nOp;s++){
                if  (Vorrangmatrix(z)(s)==1){
                    predecessorWorkingTimes(z)(s) = indiProcess.get(s).timeWorking;
                }
                else{
                    predecessorWorkingTimes(z)(s) = 0;
                }
            }
        }

        
        // Get the predecessors of every Operation and save them under Process.get(x).Predecessors
        for (int i=0;i<nOp;i++){
            copyArr(Vorrangmatrix(i),indiProcess.get(i).predecessor);
        }
    

        int()() arrayV = copyMatrix(Vorrangmatrix);
        for (int z=0;z<nOp;z++){
            if (max(Vorrangmatrix(z)) < 0.1)
                arrayA(z) = 1;
            else{
                arrayA(z) = 0; 
            }
        }

        calculateStartingTimeMatrix();

        // Beginn Giffler-Thompson Algorithmus
        int gtaBreakingCondition = 0;
        while (gtaBreakingCondition < nOp){
            operationDash = 0;
            operationDashDash = 0; 
            operationStar = 0;

            // GT - Step 2.H1: Get O' (Operation with earliest Productiontime, finsihed first)
            int earliestProductiontime = 1000;
            for (int z=0;z<nOp;z++){
                if (arrayA(z) == 1 && timesProduction(z)< earliestProductiontime){
                     operationDash = z;
                     earliestProductiontime = timesProduction(z);
                }
            }

            // GT - Step 2.H2: Get B, all Operations of A on the same Machine as O'
            int currentMachine = indiAllocation(operationDash);

            for (int z=0;z<nOp;z++){
                if (arrayA(z)==1 && indiAllocation(z)==currentMachine){
                    arrayB(z)=1;
                }
                else{
                    arrayB(z) = 0;
                }
            }
            
            // GT - Step 2.H3: Get O'' (Operation of B with ealiest starting Time)
            int earliestStartingtime = 1000;
            for (int z=0;z<nOp;z++){
                if (arrayB(z)==1 && max(startingTimeMatrix(z))<earliestStartingtime){
                    operationDashDash = z;
                    earliestStartingtime = max(startingTimeMatrix(z));
                }
            }


            // GT - Step 2.H4: Delete operations outside of the sigma window
            
            //Check if earliest starttime is smaller then occupation
            int starttime;
            if (max(startingTimeMatrix(operationDashDash)) < indiMachines.get(currentMachine).timeOccupation){
                starttime = indiMachines.get(currentMachine).timeOccupation;
            }
            else{
                starttime = max(startingTimeMatrix(operationDashDash));
            } 
            
            double sigma = 0.5;

            // Delete Operations outside
            for (int z=0;z<nOp;z++){
                if (arrayB(z)==1 && max(startingTimeMatrix(z)) > (starttime + sigma * (timesProduction(operationDash) - starttime))){
                    arrayB(z) = 0;
                }
            }


            //GT - Step 2.H5: Get O*
            int permutation = 1000;
            for (int z=0;z<nOp;z++){
                if (arrayB(z)==1 && indiSequence(z)<permutation){
                    permutation = indiSequence(z);
                    operationStar = z;
                }
            }


            // Mark OperationStern as done
            arrayA(operationStar) = -1; // OperationStar is done, mark with -1 in A
            for (int z=0;z<nOp;z++){
                if (arrayA(z) != -1){
                    arrayV(z)(operationStar) = 0; // OperationStar is no longer predecessor of other operations
                }
            }
            arrayV(operationStar) = oneValue(arrayV(operationStar), -1); // OperationStar is done, mark with -1 in V

            // Set the starting time of operationStar
            if (max(startingTimeMatrix(operationStar)) < indiMachines.get(currentMachine).timeOccupation){
                indiProcess.get(operationStar).timeStart = indiMachines.get(currentMachine).timeOccupation;
            }
            if (max(startingTimeMatrix(operationStar)) >= indiMachines.get(currentMachine).timeOccupation){
                indiProcess.get(operationStar).timeStart = max(startingTimeMatrix(operationStar));
            }

            // Set the ending time of operationStar
            indiProcess.get(operationStar).timeEnd = indiProcess.get(operationStar).timeStart + indiProcess.get(operationStar).timeWorking;

            // Set new occupation time of the current machine
            indiMachines.get(currentMachine).timeOccupation = indiProcess.get(operationStar).timeEnd;


            // GT - Schritt 4: Add successors of O* to A
            for (int z=0;z<nOp;z++){

                if (max(arrayV(z)) == 0){
                    arrayA(z) = 1;
                }
                if (max(arrayV(z)) == -1){
                    arrayA(z) = -1;
                }
                if (max(arrayV(z)) == 1){
                    arrayA(z) = 0;
                }
            }

            //GT - Step 5: Update occupation time of the current machine to every operation which is also done on that machine
            for (int z=0;z<nOp;z++){
                if (arrayA(z) != -1 && indiAllocation(z) == currentMachine){
                    startingTimeMatrix(z)(nOp) = indiProcess.get(operationStar).timeEnd;
                }
            }


            //Update startingTimeMatrix, because the occupation time in startingTimeMatrix(z)(nOp) was updated
            calculateStartingTimeMatrix();


            //Give the working Machine some Information about the Operation: Needed for the BarChart
            indiMachines.get(currentMachine).plannedOperations = addOneToArray(indiMachines.get(currentMachine).plannedOperations, operationStar);
            indiMachines.get(currentMachine).startingTimesOps = addOneToArray(indiMachines.get(currentMachine).startingTimesOps, indiProcess.get(operationStar).timeStart);
            indiMachines.get(currentMachine).endingTimesOps = addOneToArray(indiMachines.get(currentMachine).endingTimesOps, indiProcess.get(operationStar).timeEnd);

            // Calculate Breaking Condition
            gtaBreakingCondition = count(arrayA, -1);
        }
    }


    // 1-Bit-Mutation
    void einbitmutation(int nOp, double mutProbability){
        
        for (int i = 0; i<nOp; i++) {
            double random = randomNumber();
            if (random<mutProbability){
                if (indiAllocation(i)==1){
                    indiAllocation(i)=0;
                } 
                else {
                    indiAllocation(i)=1;
                }
            }
        }
    }

    //Mixed Mutation
    void mixedmutation(int nOp, float mixMutProbability, int typeCoding){

        double random = randomNumber();
        if(random > mixMutProbability){
            int sectionStart = (int) round(randomNumber()*nOp,0);
            int sectionEnd = (int) round(randomNumber()*nOp,0);
            while (sectionEnd == sectionStart){
                sectionEnd = (int) round(randomNumber()*nOp,0);
            }
            if (sectionEnd < sectionStart){
                int temp = sectionStart;
                sectionStart = sectionEnd;
                sectionEnd = temp;
            }

            int() tempArr = new int(sectionEnd-sectionStart);
            if (typeCoding == 0){
                tempArr = copyArraySection(indiAllocation, sectionStart, tempArr, 0, sectionEnd-sectionStart);
                List<Integer> tempList = new ArrayList<>();
                for (int k=0;k<tempArr.length;k++){
                    tempList.add(tempArr(k));
                }
                Collections.shuffle(tempList);
                int() mixedArr = tempList.stream().mapToInt(i->i).toArray();
                indiAllocation = copyArraySection(mixedArr, 0, indiAllocation, sectionStart, sectionEnd-sectionStart);
            }
            else if (typeCoding == 1){
                tempArr = copyArraySection(indiSequence, sectionStart, tempArr, 0, sectionEnd-sectionStart);
                List<Integer> tempList = new ArrayList<>();
                for (int k=0;k<tempArr.length;k++){
                    tempList.add(tempArr(k));
                }
                Collections.shuffle(tempList);
                int() mixedArr = tempList.stream().mapToInt(i->i).toArray();
                indiSequence = copyArraySection(mixedArr, 0, indiSequence, sectionStart, sectionEnd-sectionStart);
            }
            else{
                System.out.println("Wrong Input for Coding Type of Mixed Mutation");
            }
        }
    }

    // Swap-Mutation
    void swapmutation(int nOp, float mutProbability){

        for (int i = 0; i<nOp; i++) {
            double random = randomNumber();
            if (random<mutProbability) {
                double random2 = round(randomNumber()*(nOp-1),0);
                int randomposition = (int)random2;
                int saveNumber = indiSequence(randomposition);
                indiSequence(randomposition) = indiSequence(i);
                indiSequence(i) = saveNumber;
            }
        }
    }

}



I would like to describe the programm even a bit more, but i already wrote a lot. You can fine the whole program with more explanations, pictures and examples here:
https://github.com/enruk/Productionplaner

I know its not allowed to link to third party website but i really hesitate to copy every class in here. I don’t think this helps clarity and understanding.

Why I post it here?
My problem is that in self-taught programming, I only work up to the point where the program works. I miss a kind of mentor who tells me: Yes, this works, but it can be done better (which is i guess the point of this website).

As already mentioned, I am also happy to receive feedback on the remaining classes of the program or the entire program (but i know that this is not the purpose of this website). I know the program is quiet big to give a detailed feedback. What I look for are more general tips / advices. I appreciate the smallest advices.
What can i do better / on which topics should I work / research more?

  • Structure of the program
  • Basic algorithms
  • Structure and use of methods
  • Use of data structures

Thank you very much already!
I know there are some german comments / varaibles names. Apologies. I will fix that asap.

Also I am happy about good recommendations for online courses.
I am currently doing this one:
https://www.udemy.com/course/java-training-ullenboom/

For classifier systems, how is the genetic algorithm run on delayed class?

I am reading Signals and Boundaries and came across Classifier Systems. In the book, John Holland seems to imply that the signals in the input, as they come from a live environment, don’t have an immediate output known to the system (there is delay between the classification and the action). In every explanation I have seen online (e.g., https://en.wikipedia.org/wiki/Learning_classifier_system#Rule/classifier/population), it seems that it is expected that the input data comes in tag/class pairs, which makes it not true online learning. My question is: given that we do not know the class of a sample data tag in runtime, how is the fitness computed for the genetic algorithm step of the CS during online learning?

javascript – Genetic Algorithm Typescript

javascript – Genetic Algorithm Typescript – Code Review Stack Exchange

python – Genetic algorithm to guess coefficient of a polynomial

I have tried to code a genetic algorithm to guess the coefficients of a degree 4 polynomial. The information initially provided is values of y = f(x) for different x using the original polynomial. I then generate 100 polynomials with randomly selected coefficients. These polynomials are then ranked based on the least square difference (LSD) and sorted such that lower LSD has higher rank. These ranks are then converted into nonlinear ranking and mapped to a line of unit length. Individuals are then selected and real valued crossover is used to generate progeny, who are also sorted according to LSD. The worst individuals in the starting pool are then replaced by progeny.
Please suggest improvements where possible. I am intermediate in Python.

"""
Genetic Algorithm implementation of finding coefficients of a polynomial
3.0 - 4.3 * x + 5.9 * x ** 2 - 5.2 * x ** 3 + 1.0 * x ** 4 = 0
"""

#%% import modules
import numpy as np
import matplotlib.pyplot as plt
from copy import deepcopy

# np.set_printoptions(linewidth=np.inf)


#%% define a function to return the polynomial
def poly(x):
    return 3.0 - 4.3 * x + 5.9 * x ** 2 - 5.2 * x ** 3 + 1.0 * x ** 4


def gen_poly(array, x):
    return (
        array(0)
        + array(1) * x
        + array(2) * x ** 2
        + array(3) * x ** 3
        + array(4) * x ** 4
    )


#%% define function to measure fitness
### takes an array as input and fills the last column with fitness values
### take care to always keep one column for fitness values in starting array
### output is an array sorted by fitness
def fitness(array):
    # set initial values
    initial = np.array(
        (
            (-5, 1447.0),
            (-4, 703.4),
            (-3, 290.4),
            (-2, 92.8),
            (-1, 19.4),
            (0, 3),
            (1, 0.4),
            (2, -7.6),
            (3, -16.2),
            (4, 3.4),
            (5, 104),
        )
    )
    # score is the total square deviation from initial. lower is better
    for ind in array:
        fitness = 0
        for x, y in zip(initial(:, 0), initial(:, 1)):
            fitness += (
                (
                    ind(0)
                    + ind(1) * x
                    + ind(2) * x ** 2
                    + ind(3) * x ** 3
                    + ind(4) * x ** 4
                )
                - y
            ) ** 2
        ind(5) = fitness
    return array(array(:, 5).argsort())


#%% function to give non linear rank to sorted pool
def nl_rank(array):
    copy = deepcopy(array)
    # normalization factor for selection pressure 3 and individuals 50
    norm = 290.3359045831912
    # setting ranks based on position in the pool
    for i, j in enumerate(copy):
        j(-1) = 50 * 1.06 ** (49 - i) / norm
    return copy


#%% define function to carry out stochastic universal sampling
### takes non linear ranked array as input
### output is list of selected individuals
def sel_ind(nl_array):
    copy = deepcopy(nl_array)
    # normalize ranks
    norm = sum(copy(:, -1))
    copy(:, -1) = copy(:, -1) / norm

    # map intervals on range (0, 1)
    prob_list = list(copy(:, -1))
    intervals = ()
    start = 0
    for prob in prob_list:
        end = start + prob
        intervals.append((start, end))
        start = end

    # selecting 6 individuals from the intervals
    rng = np.random.default_rng()
    points = (rng.uniform(0, 1 / 5))
    for i in range(4):
        points.append(points(-1) + 1 / 5)
    index, i = (), 0
    for point in points:
        for j in range(i, len(intervals)):
            if intervals(j)(0) < point < intervals(j)(1):
                index.append(j)
                i = j
                break
    return index


#%% define function to carry out mating. only unique pairings are considered
### each mating gives 2 children
def crossover(array, individuals):
    rng = np.random.default_rng()
    progeny = np.empty((0, 6))
    for i in range(len(individuals) - 1):
        for j in range(i + 1, len(individuals)):
            mate1 = rng.uniform(-0.25, 1.25, (1, 5)).squeeze()
            mate2 = rng.uniform(-0.25, 1.25, (1, 5)).squeeze()
            mutation1 = rng.uniform(-0.025, 0.025, (1, 5)).squeeze()
            mutation2 = rng.uniform(-0.025, 0.025, (1, 5)).squeeze()
            baby1 = (
                array(i, :5) * mate1 + array(j, :5) * (1 - mate1) + mutation1 * mate1
            )
            baby1 = np.append(baby1, 0)
            progeny = np.append(progeny, (baby1), axis=0)
            baby2 = (
                array(j, :5) * mate2 + array(i, :5) * (1 - mate2) + mutation2 * mate2
            )
            baby2 = np.append(baby2, 0)
            progeny = np.append(progeny, (baby2), axis=0)
    return fitness(progeny)


#%% helper function to print arrays to log
def arr_print(arr, count):
    print(f"#loop_count = {count}")
    print(arr)


#%% main
if __name__ == "__main__":
    # create rng instance
    rng = np.random.default_rng()

    # create parent pool
    # parent pool has 5 columns for coefficients and one for fitness
    pool = rng.uniform(-50, 50, (50, 6))

    # measure fitness of each parent and sort in decreasing order of fitness
    pool = fitness(pool)
    starting_fitness = pool(0, 5)

    # plotting the original curve
    plt.ion()
    fig, ax = plt.subplots()
    x = np.linspace(-5, 5, 200)
    ax.plot(x, poly(x), "r", lw=1, label="Original")
    ax.set_xlabel("X axis")
    ax.set_ylabel("Y axis")
    ax.set_ylim(-1000, 1000)
    ax.set_title("Comparision of curves")
    ax.legend()

    loop_count = 1

    while True:
        # plotting
        if loop_count == 1:
            ax.plot(x, gen_poly(pool(0, :), x), lw=1, label=f"Iteration = {loop_count}")
            ax.legend()
            fig.canvas.draw()
            plt.pause(0.0001)
            # arr_print(pool, loop_count)
        elif loop_count % 100 == 0:
            ax.plot(x, gen_poly(pool(0, :), x), lw=0.25, ls="solid")
            fig.canvas.draw()
            plt.pause(0.0000001)
            # arr_print(pool, loop_count)
        # add termination condition
        elif pool(0, 5) < 0.005:
            ax.plot(
                x,
                gen_poly(pool(0, :), x),
                "k",
                lw=1.5,
                label=f"Iteration = {loop_count}",
            )
            ax.set_xlim(-2, 5)
            ax.set_ylim(-40, 75)
            ax.legend()
            fig.canvas.draw()
            plt.pause(0.001)
            # arr_print(pool, loop_count)
            break
        # rank parents based on non linear ranking
        ranked = nl_rank(pool)

        # select individuals
        individuals = sel_ind(ranked)

        # create progeny
        progeny = crossover(ranked, individuals)

        # remove 20 worst individuals from pool
        pool = np.delete(pool, np.s_(-20:), axis=0)

        # add progeny to the new pool
        pool = np.vstack((pool, progeny(:20, :)))

        # sort pool according to fitness
        pool = pool(pool(:, 5).argsort())

        loop_count += 1

    print(starting_fitness, pool(0, 5))

```

genetic algorithms – Travelling Salesman Problem: Distance between solutions

I’m designing a genetic algorithm to solve the travelling salesman problem. So far, I’ve gotten fairly good results. I’m now trying to improve on them by implementing some sort of diversification scheme (like fitness sharing and crowding), although I’m struggling with the conceptualisation of the inter-solution distance a bit.

Solutions represent a path that goes through all cities, i.e. a permutation of the order in which they are visited. This is represented in my code by np.arrays. If I want to know how similar two solutions are, I basically want to find the distance between two permutations of n_cities elements. I currently have two ideas for the distance function.

  1. Levenshtein distance, which is simply ‘how many atomic edits are two sequences removed from each other.
  2. Hamming distance, which denotes the number of positions that are the same.

Note that, for each solution, I make sure to cycle it so it starts in the same position (city). Otherwise these metrics won’t make sense.

Which of them is more appropriate? Is there a better solution? I’ve browsed a number of articles, but haven’t really found an answer yet.

optimization – Dynamic Chromosome Length in Genetic Algorithms

My inquiry concerns the length of chromosomes employed in genetic algorithms (GA), and more broadly in other classes of evolutionary algorithms.

The chromosome length is fixed throughout a GA’s run. However, some work has explored the benefits of altering a related GA parameter: the size of the GA population, which is allowed to vary in size (increase or decrease) from one generation to the next in order to better explore and exploit the search space.

My question is now: given that the population size can change, is there any benefit to altering chromosome size? I haven’t really come across any work on this.

I imagine this would have applications in things like cruise stopovers and finding alternate routes to a given destination.

Adding another city for instance in a run of the Travelling Salesman Problem (TSP) using GA would lead to an increase in the search space that must be explored. Take for instance a tour with 10 cities: the size of the search space is 10! = 3628800. Adding an addition city would mean that there are now an additional 36288000 (11! – 10!) tours to consider.

python – Genetic sequence analyzer that reads FASTA and GenBank file formats and outputs all possible gene products

import random
import re
from collections import defaultdict

nucleotides = ("A", "C", "G", "T")
rnanucleotides = ("A", "C", "G", "U")
rev_compliment = {'A': 'T', 'T': 'A', 'G': 'C', 'C': 'G'}
RNA_codon_table = {"UUU": "F", "CUU": "L", "AUU": "I", "GUU": "V",
    "UUC": "F", "CUC": "L", "AUC": "I", "GUC": "V",
    "UUA": "L", "CUA": "L", "AUA": "I", "GUA": "V",
    "UUG": "L", "CUG": "L", "AUG": "M", "UCU": "S",
    "GUG": "V", "CCU": "P", "ACU": "T", "GCU": "A",
    "UCC": "S", "CCC": "P", "ACC": "T", "GCC": "A",
    "UCA": "S", "CCA": "P", "ACA": "T", "GCA": "A",
    "UCG": "S", "CCG": "P", "ACG": "T", "GCG": "A",
    "UAU": "Y", "CAU": "H", "AAU": "N", "GAU": "D",
    "UAC": "Y", "CAC": "H", "AAC": "N", "GAC": "D",
    "UAA": "_", "CAA": "Q", "AAA": "K", "GAA": "E",
    "UAG": "_", "CAG": "Q", "AAG": "K", "GAG": "E",
    "UGU": "C", "CGU": "R", "AGU": "S", "GGU": "G",
    "UGC": "C", "CGC": "R", "AGC": "S", "GGC": "G",
    "UGA": "_", "CGA": "R", "AGA": "R", "GGA": "G",
    "UGG": "W", "CGG": "R", "AGG": "R", "GGG": "G"
    }
amino_acid_weights = {
"A": 71.03711,
"C": 103.00919,
"D": 115.02694,
"E": 129.04259,
"F": 147.06841,
"G": 57.02146,
"H": 137.05891,
"I": 113.08406,
"K": 128.09496,
"L": 113.08406,
"M": 131.04049,
"N": 114.04293,
"P": 97.05276,
"Q": 128.05858,
"R": 156.10111,
"S": 87.03203,
"T": 101.04768,
"V": 99.06841,
"W": 186.07931,
"Y": 163.06333,
"_": 0,
}

def title_screen():
    print("""-. .-.   .-. .-.   .-. .-.   .  
||||| /|||||| /|||||| /|
|/ ||||||/ ||||||/ |||||
~   `-~ `-`   `-~ `-`   `-~ `-""")
    print("DNA SEQUENCE ANALYZER by Ethan Hetrickn")

def user_selection():
    response = input("What would you like to do?n"
                     "1. Input your own DNA sequence.n"
                     "2. Input your own RNA sequence.n"
                     "3. Generate random sequence data.n"
                     "4. Import a file in FASTA format.n"
                     "5. Import a file in GenBank format.n")
    if response == '1':
        seq = input("Input your DNA sequence here: n").upper().strip()
        seq = validate_seq(seq)
        tag = "Your DNA sequence"
        return seq, tag
    elif response == '3':
        try:
            x = int(input("How many nucleotides long do you want your sequence? Enter an integer.n"))
            seq = ''.join((random.choice(nucleotides) for nuc in range(x)))
            tag = "Random DNA sequence"
            return seq, tag
        except ValueError:
            print("Invalid response.n")
            run('seq', 'tag')
    elif response == '2':
        rna = input("Input your RNA sequence here: n").upper().strip()
        rna = validate_seq(rna)
        seq = rev_transcription(rna)
        tag = "Your RNA sequence"
        return seq, tag
    elif response == '4':
        try:
            for i in fasta():
                tag = i(0)
                seq = i(1)
                if "U" in seq:
                    seq = rev_transcription(seq)
                return seq, tag
        except FileNotFoundError:
            print("nFile not found. Please input a valid file path.n")
    elif response == '5':
        try:
            seq, tag = genbank_to_fasta()
            return seq, tag
        except FileNotFoundError:
            print("nFile not found. Please input a valid file path.n")
    else:
        print("Invalid response.n")

def fasta():
    sequences = defaultdict(str)
    file = input(r'Input the path to your file: ')
    with open(f'{file}') as f:
        lines = f.readlines()
    current_tag = None
    list = ()
    for line in lines:
        m = re.match('^>(.+)', line)
        if m:
            current_tag = m.group(1)
        else:
            sequences(current_tag) += line.strip()
    for tag, seq in sequences.items():
        list.append((tag, seq))
    return list

def genbank_to_fasta():
    file = input(r'Input the path to your file: ')
    with open(f'{file}') as f:
        gb = f.readlines()
        locus = re.search('NC_d+.d+', gb(3)).group()
        region = re.search('(d+)?.+(d+)', gb(2))
        definition = re.search('w.+', gb(1)(10:)).group()
        definition = definition.replace(definition(-1), "")
        tag = locus + ":" + region.group(1) + "-" + region.group(2) + " " + definition
        sequence = ""
        for line in (gb):
            pattern = re.compile('(atgc){10}')
            matches = pattern.finditer(line)
            for match in matches:
                sequence += match.group().upper()
        end_pattern = re.search('(atgc){1,9}', gb(-3))
        sequence += end_pattern.group().upper()
        return sequence, tag

def validate_seq(dnaseq):
    if len(dnaseq) == 0:
        print("Empty sequence.n")
        user_selection()
    else:
        for nuc in dnaseq:
            if nuc not in nucleotides or rnanucleotides:
                print("Invalid sequence.n")
                user_selection()
            else:
                return dnaseq

def nuc_count(dnaseq, tag):
    print("n================================================")
    print(f'THE ANALYSIS: >{tag}n')
    if len(dnaseq) >= 1000:
        print(f'Total: {len(dnaseq)/1000} kbpn')
    else:
        print(f'Total: {len(dnaseq)} bpn')
    print("Nucleotide frequency:")
    for letter in nucleotides:
        letter_total = dnaseq.count(letter)
        letter_per = (letter_total / len(dnaseq))
        print(f'{letter}: {letter_total} : {letter_per:%}')
    GC = ((dnaseq.count("G") + dnaseq.count("C"))/(len(dnaseq)))
    print(f'GC content: {GC:%}')

def rev_transcription(rnaseq):
    dnaseq = rnaseq.replace("U", "T")
    return dnaseq

def DNA_to_cDNA(dnaseq):
    cdna = "".join((rev_compliment(nuc) for nuc in dnaseq))
    return cdna

def transcription(dnaseq):
    rna = dnaseq.replace("T", "U")
    print(f"RNA:  5' {rna} 3'")
    return rna

def translation(dnaseq, init_pos=0):
    dnaseq = dnaseq.replace("T", "U")
    return (RNA_codon_table(dnaseq(pos:pos + 3)) for pos in range(init_pos, len(dnaseq) - 2, 3))

def gen_reading_frames(dnaseq):
    frames = ()
    frames.append(translation(dnaseq, 0))
    frames.append(translation(dnaseq, 1))
    frames.append(translation(dnaseq, 2))
    frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 0))
    frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 1))
    frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 2))
    return frames

def prot_from_rf(aa_seq):
    prot1 = ()
    proteins = ()
    for aa in aa_seq:
        if aa == "_":
            proteins.extend(prot1)
            prot1 = ()
        else:
            if aa == "M":
                prot1.append("")
            for i in range(len(prot1)):
                prot1(i) += aa
    return proteins

def all_proteins_from_rfs(dnaseq, startReadPos=0, endReadPos=0):
    if endReadPos > startReadPos:
        rfs = gen_reading_frames(dnaseq(startReadPos: endReadPos))
    else:
        rfs = gen_reading_frames(dnaseq)
    all_proteins = ()
    for rf in rfs:
        prots = prot_from_rf(rf)
        for p in prots:
            all_proteins.append(p)
    return all_proteins

def protein_weight(protein):
    weights = (((amino_acid_weights(protein(pos: pos + 1)) for pos in range(0, len(protein)))))
    weight = round(sum(weights), 3)
    return weight

def printing(seq, frames):
    print("nThe 6 possible reading frames:")
    for i, frame in enumerate(frames):
        print(f'{i + 1}. {"".join(frame)}')
    print("nAll possible proteins:")
    list = ()
    for prot in all_proteins_from_rfs(seq):
        if prot not in list:
            list.append(prot)
    list.sort(key=len, reverse=True)
    for i, prot in enumerate(list):
        print(f'{i + 1}. {prot}: {protein_weight(prot)} Da')
    print("================================================n")

def run(seq, tag):
    if seq == 'seq' and tag == 'tag':
        seq, tag = user_selection()
    nuc_count(seq, tag)
    cseq = DNA_to_cDNA(seq)
    print(f"nDNA:  5' {seq} 3'")
    print(f"cDNA: 3' {cseq} 5'")
    transcription(seq)
    translation(seq)
    all_proteins_from_rfs(seq, startReadPos=0, endReadPos=0)
    frames = gen_reading_frames(seq)
    printing(seq, frames)

title_screen()
run('seq', 'tag')

Example output

================================================
THE ANALYSIS: >NC_000006.12:26156329-26157115 Homo sapiens chromosome 6, GRCh38.p13 Primary Assembly

Total: 787 bp

Nucleotide frequency:
A: 221 : 28.081321%
C: 246 : 31.257942%
G: 230 : 29.224905%
T: 90 : 11.435832%
GC content: 60.482846%

DNA:  5' CGAGTCCCGGCCAGTGCCTCTGCTTCCGGCTCGAATTGCTCTCGCTCACGCTTGCCTTCAACATGTCCGAGACTGCGCCTGCCGCGCCCGCTGCTCCGGCCCCTGCCGAGAAGACTCCCGTGAAGAAGAAGGCCCGCAAGTCTGCAGGTGCGGCCAAGCGCAAAGCGTCTGGGCCCCCGGTGTCCGAGCTCATTACTAAAGCTGTTGCCGCCTCCAAGGAGCGCAGCGGCGTATCTTTGGCCGCTCTCAAGAAAGCGCTGGCAGCCGCTGGCTATGACGTGGAGAAGAACAACAGCCGCATCAAGCTGGGTCTCAAGAGCCTGGTGAGCAAGGGCACCCTGGTGCAGACCAAGGGCACCGGCGCGTCGGGTTCCTTCAAACTCAACAAGAAGGCGGCCTCTGGGGAAGCCAAGCCTAAGGCTAAAAAGGCAGGCGCGGCCAAGGCCAAGAAGCCAGCAGGAGCGGCGAAGAAGCCCAAGAAGGCGACGGGGGCGGCCACCCCCAAGAAGAGCGCCAAGAAGACCCCAAAGAAGGCGAAGAAGCCGGCTGCAGCTGCTGGAGCCAAAAAAGCGAAAAGCCCGAAAAAGGCGAAAGCAGCCAAGCCAAAAAAGGCGCCCAAGAGCCCAGCGAAGGCCAAAGCAGTTAAACCCAAGGCGGCTAAACCAAAGACCGCCAAGCCCAAGGCAGCCAAGCCAAAGAAGGCGGCAGCCAAGAAAAAGTAGAAAGTTCCTTTGGCCAACTGCTTAGAAGCCCAACACAACCCAAAGGCTCTTTTCAGAGCCACCCA 3'
cDNA: 3' GCTCAGGGCCGGTCACGGAGACGAAGGCCGAGCTTAACGAGAGCGAGTGCGAACGGAAGTTGTACAGGCTCTGACGCGGACGGCGCGGGCGACGAGGCCGGGGACGGCTCTTCTGAGGGCACTTCTTCTTCCGGGCGTTCAGACGTCCACGCCGGTTCGCGTTTCGCAGACCCGGGGGCCACAGGCTCGAGTAATGATTTCGACAACGGCGGAGGTTCCTCGCGTCGCCGCATAGAAACCGGCGAGAGTTCTTTCGCGACCGTCGGCGACCGATACTGCACCTCTTCTTGTTGTCGGCGTAGTTCGACCCAGAGTTCTCGGACCACTCGTTCCCGTGGGACCACGTCTGGTTCCCGTGGCCGCGCAGCCCAAGGAAGTTTGAGTTGTTCTTCCGCCGGAGACCCCTTCGGTTCGGATTCCGATTTTTCCGTCCGCGCCGGTTCCGGTTCTTCGGTCGTCCTCGCCGCTTCTTCGGGTTCTTCCGCTGCCCCCGCCGGTGGGGGTTCTTCTCGCGGTTCTTCTGGGGTTTCTTCCGCTTCTTCGGCCGACGTCGACGACCTCGGTTTTTTCGCTTTTCGGGCTTTTTCCGCTTTCGTCGGTTCGGTTTTTTCCGCGGGTTCTCGGGTCGCTTCCGGTTTCGTCAATTTGGGTTCCGCCGATTTGGTTTCTGGCGGTTCGGGTTCCGTCGGTTCGGTTTCTTCCGCCGTCGGTTCTTTTTCATCTTTCAAGGAAACCGGTTGACGAATCTTCGGGTTGTGTTGGGTTTCCGAGAAAAGTCTCGGTGGGT 5'
RNA:  5' CGAGUCCCGGCCAGUGCCUCUGCUUCCGGCUCGAAUUGCUCUCGCUCACGCUUGCCUUCAACAUGUCCGAGACUGCGCCUGCCGCGCCCGCUGCUCCGGCCCCUGCCGAGAAGACUCCCGUGAAGAAGAAGGCCCGCAAGUCUGCAGGUGCGGCCAAGCGCAAAGCGUCUGGGCCCCCGGUGUCCGAGCUCAUUACUAAAGCUGUUGCCGCCUCCAAGGAGCGCAGCGGCGUAUCUUUGGCCGCUCUCAAGAAAGCGCUGGCAGCCGCUGGCUAUGACGUGGAGAAGAACAACAGCCGCAUCAAGCUGGGUCUCAAGAGCCUGGUGAGCAAGGGCACCCUGGUGCAGACCAAGGGCACCGGCGCGUCGGGUUCCUUCAAACUCAACAAGAAGGCGGCCUCUGGGGAAGCCAAGCCUAAGGCUAAAAAGGCAGGCGCGGCCAAGGCCAAGAAGCCAGCAGGAGCGGCGAAGAAGCCCAAGAAGGCGACGGGGGCGGCCACCCCCAAGAAGAGCGCCAAGAAGACCCCAAAGAAGGCGAAGAAGCCGGCUGCAGCUGCUGGAGCCAAAAAAGCGAAAAGCCCGAAAAAGGCGAAAGCAGCCAAGCCAAAAAAGGCGCCCAAGAGCCCAGCGAAGGCCAAAGCAGUUAAACCCAAGGCGGCUAAACCAAAGACCGCCAAGCCCAAGGCAGCCAAGCCAAAGAAGGCGGCAGCCAAGAAAAAGUAGAAAGUUCCUUUGGCCAACUGCUUAGAAGCCCAACACAACCCAAAGGCUCUUUUCAGAGCCACCCA 3'

The 6 possible reading frames:
1. RVPASASASGSNCSRSRLPSTCPRLRLPRPLLRPLPRRLP_RRRPASLQVRPSAKRLGPRCPSSLLKLLPPPRSAAAYLWPLSRKRWQPLAMTWRRTTAASSWVSRAW_ARAPWCRPRAPARRVPSNSTRRRPLGKPSLRLKRQARPRPRSQQERRRSPRRRRGRPPPRRAPRRPQRRRRSRLQLLEPKKRKARKRRKQPSQKRRPRAQRRPKQLNPRRLNQRPPSPRQPSQRRRQPRKSRKFLWPTA_KPNTTQRLFSEPP
2. ESRPVPLLPARIALAHACLQHVRDCACRARCSGPCREDSREEEGPQVCRCGQAQSVWAPGVRAHY_SCCRLQGAQRRIFGRSQESAGSRWL_RGEEQQPHQAGSQEPGEQGHPGADQGHRRVGFLQTQQEGGLWGSQA_G_KGRRGQGQEASRSGEEAQEGDGGGHPQEERQEDPKEGEEAGCSCWSQKSEKPEKGESSQAKKGAQEPSEGQSS_TQGG_TKDRQAQGSQAKEGGSQEKVESSFGQLLRSPTQPKGSFQSHP
3. SPGQCLCFRLELLSLTLAFNMSETAPAAPAAPAPAEKTPVKKKARKSAGAAKRKASGPPVSELITKAVAASKERSGVSLAALKKALAAAGYDVEKNNSRIKLGLKSLVSKGTLVQTKGTGASGSFKLNKKAASGEAKPKAKKAGAAKAKKPAGAAKKPKKATGAATPKKSAKKTPKKAKKPAAAAGAKKAKSPKKAKAAKPKKAPKSPAKAKAVKPKAAKPKTAKPKAAKPKKAAAKKK_KVPLANCLEAQHNPKALFRAT
4. WVALKRAFGLCWASKQLAKGTFYFFLAAAFFGLAALGLAVFGLAALGLTALAFAGLLGAFFGLAAFAFFGLFAFLAPAAAAGFFAFFGVFLALFLGVAAPVAFLGFFAAPAGFLALAAPAFLALGLASPEAAFLLSLKEPDAPVPLVCTRVPLLTRLLRPSLMRLLFFSTS_PAAASAFLRAAKDTPLRSLEAATALVMSSDTGGPDALRLAAPADLRAFFFTGVFSAGAGAAGAAGAVSDMLKASVSESNSSRKQRHWPGL
5. GWL_KEPLGCVGLLSSWPKELSTFSWLPPSLAWLPWAWRSLV_PPWV_LLWPSLGSWAPFLAWLLSPFSGFSLFWLQQLQPASSPSLGSSWRSSWGWPPPSPSWASSPLLLASWPWPRLPF_P_AWLPQRPPSC_V_RNPTRRCPWSAPGCPCSPGS_DPA_CGCCSSPRHSQRLPALS_ERPKIRRCAPWRRQQL___ARTPGAQTLCAWPHLQTCGPSSSRESSRQGPEQRARQAQSRTC_RQA_ARAIRAGSRGTGRDS
6. GGSEKSLWVVLGF_AVGQRNFLLFLGCRLLWLGCLGLGGLWFSRLGFNCFGLRWALGRLFWLGCFRLFRAFRFFGSSSCSRLLRLLWGLLGALLGGGRPRRLLGLLRRSCWLLGLGRACLFSLRLGFPRGRLLVEFEGTRRAGALGLHQGALAHQALETQLDAAVVLLHVIASGCQRFLESGQRYAAALLGGGNSFSNELGHRGPRRFALGRTCRLAGLLLHGSLLGRGRSSGRGRRSLGHVEGKREREQFEPEAEALAGT

All possible proteins:
1. MSETAPAAPAAPAPAEKTPVKKKARKSAGAAKRKASGPPVSELITKAVAASKERSGVSLAALKKALAAAGYDVEKNNSRIKLGLKSLVSKGTLVQTKGTGASGSFKLNKKAASGEAKPKAKKAGAAKAKKPAGAAKKPKKATGAATPKKSAKKTPKKAKKPAAAAGAKKAKSPKKAKAAKPKKAPKSPAKAKAVKPKAAKPKTAKPKAAKPKKAAAKKK: 21833.993 Da
2. MTWRRTTAASSWVSRAW: 2034.001 Da
3. MRLLFFSTS: 1082.558 Da
================================================

This is an update of my gene sequencing program from a previous post. If you need clarifications on what each function accomplishes feel free to ask or look at my previous post. Any tips to make the code more concise or efficient are much appreciated or functions that you think would be useful. I included the links to an example FASTA and GenBank file for your reference.
FASTA(https://www.ncbi.nlm.nih.gov/nuccore/NC_000006.12?report=fasta&from=26156329&to=26157115)
GenBank(https://www.ncbi.nlm.nih.gov/nuccore/NC_000006.12?report=genbank&from=26156329&to=26157115)

optimization – Search Space for Genetic Algorithms

I have not been able to find anywhere a general formula for the size of the search space for Genetic Algorithms (GA).

I would imagine that such a formula would involve the binomial coefficient — maybe Stars and Bars

https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)

The reason I ask is because I have developed my own GA and would like to know the search space size as a means to motivate the need for, and usefulness of, metaheuristics like GAs for a manuscript that I am currently writing.

As an example, consider a simple binary (0/1) GA with string length of $L$ = 10 and population size (number of chromosomes) of $N$ = 100. Possible solutions could be:

0100001110

1011010010

etc.

Since 0/1 can be repeated in any given string (chromosome), there would be exactly $2^N$ possible configurations. This would generalize to $k^N$ for any $k$-ary problem.

I feel this isn’t the whole story.

If I had to guess a closed-form expression using Stars and Bars for the binary GA case, it might be something like $$binom{N + 2^N – 1}{N} = binom{N + 2^N – 1}{2^N – 1} = binom{100 + 2^{100} – 1}{100} = binom{100 + 2^{100} – 1}{2^{100} – 1} sim 10^{2852} $$

Is this the right line of thinking?

Any thoughts are greatly welcomed.

DreamProxies - Cheapest USA Elite Private Proxies 100 Cheapest USA Private Proxies Buy 200 Cheap USA Private Proxies 400 Best Private Proxies Cheap 1000 USA Private Proxies 2000 USA Private Proxies 5000 Cheap USA Private Proxies ExtraProxies.com - Buy Cheap Private Proxies Buy 50 Private Proxies Buy 100 Private Proxies Buy 200 Private Proxies Buy 500 Private Proxies Buy 1000 Private Proxies Buy 2000 Private Proxies ProxiesLive.com Proxies-free.com New Proxy Lists Every Day Proxies123.com Proxyti.com Buy Quality Private Proxies