## How to use genetic algorithm to calculate the maximum value of this function

How to use genetic algorithm to calculate the maximum value of this function Sin[x] / x in Mathematica

Thanx!

Posted on Categories Articles

## java – Giffler Thompson Algorithm for decoding individuals of a Genetic Algorithm

First of all: Hi, my name is Henrik and I am a mechanical engineer, who loves to code in his free time.
To get better in programming I wrote a genetic algorithm compined with a list planning algorithm (giffler thompsoon algorithm) to solve the fjssp (i have encountered this problem in my master thesis 2 years ago).

Although i would like to have help / reviews about the whole program, i think the biggest potential for improvement is in the class “Individuum”. Especially the methods “calculateStartingTimeMatrix” and “decodierung” (decoding) seem to me somehow cumbersome, inefficient and confusing (the latter is certainly also due to the length of the methods).

To the programm:
After reading a production process out of an excel file and creating a population with random individuals i create permissible productionsplans for each individual (“Individuum”). To be able to do that i use the giffler thompson algorithm to decode the individuals. This happens in the “decodierung” method. For the giffler thompson algorithm i need to calculate the starting times at the beginning and every time i add a operation to the production schedule. This happens in the calculateStartingTimeMatrix methode.

``````package planningalgorithm;

import java.util.Random;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class Individuum {
int number;
int birthGeneration;
int nOp;
int nMa;

int() indiAllocation;                   // Allocation of this Individual
int() indiSequence;                     // Sequenz of this Individual
int()() predecessorWorkingTimes;        // Do i need this?
int()() startingTimeMatrix;             // Do i need this?
int() timesProduction;
float timeFitness;
int susRank;
int tournamentWins;

List<Machine> indiMachines;                 // Machines of this Individual
List<Operationen> indiProcess;              // Process of this Individual

int() arrayA;
int() arrayB;
int operationDash;
int operationDashDash;
int operationStar;

//Konstruktor
Individuum(int num, int startgen, int nOp, int nMa){
number=num;
this.nOp = nOp;
this.nMa = nMa;
birthGeneration=startgen;

indiAllocation = new int(nOp);
indiSequence = new int(nOp);
predecessorWorkingTimes = new int(nOp)(nOp);
startingTimeMatrix = new int(nOp)(nOp+1);
timesProduction = new int(nOp);

// Liste der Maschinen erstellen
indiMachines = new ArrayList<>(nMa);
for (int i=0;i<nMa;i++) {
Machine newMachine = new Machine(i,0);
}

// Liste der Prozesse erstellen
indiProcess = new ArrayList<>(nOp);
for (int i=0;i<nOp;i++) {
Operationen newOperation = new Operationen(nOp,nMa);
}

arrayA = new int(nOp);
arrayB = new int(nOp);
}

// Zufallszahl
private double randomNumber(){
double ranNum;
Random zufallszahl = new Random();
ranNum = zufallszahl.nextDouble();
return ranNum;
}

private double round(double value, int decimalPoints) {
double d = Math.pow(10, decimalPoints);
return Math.round(value * d) / d;
}

private int max(int() fooArr) {
int maximum = -100;
for (int i=0;i<fooArr.length;i++) {
if (fooArr(i) > maximum) {
maximum = fooArr(i);
}
}
return maximum;
}

public void copyArr (int() arr, int() dest){
for (int i=0;i<arr.length;i++){
dest(i) = arr(i);
}
}

public int() copyArraySection (int() originArr, int startPointerArr, int() destArr, int startPointerDest, int lenghtCopy){
for (int i=0;i<lenghtCopy;i++){
destArr(startPointerDest+i) = originArr(startPointerArr+i);
}
return destArr;
}

public int() addOneToArray (int() arr, int what){
// Case 1: Array is empty
if (arr == null){
int() newArr = new int(1);
newArr(0) = what;
return newArr;
}
// Case 2: Array isnt empty
else {
int() newArr = new int(arr.length + 1);
for (int i=0;i<arr.length;i++){
newArr(i) = arr(i);
}
newArr(arr.length) = what;
return newArr;
}
}

public int() oneValue(int() barArr, int value){
int() oneValueArr = new int(barArr.length);
for (int i=0;i<barArr.length;i++){
oneValueArr(i) = value;
}
return oneValueArr;
}

public int() changeValue (int() arrArr, int oldValue, int newValue){
int() newArrArr = new int(arrArr.length);
for (int i=0;i<arrArr.length;i++){
if (arrArr(i) == oldValue){
arrArr(i) = newValue;
}
}
return newArrArr;
}

public int count (int() countArr, int value){
int countDoku = 0;
for (int i=0;i<countArr.length;i++){
if (countArr(i)==value){
countDoku++;
}
}
return countDoku;
}

public int()() copyMatrix (int()() arr) {
int()() copy = new int(arr.length)(arr(0).length);
for (int i = 0; i < arr.length; i++) {
for (int j = 0; j < arr(0).length; j++) {
copy(i)(j) = arr(i)(j);
}
}
return copy;
}

void calculateStartingTimeMatrix(){
//Reset startingTimeMatrix except the last column nOp which includes the occupation times
for (int r=0;r<nOp;r++){
for (int c=0;c<nOp;c++){
startingTimeMatrix(r)(c) = 0;
}
}

// Reset the status in every operation
for (int i=0;i<nOp;i++){
indiProcess.get(i).operationDone = false;
}

// Reset the remainingPredessors
for (int i=0;i<nOp;i++){
copyArr(indiProcess.get(i).predecessor,indiProcess.get(i).remainingPredecessors);
}

// Update StartingTimeMatrix with Operation, that are already planned
for (int i=0;i<nOp;i++){
if (arrayA(i) == -1){
indiProcess.get(i).operationDone = true;
for (int z2=0;z2<nOp;z2++){
if (indiProcess.get(z2).predecessor(i) == 1){
startingTimeMatrix(z2)(i) = indiProcess.get(i).timeEnd;
indiProcess.get(z2).remainingPredecessors(i) = 0;
}
}
}
}

for (int i=0;i<nOp;i++){
if (arrayA(i) == 1){
indiProcess.get(i).operationDone = false;
}
}

int operationsDone = 0;
while (operationsDone < nOp){

for (int z=0;z<nOp;z++){
int foundOp = 0;

// Search for starting Operation
foundOp = z;
for (int z2=0;z2<nOp;z2++){
if (indiProcess.get(z2).predecessor(foundOp) == 1){
startingTimeMatrix(z2)(foundOp) = max(startingTimeMatrix(foundOp)) + predecessorWorkingTimes(z2)(foundOp); //t_Op = t_PredecessorStart(from startingTimeMatrix) + t_PredecessorProcess(from predecessorTimes)
}
}

// Mark the found operation as done
for (int z3=0;z3<nOp;z3++){
if (indiProcess.get(z3).remainingPredecessors(foundOp) == 1){
indiProcess.get(z3).remainingPredecessors(foundOp) = 0; // FoundOp is done and is now no longer a predecessor on which other operations have to wait for
}
}

indiProcess.get(foundOp).operationDone = true;
}
}

//Calculate new  Starting Operations
for (int z=0;z<nOp;z++){

// Operation has no remaining Predecessor and wasnt done yet: Ready to Start
if (max(indiProcess.get(z).remainingPredecessors)==0 && indiProcess.get(z).operationDone == false){
}

// Operation has no remaining Predecessor, but is already done
if (max(indiProcess.get(z).remainingPredecessors)==0 && indiProcess.get(z).operationDone == true){
}

// Operation still has remaining Predecessors
if (max(indiProcess.get(z).remainingPredecessors)!=0){
indiProcess.get(z).operationDone = false;
}
}

//Abbruchbedingungen ermitteln
operationsDone = 0;
for (int z=0;z<nOp;z++){
if(indiProcess.get(z).operationDone == true){
operationsDone++;
}
}
}

//Fertigungszeiten berechnen
for (int z=0;z<nOp;z++){
timesProduction(z) = max(startingTimeMatrix(z)) + indiProcess.get(z).timeWorking;
}
}

void correctingAllocation (List<Operationen> operationsList){
// Liste der Maschinen erstellen
for (int i=0;i<nOp;i++){
indiProcess.get(i).availableMachines = new int(nMa);
copyArr(operationsList.get(i).availableMachines,indiProcess.get(i).availableMachines);
}

for (int i=0;i<nOp;i++){
int requestedMachine = indiAllocation(i);
if (indiProcess.get(i).availableMachines(requestedMachine) == 0){
// Requested Machine cant execute the operation
// Find available Machines
List<Integer> numbersAvailableMachines = new ArrayList<>();
for (int j=0;j<nMa;j++){
if (indiProcess.get(i).availableMachines(j) == 1){
}
}
// Pick random Machine of the available machines
double randomMachine =  round((numbersAvailableMachines.size()-1) * randomNumber(),0);
int newMachine = (int) randomMachine;

//Put new Machine in Allocation
indiAllocation(i) = numbersAvailableMachines.get(newMachine);
}
}
}

// Decodierung
void decodierung(int()() Vorrangmatrix, int()() Maschinenzeiten){

// VORBEREITUNG
//Prozesszeitenmatrix bestimmen aus Zuordnung und Maschinenzeiten und Maschinenzeit in Matrix der Vorgänger-Prozess-Zeiten eintragen
for (int z=0;z<nOp;z++){
indiProcess.get(z).workingMachine = indiAllocation(z);
indiProcess.get(z).timeWorking = Maschinenzeiten(z)(indiProcess.get(z).workingMachine);
}

// z=zeile/row
// s=spalte/column
for (int z=0;z<nOp;z++){
for (int s=0;s<nOp;s++){
if  (Vorrangmatrix(z)(s)==1){
predecessorWorkingTimes(z)(s) = indiProcess.get(s).timeWorking;
}
else{
predecessorWorkingTimes(z)(s) = 0;
}
}
}

// Get the predecessors of every Operation and save them under Process.get(x).Predecessors
for (int i=0;i<nOp;i++){
copyArr(Vorrangmatrix(i),indiProcess.get(i).predecessor);
}

int()() arrayV = copyMatrix(Vorrangmatrix);
for (int z=0;z<nOp;z++){
if (max(Vorrangmatrix(z)) < 0.1)
arrayA(z) = 1;
else{
arrayA(z) = 0;
}
}

calculateStartingTimeMatrix();

// Beginn Giffler-Thompson Algorithmus
int gtaBreakingCondition = 0;
while (gtaBreakingCondition < nOp){
operationDash = 0;
operationDashDash = 0;
operationStar = 0;

// GT - Step 2.H1: Get O' (Operation with earliest Productiontime, finsihed first)
int earliestProductiontime = 1000;
for (int z=0;z<nOp;z++){
if (arrayA(z) == 1 && timesProduction(z)< earliestProductiontime){
operationDash = z;
earliestProductiontime = timesProduction(z);
}
}

// GT - Step 2.H2: Get B, all Operations of A on the same Machine as O'
int currentMachine = indiAllocation(operationDash);

for (int z=0;z<nOp;z++){
if (arrayA(z)==1 && indiAllocation(z)==currentMachine){
arrayB(z)=1;
}
else{
arrayB(z) = 0;
}
}

// GT - Step 2.H3: Get O'' (Operation of B with ealiest starting Time)
int earliestStartingtime = 1000;
for (int z=0;z<nOp;z++){
if (arrayB(z)==1 && max(startingTimeMatrix(z))<earliestStartingtime){
operationDashDash = z;
earliestStartingtime = max(startingTimeMatrix(z));
}
}

// GT - Step 2.H4: Delete operations outside of the sigma window

//Check if earliest starttime is smaller then occupation
int starttime;
if (max(startingTimeMatrix(operationDashDash)) < indiMachines.get(currentMachine).timeOccupation){
starttime = indiMachines.get(currentMachine).timeOccupation;
}
else{
starttime = max(startingTimeMatrix(operationDashDash));
}

double sigma = 0.5;

// Delete Operations outside
for (int z=0;z<nOp;z++){
if (arrayB(z)==1 && max(startingTimeMatrix(z)) > (starttime + sigma * (timesProduction(operationDash) - starttime))){
arrayB(z) = 0;
}
}

//GT - Step 2.H5: Get O*
int permutation = 1000;
for (int z=0;z<nOp;z++){
if (arrayB(z)==1 && indiSequence(z)<permutation){
permutation = indiSequence(z);
operationStar = z;
}
}

// Mark OperationStern as done
arrayA(operationStar) = -1; // OperationStar is done, mark with -1 in A
for (int z=0;z<nOp;z++){
if (arrayA(z) != -1){
arrayV(z)(operationStar) = 0; // OperationStar is no longer predecessor of other operations
}
}
arrayV(operationStar) = oneValue(arrayV(operationStar), -1); // OperationStar is done, mark with -1 in V

// Set the starting time of operationStar
if (max(startingTimeMatrix(operationStar)) < indiMachines.get(currentMachine).timeOccupation){
indiProcess.get(operationStar).timeStart = indiMachines.get(currentMachine).timeOccupation;
}
if (max(startingTimeMatrix(operationStar)) >= indiMachines.get(currentMachine).timeOccupation){
indiProcess.get(operationStar).timeStart = max(startingTimeMatrix(operationStar));
}

// Set the ending time of operationStar
indiProcess.get(operationStar).timeEnd = indiProcess.get(operationStar).timeStart + indiProcess.get(operationStar).timeWorking;

// Set new occupation time of the current machine
indiMachines.get(currentMachine).timeOccupation = indiProcess.get(operationStar).timeEnd;

// GT - Schritt 4: Add successors of O* to A
for (int z=0;z<nOp;z++){

if (max(arrayV(z)) == 0){
arrayA(z) = 1;
}
if (max(arrayV(z)) == -1){
arrayA(z) = -1;
}
if (max(arrayV(z)) == 1){
arrayA(z) = 0;
}
}

//GT - Step 5: Update occupation time of the current machine to every operation which is also done on that machine
for (int z=0;z<nOp;z++){
if (arrayA(z) != -1 && indiAllocation(z) == currentMachine){
startingTimeMatrix(z)(nOp) = indiProcess.get(operationStar).timeEnd;
}
}

//Update startingTimeMatrix, because the occupation time in startingTimeMatrix(z)(nOp) was updated
calculateStartingTimeMatrix();

//Give the working Machine some Information about the Operation: Needed for the BarChart

// Calculate Breaking Condition
gtaBreakingCondition = count(arrayA, -1);
}
}

// 1-Bit-Mutation
void einbitmutation(int nOp, double mutProbability){

for (int i = 0; i<nOp; i++) {
double random = randomNumber();
if (random<mutProbability){
if (indiAllocation(i)==1){
indiAllocation(i)=0;
}
else {
indiAllocation(i)=1;
}
}
}
}

//Mixed Mutation
void mixedmutation(int nOp, float mixMutProbability, int typeCoding){

double random = randomNumber();
if(random > mixMutProbability){
int sectionStart = (int) round(randomNumber()*nOp,0);
int sectionEnd = (int) round(randomNumber()*nOp,0);
while (sectionEnd == sectionStart){
sectionEnd = (int) round(randomNumber()*nOp,0);
}
if (sectionEnd < sectionStart){
int temp = sectionStart;
sectionStart = sectionEnd;
sectionEnd = temp;
}

int() tempArr = new int(sectionEnd-sectionStart);
if (typeCoding == 0){
tempArr = copyArraySection(indiAllocation, sectionStart, tempArr, 0, sectionEnd-sectionStart);
List<Integer> tempList = new ArrayList<>();
for (int k=0;k<tempArr.length;k++){
}
Collections.shuffle(tempList);
int() mixedArr = tempList.stream().mapToInt(i->i).toArray();
indiAllocation = copyArraySection(mixedArr, 0, indiAllocation, sectionStart, sectionEnd-sectionStart);
}
else if (typeCoding == 1){
tempArr = copyArraySection(indiSequence, sectionStart, tempArr, 0, sectionEnd-sectionStart);
List<Integer> tempList = new ArrayList<>();
for (int k=0;k<tempArr.length;k++){
}
Collections.shuffle(tempList);
int() mixedArr = tempList.stream().mapToInt(i->i).toArray();
indiSequence = copyArraySection(mixedArr, 0, indiSequence, sectionStart, sectionEnd-sectionStart);
}
else{
System.out.println("Wrong Input for Coding Type of Mixed Mutation");
}
}
}

// Swap-Mutation
void swapmutation(int nOp, float mutProbability){

for (int i = 0; i<nOp; i++) {
double random = randomNumber();
if (random<mutProbability) {
double random2 = round(randomNumber()*(nOp-1),0);
int randomposition = (int)random2;
int saveNumber = indiSequence(randomposition);
indiSequence(randomposition) = indiSequence(i);
indiSequence(i) = saveNumber;
}
}
}

}

``````

I would like to describe the programm even a bit more, but i already wrote a lot. You can fine the whole program with more explanations, pictures and examples here:
https://github.com/enruk/Productionplaner

I know its not allowed to link to third party website but i really hesitate to copy every class in here. I don’t think this helps clarity and understanding.

Why I post it here?
My problem is that in self-taught programming, I only work up to the point where the program works. I miss a kind of mentor who tells me: Yes, this works, but it can be done better (which is i guess the point of this website).

As already mentioned, I am also happy to receive feedback on the remaining classes of the program or the entire program (but i know that this is not the purpose of this website). I know the program is quiet big to give a detailed feedback. What I look for are more general tips / advices. I appreciate the smallest advices.
What can i do better / on which topics should I work / research more?

• Structure of the program
• Basic algorithms
• Structure and use of methods
• Use of data structures

I know there are some german comments / varaibles names. Apologies. I will fix that asap.

Also I am happy about good recommendations for online courses.
I am currently doing this one:
https://www.udemy.com/course/java-training-ullenboom/

Posted on Categories Articles

## For classifier systems, how is the genetic algorithm run on delayed class?

I am reading Signals and Boundaries and came across Classifier Systems. In the book, John Holland seems to imply that the signals in the input, as they come from a live environment, don’t have an immediate output known to the system (there is delay between the classification and the action). In every explanation I have seen online (e.g., https://en.wikipedia.org/wiki/Learning_classifier_system#Rule/classifier/population), it seems that it is expected that the input data comes in tag/class pairs, which makes it not true online learning. My question is: given that we do not know the class of a sample data tag in runtime, how is the fitness computed for the genetic algorithm step of the CS during online learning?

Posted on Categories Articles

## javascript – Genetic Algorithm Typescript

javascript – Genetic Algorithm Typescript – Code Review Stack Exchange
Posted on Categories Articles

## python – Genetic algorithm to guess coefficient of a polynomial

I have tried to code a genetic algorithm to guess the coefficients of a degree 4 polynomial. The information initially provided is values of y = f(x) for different x using the original polynomial. I then generate 100 polynomials with randomly selected coefficients. These polynomials are then ranked based on the least square difference (LSD) and sorted such that lower LSD has higher rank. These ranks are then converted into nonlinear ranking and mapped to a line of unit length. Individuals are then selected and real valued crossover is used to generate progeny, who are also sorted according to LSD. The worst individuals in the starting pool are then replaced by progeny.
Please suggest improvements where possible. I am intermediate in Python.

``````"""
Genetic Algorithm implementation of finding coefficients of a polynomial
3.0 - 4.3 * x + 5.9 * x ** 2 - 5.2 * x ** 3 + 1.0 * x ** 4 = 0
"""

#%% import modules
import numpy as np
import matplotlib.pyplot as plt
from copy import deepcopy

# np.set_printoptions(linewidth=np.inf)

#%% define a function to return the polynomial
def poly(x):
return 3.0 - 4.3 * x + 5.9 * x ** 2 - 5.2 * x ** 3 + 1.0 * x ** 4

def gen_poly(array, x):
return (
array(0)
+ array(1) * x
+ array(2) * x ** 2
+ array(3) * x ** 3
+ array(4) * x ** 4
)

#%% define function to measure fitness
### takes an array as input and fills the last column with fitness values
### take care to always keep one column for fitness values in starting array
### output is an array sorted by fitness
def fitness(array):
# set initial values
initial = np.array(
(
(-5, 1447.0),
(-4, 703.4),
(-3, 290.4),
(-2, 92.8),
(-1, 19.4),
(0, 3),
(1, 0.4),
(2, -7.6),
(3, -16.2),
(4, 3.4),
(5, 104),
)
)
# score is the total square deviation from initial. lower is better
for ind in array:
fitness = 0
for x, y in zip(initial(:, 0), initial(:, 1)):
fitness += (
(
ind(0)
+ ind(1) * x
+ ind(2) * x ** 2
+ ind(3) * x ** 3
+ ind(4) * x ** 4
)
- y
) ** 2
ind(5) = fitness
return array(array(:, 5).argsort())

#%% function to give non linear rank to sorted pool
def nl_rank(array):
copy = deepcopy(array)
# normalization factor for selection pressure 3 and individuals 50
norm = 290.3359045831912
# setting ranks based on position in the pool
for i, j in enumerate(copy):
j(-1) = 50 * 1.06 ** (49 - i) / norm
return copy

#%% define function to carry out stochastic universal sampling
### takes non linear ranked array as input
### output is list of selected individuals
def sel_ind(nl_array):
copy = deepcopy(nl_array)
# normalize ranks
norm = sum(copy(:, -1))
copy(:, -1) = copy(:, -1) / norm

# map intervals on range (0, 1)
prob_list = list(copy(:, -1))
intervals = ()
start = 0
for prob in prob_list:
end = start + prob
intervals.append((start, end))
start = end

# selecting 6 individuals from the intervals
rng = np.random.default_rng()
points = (rng.uniform(0, 1 / 5))
for i in range(4):
points.append(points(-1) + 1 / 5)
index, i = (), 0
for point in points:
for j in range(i, len(intervals)):
if intervals(j)(0) < point < intervals(j)(1):
index.append(j)
i = j
break
return index

#%% define function to carry out mating. only unique pairings are considered
### each mating gives 2 children
def crossover(array, individuals):
rng = np.random.default_rng()
progeny = np.empty((0, 6))
for i in range(len(individuals) - 1):
for j in range(i + 1, len(individuals)):
mate1 = rng.uniform(-0.25, 1.25, (1, 5)).squeeze()
mate2 = rng.uniform(-0.25, 1.25, (1, 5)).squeeze()
mutation1 = rng.uniform(-0.025, 0.025, (1, 5)).squeeze()
mutation2 = rng.uniform(-0.025, 0.025, (1, 5)).squeeze()
baby1 = (
array(i, :5) * mate1 + array(j, :5) * (1 - mate1) + mutation1 * mate1
)
baby1 = np.append(baby1, 0)
progeny = np.append(progeny, (baby1), axis=0)
baby2 = (
array(j, :5) * mate2 + array(i, :5) * (1 - mate2) + mutation2 * mate2
)
baby2 = np.append(baby2, 0)
progeny = np.append(progeny, (baby2), axis=0)
return fitness(progeny)

#%% helper function to print arrays to log
def arr_print(arr, count):
print(f"#loop_count = {count}")
print(arr)

#%% main
if __name__ == "__main__":
# create rng instance
rng = np.random.default_rng()

# create parent pool
# parent pool has 5 columns for coefficients and one for fitness
pool = rng.uniform(-50, 50, (50, 6))

# measure fitness of each parent and sort in decreasing order of fitness
pool = fitness(pool)
starting_fitness = pool(0, 5)

# plotting the original curve
plt.ion()
fig, ax = plt.subplots()
x = np.linspace(-5, 5, 200)
ax.plot(x, poly(x), "r", lw=1, label="Original")
ax.set_xlabel("X axis")
ax.set_ylabel("Y axis")
ax.set_ylim(-1000, 1000)
ax.set_title("Comparision of curves")
ax.legend()

loop_count = 1

while True:
# plotting
if loop_count == 1:
ax.plot(x, gen_poly(pool(0, :), x), lw=1, label=f"Iteration = {loop_count}")
ax.legend()
fig.canvas.draw()
plt.pause(0.0001)
# arr_print(pool, loop_count)
elif loop_count % 100 == 0:
ax.plot(x, gen_poly(pool(0, :), x), lw=0.25, ls="solid")
fig.canvas.draw()
plt.pause(0.0000001)
# arr_print(pool, loop_count)
elif pool(0, 5) < 0.005:
ax.plot(
x,
gen_poly(pool(0, :), x),
"k",
lw=1.5,
label=f"Iteration = {loop_count}",
)
ax.set_xlim(-2, 5)
ax.set_ylim(-40, 75)
ax.legend()
fig.canvas.draw()
plt.pause(0.001)
# arr_print(pool, loop_count)
break
# rank parents based on non linear ranking
ranked = nl_rank(pool)

# select individuals
individuals = sel_ind(ranked)

# create progeny
progeny = crossover(ranked, individuals)

# remove 20 worst individuals from pool
pool = np.delete(pool, np.s_(-20:), axis=0)

# add progeny to the new pool
pool = np.vstack((pool, progeny(:20, :)))

# sort pool according to fitness
pool = pool(pool(:, 5).argsort())

loop_count += 1

print(starting_fitness, pool(0, 5))

$$```$$
``````

Posted on Categories Articles

## complexity – Genetic Algorithm – Software Engineering Stack Exchange

#### Stack Exchange Network

Stack Exchange network consists of 176 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange

Posted on Categories Articles

## genetic algorithms – Travelling Salesman Problem: Distance between solutions

I’m designing a genetic algorithm to solve the travelling salesman problem. So far, I’ve gotten fairly good results. I’m now trying to improve on them by implementing some sort of diversification scheme (like fitness sharing and crowding), although I’m struggling with the conceptualisation of the inter-solution distance a bit.

Solutions represent a path that goes through all cities, i.e. a permutation of the order in which they are visited. This is represented in my code by np.arrays. If I want to know how similar two solutions are, I basically want to find the distance between two permutations of n_cities elements. I currently have two ideas for the distance function.

1. Levenshtein distance, which is simply ‘how many atomic edits are two sequences removed from each other.
2. Hamming distance, which denotes the number of positions that are the same.

Note that, for each solution, I make sure to cycle it so it starts in the same position (city). Otherwise these metrics won’t make sense.

Which of them is more appropriate? Is there a better solution? I’ve browsed a number of articles, but haven’t really found an answer yet.

Posted on Categories Articles

## optimization – Dynamic Chromosome Length in Genetic Algorithms

My inquiry concerns the length of chromosomes employed in genetic algorithms (GA), and more broadly in other classes of evolutionary algorithms.

The chromosome length is fixed throughout a GA’s run. However, some work has explored the benefits of altering a related GA parameter: the size of the GA population, which is allowed to vary in size (increase or decrease) from one generation to the next in order to better explore and exploit the search space.

My question is now: given that the population size can change, is there any benefit to altering chromosome size? I haven’t really come across any work on this.

I imagine this would have applications in things like cruise stopovers and finding alternate routes to a given destination.

Adding another city for instance in a run of the Travelling Salesman Problem (TSP) using GA would lead to an increase in the search space that must be explored. Take for instance a tour with 10 cities: the size of the search space is 10! = 3628800. Adding an addition city would mean that there are now an additional 36288000 (11! – 10!) tours to consider.

Posted on Categories Articles

## python – Genetic sequence analyzer that reads FASTA and GenBank file formats and outputs all possible gene products

``````import random
import re
from collections import defaultdict

nucleotides = ("A", "C", "G", "T")
rnanucleotides = ("A", "C", "G", "U")
rev_compliment = {'A': 'T', 'T': 'A', 'G': 'C', 'C': 'G'}
RNA_codon_table = {"UUU": "F", "CUU": "L", "AUU": "I", "GUU": "V",
"UUC": "F", "CUC": "L", "AUC": "I", "GUC": "V",
"UUA": "L", "CUA": "L", "AUA": "I", "GUA": "V",
"UUG": "L", "CUG": "L", "AUG": "M", "UCU": "S",
"GUG": "V", "CCU": "P", "ACU": "T", "GCU": "A",
"UCC": "S", "CCC": "P", "ACC": "T", "GCC": "A",
"UCA": "S", "CCA": "P", "ACA": "T", "GCA": "A",
"UCG": "S", "CCG": "P", "ACG": "T", "GCG": "A",
"UAU": "Y", "CAU": "H", "AAU": "N", "GAU": "D",
"UAC": "Y", "CAC": "H", "AAC": "N", "GAC": "D",
"UAA": "_", "CAA": "Q", "AAA": "K", "GAA": "E",
"UAG": "_", "CAG": "Q", "AAG": "K", "GAG": "E",
"UGU": "C", "CGU": "R", "AGU": "S", "GGU": "G",
"UGC": "C", "CGC": "R", "AGC": "S", "GGC": "G",
"UGA": "_", "CGA": "R", "AGA": "R", "GGA": "G",
"UGG": "W", "CGG": "R", "AGG": "R", "GGG": "G"
}
amino_acid_weights = {
"A": 71.03711,
"C": 103.00919,
"D": 115.02694,
"E": 129.04259,
"F": 147.06841,
"G": 57.02146,
"H": 137.05891,
"I": 113.08406,
"K": 128.09496,
"L": 113.08406,
"M": 131.04049,
"N": 114.04293,
"P": 97.05276,
"Q": 128.05858,
"R": 156.10111,
"S": 87.03203,
"T": 101.04768,
"V": 99.06841,
"W": 186.07931,
"Y": 163.06333,
"_": 0,
}

def title_screen():
print("""-. .-.   .-. .-.   .-. .-.   .
||||| /|||||| /|||||| /|
|/ ||||||/ ||||||/ |||||
~   `-~ `-`   `-~ `-`   `-~ `-""")
print("DNA SEQUENCE ANALYZER by Ethan Hetrickn")

def user_selection():
response = input("What would you like to do?n"
"1. Input your own DNA sequence.n"
"2. Input your own RNA sequence.n"
"3. Generate random sequence data.n"
"4. Import a file in FASTA format.n"
"5. Import a file in GenBank format.n")
if response == '1':
seq = input("Input your DNA sequence here: n").upper().strip()
seq = validate_seq(seq)
return seq, tag
elif response == '3':
try:
x = int(input("How many nucleotides long do you want your sequence? Enter an integer.n"))
seq = ''.join((random.choice(nucleotides) for nuc in range(x)))
tag = "Random DNA sequence"
return seq, tag
except ValueError:
print("Invalid response.n")
run('seq', 'tag')
elif response == '2':
rna = input("Input your RNA sequence here: n").upper().strip()
rna = validate_seq(rna)
seq = rev_transcription(rna)
return seq, tag
elif response == '4':
try:
for i in fasta():
tag = i(0)
seq = i(1)
if "U" in seq:
seq = rev_transcription(seq)
return seq, tag
except FileNotFoundError:
elif response == '5':
try:
seq, tag = genbank_to_fasta()
return seq, tag
except FileNotFoundError:
else:
print("Invalid response.n")

def fasta():
sequences = defaultdict(str)
file = input(r'Input the path to your file: ')
with open(f'{file}') as f:
current_tag = None
list = ()
for line in lines:
m = re.match('^>(.+)', line)
if m:
current_tag = m.group(1)
else:
sequences(current_tag) += line.strip()
for tag, seq in sequences.items():
list.append((tag, seq))
return list

def genbank_to_fasta():
file = input(r'Input the path to your file: ')
with open(f'{file}') as f:
locus = re.search('NC_d+.d+', gb(3)).group()
region = re.search('(d+)?.+(d+)', gb(2))
definition = re.search('w.+', gb(1)(10:)).group()
definition = definition.replace(definition(-1), "")
tag = locus + ":" + region.group(1) + "-" + region.group(2) + " " + definition
sequence = ""
for line in (gb):
pattern = re.compile('(atgc){10}')
matches = pattern.finditer(line)
for match in matches:
sequence += match.group().upper()
end_pattern = re.search('(atgc){1,9}', gb(-3))
sequence += end_pattern.group().upper()
return sequence, tag

def validate_seq(dnaseq):
if len(dnaseq) == 0:
print("Empty sequence.n")
user_selection()
else:
for nuc in dnaseq:
if nuc not in nucleotides or rnanucleotides:
print("Invalid sequence.n")
user_selection()
else:
return dnaseq

def nuc_count(dnaseq, tag):
print("n================================================")
print(f'THE ANALYSIS: >{tag}n')
if len(dnaseq) >= 1000:
print(f'Total: {len(dnaseq)/1000} kbpn')
else:
print(f'Total: {len(dnaseq)} bpn')
print("Nucleotide frequency:")
for letter in nucleotides:
letter_total = dnaseq.count(letter)
letter_per = (letter_total / len(dnaseq))
print(f'{letter}: {letter_total} : {letter_per:%}')
GC = ((dnaseq.count("G") + dnaseq.count("C"))/(len(dnaseq)))
print(f'GC content: {GC:%}')

def rev_transcription(rnaseq):
dnaseq = rnaseq.replace("U", "T")
return dnaseq

def DNA_to_cDNA(dnaseq):
cdna = "".join((rev_compliment(nuc) for nuc in dnaseq))
return cdna

def transcription(dnaseq):
rna = dnaseq.replace("T", "U")
print(f"RNA:  5' {rna} 3'")
return rna

def translation(dnaseq, init_pos=0):
dnaseq = dnaseq.replace("T", "U")
return (RNA_codon_table(dnaseq(pos:pos + 3)) for pos in range(init_pos, len(dnaseq) - 2, 3))

frames = ()
frames.append(translation(dnaseq, 0))
frames.append(translation(dnaseq, 1))
frames.append(translation(dnaseq, 2))
frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 0))
frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 1))
frames.append(translation(DNA_to_cDNA(dnaseq)(::-1), 2))
return frames

def prot_from_rf(aa_seq):
prot1 = ()
proteins = ()
for aa in aa_seq:
if aa == "_":
proteins.extend(prot1)
prot1 = ()
else:
if aa == "M":
prot1.append("")
for i in range(len(prot1)):
prot1(i) += aa
return proteins

else:
all_proteins = ()
for rf in rfs:
prots = prot_from_rf(rf)
for p in prots:
all_proteins.append(p)
return all_proteins

def protein_weight(protein):
weights = (((amino_acid_weights(protein(pos: pos + 1)) for pos in range(0, len(protein)))))
weight = round(sum(weights), 3)
return weight

def printing(seq, frames):
for i, frame in enumerate(frames):
print(f'{i + 1}. {"".join(frame)}')
print("nAll possible proteins:")
list = ()
for prot in all_proteins_from_rfs(seq):
if prot not in list:
list.append(prot)
list.sort(key=len, reverse=True)
for i, prot in enumerate(list):
print(f'{i + 1}. {prot}: {protein_weight(prot)} Da')
print("================================================n")

def run(seq, tag):
if seq == 'seq' and tag == 'tag':
seq, tag = user_selection()
nuc_count(seq, tag)
cseq = DNA_to_cDNA(seq)
print(f"nDNA:  5' {seq} 3'")
print(f"cDNA: 3' {cseq} 5'")
transcription(seq)
translation(seq)
printing(seq, frames)

title_screen()
run('seq', 'tag')

Example output

================================================
THE ANALYSIS: >NC_000006.12:26156329-26157115 Homo sapiens chromosome 6, GRCh38.p13 Primary Assembly

Total: 787 bp

Nucleotide frequency:
A: 221 : 28.081321%
C: 246 : 31.257942%
G: 230 : 29.224905%
T: 90 : 11.435832%
GC content: 60.482846%

DNA:  5' CGAGTCCCGGCCAGTGCCTCTGCTTCCGGCTCGAATTGCTCTCGCTCACGCTTGCCTTCAACATGTCCGAGACTGCGCCTGCCGCGCCCGCTGCTCCGGCCCCTGCCGAGAAGACTCCCGTGAAGAAGAAGGCCCGCAAGTCTGCAGGTGCGGCCAAGCGCAAAGCGTCTGGGCCCCCGGTGTCCGAGCTCATTACTAAAGCTGTTGCCGCCTCCAAGGAGCGCAGCGGCGTATCTTTGGCCGCTCTCAAGAAAGCGCTGGCAGCCGCTGGCTATGACGTGGAGAAGAACAACAGCCGCATCAAGCTGGGTCTCAAGAGCCTGGTGAGCAAGGGCACCCTGGTGCAGACCAAGGGCACCGGCGCGTCGGGTTCCTTCAAACTCAACAAGAAGGCGGCCTCTGGGGAAGCCAAGCCTAAGGCTAAAAAGGCAGGCGCGGCCAAGGCCAAGAAGCCAGCAGGAGCGGCGAAGAAGCCCAAGAAGGCGACGGGGGCGGCCACCCCCAAGAAGAGCGCCAAGAAGACCCCAAAGAAGGCGAAGAAGCCGGCTGCAGCTGCTGGAGCCAAAAAAGCGAAAAGCCCGAAAAAGGCGAAAGCAGCCAAGCCAAAAAAGGCGCCCAAGAGCCCAGCGAAGGCCAAAGCAGTTAAACCCAAGGCGGCTAAACCAAAGACCGCCAAGCCCAAGGCAGCCAAGCCAAAGAAGGCGGCAGCCAAGAAAAAGTAGAAAGTTCCTTTGGCCAACTGCTTAGAAGCCCAACACAACCCAAAGGCTCTTTTCAGAGCCACCCA 3'
cDNA: 3' GCTCAGGGCCGGTCACGGAGACGAAGGCCGAGCTTAACGAGAGCGAGTGCGAACGGAAGTTGTACAGGCTCTGACGCGGACGGCGCGGGCGACGAGGCCGGGGACGGCTCTTCTGAGGGCACTTCTTCTTCCGGGCGTTCAGACGTCCACGCCGGTTCGCGTTTCGCAGACCCGGGGGCCACAGGCTCGAGTAATGATTTCGACAACGGCGGAGGTTCCTCGCGTCGCCGCATAGAAACCGGCGAGAGTTCTTTCGCGACCGTCGGCGACCGATACTGCACCTCTTCTTGTTGTCGGCGTAGTTCGACCCAGAGTTCTCGGACCACTCGTTCCCGTGGGACCACGTCTGGTTCCCGTGGCCGCGCAGCCCAAGGAAGTTTGAGTTGTTCTTCCGCCGGAGACCCCTTCGGTTCGGATTCCGATTTTTCCGTCCGCGCCGGTTCCGGTTCTTCGGTCGTCCTCGCCGCTTCTTCGGGTTCTTCCGCTGCCCCCGCCGGTGGGGGTTCTTCTCGCGGTTCTTCTGGGGTTTCTTCCGCTTCTTCGGCCGACGTCGACGACCTCGGTTTTTTCGCTTTTCGGGCTTTTTCCGCTTTCGTCGGTTCGGTTTTTTCCGCGGGTTCTCGGGTCGCTTCCGGTTTCGTCAATTTGGGTTCCGCCGATTTGGTTTCTGGCGGTTCGGGTTCCGTCGGTTCGGTTTCTTCCGCCGTCGGTTCTTTTTCATCTTTCAAGGAAACCGGTTGACGAATCTTCGGGTTGTGTTGGGTTTCCGAGAAAAGTCTCGGTGGGT 5'
RNA:  5' CGAGUCCCGGCCAGUGCCUCUGCUUCCGGCUCGAAUUGCUCUCGCUCACGCUUGCCUUCAACAUGUCCGAGACUGCGCCUGCCGCGCCCGCUGCUCCGGCCCCUGCCGAGAAGACUCCCGUGAAGAAGAAGGCCCGCAAGUCUGCAGGUGCGGCCAAGCGCAAAGCGUCUGGGCCCCCGGUGUCCGAGCUCAUUACUAAAGCUGUUGCCGCCUCCAAGGAGCGCAGCGGCGUAUCUUUGGCCGCUCUCAAGAAAGCGCUGGCAGCCGCUGGCUAUGACGUGGAGAAGAACAACAGCCGCAUCAAGCUGGGUCUCAAGAGCCUGGUGAGCAAGGGCACCCUGGUGCAGACCAAGGGCACCGGCGCGUCGGGUUCCUUCAAACUCAACAAGAAGGCGGCCUCUGGGGAAGCCAAGCCUAAGGCUAAAAAGGCAGGCGCGGCCAAGGCCAAGAAGCCAGCAGGAGCGGCGAAGAAGCCCAAGAAGGCGACGGGGGCGGCCACCCCCAAGAAGAGCGCCAAGAAGACCCCAAAGAAGGCGAAGAAGCCGGCUGCAGCUGCUGGAGCCAAAAAAGCGAAAAGCCCGAAAAAGGCGAAAGCAGCCAAGCCAAAAAAGGCGCCCAAGAGCCCAGCGAAGGCCAAAGCAGUUAAACCCAAGGCGGCUAAACCAAAGACCGCCAAGCCCAAGGCAGCCAAGCCAAAGAAGGCGGCAGCCAAGAAAAAGUAGAAAGUUCCUUUGGCCAACUGCUUAGAAGCCCAACACAACCCAAAGGCUCUUUUCAGAGCCACCCA 3'

1. RVPASASASGSNCSRSRLPSTCPRLRLPRPLLRPLPRRLP_RRRPASLQVRPSAKRLGPRCPSSLLKLLPPPRSAAAYLWPLSRKRWQPLAMTWRRTTAASSWVSRAW_ARAPWCRPRAPARRVPSNSTRRRPLGKPSLRLKRQARPRPRSQQERRRSPRRRRGRPPPRRAPRRPQRRRRSRLQLLEPKKRKARKRRKQPSQKRRPRAQRRPKQLNPRRLNQRPPSPRQPSQRRRQPRKSRKFLWPTA_KPNTTQRLFSEPP

All possible proteins:
2. MTWRRTTAASSWVSRAW: 2034.001 Da
3. MRLLFFSTS: 1082.558 Da
================================================
``````

This is an update of my gene sequencing program from a previous post. If you need clarifications on what each function accomplishes feel free to ask or look at my previous post. Any tips to make the code more concise or efficient are much appreciated or functions that you think would be useful. I included the links to an example FASTA and GenBank file for your reference.
FASTA(https://www.ncbi.nlm.nih.gov/nuccore/NC_000006.12?report=fasta&from=26156329&to=26157115)
GenBank(https://www.ncbi.nlm.nih.gov/nuccore/NC_000006.12?report=genbank&from=26156329&to=26157115)

Posted on Categories Articles

## optimization – Search Space for Genetic Algorithms

I have not been able to find anywhere a general formula for the size of the search space for Genetic Algorithms (GA).

I would imagine that such a formula would involve the binomial coefficient — maybe Stars and Bars

https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)

The reason I ask is because I have developed my own GA and would like to know the search space size as a means to motivate the need for, and usefulness of, metaheuristics like GAs for a manuscript that I am currently writing.

As an example, consider a simple binary (0/1) GA with string length of $$L$$ = 10 and population size (number of chromosomes) of $$N$$ = 100. Possible solutions could be:

``````0100001110

1011010010

etc.
``````

Since 0/1 can be repeated in any given string (chromosome), there would be exactly $$2^N$$ possible configurations. This would generalize to $$k^N$$ for any $$k$$-ary problem.

I feel this isn’t the whole story.

If I had to guess a closed-form expression using Stars and Bars for the binary GA case, it might be something like $$binom{N + 2^N – 1}{N} = binom{N + 2^N – 1}{2^N – 1} = binom{100 + 2^{100} – 1}{100} = binom{100 + 2^{100} – 1}{2^{100} – 1} sim 10^{2852}$$

Is this the right line of thinking?

Any thoughts are greatly welcomed.

Posted on Categories Articles