Programming of gene expressions - Complex systems and AI

Contents

Programming of genetic expressions

The programming of genetic expressions is inspired by the replication and expression of the DNA molecule, especially at the gene level. Expression of a gene involves the transcription of its DNA into RNA which in turn forms amino acids which constitute proteins in the phenotype of an organism. The building blocks of DNA are subject to mechanisms of variation (mutations such as adaptation errors) as well as to recombination during sexual reproduction.

Gene expression programming uses a linear genome as the basis for gene operators such as mutation, recombination, inversion, and transposition. The genome is made up of chromosomes and each chromosome is made up of genes that are translated into a tree of expression to solve a given problem. Robust gene definition means that genetic operators can be applied to the subsymbol representation without regard to the structure of the resulting gene expression, ensuring the separation of genotype and phenotype.

The goal of the gene expression programming algorithm is to improve the adaptive fit of an expressed program in the context of a problem-specific cost function. This is achieved through the use of an evolutionary process that operates on a sub-symbolic representation of candidate solutions using surrogates for processes (descent with modification) and mechanisms (genetic recombination, mutation, inversion, transposition and expression. genes) of evolution.

A candidate solution is represented by a linear chain of symbols called Karva notation or K expression, where each symbol corresponds to a function or terminal node. The linear representation is mapped to an expression tree in an extended fashion. A K expression has a fixed length and includes one or more sub-expressions (genes), which are also defined with a fixed length.

A gene is made up of two sections, a head which can contain any function or terminal symbols, and a tail section which can only contain terminal symbols. Each gene will always result in a syntactically correct expression tree, where the tail portion of the gene provides a genetic buffer that ensures expression closure.

The length of a chromosome from the programming of genetic expressions is defined by the number of genes, where a gene length is defined by h + t. The h is a user-defined parameter (such as 10), and t is defined as t = h (n-1) +1, where the n represents the maximum arity of the functional nodes in the expression (such as 2 if the arithmetic functions *; /; -; + are used).

The mutation operator of the programming of genetic expressions substitutes expressions along the genome, although it must follow genetic rules such as function and terminal nodes are mutated in the head of genes, while only terminal nodes are substituted in the tail of genes.

Crossbreeding occurs between two parents selected from the population and can occur on the basis of a one-point cross, two-point cross, or a gene-based approach where genes are selected from among the parents. with uniform probability.

An inversion operator can be used with a low probability that inverts a small sequence of symbols (1-3) in a section of a gene (tail or head). A transposition operator can be used in a number of different modes, including: duplicating a small sequence (1-3) from somewhere on a gene at the head, small sequences on a gene at the root of the gene, and move a gene in the chromosome. In the case of intragenic transpositions, the sequence in the head of the gene is shifted downward to accommodate the copied sequence and the length of the head is truncated to maintain consistent gene sizes.

At the programming of genetic expressions, unot ? can be included in the terminal set and it represents a numerical constant from a vector that has evolved at the end of the genome. The constants are read from the end of the genome and replace the? when the expression tree is created (in the first order of width).

Several related sub-expressions can be used on difficult problems when a single gene is not sufficient to solve the problem. Subexpressions are linked using link expressions which are nodes of function that are either statically defined (as a conjunction) or evolved on the genome with genes.

Here is the gene expression programming algorithm: