The gene expression process in nature evaluates the fitness
of a DNA through the production of different proteins in different cells.
The DNA sequence first produces the mRNA sequence and next the mRNA produces
the protein sequence by using a transformation called the genetic code.
Finally the protein sequence folds into a three-dimensional structure which
determines the fitness of the genome. This proposal points out that genetic
code-like transformations introduce very interesting properties to the
representation of a genetic fitness function. A recent result by the PI
showed that such transformations in binary sequence representation can
convert functions with exponentially large description to one that is highly
suitable for polynomial-size approximation under certain practical conditions.
Precisely speaking, there exists a class of genetic code-like transformations
that can construct a Fourier representation
with only a polynomial number of terms that are exponentially more significant than the rest when fitter chromosomes are given more copies through a redundant, equivalent representation. This is a very desirable property for efficient induction of function representation from data which is a fundamental problem in learning, data mining, and optimization.
The proposed research will extend this preliminary finding
and explore the computation in richer genetic code-like transformations
for non-binary representations that are used in natural DNA, mRNA, and
proteins. At this point, there exists no known strong techniques
to induce functions with exponentially large representation to functions
with only a polynomial number of terms. If the genetic code offers such
properties then it will have a significant impact on the theory of computation,
machine learning, data mining, pattern recognition, automated program induction,
and optimization. It will also help explaining the scalable adaptive behavior
of the natural genetic search.
The proposed research will have the following primary