Gene Expression Programming .NET Framework 1.0
Download
Gene Expression Programming (GEP)
GEP is an evolutionary algorithm for function finding. It is explained in detail in a number of books and articles by Candida Ferreira, which you can find at
http://www.gene-expression-programming.com/.
What attracted me to GEP was its elegance. My goal was to build a prediction model and my first choise was Feed-forward Neural Networks (NNs). When I got deeper into NNs and started developing my project, I realized there are so many parameters: learning rate, number of layers, number of neurons in layer, jitter, annealing, parameters of squash functions, etc. What values should I use for those parameters? I wanted to incorporate NNs with genetic algorithms and to evolve the parameters and structure of the networks. It was quite hard to come up with some elegant way to do that. Twice I started programming the genetic algorithms from scratch. At some point, both times, it got so sloppy, that I stopped. From that point of view GEP is supperior. Every program is an expression tree and it is easily coded into a linear structure (the genome). Afterwards this genome can be subjected to different genetic operators and produce valid offspring. Although in the case of this library genetic operators are not hardcoded into the genome and still need some parameters to be set in advance, they are mostly probabilities and are pretty intuitive to what values would be best. With this architecture the library can be upgraded and new genetic operators can be introduced.
There are two examples provided with the source code. The first is a simple parabola fitting. The second is
a much more complex prediction of currency movement.
Here is a link, which explains this experiment.
Detailed documentation is available
here
The entire library is written in C#. It is created using .NET Framework 3.5. It does not come with any user interface. Creating a population of expression trees and starting the evolution takes about two pages of source code. This excludes data loading.
The framework is developed in such a way that distributed computing over networks can be implemented in the future.
At the beginning of development the goal was to have an option to choose whether to use float, double or decimal value types, depending on the requirments of the task at hand. Since in C# macros are not as powerful as in C++, there are three namespaces for every value type. The current version supports only float.
At the time this libary was being developed, .NET multipocessor support was in beta version. So far it does not support multiprocessor machines, but it will be incorporated in the future. However, it is still possible to harness the power of multiprocessor computers by creating multiple populations and mixing them from time to time.
Most probably when implementing the distributed computing over the network this will be the way to keep constants synchronized. This will also reduce network traffic volume, since constants will be transferred only once in the beginning.