Implementation and Testing
5.1 Implementation methodology
Similar to the design
methodology, an incremental approach to the implementation of the GGAT was
used. A basic genetic algorithm engine was first implemented and after
being tested and debugged the other features were added to the system.
After sufficient number of features for the engine were implemented, work
on the GUI started. Every feature was tested individually and in
conjunction with the rest of the system as it was added.
Features had then added and
modified in both subsystems until a GGAT was produced, which matched the
specification.
5.2 Choice
of Language
The choice of language is an
important part of the design since it will play an important part of the
design and architecture of the system. The GGAT did not have any specific
execution time or hardware constraints, which usually are the reason for a
low-level language to be used: therefore using a high level language was
the obvious choice.
Genetic algorithms have been
commonly implemented in LISP. One of the main reasons for the popularity
of LISP in the area of genetic algorithms is because as a language it
allows for dynamic code creation, compilation and execution. Some
adaptations of genetic algorithms like the ones used in some automated
test data generating systems require this functionality. LISP is also a
very powerful high-level language with a lot of function libraries
available. LISP was initially the language of choice for implementing the
genetic algorithm engine in order to make the tool useful for further work
in genetic algorithms. The user interface was to be implemented in Java
because the GUI API in LISP was believed to be more difficult to work with
and the author had no previous experience in LISP. The actual GUI
implementation was to be evaluated for feasibility since it was planned to
control a LISP genetic algorithm engine. The initial plan was to use
common memory locations in order to pass messages from the user interface
to the genetic algorithm engine or to incorporate the LISP engine as a
native method in the Java GUI. It was considered that if these were not
feasible, a Java form could be used only as an easy way to select the
genetic algorithm engine’s parameters that will be saved as a text file
and read in tern by the engine before it starts.
LISP is different to
procedural languages. It requires a new approach of programming style and
planning. After the language fundamentals and functionality were studied
in some detail it became apparent that all the functionality of LISP (that
is necessary for the completion of the GGAT specification) are present in
other common high level languages like Java. For the time allocated for
this project only a part of the original specification would have been
implemented in LISP. The author had no previous knowledge of LISP and the
overheads of learning the language in detail, as the project progressed
would have reduced the project scope.
Java was eventually chosen as
implementation language hence making the GUI feasibly since the problem of
communication between the user interface and the engine disappears. Also
the author had a breadth of experience in Java programming including GUI
building.
5.3 A note
on packages
The system consists of two
sub-systems therefore they can be split into different packages. Sun has
released a set of guidelines on naming packages; they should be a
company’s name in reverse. As the author’s personal domain name is
karnig.co.uk the packages will start as uk.co.karnig.
The genetic algorithm engine is in the
uk.co.karnig.ggat.ga package and the GUI sub-system is in the
uk.co.karnig.ggat.gui package. All the functionality of the system is
in the genetic algorithm sub-system, making it re-usable without the GUI.
5.4 Software produced
A fully operation system was produced. A number of the
features designed were not implemented because of time constraints. They
will be discussed later in this chapter.
The GGAT tool was produced
with a GUI. The genetic algorithm parameters are set for every genetic
algorithm. As previously mentioned the genetic algorithm executions can be
logged to a file. What exactly is logged can be chosen. A particular
genetic algorithm can be executed a number of times, and each starts with
a different initial population and can have different logging settings. A
user manual for the GGAT is provided.
The implementation of the GUI
subsystem was done using the “Forte for Java 2.0” IDE. Therefore parts of
the code of the uk.co.karnig.ggat.gui package were
automatically generated but still the author developed the majority of the
code there.
For implementing window components in Java there were
two choices. The windowing package that the author was familiar with was
the Abstract Window Toolkit (AWT). It provided all the necessary
functionality needed for this project. In the latest release of Java, Sun
Microsystems emphasised the advantages of the newer Swing windowing
package, which included greater platform compatibility and greater
functionality. The author was not familiar with the Swing package, but
decided that it would be of his advantage to learn and use the Swing
package, since it would give him greater experience.
Certain basic features had to be implemented from
scratch since Java did not provide convenient packages that implemented
them. The binary decoding was one of these features. The java.math package
was indicated to provide similar capabilities but deciding from binary to
integer was only implemented, even without a integer to binary capability.
Therefore this feature had to be implemented from scratch.
Another such feature was the sorting of genotypes in a
population vector. The Vector class did not provide a method that can use
an attribute of the objects it contains, to use it for a key to sort the
objects.
The random number generator was another function that
was actually available, but did not behave in the desired way, and a
specialised function had to be implemented using the basic random number
generator in Java.
All packages used in the GGAT,
including the Swing package are core java packages and are part of the
Java Standard Development Kit. Because most Java virtual machines are
likely to have them, the packages used were not included with the program
provided on the CD-Rom with this project. There is no Java Virtual machine
provided with the GGAT as well, as it is expected by the used to have one.
However a Java Development Environment is provided for completeness on the
CD-Rom.
A web page was developed
about the GGAT, where the program can be downloaded. For the reasons
mentioned previously no packages other that the two GGAT packages are
provided. User documentation and API is also provided. A copy of the web
site is provided in Appendix A. The web page is aimed to provide this
program to developers who might find it useful for their research or work.
The source code is decided to be distributed on personal basis, rather
than being available on the web site.
5.5 Features not implemented
A number of features of the GGAT were not implemented
because of time constraints.
The character string schema for the genetic algorithm
engine was one of these features.
The travelling salesman problem was also not
implemented, because the string schema was not going to be available and
because the time for completion of the project was running out.
The knapsack problem was also not implemented because
of lack in time. There were already five other fitness functions to
demonstrate the capabilities of the genetic algorithm engine and it was
decided that they represent an acceptable spectrum of problems.
The genetic algorithm engine was designed to allow for
invalid genotype children (as in the example of the travelling salesman).
The five problems implemented with the GGAT however all used a continuous
schema and did not need the use of that feature. The time constraints also
contributed for this decision.
In the design for the genetic algorithm the setting of
every parameter for the genetic algorithm were supposed to be validated
before being set, and exceptions thrown when invalid setting is attempted.
A number of the genetic algorithm engine parameters are set from the GUI
using window components, like drop down menus, that make incorrect setting
impossible. Such parameters include the parent selection technique,
crossover technique, mutation technique etc. Because of time restrictions
validating functions were not implemented for those parameters. However
all those parameters that involved a textual input from the user, for
example the mutation rate, have such validating techniques implemented.
Hence the GGAT prevents wrong setting parameters wrongly, but the genetic
algorithm engine itself does not validate all of the parameters set.
The error handling for the genetic algorithm, as it
executes was also not implemented because of time constrains. The GGAT has
been tested for errors in the genetic algorithm, and it is believed that
only wrong parameters can cause the genetic algorithm engine generate
errors. As previously mentioned the parameter setting in the GGAT is
designed to prevent wrong parameter settings, therefore no errors in the
genetic algorithm engine are expected. However since the GGAT aims to
provide a base for further work on genetic algorithms, users that use the
genetic algorithm engine could benefit from such error handling, so that
they can easily debug their fitness evaluation functions.
5.6 Testing
As mentioned testing of the GGAT was done as it was
implemented. Every element of the system was individually tested for
statement coverage and its functionality verified. Features were only
integrated to the system after they passed the test criteria.
When a feature failed the
test criteria, the implementation was debugged and in some occasions the
design revised, until the cause of the error is found and removed. An
example of such feature is the random number generator. It was a core part
of the genetic algorithm engine, and could not be substituted easily.
A function utility that
generated a random integer for a specified range was available in Java.
However it seemed to behave in a very deterministic manner when compared
to the random number generator in java used, which only returns a real
number between 0 and 1(1 exclusive). After some effort was put into that
function, and its design modified, the feature was completed successfully.
When certain bugs are found,
and cannot be fixed, the solution is to be added as part of the
specification or to be removed from it. There has not been a feature that
generated bugs that were added to the specification of this project, but
some features have been removed from the specification for that reasons.
An example is the average population fitness, mentioned before in the
design chapter. After its design and redesign implementations failed to
deliver the aimed the functionality test, the feature was removed from the
specification.
Functional testing was
thoroughly performed on the system as a whole. The GUI was tested whether
the appropriate Kernel preferences are correctly set, whether the Kernel
execution is monitored correctly and if the execution is controlled
correctly.
Most of the functional
testing effort was concentrated around the correctness of the genetic
algorithm engine. Parameter passing from the Kernel to the other classes
was examined closely. Then different combination of parameters tested the
stability of the genetic algorithm engine. Some contradictions between
certain parameter settings that were missed at the initial design were
picked up at the testing stage, and the design altered to eradicate the
problem.
For those parameters of the
genetic algorithm engine where not all possible settings could be manually
tested, boundary and median values were used. For example the parameter
setting the number of evaluations before the genetic algorithm is
terminated was tested using test date of that kind. Testing for the
maximum and minimum possible value settings of some parameters helped the
discovery and debugging of some faults in the crossover techniques.
The genetic algorithm
execution log capabilities greatly contributed to the functional testing
and debugging of the genetic algorithm engine. Some of the logging options
of the genetic algorithm engine were at first implemented for testing
purposes only. However they were latter on added to the system
specification. When a genetic algorithm is executed and logging done in
full detail, the actual operations of the genetic algorithm can be traced,
even at binary level. This provided a very good way of stepping trough the
phases of the genetic algorithm and helped not only for testing, but also
for the design of some crossover features. The details that each log
option provides are described in the user documentation (Appendix C) and
an example log file is provided in Appendix D.
The GGAT was believed to be thoroughly tested at least
for functional completeness. Statement coverage testing of the GGAT as a
whole seemed to be complicated because the program is event driven, and
some functions were added to the system for completeness and future
extensions, like the binary to Gray code translator. Therefore statement
coverage was done on the individual feature level, but the code has been
checked for irrelevant statements and functions when extra comments were
added to it. However two very easy to miss bugs were not discovered up
until days before the completion of this report. The first one involved
the crossover procedures. When a chromosome was crossed over, the parent
chromosomes’ genes were copied across to the child chromosomes. But since
the genes were reference objects, the children actually ended up with a
reference to the genes of the parent chromosomes, instead of gaining new
gene objects with the value of their parent chromosomes.
The bug was easy to miss since an equals sign is
logical to be used, since each gene ultimately represented a number
primitive. The second bug involved a subtraction operator in the real
umber decoding function that had to be addition. With the bug in place the
actual real numbers’ range was not what it was set to be by the
parameters. This bug was discovered when the algorithm was being described
in the design chapter of this report.
These two bugs indicate a rule that is often forgotten
by software engineers, that in most cases a software product can never be
tested enough and bugs can always pass the development undetected. Hence
this system cannot be claimed to be error free, but a sufficient effort
has been put to prevent the presence of any bugs in it.