Ecosystem

Constructing a simple self-perpetuating society of adaptable agents

Home

Jason Brownlee
April 2003
AI Writing Competition

Adaptation - alteration in the structure or function of organisms which enables them to survive and multiply in a changed environment
- The Macquarie Dictionary

Forward
I had read a number of articles on seemingly complex systems derived from simple rule sets, but it wasn’t until I constructed a Wator [1] simulation for a C++ class at University that I truly appreciated the magic of these systems. Later I reread Craig Reynolds: Boids [2] documentation and built a simple 2D implementation in Java. Honestly, I was spellbound. I could stare at the simulations for hours, wholly engrossed in the effect small changes in rules at the individual level had on the system as a whole.

I became intensely interested in machine learning algorithms and data mining after taking a few courses in Intelligent Agents and Artificial Intelligence. After constructing a reasonable library of various neural network algorithms and playing with the plethora of toy classification and prediction problems I needed something I could sink my teeth into.

The result is this article and its associated software. The goal was to attempt the construction of a self-perpetuating system of non-playable characters (NPC’s) that would use learning algorithms to optimize elements of their behaviour to engender their species.

The system was designed to be simplistic, with the intent of reinforcing ideas. Although I believe the concepts involved are thought provoking nonetheless. I hope you’ll agree and possibly like myself spend countless hours of enjoyment and frustration in continually tweaking and observing the program’s parameters.

Introduction
The goal of this article is to discuss the design and the practical implementation details of two common artificial intelligence techniques used in the construction of a non-deterministic, self- perpetuating and self-optimizing ecosystem of simple NPC’s in a real-time environment. Practical considerations are emphasized over technique detail. Upon completing this article the reader will have an understanding of development issues involved in applying the selected techniques in a real-time game environment.

We will start with what the ecosystem is trying to achieve and what type of agents populate its universe. Next a development platform is selected and evaluated, with a focus on its strengths and weaknesses.

A review is then made of the plant NPC with a focus on parameters which directly influence the species ability to survive and interact with other agents. A genetic algorithm is proposed and constructed to give the agent the flexibility to adapt to its local surroundings.

Next the herbivore agent is examined. We look at critical sub-systems such as goal selection focus on the problem of feed goal evaluation. A perceptron solution is proposed and implemented, with a focus on management of herbivore memories and the affect dynamic learning has on goal selection.

The system is evaluated using concepts such as diversity, longevity and determinism. Interesting observations and behaviour are discussed and finally extensions and additional features are proposed.

We have an exciting journey ahead, so let’s gets straight into it.

What is Ecosystem?
To begin, please forgive my use of the word agent; I have used it in the terms of "A means by which something is done or caused", rather than the Artificial Intelligence meaning defined by "Agency, Intelligence and Mobility". Associate my usage with the term Non-Playable Character (NPC).

The vision of the ecosystem is a system of agents that use learning techniques, to directly or indirectly optimize their behaviour to meet challenges in their real-time environment with the goal of individual survival and reproduction. There are two agent types that exist in the ecosystem, which are highly dependent on each other for survival.

Plant
The plant is a basic organism which exists to consume food and act as a food source to the Herbivore. Life for a plant is started as a seed which bursts into existence and floats down to root. Seeds can be sticky so they may stick to surfaces, but they are delicate little things so if they touch the wrong surface they die. Food for the plant is extracted from the environment over time (concept of sun-light). A plant can reproduce, but it must be grown-up (of a certain age) and it must receive pollen (genetic material) from another plant. Pollen is attached to agents whenever they touch/eat from the plant. Given a special combination of plant genes, a plant can be poisonous to eat.

Herbivore
The herbivore is a simple creature that lives to eat plants. Herbivores start life as children able only to select plants to eat, and eat them. Once grown up, herbivores are able to consume plants at a faster rate and are able to reproduce. Herbivore’s live for a shorter time than plants, but have the unique ability of passing their instinct (regarding what makes a good plant to eat) onto their children.

System
The system is small and low in complexity and its clear there are strong relationships between the two agents. A plant requires herbivores to act as a delivery vehicles to distribute its genetic material to other plants. Plants also rely on the herbivore to expose them to new pollen, providing the only mechanism for reproduction and ultimately the continuation of the plant species.

Herbivores require plants to eat. Without plants, herbivores’ would die before they could grow up and reproduce. Herbivores need their population size maintained. Too many herbivores mean the food supply is exhausted before it can be replenished. Therefore plant population size has a relationship with the maximum number of herbivores it can sustain. An additional interesting feature is that not all plants are good for herbivores. There are some plants that have bad genes to cause them to be harmful to herbivores once eaten.

The combination of the above relationships defines the general rules which restrict the system. Both species are forever walking the tight-rope of extinction relying on the individual agent decisions and the combined total of the actions/parameters of all agents to determine the systems condition. Combined with this is the fact that the very environment they exist in is subject to constant change with things like hazards and having paths to goals blocked. Agent associations are defined by their behaviour and need to be continually adjusted to adapt to local conditions.

Lifecycle
Although each of the two distinct species perform different actions and have different behaviours they all share a common lifecycle. This lifecycle is a loose abstraction from nature:

The lifecycle specification is better demonstrated through the use of a diagram (see figure 1.0).


Figure 1.0 - The common agent lifecycle

Platform
I chose to build the simulation on the Quake II computer game platform [3] by id Software [4]. The following are reasons why I think this particular platform is a good starting point for experimentation:

The above points are great, but the selling point for me was the existing monster sub-systems. They provided me with a basic tool kit of monster behaviour which I could hook into as required, meaning I didn’t have to bother with detail outside the scope of what I was trying to achieve.

Framework
Each monster in the game is provided with processing time via a common interface. The approach could almost be called polymorphism if it were an object oriented implementation. Generic monster functionality is abstracted and specific behaviour is defined by implementing the required interfaces (in this case provided as function pointers). This elegant approach is used throughout the game’s code and provides a welcomed level of flexibility.

The monster framework provides an extensive collection of predefined generic monster behaviour. These subsystems include:

Where appropriate, the generic monster framework will call monster specific functionality to cope with a situation which generally in turn calls generic monster functions to deal with those conditions. The majority of the state transitions are managed for the monsters, all that is required is situation specific functions for things like attacking goals.

The monster framework is low in complexity, but provides a lot of flexibility in allowing the choice to use as much or as little of the existing functionality as is required. This, in conjunction with the excellent rendering engine, physics, and network code make it a recommended launch pad for creative AI simulations and experiments.

For some useful resources for programming on this platform, please refer to [7], [8] and [9].

Plant Agents
Plants are very simple organisms. This section will discuss the four states a plant may be in and the constraints and actions of those states. The issue of plant reproduction will be raised and addressed using a genetic algorithm inspired solution.

Plant States
A plant has a number of defined states it can be in at any one time. The state a plant agent instance is in narrows the set of actions it can perform and defines the rules which must be fulfilled to trigger a transition into other valid plant states.

The following is a list of valid plant states.


The following figure 1.1 provides a diagrammatical representation of the states for a plant and the valid state transitions.


Figure 1.1 - The 4 state of a plant and the triggers for each state transition

Plants live for a defined length of time as a sprout and then grow to a mature state. Once mature, plants received an increase in the amount of resources that they can extract from the environment per server frame. They are also allowed to reproduce if pollinated. There are constraints regarding the number of times a plant can reproduce, the number of seeds it will release, the length of time between reproductions and so on. These variables are critical to the plant as they directly influence its behaviour and thus its ability to survive.

To address the critical nature of plant parameters, many variables have been left as user adjustable constants in a configuration file. A small collection of critical variables were extracted and make up what is defined as the plants DNA. These parameters include:

Genetic Algorithm (GA)
Plants pass on genetic material to their seeds when they reproduce. A direct copy of this material would be clones of plants, meaning the species would be forever stuck in its random starting condition. The environments that the plants exist in are large and consist of varying terrain and hazards. There is a need for plants to be able to adapt their behaviour to meet local conditions; both local to the selected environment and local to the plants position and surroundings in the selected environment.

Genetic Algorithms (GA’s) are a method of computing inspired by the ideas of evolution and natural selection. GA’s simulate the evolution process for a given problem via processes of selection, mutation and reproduction. They are known as being an effective search/learning/optimization technique at finding close to optimal solutions. Please refer to the Genetic Algorithms FAQ [10] for thorough treatment.

The following provides an outline of the benefit of the various genetic operators to the plant species:


Figure 1.2 - Representation of the reproduction action of a plant using GA operators

Plant Fitness
Figure 1.2 shows the design and features of plant reproduction using genetic operators, though there is something important missing; Fitness Evaluation and Selection. This is the determination of which plant agents are given the chance to reproduce and how that choice is made, a critical component of which the above scheme would be irrelevant without.

Selection of which plants may reproduce is made by the herbivore agents. Herbivores use a selfish fitness function to evaluate plants in their local area and select one to eat from. Upon eating from a plant they pass on genetic material from the last plant they ate from, and then take genetic material away with them to their next food goal.

Reiterating a plants ability to reproduce along with the implications of using the herbivores as the selection mechanism, a plant must:


Sounds like an interesting self optimisation problem, but there is one more catch: the poison genes. Plants optimise their DNA ensuring they produce enough seeds, and that their seeds float through the air the right amount to survive and get eaten by herbivores. The downside is crossover and mutation also manipulates their poison genes and once activated will injure the herbivores they work so hard to optimise themselves for.

Generally when employing GA techniques, the problems involve a large search space, where unit populations counted in the hundreds or thousands, and where converge on a solution can occur over many hundreds of generations. In this case the problem is small and relatively simple and it is expected smaller population sizes and convergence time are required. So called "errors" in plant behaviour are welcomed as they provide the simulation with a level of non-determinism expected by the user. They are also required by the optimisation procedure for pushing out of locally optimal values, to traverse the search space for good plant DNA.

Herbivore Agents
The herbivore agent is more complex than the plant in that it makes use of a goal selection and evaluation system and moves about the environment pursuing its goals. This section discusses the states that a herbivore agent may have, as well as the actions it may perform in each state. The problem of flexible food selection is discussed and a Perceptron based solution implemented, along with a simple short term memory subsystem to capture and maintain training samples.

Herbivore States
A herbivore has a number of states that it may exist in, and each state defines a set of actions which a herbivore instance may perform.

The following is a brief description of the four valid states a herbivore may be in:


Figure 1.3 is a diagrammatic representation of the state lifecycle of a herbivore, depicting valid state transitions.


Figure 1.3 - Shows the lifecycle states of a herbivore agent and the triggers for each state transition

Herbivore Feeding Problem
Herbivores must eat plants to survive. The problem is not all plants are edible; some are poisonous which injure the herbivore when eaten. Herbivores cannot reproduce until they mature, but when they reproduce they pass a snapshot of their instincts onto their offspring, giving them the best start in life they can. Obviously only herbivores that are able to select enough good food to live until they reproduce will continue the species, providing the interesting situation of Darwinism (survival of the fittest or survival of the best classifiers). The combined result of this behaviour is that over a number of generations it is expected that the amount of accumulated knowledge regarding good food selection would continually increase.

Herbivores are asexual meaning they reproduce by themselves and are not required to mate. This means genetic operators cannot be used to combine genetic material of parents, so how is change introduced into the species? What is to stop herbivores continually refining their behaviour and passing it on to their children until they are near-perfect at selecting food? What affect would this have when all the "good" food is exhausted?

We have seen that plants require the herbivore to act as a fitness evaluation mechanism, selecting which plants may reproduce. We have also seen that the plants require a fitness evaluation that is flexible enough to not always make selections based on the same plant characteristics, in an attempt at maintain a level of plant diversity. Now we have seen that herbivores also require this flexibility. Both agent species have common requirements for the same functionality, so how can these problems be addressed?

Enter the Perceptron
Effectively, the plant attributes most relevant to the herbivores immediate needs (what can I eat right now) are the poison genes.

The plants poison genes are floating point values which must be above a threshold to be active. To make a plant poisonous for its lifetime, two of the threes genes must be active. This problem can be characterized by the following propositional rules:

(P1 ^ P2) v (P1 ^ P3) v (P2 ^ P3) => Not Edible
¬(P1 ^ P2) ^ ¬(P1 ^ P3) ^ ¬(P2 ^ P3) => Edible
Those not familiar with logic connectives: (^ is AND, v is OR, ¬ is NOT, => is Implies).

Below is the same problem expressed as a binary truth table:

Poison1

Poison2

Poison3

Edible

1

1

1

No

1

0

1

No

1

1

0

No

1

0

0

Yes

0

1

1

No

0

0

1

Yes

0

1

0

Yes

0

0

0

Yes

Table 1.0 - all possible combinations of plant poison genes and their affect on a plants edibility, one indicates active gene, and zero indicates inactive gene.


The herbivore is required to learn this classification problem, but retain the flexibility of not learning it perfectly. Once learned, the herbivore is required to apply the classification to unseen cases in the environment and predict whether or not the plants are edible. An effective classification and prediction algorithm suited to this problem is the Perceptron learning rule.

Why Perceptron
The Perceptron is a learning algorithm catered to learning problems with a small number of binary inputs and a single binary output. Basically a Perceptron is a single neuron where inputs are weighted. The sum of the inputs passes through a linear transfer function, determining whether or not the neuron will fire (output one) or not fire (output zero).



Figure 1.4 - Example Perceptron showing plant input variables and classification output

The algorithm has an advantage over other learning algorithms of similar nature which is its speed and simplicity, both welcome features in a real-time environment. A highly publicized weakness of this algorithm is its inability to learn problems which are not linearly separable, forever demonstrated by the XOR problem. Non-linearly separable problems are those in which outputs of the training cases cannot be separated by a straight line or hyperplane. Fortunately the plant poison problem is linearly separable.

For further understanding of this technique and its strengths and weaknesses please refer to [11], [12] and [13]. For a broader understanding of neural networks see [14] and [15].

Memory and Learning
The Perceptron is a supervised learning technique meaning that it learns by being exposed to samples from the problem domain along with expected outputs. The algorithm then trains itself to produce similar outputs from the same inputs. The problem now becomes; where do herbivores get example patterns from? To solve this a herbivore memory management system was designed to allowed herbivores to remember plants they eat from and the result of their eating (obtained sustenance or was poisoned).


Figure 1.5 - Shows how the memory system and Perceptron is used by the herbivore

The herbivore’s goal selection subsystem can be divided into three main sections:

  1. Memory storage
  2. Training from memory
  3. Plant Classification


Memory Storage
When a herbivore selects a food goal, it makes a copy of the plant’s details and stores it in its state. These details are


If the herbivore manages to navigate to its selected food goal, it will take a number of bites from the plant. Each bite will create a new memory. A memory consists of the above details of the current food goal combined with the result of taking a bite (gain or lose health).

Memories are maintained in a queue data structure where new memories are added to the end of the queue, possibly popping old memories from the front of the queue. This simulates the concept of herbivores remembering a handful of food goals and forgetting food goals from a long time ago. If the herbivore already has a memory for the plant the memory is updated, otherwise a new memory is created. This is not required functionality however it stops the generation of too many similar memories.

Training from Memory
This module is responsible for using the remembered goal selections and results to train the herbivore’s perceptron. Basically the memory is knowledge collected by the herbivore through experience in the simulation. Application of this knowledge is provided by teaching the perceptron structure to recognize good and bad food goals based on the value of a plants poison genes. Training is performed on the creation or update of each memory.

Plant selection - flexible food selection
When a herbivore is hungry it evaluates its surroundings and constructs a list of all plants near by. Each plant is then evaluated by the herbivores Perceptron in a read-only fashion. The Perceptron will classify one or more plants as good food goals and the herbivore will randomly select one of these good food goals to eat from. If the Perceptron fails to classify any food goals as edible a random plant is selected to eat from the list.

Training Time and Training Patterns
To get an idea of the poison plant problem and the Perceptrons ability to learn it, I performed a number of experiments using the Perceptron learning algorithm external to the proposed system. The goal was to gain an understanding into how difficult the problem was in terms of the number of training epochs and the number of patterns required to learn and generalize the problem.

This information was determined to be critical to the success of the Perceptron implementation. A herbivore can only capture so many patterns during its lifetime, and only so many of those patterns will be unique. The number of training epochs must be minimized. Large numbers of training iterations means that too much processor time is allocated to a single task, which may cause pauses in the simulation and restrict the maximum number of agents the ecosystem can support.

The following table is a listing of results averaged over 10 tests, each test stopping when all training patterns were learned successfully.

% of Training / Test Patterns Epochs Average % Correct (Test Patterns) Average % Correct (All Patterns)
25% / 75% 1.1 48.33% 61.25%
50% / 50% 4 57.50% 78.75%
75% / 25% 5 60% 90%
100% / 100% 5.4 100% 100%


The table shows that on average, 6 epochs are enough to generalize the problem sufficiently to classify approximately 50% of the unseen patterns correctly when using from between 25% to 75% of the possible unique gene combinations as training data.

It is expected the herbivore would be able to collect roughly 10 patterns during its lifetime and that at least half of those collected patters would be unique, providing the herbivore with an estimated 60% chance of classifying unseen plants correctly and a 90% chance of classifying any given plant correctly. These estimations are the best case scenarios. A herbivore that is able to only capture 2 unique training samples has a 60% an estimated chance of classifying any given plant correctly.

None of these estimations take the herbivores ability to inherit learned behaviour into account, which is expected to give the herbivore a jumpstart on good classifications and possibly inherit features which need to be unlearned.

Safe values for the max epochs and herbivore memory size were selected as follows:


The Ecosystem
An environment was constructed, specifically designed to test the ecosystem. I’m no artist so it’s laughably simple. It contains three points from which an initial population can be started. The level provides a high ceiling to allow seeds to power up into the air, and an observation platform is provided for the user to oversee the action.


Figure 1.6 - a screenshot taken of the test environment showing the central island structure and the observation tower.

The agents were implemented using game models and textures from the Quake II platform. A plant is represented as a round ball. Upon reaching maturity they transform into a small cylindrical shape. Poisonous plants are represented by a purple shell effect.

Herbivores have the appearance of a mechanical creature which runs around on all fours. When feeding from a plant they extend a tongue / tentacle. The herbivores current feed and roam goals are represented by a green line extending from the herbivore to its goal. Healthy herbivores are coated in a green shell, sick herbivores are shown by a red shell.


Figure 1.7 - Shows a number of plant’s in their seed state. Poisonous seeds are represented in purple. The scene contains two herbivores, one of which is sick; designated by a red shell.

Both the herbivores and plants have a lot of constants which are used to manage and manipulate various behaviours. These are stored in a configuration file called "agents.ini". Locating suitable values for these parameters is where half the fun lies in tweaking the simulation, and due to the flexibility of the system it is possible for one askew value to have a dramatic impact on the way the simulation progresses.

Ecosystem Observations
This section is devoted to the meat of this experiment - the observations. By far the best part of a project like this for me is in the tweak-and-observe cycle which I can get lost in for hours at a time. Small changes are all that is usually required to produce very different effects in things like food selection, population migration and diversity. The following are a number of small interesting situations observed during the evaluation of my many tweaks:


Figure 1.8 - Shows a dense population of plants and feeding herbivores.

Proposed Feature Enhancements
The program is not perfect. It was constructed over a short period of time and there are number of features and feature enhancements that could make for a more enjoyable experience. This section mentions a handful of features I’d have liked to see implemented:

Next Step: Learning in real time environments
Though this simulation was a trivial example it did show an example of how to achieve learning in a real-time environment and some of the issues involved such as pattern diversity and population diversity. Though the program is simple, the effectiveness of learning was demonstrated.

A next easy step could be to expand the current application. Allow herbivores to learn more about their food, parameters such as:

Plants could also benefit from encoding more relevant parameters such as:

The area that I find most fascinating is determining how much of an agents’ state machine can be replaced or intergraded with connectionist structures. The herbivore has only its goal classification managed by a single Perceptron, and proved to be most effective. It would not be too difficult a task to replace other subsystems and decision making function with small multiplayer networks. From there the very triggers between states could again be replaced by small networks allowing for learned response to stimuli.

Interesting, if not unoriginal ideas [16], but by using the concepts demonstrated in this article, they could be most achievable. The specific problems need to be identified, abstracted and solved effectively. This was shown with goal selection by testing the chosen structure externally from the application to determine an optimal number of training patterns to learn the problem and the number of iterations required to reproduce those training patterns.

Conclusion
We have covered a lot of ground from the design of the agents state to the implementation of learning algorithms, and finally to the system observations and ideas for the future. A simple society was constructed with relatively small population sizes compared to well known alife simulations [17]. Intuitive concepts for the managing of learning in a real-time environment were demonstrated through practical design and implementation of both a Genetic Algorithm and Perceptron.

Those readers with the computer game Quake II can download the program and all its source code from [18]. I’m interested to see what kind of wacky simulations readers are able to generate and the parameters which sparked them. For those with programming background you’ll find the source code commented like a novel, with a reasonable design, decoupling systems such as the herbivore from its goal selection, goal selection from its perception and the plants from the DNA manipulation. This will allow easy replacement of subsystems with little effect on existing code.

Download

Bibliography
[1] Wator, Not mine, but a great example applet, though I recommend reducing size to 5500: Link
[2] Craig Reynolds Boids: Link
[3] Quake II Computer Game: Link
[4] id Software: Link
[5] Quake II Computer Game Source Code: Link
[6] GNU General Public License (GPL): Link
[7] Inside 3D tutorials for Quake II: Link
[8] The Quake Developers Library Site Offline Tutorials: Link
[9] Open source Quake Development: Link
[10] Genetic Algorithms FAQ: Link
[11] Generation5.com, Perceptrons, Link
[12] Muchele D. Estebon, Perceptrons: An Associative Learning Network, 1997, Link
[13] Omri Weisman & Ziv Pollack, The Perceptron, 1995, Link
[14] Neural-nets Usenet News FAQ: Link
[15] AI-Depot.com, Neural Network Warehouse, Link
[16] [Finite State Machines and neural nets: the inception], McCulloch and Pitts’ neural logical calculus: Link
[17] Seeing around corner, Jonathan Ranch the Atlantic Online, April 2002, Link

 

 

Home