Contents
ToggleFiling system
The goal of the classifier system is to optimize gain based on exposure to stimuli from a problem-specific environment. This is achieved by managing the awarding of credit for rules that prove useful and by researching new rules and new variations on existing rules using an evolutionary process.
Actors in the classifier system include detectors, messages, effectors, comments, and classifiers. Sensors are used by the system to perceive the state of the environment. Messages are the packets of information transmitted from the detectors to the system. The system performs message information processing, and messages can directly lead to actions in the environment.
Effectors control system actions on and in the environment. In addition to the system actively perceiving through its detectors, it can also receive directed feedback from the environment (gain). Classifiers are condition-action rules that provide a filter for messages. If a message satisfies the conditional part of the classifier, the classifier action fires. Rules act as message processors. A message is a string of bits of fixed length.
A classifier is defined as a ternary string with an alphabet in {1, 0, #}, where the # represents whatever (corresponding to 1 or 0).
The system processing loop is as follows:
- Environment messages are placed in the message list.
- The conditions for each classifier are checked to see if they are met by at least one message in the message list.
- All satisfied classifiers participate in a contest, those who win display their action in the list of messages.
- All messages directed to effectors are executed (causing actions in the environment).
- All messages in the message list of the previous cycle are deleted (messages persist for only one cycle).
Classifier systems are suited to problems with the following characteristics: perpetually new events with significant noise, continuous real-time demands for action, implicitly or inaccurately defined goals, and sparse gains or reinforcements that can only be achieved at through long task sequences.
The learning rate for the expected gain, error, and fitness of a classifier is usually in the range [0.1; 0.2]. The frequency of execution of thegenetic algorithm must be in the range [25; 50]. The discount factor used in multi-step programs is usually around 0.71. The minimum error that classifiers are considered to have equal precision is usually 10% of the maximum reward. The probability of crossing in the genetic algorithm is generally of the order of [0.5; 1.0]. The probability of mutating a single position in a workbook in the genetic algorithm is usually between [0.01; 0.05].
The experience threshold during classifier suppression is usually around 20. The experience threshold for a classifier during subsumption is usually around 20. The initial values for expected gain, error, and fit of a classifier are generally small and close to zero. The probability of selecting a random action for exploration purposes is usually close to 0.5. The minimum number of different actions that must be specified in a match set is usually the total number of possible actions in the environment for the input.
Subsumption should be used on problem domains that contain well-defined rules for mapping inputs to outputs.