Data Mining Mehmed Kantardzic (good english books to read .txt) đź“–
- Author: Mehmed Kantardzic
Book online «Data Mining Mehmed Kantardzic (good english books to read .txt) 📖». Author Mehmed Kantardzic
The other steps of a GA applied to the TSP are unchanged. A GA based on the above operator outperforms a random search for TSP, but still leaves much room for improvement. Typical results from the algorithm, when applied to 100 randomly generated cities gave after 20,000 generations, a value of the whole tour 9.4% above minimum.
13.6 MACHINE LEARNING USING GAs
Optimization problems are one of the most common application categories of GAs. In general, an optimization problem attempts to determine a solution that, for example, maximizes the profit in an organization or minimizes the cost of production by determining values for selected features of the production process. Another typical area where GAs are applied is the discovery of input-to-output mapping for a given, usually complex, system, which is the type of problem that all machine-learning algorithms are trying to solve.
The basic idea of input-to-output mapping is to come up with an appropriate form of a function or a model, which is typically simpler than the mapping given originally—usually represented through a set of input–output samples. We believe that a function best describes this mapping. Measures of the term “best” depend on the specific application. Common measures are accuracy of the function, its robustness, and its computational efficiency. Generally, determining a function that satisfies all these criteria is not necessarily an easy task; therefore, a GA determines a “good” function, which can then be used successfully in applications such as pattern classification, control, and prediction. The process of mapping may be automated, and this automation, using GA technology, represents another approach to the formation of a model for inductive machine learning.
In the previous chapters of this book we described different algorithms for machine learning. Developing a new model (input-to-output abstract relation) based on some given set of samples is an idea that can also be implemented in the domain of GAs. There are several approaches to GA-based learning. We will explain the principles of the technique that is based on schemata and the possibilities of its application to classification problems.
Let us consider a simple database, which will be used as an example throughout this section. Suppose that the training or learning data set is described with a set of attributes where each attribute has its categorical range: a set of possible values. These attributes are given in Table 13.4.
TABLE 13.4. Attributes Ai with Possible Values for a Given Data Set sAttributesValuesA1x, y, zA2x, y, zA3y, nA4m, n, pA5r, s, t, uA6y, n
The description of a classification model with two classes of samples, C1 and C2, can be represented in an if-then form where on the left side is a Boolean expression of the input features’ values and on the right side its corresponding class:
These classification rules or classifiers can be represented in a more general way: as some strings over a given alphabet. For our data set with six inputs and one output, each classifier has the form
where pi denotes the value of the ith attribute (1 ≤ i ≤ 6) for the domains described in Table 13.4, and d is one of two classes. To describe the classification rules in a given form, it is necessary to include the “don’t care” symbol “*” into the set of values for each attribute. For example, the new set of values for attribute A1 is {x, y, z, *}. Similar extensions are given for other attributes. The rules that were given earlier for classes C1 and C2 can be decomposed to the segments under conjunction (AND logical operation) and expressed as
To simplify the example, we assume that the system has to classify into only two classes: C1 and notC1. Any system can be easily generalized to handle multiple classes (multiple classification). For a simple classification with a single rule C1 we can accept only two values for d: d = 1 (member of the class C1) and d = 0 (not a member of the class C1).
Let us assume that at some stage of the learning process, there is a small and randomly generated population of classifiers Q in the system, where each classifier is given with its strength s:
Strengths si are parameters that are computed based on the available training data set. They show the fitness of a rule to the training data set, and they are proportional to the percentage of the data set supported by the rule.
The basic iterative steps of a GA, including corresponding operators, are applied here to optimize the set of rules with respect to the fitness function of the rules to training data set. The operators used in this learning technique are, again, mutation and crossover. However, some modifications are necessary for mutation. Let us consider the first attribute A1 with its domain {x, y, z, *}. Thus, when mutation is called, we would change the mutated character (code) to one of the other three values that have equal probability. The strength of the offspring is usually the same as that of its parents. For example, if mutation is the rule Q3 on the randomly selected position 2, replacing the value y with a randomly selected value *, the new mutated classifier will be
The crossover does not require any modification. We take advantage of the fact that all classifiers are of equal length. Therefore, to crossover two selected parents, say Q1 and Q2
we generate a random crossover-position point. Suppose that we crossover after the third character in the string as marked; then the offspring are
The strength of the offspring is an average (possibly weighted) of the parents’ strengths. Now the system is ready to continue its learning process: starting another cycle, accepting further positive and negative samples from the training set, and modifying the strengths of classifiers as a measure of fitness. We can note that the training
Comments (0)