Other
Read books online » Other » Data Mining Mehmed Kantardzic (good english books to read .txt) 📖

Book online «Data Mining Mehmed Kantardzic (good english books to read .txt) 📖». Author Mehmed Kantardzic



1 ... 158 159 160 161 162 163 164 165 166 ... 193
Go to page:
unique way. Consequently, the individual clusters are separated by sharp boundaries. In practice, such boundaries are often not very natural or even counterintuitive. Rather, the boundary of single clusters and the transition between different clusters are usually “smooth” rather than abrupt. This is the main motivation underlying fuzzy extensions to clustering algorithms. In fuzzy clustering an object may belong to different clusters at the same time, at least to some extent, and the degree to which it belongs to a particular cluster is expressed in terms of a membership degree.

The most frequent application of fuzzy set theory in data mining is related to the adaptation of rule-based predictive models. This is hardly surprising, since rule-based models have always been a cornerstone of fuzzy systems and a central aspect of research in the field. Set of fuzzy rules can represent both classification and regression models. Instead of dividing quantitative attributes into fixed intervals, they employ linguistic terms to represent the revealed regularities. Therefore, no user-supplied thresholds are required, and quantitative values can be directly inferred from the rules. The linguistic representation leads to the discovery of natural and more understandable rules.

Decision tree induction includes well-known algorithms such as ID3, C4.5, C5.0, and Classification and Regression Trees (CART). Fuzzy variants of decision-tree induction have been developed for quite a while and seem to remain a topic of interest even today. In fact, these approaches provide a typical example for the “fuzzification” of standard predictive methods. In the case of decision trees, it is primarily the “crisp” thresholds used for defining splitting attributes, such as size > 181 at inner nodes. Such thresholds lead to hard decision boundaries in the input space, which means that a slight variation of an attribute (e.g., size = 182 instead of size = 181) can entail a completely different classification of a sample. Usually, a decision in favor of one particular class label has to be made, even if the sample under consideration seems to have partial membership in several classes simultaneously. Moreover, the learning process becomes unstable in the sense that a slight variation of the training samples can change the induced decision tree drastically. In order to make the decision boundaries “soft,” an obvious idea is to apply fuzzy predicates at the nodes of a decision tree, for example, size = LARGE, where LARGE is a fuzzy set. In that case the sample is not assigned to exactly one successor node in a unique way, but perhaps to several successors with a certain degree. Also, for fuzzy classification solutions the consequent of single rules is usually a class assignment represented with a singleton fuzzy set. Evaluating a rule-based model thus becomes trivial and simply amounts to “maximum matching,” that is, searching the maximally supporting rule for each class.

A particularly important trend in the field of fuzzy systems are hybrid methods that combine fuzzy-set theory with other methodologies such as neural networks. In the neuro-fuzzy methods the main idea is to encode a fuzzy system in a neural network, and to apply standard approaches like backpropagation in order to train such a network. This way, neuro-fuzzy systems combine the representational advantages of fuzzy systems with the flexibility and adaptivity of neural networks. Interpretations of fuzzy membership include similarity, preference, and uncertainty. A primary motivation was to provide an interface between a numerical scale and a symbolic scale that is usually composed of linguistic terms. Thus, fuzzy sets have the capability to interface quantitative data with qualitative knowledge structures expressed in terms of natural language. In general, due to their closeness to human reasoning, solutions obtained using fuzzy approaches are easy to understand and to apply. This provides the user with comprehensive information and often data summarization for grasping the essence of discovery from a large amount of information in a complex system.

14.8 REVIEW QUESTIONS AND PROBLEMS

1. Find some examples of fuzzy variables in daily life.

2. Show graphically and explain why the law of contradiction is violated in the fuzzy-set theory.

3. The MF of a fuzzy set is defined as

(a) What will be a linguistic description of the fuzzy set A if x is the variable “age” in years?

(b) Give an analytical description for μB(x) if B is a fuzzy set “age is close to 60 years.”

4. Assume you were told that the room temperature is around 70 degrees Fahrenheit. How you would represent this information?

(a) by a set notation,

(b) by a fuzzy set notation.

5. Consider the fuzzy sets A, B, and C defined on the interval x = [0, 10] with corresponding μ functions:

Determine analytically and graphically:

(a) A′ and B′

(b) A ∪ C and A ∪ B

(c) A ∩ C and A ∩ B

(d) A ∪ B ∪ C

(e) A ∩ C′

(f) Calculate the α-cuts for A, B, and C if α = 0.2, α = 0.5, and α = 1.

6. Consider two fuzzy sets with triangular MFs A(x, 1, 2, 3) and B(x, 2, 2, 4). Find their intersection and union graphically, and express them analytically using the min and max operators.

7. If X = {3, 4, 5} and Y = {3, 4, 5, 6, 7}, and the binary fuzzy relation R = “Y is much greater than X” is defined by the analytical MF

what will be corresponding relation matrix of R (for all discrete X and Y values)?

8. Apply the extension principle to the fuzzy set

where the mapping function f(x) = x2 − 3.

(a) What is the resulting image B where B = f(A)?

(b) Sketch this transformation graphically.

9. Assume that the proposition “if x is A then y is B” is given where A and B are fuzzy sets:

Given a fact expressed by the proposition “x is A*,” where

derive the conclusion in the form “y is B*” using the generalized modus ponens inference rule.

10. Solve Problem number 9 by using

11. The test scores for the three students are given in the following table:

1 ... 158 159 160 161 162 163 164 165 166 ... 193
Go to page:

Free ebook «Data Mining Mehmed Kantardzic (good english books to read .txt) 📖» - read online now

Comments (0)

There are no comments yet. You can be the first!
Add a comment