Data Mining Mehmed Kantardzic (good english books to read .txt) 📖
- Author: Mehmed Kantardzic
Book online «Data Mining Mehmed Kantardzic (good english books to read .txt) 📖». Author Mehmed Kantardzic
Visual techniques that preserve some characteristics of the data set can be invaluable for obtaining good separators in a clustering process. In contrast to dimension-reduction approaches such as PCAs, this visual approach does not require that a single projection preserve all clusters. In the projections, some clusters may overlap and therefore not be distinguishable, such as projection A in Figure 15.10. The algorithm only needs projections that separate the data set into at least two subsets without dividing any clusters. The subsets may then be refined using other projections and possibly partitioned further based on separators in other projections. Based on the visual representation of the projections, it is possible to find clusters with unexpected characteristics (shapes, dependencies) that would be very difficult or impossible to find by tuning the parameter settings of automatic-clustering algorithms.
In general, model visualization and exploratory data analysis (EDA) are data-mining tasks in which visualization techniques have played a major role. Model visualization is the process of using visual techniques to make the discovered knowledge understandable and interpretable by humans. Techniques range from simple scatter plots and histograms to sophisticated multidimensional visualizations and animations. These visualization techniques are being used not only to convey mining results more understandable to end users, but also to help them understand how the algorithm works. EDA, on the other hand, is the interactive exploration of usually graphical representations of a data set without heavy dependence on preconceived assumptions and models, thus attempting to identify interesting and previously unknown patterns. Visual data-exploration techniques are designed to take advantage of the powerful visual capabilities of human beings. They can support users in formulating hypotheses about the data that may be useful in further stages of the mining process.
15.7 REVIEW QUESTIONS AND PROBLEMS
1. Explain the power of n-dimensional visualization as a data-mining technique. What are the phases of data mining supported by data visualization?
2. What are fundamental experiences in human perception we would build into effective visualization tools?
3. Discuss the differences between scientific visualization and information visualization.
4. The following is the data set X:
Although the following visualization techniques are not explained with enough details in this book, use your knowledge from earlier studies of statistics and other courses to create 2-D presentations.
(a) Show a bar chart for the variable A.
(b) Show a histogram for the variable B.
(c) Show a line chart for the variable B.
(d) Show a pie chart for the variable A.
(e) Show a scatter plot for A and B variables.
5. Explain the concept of a data cube and where it is used for visualization of large data sets.
6. Use examples to discuss the differences between icon-based and pixel-oriented visualization techniques.
7. Given 7-D samples
(a) make a graphical representation of samples using the parallel-coordinates technique;
(b) are there any outliers in the given data set?
8. Derive formulas for radial visualization of
(a) 3-D samples
(b) 8-D samples
(c) using the formulas derived in (a) represent samples (2, 8, 3) and (8, 0, 0).
(d) using the formulas derived in (b) represent samples (2, 8, 3, 0, 7, 0, 0, 0) and (8, 8, 0, 0, 0, 0, 0, 0).
9. Implement a software tool supporting a radial-visualization technique.
10. Explain the requirements for full visual discovery in advanced visualization tools.
11. Search the Web to find the basic characteristics of publicly available or commercial software tools for visualization of n-dimensional samples. Document the results of your search.
15.8 REFERENCES FOR FURTHER STUDY
Draper, G. M., L. Y. Livnat, R. F. Riesenfeld, A Survey of Radial Methods for Information Visualization, IEEE Transaction on Visualization and Computer Graphics, Vol. 15, No. 5, 2009, pp. 759–776.
Radial visualization, or the practice of displaying data in a circular or elliptical pattern, is an increasingly common technique in information visualization research. In spite of its prevalence, little work has been done to study this visualization paradigm as a methodology in its own right. We provide a historical review of radial visualization, tracing it to its roots in centuries-old statistical graphics. We then identify the types of problem domains to which modern radial visualization techniques have been applied. A taxonomy for radial visualization is proposed in the form of seven design patterns encompassing nearly all recent works in this area. From an analysis of these patterns, we distill a series of design considerations that system builders can use to create new visualizations that address aspects of the design space that have not yet been explored. It is hoped that our taxonomy will provide a framework for facilitating discourse among researchers and stimulate the development of additional theories and systems involving radial visualization as a distinct design metaphor.
Fayyad, V., G. G. Grinstein, A. Wierse, Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann, San Francisco, CA, 2002.
Leading researchers from the fields of data mining, data visualization, and statistics present findings organized around topics introduced in two recent international knowledge-discovery and data-mining workshops. The book introduces the concepts and components of visualization, details current efforts to include visualization and user interaction in data mining, and explores the potential for further synthesis of data-mining algorithms and data-visualization techniques.
Ferreira de Oliveira, M. C., H. Levkowitz, From Visual Data Exploration to Visual Data Mining: A Survey, IEEE Transactions On Visualization And Computer Graphics, Vol. 9, No. 3, 2003, pp. 378–394.
The authors survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of IVT in the context of mining data. This includes both visual-data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension.
Gallaghar, R. S., Computer Visualization:
Comments (0)