Skip to main content

Banery wysuwane

INTRODUCTION TO DATA MINING
Introduction to Data Mining

Product category
Nauki techniczne » Informatyka
ISBN
978-83-66364-25-7
Format
B5
Binding
miękka
Number of pages
364
Publication date
2019
Description

The monograph deals with the possibilities of using algorithms, mainly from the field of Data Mining, in the field of information processing, which is currently the most important method of acquiring knowledge. The attached CD includes programs and training materials complementing the content of selected chapters.


Monografia dotyczy możliwości wykorzystania algorytmów, głównie z zakresu Data Mining, w dziedzinie przetwarzania informacji, które jest obecnie najważniejszą metodą zdobywania wiedzy. Do opracowania dołączono płytę z programami i materiałami szkoleniowymi uzupełniającymi treści wybranych rozdziałów.

Contents

Introduction   7
Preface   9
1. Least-squares method   11
1.1. Ordinary least square method   11
1.2. Linearization at the least squares method   17
1.3. Examples in Python   23
2. Principle Component Analysis   32
2.1. The main idea of PCA  32
2.2. An iteration scheme of PCA calculating  40
2.3. Examples in Python   44
2.4. Optimum transition from the RGB model to optimum three-component model  48
2.5. Fisher linear discriminant analysis   51
2.6. Multidimensional discriminant analysis (MDA)   57
3. Application of fuzzy logic in Data Mining   72
3.1. What is fuzzy thinking?  72
3.2. Fuzzy sets   73
3.3. Linguistic variables and linguistic gain  77
3.4. Operations on fuzzy sets  80
3.5. Properties of operations on fuzzy sets  83
3.6. Fuzzy inference rules  86
3.7. Defuzzification   96
3.8. The choice of alternatives using fuzzy inference rules  100
3.9. Ranking alternatives based on heuristic approach   110
3.10. Fuzzy decision trees   117
4. Soft computing in data handling   129
4.1. Introduction to soft computing   129
4.2. Evolutionary calculations   131
4.2.1. General Introduction   131
4.2.2. Genetic algorithm   133
4.2.3. Simple example of implementation of GA  136
4.2.4. Closer to reality, or the space crossover  140
4.2.5. Genetic programming   149
4.2.6. To be, or not to be...   152
4.2.7. Diophantine equation  159
4.3. Swarm intelligence   162
4.3.1. The use of ant algorithm for the Traveling Salesman Problem  167
5. Clustering methods   174
5.1. Clustering. General concepts   175
5.2. Hierarchical methods   178
5.2.1. Hierarchical methods. Agglomerative algorithms   178
5.2.2. Hierarchical methods. Divisive algorithms  179
5.3. Examples in Python – clustering hierarchical methods  180
5.4. Nonhierarchical algorithms  216
5.4.1. K-means method   217
5.4.2. Fuzzy k-means   220
5.4.3. Gyustafsona-Kessel’s clustering  221
5.4.4. Method of correlation galaxies   223
5.4.5. Spectral clustering method   224
5.5. Examples in Python – clustering nonhierarchical methods  228
6. Classifiers   248
6.1. Definition of the classification problem  248
6.2. Main directions of the research of the classification issue   249
6.3. Stochastic classifiers   251
6.3.1. Use of the theorem of Bayes for decision-making   251
6.4. Naive Bayesian classifier   255
6.4.1. Example of sale of the Naive Bayes classifier   258
6.4.2. EM algorithm 263
6.5. Linear discriminant analysis   271
6.5.1. Example 5.1   274
6.5.2. Example 5.2  276
6.5.3. Example 5.3  277
7. Use of genetic algorithms for creation of the vector classifiers   279
7.1. Use of Veronoi polyhedron in the problem of texts classification 282
7.2. Check of the existing classification on the correctness   284
8. Support vector machines  287
8.1. The main idea   287
8.2. SVM for linear separable set   289
8.3. SVM for nonlinear separable set   290
8.4. Example  294
9. Visualization of multidimensional data  297
9.1. Multidimensional scaling technic   297
9.2. Kohonen self-organizing maps (SOM)  299
9.2.1. Initialization of the map of Kohonen  301
9.2.2. The training algorithm  302
9.3. Examples   305
9.3.1. Showing similarity of objects  305
9.3.2. Showing similarity of European countries  306
9.3.3. The world map of poverty  308
9.3.4. The traveling salesman problem  309
10. Recommender systems   311
10.1. General structure of recommendation system   313
10.1.1. Collaborative filtering  314
10.1.2. The content-oriented recommendations   321
10.1.3. Profiles of users  322
10.1.4. Training a user model  323
10.1.5. Hybrid approaches   328
10.2. Analysis of client environments   330
10.2.1. Examples of client environments   330
10.2.2. Retail chain stores   331
10.2.3. Mobile operators  331
10.2.4. Online stores of books, audio and video of other products 331
10.2.5. Search engines  332
10.2.6. Parliamentary elections   332
10.2.7. Analysis of texts  333
10.2.8. Social networks  3336
Appendix
Basic information  334
A.1. Background information on linear algebra   334
A.2. Background information on probability theory  336
References  347

Contents
Price
40.00
/
9.00€
In order to arrange international shipping details and cost please contact wydawnictwa@agh.edu.pl