Summary:Data Mining deals with discovering hidden knowlege, unexpected patterns and rules in large databases. It can bring significant gains to organizations, for example, through better targeted marketing and enhanced internal performance. If you have large data sets (for example, large quantities of financial data, extensive customer databases or sales records) you can benefit from this newly emerging field. But setting up a data mining environment is not a trivial task. The long-term goal must be to create a self-learning organization that makes optimal use of the information it generates.
This is the first book to offer a comprehensive introduction to data mining. Its aim is to provide essential insights and guidelines to help you make the right decisions when setting up a data mining environment.
It offers answers to questions such as:
- What is Data Mining?
- Which techniques are suitable for my data?
- How do I set up a data mining environment?
- How do I justify the costs?
The whole data mining process, including data selection, cleaning, coding, different pattern recognition techniques and reporting, is illustrated by means of an extensive case study and numerous answers.
Audience
- General management and IT managers
Table of Contents: (Most chapters begin with "Introduction", while all chapters finish with "Conclusion".)
Preface.
Overview of the Book.
Acknowledgments.
1. Introduction.
An Expanding Universe of Data.
Information as a Production Factor.
Computer Systems That Can Learn.
Data Mining.
Data Mining Versus Query Tools.
Data Mining in Marketing.
Practical Applications of Data Mining.
2. What is Learning? Introduction.
What is Learning?
Self-Learning Computer Systems.
Machine Learning and the Methodology of Science.
Concept Learning.
A Kangaroo in Mist.
3. Data Mining and the Data Warehouse. Introduction.
What is a Data Warehouse and Why Do We Need It?
Designing Decision Support Systems.
Integration With Data Mining.
Client/Server and Data Warehousing.
Multi-Processing Machine.
Cost Justification.
4. The Knowledge Discovery Process. Introduction.
The Knowledge Discovery Process in Detail.
Data Selection.
Cleaning.
Enrichment.
Coding.
Data Mining.
Preliminary Analysis of the Data Set Using Traditional query tools.
Visualization Techniques.
Likelihood and Distance.
OLAP Tools.
K-Nearest Neighbour.
Decision Trees.
Association Rules.
Neural Networks.
Genetic Algorithms.
Reporting.
5. Setting Up a Kdd Environment. Introduction.
Different Forms of Knowledge.
Getting Started.
Data Selection.
Cleaning.
Enrichment.
Coding.
Data Mining.
Reporting.
The KDD Environment.
Ten Golden Rules.
6. Some Real-Life Applications. Introduction.
Customer Profiling.
Predicting Bid Behaviour of Pilots.
Discovering Foreign Key Relationships.
Results.
7. Some Formal Aspects of Learning Algorithms. Introduction.
Learning as Compression of Data Sets.
The Information Content of a Message.
Noise and Redundancy.
The Significance of Noise.
Fuzzy Databases.
The Traditional Theory of the Relational Database.
From Relations To Tables.
From Keys To Statistical Dependencies.
Denormalization.
Data Mining Primitives.
Summary. Glossary. Further Reading. Index.