Buy Discovering Knowledge in Data by Daniel T. Larose
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Book 1
Book 2
Book 3
Home > Computing and Information Technology > Databases > Data mining > Discovering Knowledge in Data: An Introduction to Data Mining
Discovering Knowledge in Data: An Introduction to Data Mining

Discovering Knowledge in Data: An Introduction to Data Mining


     0     
5
4
3
2
1



Out of Stock


Notify me when this book is in stock
X
About the Book

Learn Data Mining by doing data mining Data mining can be revolutionary-but only when it's done right. The powerful black box data mining software now available can produce disastrously misleading results unless applied by a skilled and knowledgeable analyst. Discovering Knowledge in Data: An Introduction to Data Mining provides both the practical experience and the theoretical insight needed to reveal valuable information hidden in large data sets. Employing a "white box" methodology and with real-world case studies, this step-by-step guide walks readers through the various algorithms and statistical structures that underlie the software and presents examples of their operation on actual large data sets. Principal topics include: * Data preprocessing and classification * Exploratory analysis * Decision trees * Neural and Kohonen networks * Hierarchical and k-means clustering * Association rules * Model evaluation techniques Complete with scores of screenshots and diagrams to encourage graphical learning, Discovering Knowledge in Data: An Introduction to Data Mining gives students in Business, Computer Science, and Statistics as well as professionals in the field the power to turn any data warehouse into actionable knowledge. An Instructor's Manual presenting detailed solutions to all the problems in the book is available online.

Table of Contents:
PREFACE xi 1 INTRODUCTION TO DATA MINING 1 What Is Data Mining? 2 Why Data Mining? 4 Need for Human Direction of Data Mining 4 Cross-Industry Standard Process: CRISP–DM 5 Case Study 1: Analyzing Automobile Warranty Claims: Example of the CRISP–DM Industry Standard Process in Action 8 Fallacies of Data Mining 10 What Tasks Can Data Mining Accomplish? 11 Description 11 Estimation 12 Prediction 13 Classification 14 Clustering 16 Association 17 Case Study 2: Predicting Abnormal Stock Market Returns Using Neural Networks 18 Case Study 3: Mining Association Rules from Legal Databases 19 Case Study 4: Predicting Corporate Bankruptcies Using Decision Trees 21 Case Study 5: Profiling the Tourism Market Using k-Means Clustering Analysis 23 References 24 Exercises 25 2 DATA PREPROCESSING 27 Why Do We Need to Preprocess the Data? 27 Data Cleaning 28 Handling Missing Data 30 Identifying Misclassifications 33 Graphical Methods for Identifying Outliers 34 Data Transformation 35 Min–Max Normalization 36 Z-Score Standardization 37 Numerical Methods for Identifying Outliers 38 References 39 Exercises 39 3 EXPLORATORY DATA ANALYSIS 41 Hypothesis Testing versus Exploratory Data Analysis 41 Getting to Know the Data Set 42 Dealing with Correlated Variables 44 Exploring Categorical Variables 45 Using EDA to Uncover Anomalous Fields 50 Exploring Numerical Variables 52 Exploring Multivariate Relationships 59 Selecting Interesting Subsets of the Data for Further Investigation 61 Binning 62 Summary 63 References 64 Exercises 64 4 STATISTICAL APPROACHES TO ESTIMATION AND PREDICTION 67 Data Mining Tasks in Discovering Knowledge in Data 67 Statistical Approaches to Estimation and Prediction 68 Univariate Methods: Measures of Center and Spread 69 Statistical Inference 71 How Confident Are We in Our Estimates? 73 Confidence Interval Estimation 73 Bivariate Methods: Simple Linear Regression 75 Dangers of Extrapolation 79 Confidence Intervals for the Mean Value of y Given x 80 Prediction Intervals for a Randomly Chosen Value of y Given x 80 Multiple Regression 83 Verifying Model Assumptions 85 References 88 Exercises 88 5 k-NEAREST NEIGHBOR ALGORITHM 90 Supervised versus Unsupervised Methods 90 Methodology for Supervised Modeling 91 Bias–Variance Trade-Off 93 Classification Task 95 k-Nearest Neighbor Algorithm 96 Distance Function 99 Combination Function 101 Simple Unweighted Voting 101 Weighted Voting 102 Quantifying Attribute Relevance: Stretching the Axes 103 Database Considerations 104 k-Nearest Neighbor Algorithm for Estimation and Prediction 104 Choosing k 105 Reference 106 Exercises 106 6 DECISION TREES 107 Classification and Regression Trees 109 C4.5 Algorithm 116 Decision Rules 121 Comparison of the C5.0 and CART Algorithms Applied to Real Data 122 References 126 Exercises 126 7 NEURAL NETWORKS 128 Input and Output Encoding 129 Neural Networks for Estimation and Prediction 131 Simple Example of a Neural Network 131 Sigmoid Activation Function 134 Back-Propagation 135 Gradient Descent Method 135 Back-Propagation Rules 136 Example of Back-Propagation 137 Termination Criteria 139 Learning Rate 139 Momentum Term 140 Sensitivity Analysis 142 Application of Neural Network Modeling 143 References 145 Exercises 145 8 HIERARCHICAL AND k-MEANS CLUSTERING 147 Clustering Task 147 Hierarchical Clustering Methods 149 Single-Linkage Clustering 150 Complete-Linkage Clustering 151 k-Means Clustering 153 Example of k-Means Clustering at Work 153 Application of k-Means Clustering Using SAS Enterprise Miner 158 Using Cluster Membership to Predict Churn 161 References 161 Exercises 162 9 KOHONEN NETWORKS 163 Self-Organizing Maps 163 Kohonen Networks 165 Example of a Kohonen Network Study 166 Cluster Validity 170 Application of Clustering Using Kohonen Networks 170 Interpreting the Clusters 171 Cluster Profiles 175 Using Cluster Membership as Input to Downstream Data Mining Models 177 References 178 Exercises 178 10 ASSOCIATION RULES 180 Affinity Analysis and Market Basket Analysis 180 Data Representation for Market Basket Analysis 182 Support, Confidence, Frequent Itemsets, and the A Priori Property 183 How Does the A Priori AlgorithmWork (Part 1)? Generating Frequent Itemsets 185 How Does the A Priori AlgorithmWork (Part 2)? Generating Association Rules 186 Extension from Flag Data to General Categorical Data 189 Information-Theoretic Approach: Generalized Rule Induction Method 190 J-Measure 190 Application of Generalized Rule Induction 191 When Not to Use Association Rules 193 Do Association Rules Represent Supervised or Unsupervised Learning? 196 Local Patterns versus Global Models 197 References 198 Exercises 198 11 MODEL EVALUATION TECHNIQUES 200 Model Evaluation Techniques for the Description Task 201 Model Evaluation Techniques for the Estimation and Prediction Tasks 201 Model Evaluation Techniques for the Classification Task 203 Error Rate, False Positives, and False Negatives 203 Misclassification Cost Adjustment to Reflect Real-World Concerns 205 Decision Cost/Benefit Analysis 207 Lift Charts and Gains Charts 208 Interweaving Model Evaluation with Model Building 211 Confluence of Results: Applying a Suite of Models 212 Reference 213 Exercises 213 EPILOGUE: "WE'VE ONLY JUST BEGUN" 215 INDEX 217

About the Author :
DANIEL T. LAROSE received his PhD in statistics from the University of Connecticut. An associate professor of statistics at Central Connecticut State University, he developed and directs Data Mining@CCSU, the world's first online master of science program in data mining. He has also worked as a data mining consultant for Connecticut-area companies. He is currently working on the next two books of his three-volume series on Data Mining: Data Mining Methods and Models and Data Mining the Web: Uncovering Patterns in Web Content, scheduled to publish respectively in 2005 and 2006. DANIEL T. LAROSE received his PhD in statistics from the University of Connecticut. An associate professor of statistics at Central Connecticut State University, he developed and directs Data Mining@CCSU, the world's first online master of science program in data mining. He has also worked as a data mining consultant for Connecticut-area companies. He is currently working on the next two books of his three-volume series on Data Mining: Data Mining Methods and Models and Data Mining the Web: Uncovering Patterns in Web Content, scheduled to publish respectively in 2005 and 2006.

Review :
"...an excellent introductory book of data mining. I recommend it for every one who wants to learn data mining." (Journal of Statistical Software, May 2006) "...selected material is described in a simple, clear, and…precise way...case studies…examples, and screen shots has definitely added to the learning value of the book." (Journal of Biopharmaceutical Statistics, January/February 2006) "...does a good job introducing data mining to novices...it skillfully previews some of the basic statistical issues needed to understand data mining techniques." (Journal of the American Statistical Association, December 2005) "If you need a book to help colleagues understand your data mining procedures and results, this is the one you want to give them." (Technometrics, November 2005) "…an excellent book…it should be useful for anyone interested in analysing epidemiological data." (Statistics in Medical Research, October 2005) "...an excellent 'white-box' overview of established approaches for data analysis, in which readers are shown how, why, and when the methods work." (CHOICE, April 2005) "Larose has the making of a good series of books on data mining…I, for one, look forward to the next two books in the series." (Computing Reviews.com, February 15, 2005) "...an excellent introductory book of data mining. I recommend it for every one who wants to learn data mining." (Journal of Statistical Software, May 2006) "...selected material is described in a simple, clear, and…precise way...case studies…examples, and screen shots has definitely added to the learning value of the book." (Journal of Biopharmaceutical Statistics, January/February 2006) "...does a good job introducing data mining to novices...it skillfully previews some of the basic statistical issues needed to understand data mining techniques." (Journal of the American Statistical Association, December 2005) "If you need a book to help colleagues understand your data mining procedures and results, this is the one you want to give them." (Technometrics, November 2005) "…an excellent book…it should be useful for anyone interested in analysing epidemiological data." (Statistics in Medical Research, October 2005) "...an excellent 'white-box' overview of established approaches for data analysis, in which readers are shown how, why, and when the methods work." (CHOICE, April 2005) "Larose has the making of a good series of books on data mining…I, for one, look forward to the next two books in the series." (Computing Reviews.com, February 15, 2005) "...an excellent introductory book of data mining. I recommend it for every one who wants to learn data mining." (Journal of Statistical Software, May 2006) "...selected material is described in a simple, clear, and…precise way...case studies…examples, and screen shots has definitely added to the learning value of the book." (Journal of Biopharmaceutical Statistics, January/February 2006) "...does a good job introducing data mining to novices...it skillfully previews some of the basic statistical issues needed to understand data mining techniques." (Journal of the American Statistical Association, December 2005) "If you need a book to help colleagues understand your data mining procedures and results, this is the one you want to give them." (Technometrics, November 2005) "…an excellent book…it should be useful for anyone interested in analysing epidemiological data." (Statistics in Medical Research, October 2005) "...an excellent 'white-box' overview of established approaches for data analysis, in which readers are shown how, why, and when the methods work." (CHOICE, April 2005) "Larose has the making of a good series of books on data mining…I, for one, look forward to the next two books in the series." (Computing Reviews.com, February 15, 2005) "...an excellent introductory book of data mining. I recommend it for every one who wants to learn data mining." (Journal of Statistical Software, May 2006) "...selected material is described in a simple, clear, and…precise way...case studies…examples, and screen shots has definitely added to the learning value of the book." (Journal of Biopharmaceutical Statistics, January/February 2006) "...does a good job introducing data mining to novices...it skillfully previews some of the basic statistical issues needed to understand data mining techniques." (Journal of the American Statistical Association, December 2005) "If you need a book to help colleagues understand your data mining procedures and results, this is the one you want to give them." (Technometrics, November 2005) "…an excellent book…it should be useful for anyone interested in analysing epidemiological data." (Statistics in Medical Research, October 2005) "...an excellent 'white-box' overview of established approaches for data analysis, in which readers are shown how, why, and when the methods work." (CHOICE, April 2005) "Larose has the making of a good series of books on data mining…I, for one, look forward to the next two books in the series." (Computing Reviews.com, February 15, 2005)


Best Sellers


Product Details
  • ISBN-13: 9780470361351
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: Wiley-Interscience
  • Language: English
  • Sub Title: An Introduction to Data Mining
  • ISBN-10: 0470361352
  • Publisher Date: 07 Feb 2008
  • Binding: Digital (delivered electronically)
  • No of Pages: 222


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Discovering Knowledge in Data: An Introduction to Data Mining
John Wiley & Sons Inc -
Discovering Knowledge in Data: An Introduction to Data Mining
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Discovering Knowledge in Data: An Introduction to Data Mining

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    Fresh on the Shelf


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!