Microsoft Big Data Solutions
Home > Computing and Information Technology > Databases > Microsoft Big Data Solutions
Microsoft Big Data Solutions

Microsoft Big Data Solutions


     0     
5
4
3
2
1



Out of Stock


Notify me when this book is in stock
X
About the Book

Tap the power of Big Data with Microsoft technologies Big Data is here, and Microsoft's new Big Data platform is a valuable tool to help your company get the very most out of it. This timely book shows you how to use HDInsight along with HortonWorks Data Platform for Windows to store, manage, analyze, and share Big Data throughout the enterprise. Focusing primarily on Microsoft and HortonWorks technologies but also covering open source tools, Microsoft Big Data Solutions explains best practices, covers on-premises and cloud-based solutions, and features valuable case studies. Best of all, it helps you integrate these new solutions with technologies you already know, such as SQL Server and Hadoop. Walks you through how to integrate Big Data solutions in your company using Microsoft's HDInsight Server, HortonWorks Data Platform for Windows, and open source tools Explores both on-premises and cloud-based solutions Shows how to store, manage, analyze, and share Big Data through the enterprise Covers topics such as Microsoft's approach to Big Data, installing and configuring HortonWorks Data Platform for Windows, integrating Big Data with SQL Server, visualizing data with Microsoft and HortonWorks BI tools, and more Helps you build and execute a Big Data plan Includes contributions from the Microsoft and HortonWorks Big Data product teams If you need a detailed roadmap for designing and implementing a fully deployed Big Data solution, you'll want Microsoft Big Data Solutions.

Table of Contents:
Introduction xv Part I What Is Big Data? 1 Chapter 1 Industry Needs and Solutions 3 What’s So Big About Big Data? 4 A Brief History of Hadoop 5 Google 5 Nutch 6 What Is Hadoop? 6 Derivative Works and Distributions 7 Hadoop Distributions 8 Core Hadoop Ecosystem 9 Important Apache Projects for Hadoop 11 The Future for Hadoop 17 Summary 17 Chapter 2 Microsoft’s Approach to Big Data 19 A Story of “Better Together” 19 Competition in the Ecosystem 20 SQL on Hadoop Today 21 Hortonworks and Stinger 21 Cloudera and Impala 23 Microsoft’s Contribution to SQL in Hadoop 25 Deploying Hadoop 25 Deployment Factors 26 Deployment Topologies 29 Deployment Scorecard 33 Summary 36 Part II Setting Up for Big Data with Microsoft 37 Chapter 3 Configuring Your First Big Data Environment 39 Getting Started 39 Getting the Install 40 Running the Installation 40 On-Premise Installation: Single-Node Installation 41 HDInsight Service: Installing in the Cloud 51 Windows Azure Storage Explorer Options 52 Validating Your New Cluster 55 Logging into HDInsight Service 55 Verify HDP Functionality in the Logs 57 Common Post-Setup Tasks 58 Loading Your First Files 58 Verifying Hive and Pig 60 Summary 63 Part III Storing and Managing Big Data 65 Chapter 4 HDFS, Hive, HBase, and HCatalog 67 Exploring the Hadoop Distributed File System 68 Explaining the HDFS Architecture 69 Interacting with HDFS 72 Exploring Hive: The Hadoop Data Warehouse Platform 75 Designing, Building, and Loading Tables 76 Querying Data 77 Configuring the Hive ODBC Driver 77 Exploring HCatalog: HDFS Table and Metadata Management 78 Exploring HBase: An HDFS Column-Oriented Database 80 Columnar Databases 81 Defining and Populating an HBase Table 82 Using Query Operations 83 Summary 84 Chapter 5 Storing and Managing Data in HDFS 85 Understanding the Fundamentals of HDFS 86 HDFS Architecture 87 NameNodes and DataNodes 89 Data Replication 90 Using Common Commands to Interact with HDFS 92 Interfaces for Working with HDFS 92 File Manipulation Commands 94 Administrative Functions in HDFS 97 Moving and Organizing Data in HDFS 100 Moving Data in HDFS 100 Implementing Data Structures for Easier Management 101 Rebalancing Data 102 Summary 103 Chapter 6 Adding Structure with Hive 105 Understanding Hive’s Purpose and Role 106 Providing Structure for Unstructured Data 107 Enabling Data Access and Transformation 114 Differentiating Hive from Traditional RDBMS Systems 115 Working with Hive 116 Creating and Querying Basic Tables 117 Creating Databases 117 Creating Tables 118 Adding and Deleting Data 121 Querying a Table 123 Using Advanced Data Structures with Hive 126 Setting Up Partitioned Tables 126 Loading Partitioned Tables 128 Using Views 129 Creating Indexes for Tables 130 Summary 131 Chapter 7 Expanding Your Capability with HBase and HCatalog 133 Using HBase 134 Creating HBase Tables 134 Loading Data into an HBase Table 136 Performing a Fast Lookup 138 Loading and Querying HBase 139 Managing Data with HCatalog 140 Working with HCatalog and Hive 140 Defining Data Structures 141 Creating Indexes 143 Creating Partitions 143 Integrating HCatalog with Pig and Hive 145 Using HBase or Hive as a Data Warehouse 149 Summary 150 Part IV Working with Your Big Data 151 Chapter 8 Effective Big Data ETL with SSIS, Pig, and Sqoop 153 Combining Big Data and SQL Server Tools for Better Solutions 154 Why Move the Data? 154 Transferring Data Between Hadoop and SQL Server 155 Working with SSIS and Hive 156 Connecting to Hive 157 Configuring Your Packages 161 Loading Data into Hadoop 165 Getting the Best Performance from SSIS 167 Transferring Data with Sqoop 167 Copying Data from SQL Server 168 Copying Data to SQL Server 170 Using Pig for Data Movement 171 Transforming Data with Pig 171 Using Pig and SSIS Together 174 Choosing the Right Tool 175 Use Cases for SSIS 175 Use Cases for Pig 175 Use Cases for Sqoop 176 Summary 176 Chapter 9 Data Research and Advanced Data Cleansing with Pig and Hive 177 Getting to Know Pig 178 When to Use Pig 178 Taking Advantage of Built-in Functions 179 Executing User-defi ned Functions 180 Using UDFs 182 Building Your Own UDFs for Pig 189 Using Hive 192 Data Analysis with Hive 192 Types of Hive Functions 192 Extending Hive with Map-reduce Scripts 195 Creating a Custom Map-reduce Script 198 Creating Your Own UDFs for Hive 199 Summary 201 Part V Big Data and SQL Server Together 203 Chapter 10 Data Warehouses and Hadoop Integration 205 State of the Union 206 Challenges Faced by Traditional Data Warehouse Architectures 207 Technical Constraints 207 Business Challenges 213 Hadoop’s Impact on the Data Warehouse Market 216 Keep Everything 216 Code First (Schema Later) 217 Model the Value 218 Throw Compute at the Problem 218 Introducing Parallel Data Warehouse (PDW) 220 What Is PDW? 221 Why Is PDW Important? 222 How PDW Works 224 Project Polybase 235 Polybase Architecture 235 Business Use Cases for Polybase Today 249 Speculating on the Future for Polybase 251 Summary 255 Chapter 11 Visualizing Big Data with Microsoft BI 257 An Ecosystem of Tools 258 Excel 258 PowerPivot 258 Power View 259 Power Map 261 Reporting Services 261 Self-service Big Data with PowerPivot 263 Setting Up the ODBC Driver 263 Loading Data 265 Updating the Model 272 Adding Measures 273 Creating Pivot Tables 274 Rapid Big Data Exploration with Power View 277 Spatial Exploration with Power Map 281 Summary 283 Chapter 12 Big Data Analytics 285 Data Science, Data Mining, and Predictive Analytics 286 Data Mining 286 Predictive Analytics 287 Introduction to Mahout 288 Building a Recommendation Engine 289 Getting Started 291 Running a User-to-user Recommendation Job 292 Running an Item-to-item Recommendation Job 295 Summary 296 Chapter 13 Big Data and the Cloud 297 Defi ning the Cloud 298 Exploring Big Data Cloud Providers 299 Amazon 299 Microsoft 300 Setting Up a Big Data Sandbox in the Cloud 300 Getting Started with Amazon EMR 301 Getting Started with HDInsight 307 Storing Your Data in the Cloud 315 Storing Data 316 Uploading Your Data 317 Exploring Big Data Storage Tools 318 Integrating Cloud Data 319 Other Cloud Data Sources 321 Summary 321 Chapter 14 Big Data in the Real World 323 Common Industry Analytics 324 Telco 324 Energy 325 Retail 325 Data Services 326 IT/Hosting Optimization 326 Marketing Social Sentiment 327 Operational Analytics 327 Failing Fast 328 A New Ecosystem of Technologies 328 User Audiences 330 Summary 333 Part VI Moving Your Big Data Forward 335 Chapter 15 Building and Executing Your Big Data Plan 337 Gaining Sponsor and Stakeholder Buy-In 338 Problem Definition 338 Scope Management 339 Stakeholder Expectations 341 Defining the Criteria for Success 342 Identifying Technical Challenges 342 Environmental Challenges 342 Challenges in Skillset 344 Identifying Operational Challenges 345 Planning for Setup/Configuration 345 Planning for Ongoing Maintenance 347 Going Forward 348 The HandOff to Operations 348 After Deployment 349 Summary 350 Chapter 16 Operational Big Data Management 351 Hybrid Big Data Environments: Cloud and On-Premise Solutions Working Together 352 Ongoing Data Integration with Cloud and On-Premise Solutions 353 Integration Thoughts for Big Data 354 Backups and High Availability in Your Big Data Environment 356 High Availability 356 Disaster Recovery 358 Big Data Solution Governance 359 Creating Operational Analytics 360 System Center Operations Manager for HDP 361 Installing the Ambari SCOM Management Pack 362 Monitoring with the Ambari SCOM Management Pack 371 Summary 377 Index 379

About the Author :
Adam Jorgensen is the President of Pragmatic Works and the Executive Vice President of PASS. He has extensive experience with data warehousing, analytics, and NoSQL architectures. James Rowland-Jones is a principal consultant for The Big Bang Data Company. He specializes in big data warehouse solutions that leverage SQL Server Parallel Data Warehouse and Hadoop ecosystems. John Welch is Vice President of Software Development at Pragmatic Works, where he leads the development of a suite of BI and data products for SQL Server and related technologies. Dan Clark is a senior BI consultant for Pragmatic Works. Dan has published several books and numerous articles on .NET programming and BI development. Christopher Price is a senior consultant with Microsoft. His focus is on ETL, data integration, data quality, MDM, SSAS, SharePoint, and all things big data. Brian Mitchell is the lead architect of the Microsoft Big Data Center of Expertise. He focuses exclusively on DW/BI solutions.


Best Sellers


Product Details
  • ISBN-13: 9781118729083
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: John Wiley & Sons Inc
  • Height: 236 mm
  • No of Pages: 408
  • Weight: 680 gr
  • ISBN-10: 1118729080
  • Publisher Date: 04 Apr 2014
  • Binding: Paperback
  • Language: English
  • Spine Width: 20 mm
  • Width: 189 mm


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Microsoft Big Data Solutions
John Wiley & Sons Inc -
Microsoft Big Data Solutions
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Microsoft Big Data Solutions

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!