Virtualizing Hadoop
Home > Computing and Information Technology > Databases > Data mining > Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)
Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)

|
     0     
5
4
3
2
1




Out of Stock


Notify me when this book is in stock
About the Book

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility   Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution.   First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices.   Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it.   Coverage includes the following:           •        Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop           •        Understanding YARN resource management, HDFS storage, and I/O           •        Designing data ingestion, movement, and organization for modern enterprise data platforms           •        Defining SQL engine strategies to meet strict SLAs           •        Considering security, data isolation, and scheduling for multitenant environments           •        Deploying Hadoop as a service in the cloud           •        Reviewing the essential concepts, capabilities, and terminology of virtualization            •        Applying current best practices, guidelines, and key metrics for Hadoop virtualization           •        Managing multiple Hadoop frameworks and products as one unified system           •        Virtualizing master and worker nodes to maximize availability and performance           •        Installing and configuring Linux for a Hadoop environment  

Table of Contents:
Foreword xix Preface xxi Part I: Introduction to Hadoop Chapter 1 Understanding the Big Data World 1 The Data Revolution 2 Traditional Data Systems 4     Semi-Structured and Unstructured Data 5     Causation and Correlation 7     Data Challenges 8 The Modern Data Architecture 17 Organizational Transformations 20 Industry Transformation 21 Summary 22 Chapter 2 Hadoop Fundamental Concepts 23 Types of Data in Hadoop 23 Use Cases 25 What Is Hadoop? 26 Hadoop Distributions 32 Hadoop Frameworks 32 NoSQL Databases 37     What Is NoSQL? 38 A Hadoop Cluster 42 Hadoop Software Processes 45     Hadoop Hardware Profiles 48 Roles in the Hadoop Environment 56 Summary 59 Chapter 3 YARN and HDFS 61 A Hadoop Cluster Is Distributed 61 Hadoop Directory Layouts 65     Hadoop Operating System Users 67 The Hadoop Distributed File System 67     YARN Logging 70     The NameNode 70     The DataNode 71     Block Placement 75     NameNode Configurations and Managing Metadata 77 Rack Awareness 82     Block Management 83     The Balancer 84     Maintaining Data Integrity in the Cluster 84 Quotas and Trash 92 YARN and the YARN Processing Model 93     Running Applications on YARN 101     Resource Schedulers 107     Benchmarking 112     TeraSort Benchmarking Suite 115 Summary 117 Chapter 4 The Modern Data Platform 119 Designing a Hadoop Cluster 119     Enterprise Data Movement 124 Summary 140 Chapter 5 Data Ingestion 141 Extraction, Loading, and Transformation (ELT) 141     Sqoop: Data Movement with SQL Sources 143     Flume: Streaming Data 148     Oozie: Scheduling and Workfl ow 167     Falcon: Data Lifecycle Management 172     Kafka: Real-time Data Streaming 176 Summary 186 Chapter 6 Hadoop SQL Engines 187 Where SQL Was Born 187 SQL in Hadoop 188 Hadoop SQL Engines 190     Selecting the SQL Tool For Hadoop 190 Now Getting Groovy with Hive and Pig 198     Hive 199     HCatalog 213     Pig 215 Summary 221 Chapter 7 Multitenancy in Hadoop 223 Securing the Access 224     Authentication 225     Auditing 230     Authorization 230     Data Protection 232     Isolating the Data 241     Isolating the Process 251 Summary 255 Part II: Introduction to Virtualization Chapter 8 Virtualization Fundamentals 257 Why Virtualize Hadoop? 258     Introduction to Virtualization 261 Summary 276 References 276 Chapter 9 Best Practices for Virtualizing Hadoop 277 Running Virtualized Hadoop with Purpose and Discipline 277     The Discipline of Purpose Starts with a Clear Target 279     Virtualizing Different Tiers of Hadoop 280     Industry Best Practices 282 Summary 298 Part III: Virtualizing Hadoop Chapter 10 Virtualizing Hadoop 299 How Are Hadoop Ecosystems Going to Be Managed? 300     Building an Enterprise Hadoop Platform That Is Agile and Flexible 301     Clarification of Terms 302     The Journey from Bare-Metal to Virtualization 303 Why Consider Virtualizing Hadoop? 304     Benefits of Virtualizing Hadoop 305     Virtualized Hadoop Can Run as Fast or Faster Than Native 306     Coordination and Cross-Purpose Specialization Is the Future 309     Barriers Can Be Organizational 310     Virtualization Is Not an All or Nothing Option 310     Rapid Provisioning and Improving Quality of Development and Test Environments 311     Improve High Availability with Virtualization 313     Use Virtualization to Leverage Hadoop Workloads 313     Hadoop in the Cloud 314     Big Data Extensions 314     The Path to Virtualization 315     The Software-Defined Data Center 316     Virtualizing the Network 318     vRealize Suite 320 Summary 321 References 322 Chapter 11 Virtualizing Hadoop Master Servers 323 Virtualizing Servers in a Hadoop Cluster 324     Virtualizing the Environment Around Hadoop 325     Virtualizing the Master Hadoop Servers 325     Virtualizing Without the SAN 330 Summary 331 Chapter 12 Virtualizing the Hadoop Worker Nodes 333 A Brief Introduction to the Worker Nodes in Hadoop 333 Deployment Models for Hadoop Clusters 335     The Combined Model 336     The Separated Model 339     Network Effects of the Data-Compute Separation 341     The Shared-Storage Approach to the Data-Compute Separated Model 343     Local Disks for the Application’s Temporary Data 345     The Shared Storage Architecture Model Using Network-Attached Storage (NAS) 345     Deployment Model Summary 348 Best Practices for Virtualizing Hadoop Workers 349     Disk I/O 349 The Hadoop Virtualization Extensions (HVE) 354 Summary 357 References 358 Resources 358 Chapter 13 Deploying Hadoop as a Service in the Private Cloud 361 The Cloud Context 361     Stakeholders for Hadoop 362     Overview of the Solution Architecture 368 Summary 370 References 371 Chapter 14 Understanding the Installation of Hadoop 373 Map the Right Solutions to the Right Use Case 373     Thoughts About Installing Hadoop 374 Configuring Repositories 376     Installing HDP 2.2 378     Environment Preparation 378 Setting Up the Hadoop Configuration 389 Starting HDFS and YARN 393     Start YARN 396     Verifying MapReduce Functionality 398 Installing and Configuring Hive 400 Installing and Configuring MySQL Database 401 Installing and Configuring Hive and HCatalog 401 Summary 404 Chapter 15 Configuring Linux for Hadoop 405 Supported Linux Platforms 406 Different Deployment Models 406 Linux Golden Templates 407     Building a Linux Enterprise Hadoop Platform 408     Selecting the Linux Distribution 411 Optimal Linux Kernel Parameters and System Settings 411     epoll 411     Disable Swap Space 412     Disable Security During Install 412     IO Scheduler Tuning 414     Check Transparent Huge Pages Configuration 414     Limits.conf 414     Partition Alignment for RDMs 415     File System Considerations 416     Lazy Count Parameter for XFS 418     Mount Options 418     I/O Scheduler 419     Disk Read and Write Options 421     Storage Benchmarking 421     Java Version 422     Set Up NTP 423     Enable Jumbo Frames 424     Additional Network Considerations 425 Summary 427 Appendix A Hadoop Cluster Creation: A Prerequisite Checklist 429 Appendix B Big Data/Hadoop on VMware vSphere Reference Materials 433 Deployment Guides 433 Reference Architectures 434 Customer Case Studies 434 Performance 434 vSphere Big Data Extensions (BDE) 435 Other vSphere Features and Big Data 436     9780133811025   TOC   7/7/2015  


Best Sellers


Product Details
  • ISBN-13: 9780133811117
  • Publisher: Pearson Education (US)
  • Publisher Imprint: VMWare Press
  • Language: English
  • Series Title: VMware Press Technology
  • Weight: 1 gr
  • ISBN-10: 0133811115
  • Publisher Date: 04 Jul 2015
  • Binding: Digital download
  • No of Pages: 480
  • Sub Title: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)
Pearson Education (US) -
Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture(VMware Press Technology)

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals

    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!