Apache Spark in 24 Hours, Sams Teach Yourself
Home > Computing and Information Technology > Databases > Data mining > Apache Spark in 24 Hours, Sams Teach Yourself
Apache Spark in 24 Hours, Sams Teach Yourself

Apache Spark in 24 Hours, Sams Teach Yourself

|
     0     
5
4
3
2
1




Out of Stock


Notify me when this book is in stock
About the Book

Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark’s amazing speed, scalability, simplicity, and versatility. This book’s straightforward, step-by-step approach shows you how to deploy, program, optimize, manage, integrate, and extend Spark–now, and for years to come. You’ll discover how to create powerful solutions encompassing cloud computing, real-time stream processing, machine learning, and more. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success. Whether you are a data analyst, data engineer, data scientist, or data steward, learning Spark will help you to advance your career or embark on a new career in the booming area of Big Data. Learn how to • Discover what Apache Spark does and how it fits into the Big Data landscape • Deploy and run Spark locally or in the cloud • Interact with Spark from the shell • Make the most of the Spark Cluster Architecture • Develop Spark applications with Scala and functional Python • Program with the Spark API, including transformations and actions • Apply practical data engineering/analysis approaches designed for Spark • Use Resilient Distributed Datasets (RDDs) for caching, persistence, and output • Optimize Spark solution performance • Use Spark with SQL (via Spark SQL) and with NoSQL (via Cassandra) • Leverage cutting-edge functional programming techniques • Extend Spark with streaming, R, and Sparkling Water • Start building Spark-based machine learning and graph-processing applications • Explore advanced messaging technologies, including Kafka • Preview and prepare for Spark’s next generation of innovations Instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; "Did You Know?" tips offer insider advice and shortcuts; and "Watch Out!" alerts help you avoid pitfalls. By the time you're finished, you'll be comfortable using Apache Spark to solve a wide spectrum of Big Data problems.

Table of Contents:
Preface     xii PART I:  GETTING STARTED WITH APACHE SPARK Hour 1:  Introducing Apache Spark     1 What Is Spark?     1 What Sort of Applications Use Spark?     3 Programming Interfaces to Spark     3 Ways to Use Spark     5 Q&A     8 Workshop     8 Hour 2:  Understanding Hadoop     11 Hadoop and a Brief History of Big Data     11 Hadoop Explained     12 Introducing HDFS     13 Introducing YARN     19 Anatomy of a Hadoop Cluster     22 How Spark Works with Hadoop     24 Q&A     25 Workshop     25 Hour 3:  Installing Spark     27 Spark Deployment Modes     27 Preparing to Install Spark     28 Installing Spark in Standalone Mode     29 Exploring the Spark Install     38 Deploying Spark on Hadoop     39 Q&A     43 Workshop     43 Exercises     44 Hour 4:  Understanding the Spark Application Architecture     45 Anatomy of a Spark Application     45 Spark Driver     46 Spark Executors and Workers     48 Spark Master and Cluster Manager     49 Spark Applications Running on YARN     51 Local Mode     56 Q&A     59 Workshop     59 Hour 5:  Deploying Spark in the Cloud     61 Amazon Web Services Primer     61 Spark on EC2     64 Spark on EMR     73 Hosted Spark with Databricks     81 Q&A     89 Workshop     89 PART II:  PROGRAMMING WITH APACHE SPARK Hour 6:  Learning the Basics of Spark Programming with RDDs     91 Introduction to RDDs     91 Loading Data into RDDs     93 Operations on RDDs     106 Types of RDDs     111 Q&A     113 Workshop     113 Hour 7:  Understanding MapReduce Concepts     115 MapReduce History and Background     115 Records and Key Value Pairs     117 MapReduce Explained     118 Word Count: The “Hello, World” of MapReduce     126 Q&A     135 Workshop     136 Hour 8:  Getting Started with Scala     137 Scala History and Background     137 Scala Basics     138 Object-Oriented Programming in Scala     153 Functional Programming in Scala     157 Spark Programming in Scala     160 Q&A     163 Workshop     163 Hour 9:  Functional Programming with Python     165 Python Overview     165 Data Structures and Serialization in Python     170 Python Functional Programming Basics     178 Interactive Programming Using IPython     183 Q&A     194 Workshop     194 Hour 10:  Working with the Spark API (Transformations and Actions)     197 RDDs and Data Sampling     197 Spark Transformations     199 Spark Actions     206 Key Value Pair Operations     211 Join Functions     219 Numerical RDD Operations     229 Q&A     232 Workshop     233 Hour 11:  Using RDDs: Caching, Persistence, and Output     235 RDD Storage Levels     235 Caching, Persistence, and Checkpointing     239 Saving RDD Output     247 Introduction to Alluxio (Tachyon)     254 Q&A     257 Workshop     258 Hour 12:  Advanced Spark Programming     259 Broadcast Variables     259 Accumulators     265 Partitioning and Repartitioning     270 Processing RDDs with External Programs     278 Q&A     280 Workshop     280 PART III:  EXTENSIONS TO SPARK Hour 13:  Using SQL with Spark     283 Introduction to Spark SQL     283 Getting Started with Spark SQL DataFrames     294 Using Spark SQL DataFrames     305 Accessing Spark SQL     316 Q&A     321 Workshop     322 Hour 14:  Stream Processing with Spark     323 Introduction to Spark Streaming     323 Using DStreams     326 State Operations     335 Sliding Window Operations     337 Q&A     340 Workshop     340 Hour 15:  Getting Started with Spark and R     343 Introduction to R     343 Introducing SparkR     350 Using SparkR     355 Using SparkR with RStudio     358 Q&A     361 Workshop     361 Hour 16:  Machine Learning with Spark     363 Introduction to Machine Learning and MLlib     363 Classification Using Spark MLlib     367 Collaborative Filtering Using Spark MLlib     373 Clustering Using Spark MLlib     375 Q&A     378 Workshop     379 Hour 17:  Introducing Sparkling Water (H20 and Spark)     381 Introduction to H2O     381 Sparkling Water—H2O on Spark     387 Q&A     397 Workshop     397 Hour 18:  Graph Processing with Spark     399 Introduction to Graphs     399 Graph Processing in Spark     402 Introduction to GraphFrames     406 Q&A     414 Workshop     414 Hour 19:  Using Spark with NoSQL Systems     417 Introduction to NoSQL     417 Using Spark with HBase     419 Using Spark with Cassandra     425 Using Spark with DynamoDB and More     429 Q&A     431 Workshop     432 Hour 20:  Using Spark with Messaging Systems     433 Overview of Messaging Systems     433 Using Spark with Apache Kafka     435 Spark, MQTT, and the Internet of Things     443 Using Spark with Amazon Kinesis     446 Q&A     451 Workshop     451 PART IV:  MANAGING SPARK Hour 21:  Administering Spark     453 Spark Configuration     453 Administering Spark Standalone     461 Administering Spark on YARN     471 Q&A     477 Workshop     478 Hour 22:  Monitoring Spark     479 Exploring the Spark Application UI     479 Spark History Server     488 Spark Metrics     490 Logging in Spark     492 Q&A     499 Workshop     499 Hour 23:  Extending and Securing Spark     501 Isolating Spark     501 Securing Spark Communication     504 Securing Spark with Kerberos     512 Q&A     517 Workshop     517 Hour 24:  Improving Spark Performance     519 Benchmarking Spark     519 Application Development Best Practices     526 Optimizing Partitions     534 Diagnosing Application Performance Issues     536 Q&A     540 Workshop     541 Index     543


Best Sellers


Product Details
  • ISBN-13: 9780134445830
  • Publisher: Pearson Education (US)
  • Publisher Imprint: Addison Wesley
  • Language: English
  • Weight: 1 gr
  • ISBN-10: 013444583X
  • Publisher Date: 31 Aug 2016
  • Binding: Digital download
  • No of Pages: 445


Similar Products

Add Photo
Add Photo

Customer Reviews

REVIEWS      0     
Click Here To Be The First to Review this Product
Apache Spark in 24 Hours, Sams Teach Yourself
Pearson Education (US) -
Apache Spark in 24 Hours, Sams Teach Yourself
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Apache Spark in 24 Hours, Sams Teach Yourself

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals

    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!