Hadoop: The Definitive Guide
Original price was: $49.99.$23.06Current price is: $23.06.
Price: [price_with_discount]
(as of [price_update_date] – Details)
Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework — an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.
This revised edition covers recent changes to Hadoop, including new features such as Hive, Sqoop, and Avro. It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book.
Use the Hadoop Distributed File System (HDFS) for storing large datasets, then run distributed computations over those datasets with MapReduceBecome familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistenceDiscover common pitfalls and advanced features for writing real-world MapReduce programsDesign, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloudUse Pig, a high-level query language for large-scale data processingAnalyze datasets with Hive, Hadoop’s data warehousing systemTake advantage of HBase, Hadoop’s database for structured and semi-structured dataLearn ZooKeeper, a toolkit of coordination primitives for building distributed systems
“Now you have the opportunity to learn about Hadoop from a master — not only of the technology, but also of common sense and plain talk.”
–Doug Cutting, Cloudera
Publisher : Yahoo Press; Second edition (October 15, 2010)
Language : English
Paperback : 628 pages
ISBN-10 : 1449389732
ISBN-13 : 978-1449389734
Item Weight : 1.75 pounds
Dimensions : 7 x 1.5 x 9.19 inches
[ad_2]
David Mark Schramm –
Excellant Hadoop Overview
This book provides an excellent in-depth overview of all aspects of Hadoop with how-to examples that are easy to follow. It is well written, thorough and exactly what I needed to architect and build a Hadoop-based solution. Related technologies such as Hive, HBase, Sqoop, Pig and Zookeeper are also covered in decent depth.Other reviewers gave poor reviews due to the APIs being not up to date, which I think is unfair. Those new APIs are still only available in early unstable Hadoop versions, so current developers are best served to use the earlier APIs. The book gives samples with new APIs and shows very clearly the API changes which are minor. The concepts are identical, but a few classes have been combined into a more cohesive “Context” class in the new APIs.So, for example, to write a data record you call “context.collect(…);” rather than “output.collect(…);” with identical parameters. The structure of applications and the concepts are not changed. The changes to the syntax of Java calls is trivial and covered in the book very clearly. What is the big deal? Understanding the concepts is the most important thing and this book provides this very nicely.I would recommend this book to anyone who is new to Hadoop and needs to learn it in depth.
Dimitri K –
Useful but hard to understand
Good to have a book about this system; it is much better to have at least some book than no book at all. Also good is the practical approach of the author, who guides you through all steps of configuring, testing and running tasks in Hadoop.One major drawback is a somehow vague and unclear language of the book. When author tries to explain some complicated matter, his phrases are always impossible to understand without further research, or not at all. The author surely knows what he wants to say, but cannot express it in a clear and precise manner.
Justin Vincent –
Makes Hadoop as easy as it should be
The documentation for Hadoop is really lacking. This fills that gap very well. I suggest just read the book through and then try to code something up. All the examples in the book are on the authors github page.The examples don’t include things like imports, so it’s useful to just look at the GitHub page to figure out where classes actually reside. The unpublished 3rd edition examples are in there too in case you get sick of deprecation warnings.
Eyal –
Good coverage of the subject
The book was used to introduce the subject. It gave a good overview and enough detail to get a project going. It is still referred to now, however as hadoop is such a moving target (there is now a later edition of the book) referring to the official online classes is more effective.Working with the book at the early stages easily justified the cost, and more than one person used it this way.
Vova –
Three Stars
ok
Dan S. –
Excellent Hadoop reference
This book sits next to my laptop as my Hadoop reference guide/bible. Covering installation, administration and development, this is my go to resource for understanding Hadoop as a relatively new (to me) technology. Agreed that the APIs are not up to date, that is unavoidable for a book on emerging technology. The chapter on Pig alone is worth buying this book, but I’ve gotten extensive use out of nearly the entire book.
Sourin Rao –
Awesome book. The bible for Big Data
Loved this bookThis book is the best source to get hadoop information and get up to speed on the big wide world of big data
Al –
Good overview
This is a good overview book of Hadoop, how it works, and the software in the Hadoop ecosystem. It’s definitely a breadth book, not a depth book, so if you’re looking to be an expert on specific subsections of Hadoop, you should buy books on those specific topics.(Note: there is a newer edition of the book now, you may want to get that one instead of this one.)
Justin Kamerman –
This is a great introduction to MapReduce, Hadoop, and the HDFS. A programmer with basic Java knowledge could have most of the the code examples up and running in a few hours. That said, it is a broad topic and impossible to cover in the scope of a single book. I would have preferred more coverage of the MapReduce paradigm and briefer coverage of the Hadoop add-on projects like Pig, Hive, and ZooKeeper. Also, the book left a few gaps for me with respect to preparing input data to leverage the distributed filesystem.All in all, a well written and very informative book. I found Data-Intensive Text Processing with MapReduce an excellent companion to this book for more detail on MapReduce.
S W SCOTT –
As a new comer to Hadoop I found this book really useful in helping me get to grips with this new way of thinking about non-relational database. (I’ve developed several commercial applications with Hadoop now-all of which started from this book)
Jose Antonio Garcia Varela –
Quizás el mejor libro que he leÃdo en este campo. Gracias a él me he podido quitar una certificación oficial.
Dia Ledesma –
Itâs fair
Prairy Earth –
Learning Hadoop is a challenge. At the time I bought this book, it was one of the first out on the topic. Generally, I really like O’Reilly books, but this book was difficult to understand, and clearly showed its first mover status. I have not followed up to see if there are additional editions of this book; however, Hadoop has been supplanted by other technologies, such as Cassandra, so I don’t think I’ll ever find out if Tom White did a better job on future editions.