Користувач:Ilya/hadoop

Матеріал з Вікіпедії — вільної енциклопедії.
Перейти до навігації Перейти до пошуку

For those of you who are new to Hadoop, I strongly urge you to try Cloudera’s open source Distribution for Hadoop (http://www.cloudera.com/hadoop). It provides the stable base of Hadoop 0.18.3 with bug fixes and some new features back-ported in and added-in hooks to the support scribe log file aggregation service (http://scribeserver.wiki.sourceforge.net/). The Cloudera folks have Amazon machine images (AMIs), Debian and RPM installer files, and an online configuration tool to generate configuration files. If you are struggling with Hadoop 0.19 issues, or some of the 0.18.3 issues are biting you, please shift to this distribution. It will reduce your pain.

The following are the stock Hadoop Core distributions at the time of this writing:

  • Hadoop 0.18.3 is a good distribution, but has a couple of issues related to file descriptor leakage and reduce task stalls.
  • Hadoop 0.19.0 should be avoided, as it has data corruption issues related to the append and sync changes.
  • Hadoop 0.19.1 looks to be a reasonably stable release with many useful features.
  • Hadoop 0.20.0 has some major API changes and is still unstable.

Hive/GettingStarted This is a prototype version of Hive and is NOT production quality. However, we are working hard to make Hive a production quality system. Hive has only been tested on unix(linux) and mac systems using Java 1.6 for now - although it may very well work on other similar platforms. It does not work on Cygwin right now. Most of our testing has been on Hadoop 0.17.2 - so we would advise running it against this version of hadoop - even though it may compile/work against other versions