Prefaceedit
Elasticsearch for Apache Hadoop is an ‘umbrella’ project consisting of three similar, yet independent sub-projects with their own, dedicated, section in the documentation:
- elasticsearch-hadoop proper
- interact with Elasticsearch from within a Hadoop environment. If you are using Map/Reduce, Hive, Pig, Apache Spark, Apache Storm, or Cascading, this project is for you. For feature requests or bugs, please open an issue in the Elasticsearch-Hadoop repository.
- repository-hdfs
- use HDFS as a back-end repository for doing snapshot/restore from/to Elasticsearch. For more information, refer to its home page. For feature requests or bugs, please open an issue in the Elasticsearch repository with the ":Plugin Repository HDFS" tag.
- Elasticsearch on YARN
- run Elasticsearch on top of YARN - see Elasticsearch on YARN . This project is in beta.
Thus, while all projects fall under the Hadoop umbrella, each is covering a certain aspect of it so please be sure to read the appropriate documentation. For general questions around any of these projects, the Elastic Discuss forum is a great place to interact with other users in the community.