Hadoop Versioning and Ecosystem Abstraction – The MapR Angle

As a follow up to my previous article on Hadoop versioning I noticed that I left off MapR. There exists a similar logic and structure to the MapR versioning again with some nuance worth explaining. When looking for the version of MapR you are using a simple cat of MAPR_HOME/MapRBuildVersion will show you what you are working with.

This is a simple was to grab the version (5.0.0), the build (32987) and the release state (GA) of the distribution of MapR in use.

But what about Ecosystem products? 

MapR categorizes the packages in a distribution into core packages and ecosystem packages. Packages in the ecosystem would include a package like Hive for example. In RPM based Linux systems one can simply use Yum to investigate whats installed or available.

Yum cleanly separates installed vs available automatically and you can see core vs ecosystem by the repo column. Again the exact verssion is displayed in a similar context.

Core products use


with X.X.X being the major version, Y is the build and Z is the release state. The ecosystem packages are slightly different in that they are listed as


where X.X.X is the version number from the corresponding Apache project and Y is the date.

All that said its always good know not only what version of package you are using but do yourself a favor and be sure to read the release notes in the documentation.

One of the best things about MapR is the abstraction of the ecosystem layer from the core distribution. What does this mean in practice? This means that you have a range of ecosystem package versions to choose from within a version of the core MapR distro. This is often overlooked but is a great example of the forethought that has done into MapR that sometimes seen as “different”. This means if you have a mission critical application running on Hive 0.13 there is NO reason to install a whole new cluster to run Hive 1.0. Ultimately this type of flexibility leaves the user in charge of their own cluster.