The MapR Converged Data Platform integrates Hadoop, Spark, and Apache Drill with real-time database capabilities, global event streaming, and scalable enterprise storage to power a new generation of big data applications. The MapR Platform delivers enterprise grade security, reliability, and real-time performance while dramatically lowering both hardware and operational costs of your most important applications and data.


With MapR, data does not need to be moved to specialized silos for processing; data can be processed in place. In fact, we have applied the concept of "Polyglot Persistence" to the MapR Platform, with the ability to leverage multiple data types and formats directly, depending on your use case. The MapR Converged Data Platform enables direct processing of files, tables, and event streams. The MapR Platform also makes it easier to leverage existing applications and solutions by supporting POSIX-compliant, industry-standard NFS. Additionally, containerized applications can make use of the MapR Persistent Application Client Containers to securely access and leverage MapR platform services (MapR-FS, MapR-DB, MapR Streams) as a persistent data store. Finally, organizations can capture, process, and analyze IoT data close to the source by leveraging MapR Edge.”

Additional features to support a diverse set of applications and users include a range of enterprise-grade features: unified security, global namespace, high availability, data protection and disaster recovery support; multi-tenancy and volume support; data and job placement control so applications can be selectively executed in a cluster to take advantage of faster CPUs or SSD drives; and support for a heterogeneous hardware cluster.



High Availability

High availability (HA) is the ability of a system to remain up and running despite unforeseen failures, avoiding unplanned downtime or service disruption.

Real Multi-tenancy Including YARN

The MapR Platform provides features to logically partition a physical cluster to provide separate administrative control, data placement, job execution, and network access.

Real-time Streaming

MapR Streams is a global publish-subscribe event streaming system for Big Data. It is the only Big Data streaming system to support global event replication reliably at IoT scale.

Snapshots: Complete Data Protection

MapR Snapshots provide protection without duplicating the data. You can take a snapshot of a 1 PB cluster in seconds with no additional data storage.

Ease of Data Integration

MapR provides complete random read-write capable, POSIX compliant, highly available, high performance NFS access for production use.

Lowest Total Cost of Ownership

The architectural advantages of the MapR Platform show cost differences of 20-50% in terms of total cost of ownership (TCO) across capital and operational expenses.


Hadoop is built to process large amounts of data from terabytes to petabytes, and beyond. It delivers greater business impact when used as part of the MapR Converged Data Platform. The MapR Platform combines operational and analytical workloads that drive business insights in real time that are not feasible in other environments that suffer from complex integrations between disparate data silos.


Apache Hadoop is a software package that includes a wide range of data processing engines on top of a distributed file system. It was designed to run a variety of computations, especially analytical jobs, on extremely large volumes of data in parallel across many commodity servers in a cluster. Example use cases on Hadoop include data lakes, customer 360 degree views, recommendations, security analytics, and clickstream analysis.