You are viewing the RapidMiner Radoop documentation for version 10.2 - Check here for latest version
RapidMiner Radoop Compatibility
Supported Hadoop distributions
RapidMiner Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR) 6.x
- Azure HDInsight 4.0, 5.0
- Cloudera Data Platform Private Cloud Base (CDP) 7.x
For CDH distributions, we only support the minor versions that are also supported by Cloudera. For HDInsight and Amazon EMR operators related to model scoring is not available due their lack of running Hive on Java11.
Supported data warehouse systems (DWS)
RapidMiner Radoop supports the following data warehouse infrastructure:
- Hive 3.x (for scoring models it must run on Java11 JVM to load Radoop UDFs)
Supported Spark versions
RapidMiner Radoop supports the following Spark versions:
- Apache Spark 3.x (only Scala 2.12 distribution is supported on Java11 JVM)
Supported Java versions
On the Hadoop cluster, RapidMiner Radoop requires Oracle JDK 11 or OpenJDK 11 installed to operate. The cluster nodes should have at least 32 GB of RAM. On the machine running the extension itself (either within RapidMiner Studio or AI Hub), RapidMiner Radoop requires Oracle Java 11 or OpenJDK Java 11.
RapidMiner extension compatibility
RapidMiner Radoop is not compatible with the Parallel Processing Extension. This extension must be disabled when using Radoop. Please select the Extensions > Manage Extensions... menu item and uncheck the box for Parallel Processing Extension.
