Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 7.6 - Check here for latest version

What’s New in RapidMiner Radoop 7.0?

Enhancements and bug fixes

The following improvements are part of RapidMiner Radoop 7.0.

Enhancements

  • Added Single Process Pushdown operator
  • Added Decision Tree distributed learner operator (integrates Spark ML)
  • Added Random Forest distributed learner operator (integrates Spark ML)
  • Added Support Vector Machine distributed learner operator (integrates Spark MLlib)
  • Generate Attributes now has a new intuitive UI, that can also be used in Filter Examples operator and on the Hadoop Data view as well
  • Added several new functions of Hive to the Generate Attributes expression editor
  • Adds support for Spark 1.6.0
  • Applied new operator grouping and new color for Radoop operators
  • Added an XML editor for Radoop connections on the Advanced Connection Properties dialog
  • Added new convergence tolerance parameter to Linear Regression and Logistic Regression
  • Added new connection test: upload jar file required for user-defined functions to the cluster
  • Added new connection test: create permanent user-defined functions
  • Added new connection test: test Spark staging directory
  • Custom JDBC driver for Hive can now be specified beside the built-in drivers
  • Importing large tables into the cluster is now faster
  • Using Tez as the execution engine for Hive (Hive-on-Tez) is now possible, although not yet supported

Bug fixes

  • BUGFIX: Local temporary files are now always cleaned
  • BUGFIX: Pivot now does not truncate the average value of integer attributes
  • BUGFIX: Temporary tables and directories are now also cleaned if a special error occurs
  • BUGFIX: Hive command timeout now applied to each command instead of multiple commands in some cases
  • BUGFIX: In Replace Missing operator real values are now not allowed for integer attributes
  • BUGFIX: Job kill for certain operators now works on secure cluster as well
  • BUGFIX: Hadoop Data view related panels are now not listed on the Show Panel menu
  • BUGFIX: Fixes issues with non-default Hive null placeholder setting
  • BUGFIX: Long error messages now do not appear in pop-up windows that are too big
  • BUGFIX: Move into new subprocess action on the Design canvas now adds Radoop Subprocess operator
  • BUGFIX: Distributed model scoring on a new cluster or with new user no longer fails when running from RM Server