A machine learning library of high-quality algorithms for Apache Spark. It supports R, Python, Java and Scala programming languages. It can run on Mesos, Hadoop and Kubernetes, and can extract data from a number of databases, such as Hive, Cassandra, HDFS, and HBase.

Learn more

First released 2012
Developed by Apache Software Foundation
Latest stable version 3.x
Open-source Yes

Development by

Sign up for updates
straight to your inbox