上QQ阅读APP看书,第一时间看更新
Machine learning and data mining libraries
There are quite a few machine learning and data mining libraries available for Java and other JVM languages. Some of them are as follows:
- Weka (http://www.cs.waikato.ac.nz/ml/weka/) is probably the most famous data mining library in Java, contains a lot of algorithms and has many extensions.
- JavaML (http://java-ml.sourceforge.net/) is quite an old and reliable ML library, but unfortunately not updated anymore
- Smile (http://haifengl.github.io/smile/) is a promising ML library that is under active development at the moment and a lot of new methods are being added there.
- JSAT (https://github.com/EdwardRaff/JSAT) contains quite an impressive list of machine learning algorithms.
- H2O (http://www.h2o.ai/) is a framework for distributed ML written in Java, but is available for multiple languages, including Scala, R, and Python.
- Apache Mahout (http://mahout.apache.org/) is used for in-core (one machine) and distributed machine learning. The Mahout Samsara framework allows writing the code in a framework-independent way and then executes it on Spark, Flink, or H2O.
There are several libraries that specialize solely on neural networks:
- Encog (http://www.heatonresearch.com/encog/)
- DeepLearning4j (http://deeplearning4j.org/)
We will cover some of these libraries throughout the book.