LIBBLE-Spark

Introduction

LIBBLE-Spark is the LIBBLE variant implemented on Spark.

The current version of LIBBLE-Spark includes the following machine learning algorithms:

  • Classification
    • Logistic Regression (LR)
    • Logistic Regression with L1-norm Regularization
    • Support Vector Machine (SVM)
  • Regression
    • Linear Regression
    • Lasso
  • Collaborative Filtering
    • Matrix Factorization
  • Dimensionality Reduction
    • Principal Component Analysis (PCA)
    • Singular Value Decomposition (SVD)
  • Clustering
    • K-Means

Empirical Comparison

The main Learning Engine for LIBBLE-Spark is based on a distributed stochastic optimization algorithm called SCOPE (Scalable Composite OPtimization for lEarning). SCOPE is both computation-efficient and communication-efficient. Theoretical analysis shows that SCOPE is convergent with linear convergence rate when the objective function is strongly convex. Furthermore, empirical results on real datasets show that SCOPE can outperform other state-of-the-art distributed learning methods on Spark, including both batch learning methods and stochastic learning methods.

Tutorial

Open Source

API

Please click here to check the Application Programming Interface documents.

Development Team