2025

Document Similarity and Clustering Toolkit

Data Mining

Language:

  • Python

Libraries:

  • numPy

  • PySpark

  • scikit-learn

Platform:

  • Spark

README.md