A Python machine learning toolkit for time series analysis with scikit-learn compatible API.
tslearn is a Python machine learning toolkit specifically designed for time series analysis. It provides algorithms for classification, clustering, regression, and preprocessing of temporal data, solving the problem of applying ML techniques to sequential datasets. The library follows scikit-learn's API conventions, making it familiar and easy to integrate into existing workflows.
Data scientists, researchers, and machine learning practitioners working with time series data in fields like finance, healthcare, IoT, and signal processing.
Developers choose tslearn because it offers a comprehensive, scikit-learn-compatible toolkit dedicated to time series analysis, with specialized algorithms not available in general-purpose ML libraries. Its consistent API and support for variable-length time series make it both powerful and practical.
The machine learning toolkit for time series analysis in Python
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Follows scikit-learn API conventions, allowing seamless use with pipelines and hyper-parameter tuning, as demonstrated in the examples and documentation.
Offers a range of methods like KShape for clustering and LearningShapelets for classification, tailored specifically for temporal data analysis.
Natively handles time series of different lengths without requiring padding, as highlighted in the variable-length support section of the README.
Provides extensive documentation with a gallery of examples and API reference hosted on Read the Docs, making it easier to get started and troubleshoot.
Requires time series to be formatted as 3D numpy arrays, which can be non-trivial to set up and may necessitate additional preprocessing steps, as noted in the getting started section.
Includes only basic neural networks like MLP, lacking advanced architectures such as RNNs or CNNs that are common in modern time series analysis, limiting its use for complex patterns.
Algorithms like Dynamic Time Warping are known for high computational cost, which can be prohibitive for large datasets or real-time processing, as admitted in performance considerations.