Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is bottleneck for virtually all time series data mining algorithms. The difficulty of scaling search to large datasets largely explains why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine truly massive time series for the first time. We demonstrate the following extremely unintuitive fact; in large datasets we can exactly search under DTW much more quickly than the current state-of-the-art search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We show that our ideas allow us to solve higher-level time series data mining problem such as motif discovery and clustering at scales that would otherwise be untenable. In addition to mining massive datasets, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6816304 | PMC |
http://dx.doi.org/10.1145/2339530.2339576 | DOI Listing |
Sci Rep
December 2024
Department of Civil Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
Deep learning models are widely used for traffic forecasting on freeways due to their ability to learn complex temporal and spatial relationships. In particular, graph neural networks, which integrate graph theory into deep learning, have become popular for modeling traffic sensor networks. However, traditional graph convolutional networks (GCNs) face limitations in capturing long-range spatial correlations, which can hinder accurate long-term predictions.
View Article and Find Full Text PDFSci Rep
December 2024
Imperial College London, London, UK.
Accurate estimation of the soil resilient modulus (M) is essential for designing and monitoring pavements. However, experimental methods tend to be time-consuming and costly; regression equations and constitutive models usually have limited applications, while the predictive accuracy of some machine learning studies still has room for improvement. To forecast M efficiently and accurately, a new model named black-winged kite algorithm-extreme gradient boosting (BKA-XGBOOST) is proposed.
View Article and Find Full Text PDFSci Rep
December 2024
Nehme and Therese Tohme Multiple Sclerosis Center, American University of Beirut Medical Center, Riad El-Solh, PO Box 11-0236, 1107 2020, Beirut, Lebanon.
Fatigue is one of the most prevalent and disabling symptoms among patients with MS, but there is limited research investigating the longitudinal determinants of fatigue progression. This study aims to identify the sociodemographic, behavioral and clinical characteristics, and therapeutic regimens that are correlated with worsening fatigue over time in patients diagnosed with MS. This is a retrospective chart review of 483 patients.
View Article and Find Full Text PDFSci Rep
December 2024
College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan, 030024, China.
Accurate prediction of runoff is of great significance for rational planning and management of regional water resources. However, runoff presents non-stationary characteristics that make it impossible for a single model to fully capture its intrinsic characteristics. Enhancing its precision poses a significant challenge within the area of water resources management research.
View Article and Find Full Text PDFSci Rep
December 2024
Hebei Provincial Key Laboratory of Orthopaedic Biomechanics, Hebei Orthopaedic Research Institute, No. 139 Ziqiang Road, Shijiazhuang, 050051, China.
To investigate the population distribution characteristics of elderly osteoporosis fracture patients in Hebei Province and analyze the effects of air pollutants on elderly osteoporosis fractures, We retrospectively collected 18,933 cases of elderly osteoporosis fractures from January 1, 2019, to December 31, 2022, from four hospitals in Hebei Province. The average age was 76.44 ± 7.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!