Time series subsequence matching (or signal searching) has importance in a variety of areas in health care informatics. These areas include case-based diagnosis and treatment as well as the discovery of trends and correlations between data. Much of the traditional research in signal searching has focused on high dimensional -NN matching. However, the results of -NN are often small and yield minimal information gain; especially with higher dimensional data. This paper proposes a randomized Monte Carlo sampling method to broaden search criteria such that the query results are an accurate sampling of the complete result set. The proposed method is shown both theoretically and empirically to improve information gain. The number of query results are increased by several orders of magnitude over approximate exact matching schemes and fall within a Gaussian distribution. The proposed method also shows excellent performance as the majority of overhead added by sampling can be mitigated through parallelization. Experiments are run on both simulated and real-world biomedical datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5016193PMC
http://dx.doi.org/10.1109/BIBM.2012.6392646DOI Listing

Publication Analysis

Top Keywords

monte carlo
8
time series
8
signal searching
8
proposed method
8
carlo approach
4
approach biomedical
4
biomedical time
4
series search
4
search time
4
series subsequence
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!