PartSpan: Parallel Sequence Mining of Trajectory Patterns

The trajectory pattern mining problem has recently attracted increasing attention. This paper precisely addresses the parallel mining problem of trajectory patterns as well as the newly proposed concepts with regard to trajectory pattern mining. An efficient parallel trajectory sequential pattern mining (PartSpan) is proposed by incorporating three key techniques: prefix-projection, parallel formulation, and candidate pruning. The prefix-projection technique is used to decompose the search space as well as greatly reducing candidate trajectory sequences. The parallel formulation integrates the data parallel formulation and the task parallel formulation to partition the computations and to assign them to multiple processors in an efficient and effective manner that helps reduce the communication cost across processors. Representative experiments are used to evaluate the performance of PartSpan. The results show that PartSpan outperforms GSP-based and SPADE-based parallel algorithms in mining very large trajectory databases.

[1]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[2]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[3]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[4]  Masaru Kitsuregawa,et al.  Mining Algorithms for Sequential Patterns in Parallel: Hash Based Approach , 1998, PAKDD.

[5]  Valerie Guralnik,et al.  Parallel Tree Projection Algorithm for Sequence Mining , 2001, Euro-Par.

[6]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[7]  Mohammed J. Zaki Parallel Sequence Mining on Shared-Memory Machines , 1999, J. Parallel Distributed Comput..

[8]  Thomas Brinkhoff,et al.  Generating Traffic Data , 2003, IEEE Data Eng. Bull..

[9]  Valerie Guralnik,et al.  Parallel tree-projection-based sequence mining algorithms , 2004, Parallel Comput..

[10]  Jianwen Su,et al.  Universal trajectory queries for moving object databases , 2004, IEEE International Conference on Mobile Data Management, 2004. Proceedings. 2004.

[11]  Zou Xiang Study on Distributed Sequential Pattern Discovery Algorithm , 2005 .

[12]  David A. Padua,et al.  Parallel mining of closed sequential patterns , 2005, KDD '05.

[13]  Dino Pedreschi,et al.  Mining sequences with temporal annotations , 2006, SAC '06.

[14]  Dino Pedreschi,et al.  Efficient Mining of Temporally Annotated Sequences , 2006, SDM.

[15]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[16]  Dino Pedreschi,et al.  Trajectory pattern mining , 2007, KDD '07.