Algorithm Engineering

We present a new technique to encode a deterministic finite automaton (DFA). Based on the specific properties of Glushkov’s nondeterministic finite automaton (NFA) construction algorithm, we are able to encode the DFA using (m+1)(2 + |Σ|) bits, where m is the number of characters (excluding operator symbols) in the regular expression and Σ is the alphabet. This compares favorably against the worst case of (m + 1)2m+1|Σ| bits needed by a classical DFA representation and m(2 + |Σ|) bits needed by the Wu and Manber approach implemented in Agrep. Our approach is practical and simple to implement, and it permits searching regular expressions of moderate size (which include most cases of interest) faster than with any previously existing algorithm, as we show experimentally.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  Vijaya Ramachandran Parallel Open Ear Decomposition with Applications to Graph Biconnectivity and Triconnectivity , 1993 .

[3]  Richard E. Ladner,et al.  The influence of caches on the performance of sorting , 1997, SODA '97.

[4]  Charles E. Leiserson,et al.  Cache-Oblivious Algorithms , 2003, CIAC.

[5]  Todd L. Veldhuizen,et al.  Techniques for Scientific C , 1999 .

[6]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures: dealing with massive data , 2001, CSUR.

[7]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[8]  Vijaya Ramachandran,et al.  QSM: A General Purpose Shared-Memory Model for Parallel Computation , 1997, FSTTCS.

[9]  Stefan Schirra,et al.  A Case Study on the Cost of Geometric Computing , 1999, ALENEX.

[10]  Richard E. Ladner,et al.  The influence of caches on the performance of heaps , 1996, JEAL.

[11]  Uzi Vishkin,et al.  Parallel Ear Decomposition Search (EDS) and st-Numbering in Graphs , 1986, Theor. Comput. Sci..

[12]  Jonathan Richard Shewchuk,et al.  Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates , 1997, Discret. Comput. Geom..

[13]  Naila Rahman,et al.  Analysing Cache Effects in Distribution Sorting , 1999, Algorithm Engineering.

[14]  Rajeev Raman,et al.  An Experimental Evaluation of Hybrid Data Structures for Searching , 1999, WAE.

[15]  László Lovász,et al.  Computing ears and branchings in parallel , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[16]  Naila Rahman,et al.  Analysing the Cache Behaviour of Non-uniform Distribution Sorting Algorithms , 2000, ESA.

[17]  S. Griffis EDITOR , 1997, Journal of Navigation.

[18]  Bernard M. E. Moret,et al.  DIMACS Series in Discrete Mathematics and Theoretical Computer Science Towards a Discipline of Experimental Algorithmics , 2022 .

[19]  Jop F. Sibeyn,et al.  Better trade-offs for parallel list ranking , 1997, SPAA '97.

[20]  Peter Sanders,et al.  Accessing Multiple Sequences Through Set Associative Caches , 1999, ICALP.

[21]  Joseph JáJá,et al.  Fast, Efficient Parallel Algorithms for Some Graph Problems , 1981, SIAM J. Comput..

[22]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[23]  Margaret Reid-Miller,et al.  List ranking and list scan on the Cray C-90 , 1994, SPAA '94.

[24]  Arne Andersson Faster deterministic sorting and searching in linear space , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[25]  Michael A. Bender,et al.  Cache-oblivious B-trees , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[26]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[27]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[28]  Richard E. Ladner,et al.  Cache performance analysis of traversals and random accesses , 1999, SODA '99.

[29]  Naila Rahman,et al.  Adapting Radix Sort to the Memory Hierarchy , 2001, JEAL.

[30]  Jeffrey Scott Vitter,et al.  Optimal Dynamic Interval Management in External Memory (extended abstract). , 1996, FOCS 1996.

[31]  Steve Furber ARM System-on-Chip Architecture , 2000 .

[32]  Siddhartha Chatterjee,et al.  Towards a Theory of Cache-Efficient Algorithms ( Extended Abstract ) , 1999 .