Title page for ETD etd-03302011-195620


Document Type Doctoral Thesis
Author Watson, Bruce William
Email bruce@fastar.org
URN etd-03302011-195620
Document Title Constructing minimal acyclic deterministic finite automata
Degree PhD
Department Computer Science
Supervisor
Advisor Name Title
Prof D G Kourie Supervisor
Keywords
  • acyclic finite automata
  • minimization
  • taxonomy
  • algorithmics
  • correctness-by-construction
Date 2011-04-18
Availability unrestricted
Abstract
This thesis is submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Ph.D) in the FASTAR group of the Department of Computer Science, University of Pretoria, South Africa. I present a number of algorithms for constructing minimal acyclic deterministic finite automata (MADFAs), most of which I originally derived/designed or co-discovered. Being acyclic, such automata represent finite languages and have proven useful in applications such as spellchecking, virus-searching and text indexing. In many of those applications, the automata grow to billions of states, making them difficult to store without using various compression techniques the most important of which is minimization. Results from the late 1950s show that minimization yields a unique automaton (for a given language), and later results show that minimization of acyclic automata is possible in time linear in the number of states. These two results make for a rich area of algorithmics research; automata and algorithmics research are relatively old fields of computing science and the discovery/invention of new algorithms in the field is an exciting result.

I present both incremental and nonincremental algorithms. With nonincremental techniques, the unminimized acyclic deterministic finite automaton (ADFA) is first constructed and then minimized. As mentioned above, the unminimized ADFA can be very large indeed often even too large to fit within the virtual memory space of the computer. As a result, incremental techniques for minimization (i.e. the ADFA is minimized during its construction) become interesting. Incremental algorithms frequently have some overhead: if the unminimized ADFA fits easily within physical memory, it may still be faster to use nonincremental techniques.

The presentation used in this thesis has a few unusual characteristics:

  • Few other presentations follow a correctness-by-construction style for presenting and deriving algorithms. The presentations given here include correctness arguments or sketches thereof.
  • The presentation is taxonomic emphasizing the similarities and differences between the algorithms at a fundamental level.
  • While it is possible to present these algorithms in a formal-language-theoretic setting, this thesis remains somewhat closer to the actual implementation issues.
  • In several chapters, new algorithms and interesting new variants of existing algorithms are presented.
  • It gives new presentations of many existing algorithms all in a common format with common examples.
  • There are extensive links to the existing literature.

2010 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.

Please cite as follows:

Watson, BW 2010, Constructing minimal acyclic deterministic finite automata, PhD thesis, University of Pretoria, Pretoria, viewed yymmdd < http://upetd.up.ac.za/thesis/available/etd-03302011-195620/ >

C11/90/ag

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  thesis.pdf 883.30 Kb 00:04:05 00:02:06 00:01:50 00:00:55 00:00:04

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact UPeTD.