Alumni Dissertations and Theses

 
 

Alumni Dissertations and Theses

Filter Dissertations and Theses By:

 
 
  • LIGHTWEIGHT 3D MODELING OF URBAN BUILDINGS FROM RANGE DATA

    Author:
    Weihong Li
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    George Wolberg
    Abstract:

    Laser range scanners are widely used to acquire accurate scene measurements. The massive point clouds they generate, however, present challenges to efficient modeling and visualization. State-of-the-art techniques for generating 3D models from voluminous range data is well-known to demand large computational and storage requirements. In this thesis, attention is directed to the modeling of urban buildings directly from range data. We present an efficient modeling algorithm that exploits a priori knowledge that buildings can be modeled from cross-sectional contours using extrusion and tapering operations. Inspired by this simple workflow, we identify key cross-sectional slices among the point cloud. These slices capture changes across the building facade along the principal axes. Standard image processing algorithms are used to remove noise, fill holes, and vectorize the projected points into planar contours. Applying extrusion and tapering operations to these contours permits us to achieve dramatic geometry compression, making the resulting models suitable for web-based applications such as Google Earth or Microsoft Virtual Earth. This work has applications in architecture, urban design, virtual city touring, and online gaming. We present experimental results on the exterior and interior of urban building datasets to validate the proposed algorithm.

  • STRUCTURE-BASED SEARCH TO SOLVE CONSTRAINT SATISFACTION PROBLEMS

    Author:
    Xingjian Li
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    Susan Epstein
    Abstract:

    Constraint satisfaction is a paradigm that applies to many challenging real-world tasks with direct practical relevance, such as planning and scheduling. It models these tasks as constraint satisfaction problems (CSPs) and then solves them with search. Compared to artificially-generated CSPs, real-world CSPs are more likely to have non-random structure. This dissertation shows the difficulties encountered by traditional and state-of-the-art search techniques when the structure of such a problem is overlooked. Here, a novel hybrid search algorithm for CSPs combines two fundamental search paradigms (global search and local search) to solve structured CSPs. It first uses local search to detect crucial structure (the structure identification stage), and then applies that structural knowledge to solve the problem by global search (the search stage). In the structure identification stage, we adapt a local search metaheuristic to find crucial structure within a CSP. This dissertation presents different structure detection methods based on different metrics. Static metrics from the original problem are sometimes available without any cost and are also straightforward to use, but they may not represent the real challenges within the problem. When dynamic metrics reveal the true crucial structure, they provide better guidance for search. On easier problems, however, this benefit does not justify the additional computational cost of the dynamic metrics. In the search stage, the structure identified in the previous stage guides global search for a solution. This dissertation presents various approaches to exploit this structure, which is presumably the most difficult area for search. Under this premise, search can either address the identified structure earlier than other areas of the problem, or allow the structure to direct inference, a technique that reduces search space for global search. In addition, this dissertation presents a new visualization tool for binary CSPs and the structures identified by our algorithms. The tool inspires, supports, and verifies the design and improvement of structure-based search. On benchmark and real-world CSPs, structured-based search not only improves search performance, but also provides users with direct explanations and visualization of the inherent challenges in each problem.

  • LIGHTWEIGHT 3D MODELING OF URBAN BUILDINGS FROM RANGE DATA

    Author:
    Weihong Li
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    George Wolberg
    Abstract:

    Laser range scanners are widely used to acquire accurate scene measurements. The massive point clouds they generate, however, present challenges to efficient modeling and visualization. State-of-the-art techniques for generating 3D models from voluminous range data is well-known to demand large computational and storage requirements. In this thesis, attention is directed to the modeling of urban buildings directly from range data. We present an efficient modeling algorithm that exploits a priori knowledge that buildings can be modeled from cross-sectional contours using extrusion and tapering operations. Inspired by this simple workflow, we identify key cross-sectional slices among the point cloud. These slices capture changes across the building facade along the principal axes. Standard image processing algorithms are used to remove noise, fill holes, and vectorize the projected points into planar contours. Applying extrusion and tapering operations to these contours permits us to achieve dramatic geometry compression, making the resulting models suitable for web-based applications such as Google Earth or Microsoft Virtual Earth. This work has applications in architecture, urban design, virtual city touring, and online gaming. We present experimental results on the exterior and interior of urban building datasets to validate the proposed algorithm.

  • STRUCTURE-BASED SEARCH TO SOLVE CONSTRAINT SATISFACTION PROBLEMS

    Author:
    Xingjian Li
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    Susan Epstein
    Abstract:

    Constraint satisfaction is a paradigm that applies to many challenging real-world tasks with direct practical relevance, such as planning and scheduling. It models these tasks as constraint satisfaction problems (CSPs) and then solves them with search. Compared to artificially-generated CSPs, real-world CSPs are more likely to have non-random structure. This dissertation shows the difficulties encountered by traditional and state-of-the-art search techniques when the structure of such a problem is overlooked. Here, a novel hybrid search algorithm for CSPs combines two fundamental search paradigms (global search and local search) to solve structured CSPs. It first uses local search to detect crucial structure (the structure identification stage), and then applies that structural knowledge to solve the problem by global search (the search stage). In the structure identification stage, we adapt a local search metaheuristic to find crucial structure within a CSP. This dissertation presents different structure detection methods based on different metrics. Static metrics from the original problem are sometimes available without any cost and are also straightforward to use, but they may not represent the real challenges within the problem. When dynamic metrics reveal the true crucial structure, they provide better guidance for search. On easier problems, however, this benefit does not justify the additional computational cost of the dynamic metrics. In the search stage, the structure identified in the previous stage guides global search for a solution. This dissertation presents various approaches to exploit this structure, which is presumably the most difficult area for search. Under this premise, search can either address the identified structure earlier than other areas of the problem, or allow the structure to direct inference, a technique that reduces search space for global search. In addition, this dissertation presents a new visualization tool for binary CSPs and the structures identified by our algorithms. The tool inspires, supports, and verifies the design and improvement of structure-based search. On benchmark and real-world CSPs, structured-based search not only improves search performance, but also provides users with direct explanations and visualization of the inherent challenges in each problem.

  • ANALYSIS OF DNA MOTIFS IN THE HUMAN GENOME

    Author:
    Yupu Liang
    Year of Dissertation:
    2013
    Program:
    Computer Science
    Advisor:
    Dina Sokol
    Abstract:

    DNA motifs include repeat elements, promoter elements and gene regulator elements, and play a critical role in the human genome. This thesis describes a genome-wide computational study on two groups of motifs: tandem repeats and core promoter elements. Tandem repeats in DNA sequences are extremely relevant in biological phenomena and diagnostic tools. Computational programs that discover tandem repeats generate a huge volume of data, which can be difficult to decipher without further organization. A new method is presented here to organize and rank detected tandem repeats through clustering and classification. Our work presents multiple ways of expressing tandem repeats using the n-gram model with different clustering distance measures. Analysis of the clusters for the tandem repeats in the human genome shows that the method yields a well-defined grouping in which similarity among repeats is apparent. Our new, alignment-free method facilitates the analysis of the myriad of tandem repeats replete in the human genome. We believe that this work will lead to new discoveries on the roles, origins, and significance of tandem repeats. As with tandem repeats, promoter sequences of genes contain binding sites for proteins that play critical roles in mediating expression levels. Promoter region binding proteins and their co-factors influence timing and context of transcription. Despite the critical regulatory role of these non-coding sequences, computational methods to identify and predict DNA binding sites are extremely limited. The work reported here analyzes the relative occurrence of core promoter elements (CPEs) in and around transcription start sites. We found that out of all the data sets 49\%-63\% upstream regions have either TATA box or DPE elements. Our results suggest the possibility of predicting transcription start sites through combining CPEs signals with other promoter signals such as CpG islands and clusters of specific transcription binding sites.

  • Feature Selection for Error Detection and Recovery in Spoken Dialogue Systems

    Author:
    Tiziana Ligorio
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    Susan Epstein
    Abstract:

    Spoken dialogue between people and machines is increasingly common, but not as flexible and complex as that between people. Spoken dialogue is susceptible to error - human speech is often loosely structured, people change their mind at mid-sentence, repeat themselves, pause, and produce non-speech sounds. A spoken dialogue system expected to handle complex and flexible dialogue like that between people must be robust to error, and employ strategies for error detection and recovery. People recover from understanding error during spoken dialogue with little effort, and without disruption of the dialogue flow. To study how people would handle the same kind of errors that a spoken dialogue system must contend with, the work recounted here embedded a person within a spoken dialogue system with noisy speech recognition, that is, the person and the system processed information concurrently. The person was also ablated, that is, had input and output restricted to that of a spoken dialogue system. Run-time system features were collected to train data-driven models of human error detection and recovery. Our results indicate that people have successful dialogue interactions even with very high levels of noise, and they rely heavily on context and questions to disambiguate noisy speech recognition. Empirical results also demonstrate that data-driven models of human error detection and recovery strategies can be obtained from system features alone. These models can draw from a rich and varied collection of system features across all levels of spoken language processing, and careful selection from among them is essential for effective modeling. To this end, this research developed two novel feature selection algorithms, one general and one knowledge-guided. They effectively and efficiently identify features for a particular machine learning algorithm and data set, and, guided by domain knowledge, they support adequate learning of error detection and recovery models. The resulting models of human error detection and recovery were then implemented within a spoken dialogue system. Empirical results indicate these strategies support dialogue management to address speech recognition noise, and improve overall task success. Our results demonstrate that successful dialogue with a spoken dialogue system is possible despite high levels of understanding error.

  • Data-Driven Synthesis of Animations of Spatially Inflected American Sign Language Verbs Using Human Data

    Author:
    Pengfei Lu
    Year of Dissertation:
    2014
    Program:
    Computer Science
    Advisor:
    Matt Huenerfauth
    Abstract:

    Techniques for producing realistic and understandable animations of American Sign Language (ASL) have accessibility benefits for signers with lower levels of written language literacy. Previous research in sign language animation didn't address the specific linguistic issue of space use and verb inflection, due to a lack of sufficiently detailed and linguistically annotated ASL corpora, which is necessary for modern data-driven approaches. In this dissertation, a high-quality ASL motion capture corpus with ASL-specific linguistic structures is collected, annotated, and evaluated using carefully designed protocols and well-calibrated motion capture equipment. In addition, ASL animations are modeled, synthesized, and evaluated based on samples of ASL signs collected from native-signer animators or from signers recorded using motion capture equipment. Part I of this dissertation focuses on how an ASL corpus is collected, including unscripted ASL passages and ASL inflecting verbs, signs in which the location and orientation of the hands is influenced by the arrangement of locations in 3D space that represent entities under discussion. Native signers are recorded in a studio with motion capture equipment: cyber-gloves, body suit, head tracker, hand tracker, and eye tracker. Part II describes how ASL animation is synthesized using our corpus of ASL inflecting verbs. Specifically, mathematical models of hand movement are trained on animation data of signs produced by a native signer. This dissertation work demonstrates that mathematical models can be trained and built using movement data collected from humans. The evaluation studies with deaf native signer participants show that the verb animations synthesized from our models have similar understandability in subjective-rating and comprehension-question scores to animations produced by a human animator, or to animations driven by a human's motion capture data. The modeling techniques in this dissertation are applicable to other types of ASL signs and to other sign languages used internationally. These models' parameterization of sign animations can increase the repertoire of generation systems and can automate the work of humans using sign language scripting systems.

  • Efficient Processing of Very Large XML Documents in Small Space

    Author:
    Matthew Meyer
    Year of Dissertation:
    2012
    Program:
    Computer Science
    Advisor:
    Ira Rudowsky
    Abstract:

    The focus of this research was to develop a highly efficient serialized data structure for use in Document Object Model (DOM) query systems of Extensible Markup Language (XML) documents, which are increasingly the dominant model for data representation across the World Web. The space efficiency of the serialized data structure developed is shown to improve query response time and eliminates the need to rebuild the DOM representation each time it is desired to query an existing XML document. Moreover, this serialized data structure enables DOM modeling of extremely large XML documents at a low and fixed memory cost. New algorithms and software tools for working with these data structures have been developed and this allows for compatibility with standard DOM modeling. The structures, tool and techniques presented can be readily adapted for use on the Semantic Web as it becomes a more prominent feature of the internet. The improved performance of the XML database storage and retrieval application developed in this thesis helps close the gap between relational database application performance and XML Document (NXD) database application performance. In addition, the novel DOM querying technique that we have developed in this thesis has potential application on hand held devices, autonomous robots, and other devices which have limited memory.

  • The Computational Complexity of Some Games and Puzzles With Theoretical Applications

    Author:
    Vasiliki Mitsou
    Year of Dissertation:
    2014
    Program:
    Computer Science
    Advisor:
    Amotz Bar-Noy
    Abstract:

    The subject of this thesis is the algorithmic properties of one- and two-player games people enjoy playing, such as Sudoku or Chess. Questions asked about puzzles and games in this context are of the following type: can we design efficient computer programs that play optimally given any opponent (for a two-player game), or solve any instance of the puzzle in question? We examine four games and puzzles and show algorithmic as well as intractability results. First, we study the wolf-goat-cabbage puzzle, where a man wants to transport a wolf, a goat, and a cabbage across a river by using a boat that can carry only one item at a time, making sure that no incompatible items are left alone together. We study generalizations of this puzzle, showing a close connection with the Vertex Cover problem that implies NP-hardness as well as inapproximability results. Second, we study the SET game, a card game where the objective is to form sets of cards that match in a certain sense using cards from a special deck. We study single- and multi-round variations of this game and establish interesting con- nections with other classical computational problems, such as Perfect Multi- Dimensional Matching, Set Packing, Independent Edge Dominating Set, and Arc Kayles. We prove algorithmic and hardness results in the classical and the parameterized sense. Third, we study the UNO game, a game of colored numbered cards where players take turns discarding cards that match either in color or in number. We extend results by Demaine et. al. (2010 and 2014) that connected one- and two-player generaliza- tions of the game to Edge Hamiltonian Path and Generalized Geography, proving that a solitaire version parameterized by the number of colors is fixed param- eter tractable and that a k-player generalization for k greater or equal to 3 is PSPACE-hard. Finally, we study the Scrabble game, a word game where players are trying to form words in a crossword fashion by placing letter tiles on a grid board. We prove that a generalized version of Scrabble is PSPACE-hard, answering a question posed by Demaine and Hearn in 2008.

  • A Tiling Approach to Producing High Quality Solutions to Real World Timetabling Problems

    Author:
    Douglas Moody
    Year of Dissertation:
    2011
    Program:
    Computer Science
    Advisor:
    Amotz Bar-Noy
    Abstract:

    The problem of constructing timetables in a fast and efficient manner is faced by a wide variety of scheduling practitioners. These timetables may specify the weekly calendar of university courses for a semester or contain the dates and times of games for a sports league. Current areas of research in timetabling have focused on problem instances that are small and often not representative of the real world challenges. Hence, the solution approaches proposed often do not solve the needs of the scheduling practitioner. This thesis describes a robust and efficient technique using a new approach, referred to as "tiling, "to provide high quality solutions for a variety of timetabling problems. The tiling method is an efficient and flexible method capable of managing the size and complexity of real world timetables. The tiles in this method represent conjectures about the relationship among the events to be scheduled within a timetabling problem. A constructive approach is used to create an initial solution in phase one of the tiling method. A second phase of local search is used to tune the solution. We demonstrate the flexibility of the approach through its application to large and small problems in various timetabling areas. By applying our method to problems in previous studies, we are able to demonstrate how our tiles are found within the best known solutions to date, and are comparable in quality with respect to the objective function. Yet the tiling approach only uses a small fraction of the computing resources used in the best approaches. Using our tiling approach, we directly address real world timetables to demonstrate its adaptability to real world scenarios. For example, the tiling approach has successfully been applied to sports leagues containing over five hundred teams for a youth sports league. In addition to the number of teams involved, this scheduling problem has other layers of complexity not previously addressed in other studies. Importantly, these leagues are more numerous than the professional sports, which have been the subject of most research to date. We introduce and define a new sports scheduling problem to encourage the further study of more common real world timetabling challenges.