# Alumni Dissertations and Theses

#### Filter Dissertations and Theses By:

### COMPONENT TREES FOR THE EXPLORATION OF MACROMOLECULAR STRUCTURES IN BIOLOGY

Author:Lucas OliveiraYear of Dissertation:2014Program:Computer ScienceAdvisor:Gabor HermanAbstract:Understanding the three-dimensional structure of a macromolecular complex is essential for understanding its function. A component tree is a topological and geometric image descriptor that captures information regarding the structure of an image based on the connected components determined by different grayness thresholds. This dissertation presents a novel interactive framework for visual exploration of component trees of the density maps of macromolecular complexes, with the purpose of improved understanding of their structure. The interactive exploration of component trees together with a robust simplification methodology provide new insights in the study of macromolecular structures. An underlying mathematical theory is introduced and then is applied to studying digital pictures that represent objects at different resolutions. Illustrations of how component trees, and their simplifications, can help in the exploration of macromolecular structures include (i) identifying differences between two very similar viruses, (ii) showing how differences between the component trees reflect the fact that structures of mutant virus particles have varying sets of constituent proteins, (ii) utilizing component trees for density map segmentation in order to identify substructures within a macromolecular complex, (iv) showing how an appropriate component tree simplification may reveal the secondary structure in a protein, and (v) providing a potential strategy for docking a high-resolution representation of a substructure into a low-resolution representation of whole structure.

### Service Oriented Wireless Sensor Network

Author:Carolyn OrtegaYear of Dissertation:2013Program:Computer ScienceAdvisor:Theodore BrownAbstract:Quality of service (QoS) applications and network lifetime management are key challenges in Wireless Sensor Networks (WSNs). Service-level guarantees that are achievable in Service Oriented Architecture (SOA) can ensure reliability, availability and overall usability in any application. How to create the necessary structure to implement SOA in a WSN to achieve the required levels of quality of service is still a research topic. The SOA paradigm provides the methodology to improve these qualities in WSNs. We present a strategy that uses sensors in a cluster and dynamic service selection to achieve higher levels in quality of service for WSNs, while preserving network lifetime. The realization of these strategies will be achieved through a SOA middleware layer in the WSN. The concept of SOA dynamic service selection is transformed so that it can be applied to WSNs. The transformation requires that we equate a service in SOA to functions that are performed in a WSN. In our architecture, a service or function will be achieved through a group of sensors called a sensor cluster. Quality metrics will be captured at the sensor cluster level each time it is executed within the WSN. We present sensor cluster selection algorithms that will determine the most effective and efficient sensor clusters to participate in a composite function plan. The sensor cluster selection algorithms guarantee a minimum level of quality by selecting services that meet a client's quality constraints while taking into account the network routing requirements to achieve the operation in the shortest amount of time utilizing the least amount of energy. Additionally, we present methods for implementing service selection using the multiple choice, multiple dimension knapsack (MMKP) algorithm within a composite function that has complex workflow. We make use of the WSN's MMKP selection algorithm using dynamic programming as a proof of concept to demonstrate that there is a significant improvement in the energy utilization, network lifetime and overall quality.

### SOME ENHANCEMENT METHODS FOR BACKTRACKING-SEARCH IN SOLVING MULTIPLE PERMUTATION PROBLEMS

Author:Tayfun PayYear of Dissertation:2015Program:Computer ScienceAdvisor:James CoxAbstract:In this dissertation, we present some enhancement methods for backtracking-search in solving multiple permutation problems. Some well-known NP-complete multiple permutation problems are Quasigroup Completion Problem and Sudoku. Multiple permutation problems have been getting a lot of attention in the literature in the recent years due to having a highly structured nature and being a challenging combinatorial search problem. Furthermore, it has been shown that many real-world problems in scheduling and experimental design take the form of multiple permutation problems. Therefore, it has been suggested that they can be used as a benchmark problem to test various enhancement methods for solving constraint satisfaction problems. Then it is hoped that the insight gained from studying them can be applied to other hard structured as well as unstructured problems. Our supplementary and novel enhancement methods for backtracking-search in solving these multiple permutation problems can be summarized as follows: We came up with a novel way to encode multiple permutation problems and then we designed and developed an arc-consistency algorithm that is tailored towards this modeling. We implemented five versions of this arc-consistency algorithm where the last version eliminates almost all of the possible propagation redundancy. Then we introduced the novel notion of interlinking dynamic variable ordering with dynamic value ordering, where the dynamic value ordering is also used as a second tie-breaker for the dynamic variable ordering. We also proposed the concept of integrating dynamic variable ordering and dynamic value ordering into an arc-consistency algorithm by using greedy counting assertions. We developed the concept of enforcing local-consistency between variables from different redundant models of the problem. Finally, we introduced an embarrassingly parallel task distribution process at the beginning of the search. We theoretically proved that the limited form of the Hall's theorem is enforced by our modeling of the multiple permutation problems. We showed with our empirical results that the ``fail-first" principle is confirmed in terms of minimizing the total number of explored nodes, but is refuted in terms of minimizing the depth of the search tree when finding a single solution, which correlates with previously published results. We further showed that the performance (total number instances solved at the phase transition point within a given time limit) of a given search heuristic is closely related to the underlying pruning algorithm that is being employed to maintain some level of local-consistency during backtracking-search. We also extended the previously established hypothesis, which stated that the second peak of hardness for NP-complete problems is algorithm dependent, to second peak of hardness for NP-complete problems is also search-heuristic dependent. Then we showed with our empirical results that several of our enhancement methods on backtracking-search perform better than the constraint solvers MAC-LAH and Minion as well as the SAT solvers Satz and MiniSat for previously tested instances of multiple permutation problems on these solvers.

### Theory and Applications of Outsider Anonymity in Broadcast Encryption

Author:Irippuge PereraYear of Dissertation:2015Program:Computer ScienceAdvisor:Nelly FazioAbstract:Broadcast Encryption (BE) allows efficient one-to-many secret communication of data over a broadcast channel. In the standard setting of BE, information about receivers is transmitted in the clear together with ciphertexts. This could be a serious violation of recipient privacy since the identities of the users authorized to access the secret content in certain broadcast scenarios are as sensitive as the content itself. Anonymous Broadcast Encryption (AnoBe) prevents this leakage of recipient identities from ciphertexts but at a cost of a linear lower bound (in the number of receivers) on the length of ciphertexts. A linear ciphertext length is a highly undesirable bottleneck in any large-scale broadcast application. In this thesis, we propose a less stringent yet very meaningful notion of anonymity for anonymous broadcast encryption called Outsider-Anonymous Broadcast Encryption (oABE) that allows the creation of ciphertexts that are sublinear in the number of receivers. We construct several oABE schemes with varying security guarantees and levels of efficiency. We also present two very interesting cryptographic applications afforded by the efficiency of our oABE schemes. The first is Broadcast Steganography (BS), the extension of the state of the art setting of point-to-point steganography to the multi-recipient setting. The second is Oblivious Group Storage (OGS), the introduction of fine-grained data access control policies to the setting of multi-client oblivious cloud storage protocols.

### Secure Critical Care Resource Optimization based on Heterogeneous Vital Signs

Author:Mohamed SaadYear of Dissertation:2010Program:Computer ScienceAdvisor:Bilal KhanAbstract:Preventable, in-hospital errors account for a substantial number of deaths and injuries in the United States. Various studies estimate that such deaths number between 100,000 and 200,000 each year. One of the key challenges in critical care is a legacy of existing largely wired medical networks, which due to the complexity of their constituent heterogeneous medical devices, limit the ability to optimize the allocation of medical resources such as caregivers. The absence of reliable solutions which address the interoperability of different systems inside critical care units, is principally due to market concerns, since competing vendors do not embrace data sharing standards. In this work, we present a solution that integrates heterogeneous wired legacy systems within a backward compatible wireless interconnect system, providing mobility to caregivers, and the ability to coordinate and optimize their assignment to patients. The design and architecture is able to scale as needed in terms of system load and size. We demonstrate, through simulation, that the system is able to, through the optimization of caregiver assignment, significantly reduce total patient risk within health-care institutions. A prototype implementation of the system, demonstrates that the system has great promise in real-world field deployments, and can be instrumented to be compliant with site security requirements and the HIPAA privacy act.

### Geometric Graph Theory and Wireless Sensor Networks

Author:Deniz SariozYear of Dissertation:2012Program:Computer ScienceAdvisor:Janos PachAbstract:In this work, we apply geometric and combinatorial methods to explore a variety of problems motivated by wireless sensor networks. Imagine sensors capable of communicating along straight lines except through obstacles like buildings or barriers, such that the communication network topology of the sensors is their visibility graph. Using a standard distributed algorithm, the sensors can build common knowledge of their network topology. We first study the following inverse visibility problem: What positions of sensors and obstacles define the computed visibility graph, with fewest obstacles? This is the problem of finding a minimum obstacle representation of a graph. This minimum number is the obstacle number of the graph. Using tools from extremal graph theory and discrete geometry, we obtain for every constant $h$ that the number of $n$-vertex graphs that admit representations with $h$ obstacles is $2^{o(n^2)}$. We improve this bound to show that graphs requiring $\Omega(n / \log^{2} n)$ obstacles exist. We also study restrictions to convex obstacles, and to obstacles that are line segments. For example, we show that every outerplanar graph admits a representation with five convex obstacles, and that allowing obstacles to intersect sometimes decreases their required number. Finally, we study the corresponding problem for sensors equipped with GPS. Positional information allows sensors to establish common knowledge of their communication network geometry, hence we wish to compute a minimum obstacle representation of a given straight-line graph drawing. We prove that this problem is NP-complete, and provide a $O(\log OPT)$-approximation algorithm by bounding the Vapnik-Chervonenkis dimension for the corresponding hypergraph family.

### Information Transmission in Communication Games: Signaling with an Audience

Author:Farishta SatariYear of Dissertation:2013Program:Computer ScienceAdvisor:Rohit ParikhAbstract:Communication is a goal-oriented activity where interlocutors use language as a means to achieve an end while taking into account the goals and plans of others. Game theory, being the scientific study of strategically interactive decision-making, provides the mathematical tools for modeling language use among rational decision makers. When we speak of language use, it is obvious that questions arise about what someone knows and what someone believes. Such a treatment of statements as moves in a language game has roots in the philosophy of language and in economics. In the first, the idea is prominent with the work of Strawson, later Wittgenstein, Austin, Grice, and Lewis. In the second, the work of Crawford, Sobel, Rabin, and Farrell. We supplement the traditional model of signaling games with the following innovations: We consider the effect of the relationship whether close or distant among players. We consider the role that ethical considerations may play in communication. And finally, in our most significant innovation, we introduce an audience whose presence affects the sender's signal and/or the receiver's response. In our model, we no longer assume that the entire structure of the game is common knowledge as some of the priorities of the players and relationships among some of them might not be known to the other players.

### Optimization Problems in Sensor Network Data Collection

Author:Simon ShamounYear of Dissertation:2011Program:Computer ScienceAdvisor:Amotz Bar-NoyAbstract:Data collection is one of the most important tasks of many sensor networks. The data collected by sensors is used to monitor and analyze various systems, such as volcanoes, forests, and bridges. Large scale wireless sensor networks can provide timely access to a wealth of data, but obtaining this data is challenged by various resource constraints. This thesis proposes and analyzes solutions to three optimization problems that arise from the conflict between data collection and resource constraints: (1) maximize coverage by a set of sensors when the coverage they provide varies with location; (2) select a subset of the sensors, within some budget constraint, that best predict the data streams produced by all the sensors in the network; and (3) minimize the cost needed to find the top ranking sensor readings according to some criteria. The analyses of these problems use three different views of a sensor network: a coverage-centric view, in which each sensor is valued for its coverage ability; a data-centric view, in which each sensor is valued for the data it provides; and an agent-centric view, in which each sensor is viewed as an independent agent with information of value to the application. By choosing an appropriate view of the network, it is possible to separate the analysis from implementation details and apply well-established techniques from other domains to the problem solution. In this case, methodologies from stochastic and computational geometry, graph theory, and search theory are applied to the respective problems. This thesis presents optimal solutions to the coverage and search problems, approximation bounds on the best possible solution to the selection problem, and quantitative comparisons to alternative solutions to each problem in synthetic environments.

### ON THE PROBLEM OF PACKING STEINER TREES OF A GRAPH

Author:MOHAMMAD TALAFHAYear of Dissertation:2010Program:Computer ScienceAdvisor:LOUIS PETINGIAbstract:On The Problem of Packing Steiner Trees of a Graph Let G = (V, E) be an undirected graph with vertex-set V, edge-set E, and a subset of distinguished terminal vertices K of V, where |K| ≥ 2 (the vertices in V - K are called the non-terminal vertices). The degree of a vertex v of G is the number of edges incident at v in G. A K-Steiner tree T of G is a tree containing the terminal vertex-set K, and any vertex of degree one in T must belong to K. The K-edge-connectivity, denoted as λ

_{K}(G), of a connected graph G with terminal vertex-set K, is the minimum number of edges whose removal from G disconnects at least two vertices of K. The Steiner Tree Packing problem (STPP for short) is the problem of finding the maximum number of edge-disjoint K-Steiner trees, t_{K}(G), contained in G. Specifically, in this thesis, we are interested in finding a lower bound on t_{K}(G) with respect to the K-edge-connectivity, λ_{K}(G). The Steiner Tree Packing problem has attracted considerable attention from many researchers in different areas because of its wide applicability. It has applications in routing problems arising in VLSI circuit design, where an effective way of sharing different signals among cells in a circuit can be achieved by the use of edge-disjoint Steiner trees. It also has a variety of computer network applications such as multicasting, video-conferencing, and network information flow where simultaneous communications can be facilitated by using edge-disjoint Steiner trees. In 2003, Kriesell conjectured that any graph G = (V, E) with terminal vertex-set with subset of terminal vertices K of V has at least λ_{K}(G) / 2 edge-disjoint K-Steiner trees. In this thesis we give a background on this problem and we present most of the major results obtained so far. Moreover, we show that Kriesell's conjecture can be answered affirmatively if the edges of G can be partitioned into K-Steiner trees. This result yields bounds for the problem of packing K-Steiner trees with certain intersection properties in a graph. In addition we show that for any graph G with terminal vertex-set K, ; t_{K}(G) ≥ λ_{K}(G) - |V - K| / 2 - 1. Moreover, we study a vulnerability index called the edge-toughness, which connects the connectivity of a network with the problem of packing Steiner trees of a graph.### SEMIPARAMETRIC TEMPORAL CLUSTERING

Author:Suzanne TamangYear of Dissertation:2013Program:Computer ScienceAdvisor:Simon ParsonsAbstract:Although temporal data provides critical context for many real-world reasoning tasks, incorporating the temporal dimension into an analysis can present methodological challenges. Traditional methods from statistics are limited in their ability to process noisy, large-scale secondary data sources. Data mining approaches are better suited for these types of problems, but have primarily focused on static data sets. However, few real-world data sets are static, or measure stationary phenomena; rather, they are dynamic. To facilitate the meaningful use of abundant, unlabeled temporal data, I develop a new temporal clustering method that can assist in the preprocessing, exploration, and discovery of new knowledge from secondary data sources that are subject to arbitrary sampling schemes, and contain observation sequences of different durations. My approach builds on the semiparametric time series clustering framework, which has demonstrated clear benefits over fully parametric, or fully non-parametric methods. The framework combines beneficial parametric assumptions, such as the Markov or hidden-state assumption, to model temporal systems, with a more agnostic, nonparametric approach for clustering the embedded models. Using digital health data as a case study, I broaden the range of scenarios for which semiparametric clustering can be successfully applied. Specifically, I develop a method to use a state-of-the-art continuous-time Bayesian network to more naturally represent temporal information, addressing limitations of discrete- time methods. Also, as an alternative to spectral methods I pair model-based abstraction with a nonparametric Bayesian clustering technique that allows k to be expressed as a function of the size and complexity of the patient population, avoiding the requirement to prespecify the number of clusters using a heuristic. To demonstrate the ability of this approach to produce meaningful results, clusters are evaluated using intrinsic and extrinsic validation. In addition, I compare cluster assignments with that of temporal clustering systems reported in the research literature, showing a 20% relative improvement over the best system's performance and recognizable differences among the patient clusters that are detected.