Browsing by Author "Wonnacott, David G."
Now showing 1 - 20 of 35
Results Per Page
Sort Options
- ItemAn Empirical Study of the Performance of Scalable Locality on a Distributed Shared Memory System(2011) Douglas, Tim; Wonnacott, David G.We run an empirical study to determine the performance attainable using scalable locality on a distributed shared memory system. We utilize the PLUTO automatic parallelizer and locality optimizer on a one-dimensional Jacobi stencil with a distributed shared memory system provided by Intel’s Cluster OpenMP on commodity hardware. Unlike earlier works which base performance concerns on tile size, we discover tile sizes are less crucial, and instead threading issues and cache interference need particular attention. On our 56 processor cluster of 14 quad-core machines, with preliminary single-level tiling we are able to realize 121.3 GFLOPS, or 63% of linear speedup. Our data suggest that adding further cores would increase performance linearly.
- ItemAnalytical Computer Generated Holography and Vector Graphics(2012) McAllister, Takumi; Wonnacott, David G.In this paper, I investigate the problem of generating holograms with computers. Holograms are developing 3D display technology, superior to modern stereoscopic systems. After an introduction to the field, I survey some existing methods (Ping Pong, Coherent Ray Tracing, Diffraction Specific). Next, I examine two analytical algorithms in closer detail. Finally, I consider building on the analytical methods to make vector based holograms, and examine their possible merit. Upon examining a theoretical algorithm, I conclude that without further advances in vector displays, a vector hologram would be impractical.
- ItemAnalyzing Iteration-Space Performance Tuning Using the Deriche Recursive Filtering Algorithm(2018) Glaser, Mary; Wonnacott, David G.No matter how efficient or satisfactory an algorithm or program may be, we as computer scientists are always looking to improve our programs by means of speeding them up or making them more accurate. In this paper, we will focus on finding the best way to optimize programs by using iteration-space performance tuning methods. We hope to contribute towards the bigger picture of understanding whether it is possible to do iteration-space performance tuning by studying a version of the Deriche Recursive Filtering Algorithm in C, which applies edge detection and smoothing to 2D images. Among the many different existing types of optimization techniques, we will be focusing on classic optimization and iterator-based optimization. With Deriche, we will first compare runtime results of the original code in C with the performance of classic optimization techniques and PluTo (a specific type of iteration-space optimization). Then we will convert the Deriche algorithm to the language Chapel and apply more direct iterator-based optimization techniques to further study its performance.
- ItemAutomated Correctness Checking of Iteration Space Transformations Expressed As Chapel Iterators(2020) Mohr, Henry; Wonnacott, David G.In certain types of programs, re-ordering their execution can speed them up by optimizing memory accesses and parallelism. However, these re-orderings are constrained by the fact that updates depend on the results of earlier iterations. It is easy to generate and check the validity of simple re-orderings, but more complex cases are beyond the reach of existing tools. For these cases, programmers can hand-write the re-orderings, but we need to be able to automatically verify them to avoid programmer error. We propose two potential methods for verifying their correctness.
- ItemAutomatic Correctness Checking for Affine Constrained Loop Tiling Transformations(2020) Rivera, Russell; Wonnacott, David G.The growth in speed seen in modern processors over the past couple decades has created a need for programs that carefully manage the data fed to CPUs. Often, it is the case the processors are so fast, the memory used to supply them with information cannot keep up. Advanced iteration space transformations for stencil computations have found clever ways of parallelizing computationally heavy workloads while managing memory bandwidth through methods such as loop tiling, which streamline cache usage. Additionally, these methods offer scalable parallelism with respect to problem size. While automatic transformations for complex loop structures are still being researched, the same mathematical framework for checking the correctness of programmer implemented transformations should naturally enable such automation. This paper offers insight into the literature surrounding the Polyhedral Model, Iteration Space Transformations, and the mathematical background on which they are formed. Later, this paper seeks to motivate an investigation of whether non-symbolically-tiled, parameterized iteration space transformations, such as diamond tiling, can be verified for correctness with existing mathematics and programming techniques.
- ItemBeginners Understanding of Object Oriented Programming(2022) Majoni, Pelagia; Wonnacott, David G.Problems beginners face with understanding Object-Oriented Programming (OOP) concepts have been investigated and reported in a number of studies. In this paper, we present the results of comparing and contrasting the methods and results of eight studies on novice's misconceptions of OOP. We identify the types of assessment each study used the teaching methods researchers used and finally the types of content students in different studies were expected to produce and compared and contrasted the methodology and results of each of the researchers. In these studies, students were asked to draw diagrams depicting the notional machine. Researchers defined systematic ways to analyze student's drawings for OOP misconceptions. The main findings highlighted that a significant number of students fail to distinguish between classes and objects.
- ItemComputer Vision and Its Application in Self-Driving Cars(2012) Cutilli, Benjamin; Eaton, Eric; Wonnacott, David G.This thesis discusses the relationship between computer vision and self-driving cars. It discusses the status of fields of computer vision and self-driving cars, at the time of the thesis' writing, and it also proposes a method of using computer vision in high-performance cars. The thesis also contains code that was written based on this method, and the results of the code are shown in the thesis.
- ItemConcurrency in Multi-Core Processor Design(2007) Clancy, Patrick; Wonnacott, David G.It is possible to extend a microprocessor from a single core to a multiple cores by replicating the single core processor, and interfacing them to main memory bus via a bus arbitrator. With a multi-core processor, the possibilities for parallel programs are apparent, but the programmer must overcome those obstacles with lock based programming. Transactional Memory would provide a means for programmers to handle highly concurrent programming in a more forgiving environment. This thesis surveys these topics and discusses possible implementations in terms of the HERA architecture.
- ItemDynamic Balancing of Scientific HPC Applications on Institutional Clusters(2020) Lyle, Mac; Dougherty, John P.; Wonnacott, David G.In an age of ever expanding data sets, there is an increasing demand for high performance computing (HPC) in the scientific community. In order to maximize the performance of existing hardware, researchers have been looking for innovative and cost effective ways to expand and/or optimize their clusters. This thesis first provides a quick summary on the traditional methods that are currently in use in institutional clusters followed by some of the new services being offered to help researchers expand their computational resources. Most importantly, three recent papers propose new ways to optimize current hardware to increase total job throughput and utilization of a cluster. The methods proposed take advantage of dynamic allocation, both on the application level and the system level. The first paper (Nathaniel Kremer Herman and Thain 2018) finds that it is possible to dynamically size master-worker applications, freeing up significant resources for other applications to run on the system. The work done in (Feng Liu 2018) offers a method to integrate on-demand HPC requests into a traditional batch system. This created a massive boost in system throughput and reduction in batch wait time. The final paper (Suraj Prabhakaran 2015) looks into the potential of a new type of malleable job that can be run on HPC systems. By integrating these jobs into the batch scheduler, total system throughput was increased and lays a foundation for potential future workflows. This review is followed by the methods and results of a comparison between parallel jobs in the Chapel language and the MPI implementation. During my experience, Chapel was much easier to learn and to implement however, when scaled, it couldn't compare to the speed-up of MPI.
- ItemEfficient Handling of Dependence Analysis for Arrays(2001) Seater, Robert; Wonnacott, David G.In this paper, I discuss what array dependence analysis is, and how we can phrase questions about dependencies as questions about integer feasibility. I present the straight forward approaches to dealing with equalities, inequalities, and disequalities. I also give a sampling of the more subtle approaches to each; with a particular focus on improving Fourier's Method and on the LI(2)-unit subdomain. Furthermore, I present the workings of a new approach to reduce the exponential behavior of disequality analysis.
- ItemEvaluating Inherited Attributes Using Haskell and Lazy Evaluation(2005) Moss, William B.; Wonnacott, David G.Parser generators designed for imperative programming languages (C++, Java, etc), such as YACC and CUPS, inevitably run into difficulty handling inherited attributes. In the past, this difficulty has been handled in one of two ways: 1) the problem is ignored, and inherited attributes must be manually calculated later using extra passes over the parse tree; and 2) for the L-attributed class of grammars, the parser generator provides some sort of ”hack,” allowing the user to access previous elements of the stack with no safeguards preventing a user from accessing incorrect data. Many functional programming languages, however, provide an easy solution to this problem. Using lazy evaluation, a parser generator written for a function programming language can sidestep the entire problem, relying on the language system to evaluate attributes (regardless of whether they are inherited or synthesized) as it needs them. This paper discusses a method for evaluating inherited and synthesized attributes in Haskell, specifically using Happy, a parser generator for Haskell. This paper will present specific examples of parsing inherited attributes in both functional and imperative languages, followed by the presentation of an algorithm for computing any inherited attribute in Happy.
- ItemExploring Functional Graph Representations for Max Flow: The Abstraction of Functional Data Structure Design Techniques(2023) Weinstock, Harrison; Wonnacott, David G.This paper investigates the techniques involved in the design of functional data structures and how they are realized in a functional implementation of the Ford-Fulkerson max flow method. Specifically, we focus on functional graph representations and how some of the initial issues can be handled safely without affecting user facing code. We examine the difficulty the functional paradigm presents in working with cyclic structure and how this shows up in graph traversals. We review Erwig's induction graphs and Mokhov's algebraic graphs to see two potential solutions to the same problem of handling cyclic structures in a way that allows for easy traversal. Finally, we analyze functional implementations of the max-flow algorithm to exemplify how the tools for abstraction used in the data structure design carry over to an important and common algorithm implementation.
- ItemFactors Affecting Stencil Code Performance(2016) Zhou, Ting; Wonnacott, David G.In the field of scientific computation, loop tiling is an indispensable technique for improving cache performance, and thereby the overall performance of the code. Research so far has predominantly been focusing on optimizing the code of a particular tiling choice, under a specific problem setting. In this thesis, I wish to both statistically explore the most important factors in different tiling scenarios, as well as the role the problem parameters may play when making tiling decisions.
- ItemFactors Affecting the Performance of Tiled Numerical Codes(2014) Radwan, Besan A.; Wonnacott, David G.The topic I am investigating is High Performance Computing. I am investigating the factors affecting the outcome of speed of tiled dense array programs. The significance and applications of parallel computing are countless. As the field expands, different approaches are taken to optimize performance in HPC. It is paramount to not only investigate methods for continued innovation in the field, but to look at the methods already out there and how a combination of these methods can provide promising results. The ISCC software transformation tool was used with different tile arrangements to investigate which factors affected the speed at which programs could be run sequentially or in a parallel. In addition to this, the tile arrangements and distribution with different tile sizes were tested. The results obtained were significant and showed that using the certain combinations of tile shape and iteration performed a significant amount better than other tile arrangements.
- ItemFunctional Graph Algorithms in Syntax-directed Translation Part I Literature Review(2017) Yang, Qin; Wonnacott, David G.This paper is a literature review that examines applications of graph algorithms written in functional languages in attributes computation in syntaxdirected translation. Specifically, three paradigms of graph algorithms in the current literature are explored here: 1. Taking advantage of language features such as monads and lazy evaluations, 2. Inductive graphs, and 3. Structured graphs. This paper concludes the approach using inductive graphs seems to have the greatest advantage of all three, especially in operations over dependency graph.
- ItemHybrid Hexagonal/Classical Tiling for GPUs with Chapel Iterators(2020) Wu, Katherine; Wonnacott, David G.Iterative stencil computations are important in scientific computing and in the mobile domain among others. Tiling is an important method to improve computational efficiency in GPUs, enhancing data reuse and cache memory usage. Diamond tiling and hexagonal tiling both enable concurrent starts. While there are multiple combinations of different tilings, and Grosser et al. have proposed a diamond and hexagonal tiling, we will attempt to construct a combine tiling that uses Chapel parallel iterators.
- ItemImplementation and Utilization of Hardware Parallelism in Modern CPU Architectures(2024) Reed, Justin; Wonnacott, David G.For at least the past three decades, nearly all computer architectures have made use of pipelining, a concept that has proven indispensable in improving the efficiency of computer processors, regardless of the application. Pipelining improves the throughput of an instruction sequence by running instructions in mutually independent stages, thus achieving what is called instruction level parallelism (ILP). This paper provides an overview of how ILP has been utilized to develop more efficient processors, reviewing a selection of papers influential to its use. To this end, the tradeoffs between out-of-order and in-order execution are considered. Finally, the most recent work being done to improve instruction level parallelism is discussed.
- ItemLaziness in C++(2015) Mendelsohn, Richard; Wonnacott, David G.The goal of this thesis is to create a suitable standard for evaluating lazy evaluation techniques, and to use this standard to compare existing methods of lazy evaluation in C++. For our standard we will create a rigorous definition of what lazy evaluation should entail, and a set of varied benchmarks to test the efficiency of these distinct approaches. The methods we will analyze will include techniques using template meta programming, operator overloading coupled with unique object creation, and C++ 11 lambda expressions. We will also take into account existing libraries such as Boost and FC++. Our thesis should be able to act as a resource to correctly select the optimal method for lazy evaluation in C++ given a specific problem. This paper will also make some comments on the evolution of lazy evaluation approaches in C++, and thus provide an appropriate segueway to it's sister paper, which tackles the best ways of implementing laziness in a language, and how C++ compares to other languages in this respect.
- ItemLocality and Parallel Optimizations for Parallel Supercomputing(2003) Harrison, Ian; Wonnacott, David G.
- ItemMaking Computer Science Education Accessible to Blind and Visually Impaired Students(2023) De La Cruz, Marisleysis; Wonnacott, David G.There are numerous barriers for blind and visually impaired (BVI) learners in the computer science classroom. This review of the literature has identified potential strategies that have been developed to make programming accessible to blind and visually impaired students. Among these approaches are 3D printing, screen readers, braille displays, making text-based languages accessible, and creating accessible programming environments. A common problem documented in the literature is the screen reader being the most effective assistive technology, but falling short of helping visually impaired learners understand the overall structure of their code. Most of the other assistive technologies discussed are in their developmental and research phase, thus further work is needed to investigate potential methods for introducing visually impaired students to programming.