Computer Science
Permanent URI for this collection
Browse
Browsing Computer Science by Title
Now showing 1 - 20 of 188
Results Per Page
Sort Options
- Item1-D TV: A Computational Investigation into the Lexical Representation of Black Womanhood In Reality Television News(2021) Habtu, Blien; Grissom, Alvin; Payne, AmandaIt is well-established that when categorizing lexical associations of words in news corpora, women and minorities tend to be associated with negative terms (Manziniet al. 2019). This harm is also carried through other forms of media. For instance, Black women on television have been historically depicted as one dimensional characters, often forced into categories of strict binaries. Commonly, they are either extremely educated or they dropped out of school, either they are ambitious or they have lost all enthusiasm, either they are completely desexualized or they are hypersexualized, either they are always compliant or they are always aggressive (Boylorn 2008). While these depictions are known to cause harm, racism and sexism are not necessarily so overt, and more work is needed to quantify the effects and spread of stereotypes relating to intersections of identities. Through this thesis, I use the context of reality television to examine how racial representations in media can influence people's perceptions of Black womanhood. I begin with background information on some of the effects of media consumption and with a high-level computational overview of how words can be represented as vectors to quantify prejudicial bias in text representations. Then I conduct a literature review exploring some of the ways previous researchers (Parthasarthi et al. 2019; Garg et al. 2018) have measured bias in digital media both through text and over time. To conclude, I propose an experiment to examine the ways in which Black female reality television contestants are talked about in news article headlines using the Word2Vec algorithm and vector representation tools.
- ItemA Comparison of Fairness-Aware Machine Learning Algorithms(2018) Roth, Derek; Friedler, SorelleFairness, as it applies to algorithms, implies that the decisions made by an algorithm are being made such that there is no discrimination against individuals or groups in labeled data sets. In this paper, I give a summary of the relevant endings by computer scientists regarding fair algorithms, discuss three techniques used to reduce/remove bias in algorithms, and examine three case studies that clearly demonstrate the importance of this field of study. The goal of this thesis is to determine the best practices for case studies on this topic and to discover ways of developing algorithms that are unbiased. I apply existing algorithms to four data sets and compare their results in order to determine which are the most useful in a specifc situation.
- ItemA General Framework for Proving Sauer-Shelah Type Lemmas(2019) Yutong, Li; Bhaskar, SiddharthWe present two versions of "shatter functions" and show that they are upper-bounded by the same polynomial, known as the Sauer-Shelah dichotomy. Then, we analyze a third setting due to Chase and Freitag [3], the $(n, k)$-banned binary sequence problems (BBSPs). Our purpose is to give an accessible exposition of their result and show that this is not only a generalization of both theorems, but also of both proofs. In particular, a new proof is obtained for the thicket case following this generalization.
- ItemA Rule Learning Approach to Discovering Contexts of Discrimination(2017) Tionney, Nix; Friedler, SorelleIn fairness-aware data mining, discrimination discovery refers to determining if social discrimination against certain individuals or groups of individuals exists in labeled data sets or in learned models. In this thesis, I focus on the problem of discovering contexts or niches of discrimination in data sets, i.e. revealing groups of features in a given data set that, when considered together, have a greater degree of discriminatory influence in the data than when any one feature is examined individually. Our approach to this problem involves using the CN2 and CN2-SD rule learning algorithms to identify groups in the data that have signifcant predictive ability, and then using the Gradient Feature Algorithm to quantify and examine each group's discrimination potential. This approach shows promising results for identifying contexts of discrimination.
- ItemAbstract Interpretation of Algorithmic Fairness(2018) Lytle, Becky; Micinski, KristopherCurrently, many societally impactful decisions are made by machine learning algorithms. As these algorithms are given more power, we want to ensure that they are not making discriminatory or biased decisions. The past literature on algorithmic fairness has displayed weaknesses which probabilistic program analysis may be suited to address. My thesis looks into the topic of verifying fairness properties of decision-making algorithms using this type of program analysis. I will delve into the background behind algorithmic fairness and program analysis, and then I will go on to compare a variety of program analysis based methods for understanding the fairness of a decision-making algorithm. Ultimately, I will weigh in on which approach contributes most signi cantly to the eld. I then go on to explain my independent work and how I have built o of algorithmic fairness research in order to aid an existing fairness-checking algorithm using an approach which analyzes the abstract syntax tree of a program.
- ItemActive Meta-Learning(2020) Nicholas, Gareth; Friedler, SorelleThis thesis outlines the methods used in machine learning to generate models which are effective on a variety of tasks. We begin with a quick overview of the field of machine learning, covering the topics necessary to understand more complex learning algorithms. We then discuss the field of transfer learning and the common strategies used to train models so that they can function in different domains and solve different tasks. We then continue with current research in the field of meta-learning, which aims to find optimal solutions to the problem of learning how to learn. We introduce model-agnostic meta-learning, or MAML, as an algorithm which addresses the difficulty of few-shot learning on a variety of tasks. We then consider PLATIPUS as a tool for reasoning about model uncertainty in meta-learned models. Using this uncertainty,we explore active learning as a means of selecting the optimal examples to train our models. Using PLATIPUS and active learning, we seek to address the problem of exploring the space of chemical reactions efficiently.
- ItemActive Task and Instance Selection for Lifelong Learning(2015) Xiong, Fangyu; Eaton, EricActive learning improves the efficiency of machine learning in situations where labels are acquired at a cost. In this paper, I explore how active learning techniques can be used in a lifelong machine learning setting, where the agents face a pool of tasks and acquire knowledge incrementally by learning a sequence of tasks. I develop two active learning frameworks for Efficient Lifelong Learning Algorithm (ELLA). The first, ELLA-ATAL is a flexible framework that combines active task selection and active instance selection and bridges lifelong learning algorithms with traditional single-task active learning methods. The second, ELLA-ATMIS enhances ELLA-ATAL by taking into account how each instance contributes to tasks that it does not belong to. I evaluate these approaches on two data sets and compare them with other lifelong learning methods. My results demonstrate the potential of active learning when used in combination with lifelong learning.
- ItemAn Analytical Approach to Higman's Lemma(2024) McMahon, Eleanor; Lindell, StevenHigman’s lemma is a fundamental result in computer science and mathematics, providing applications in areas such as graph theory, and other data structures. One proof of this lemma provides an interesting use of the minimally bad sequence proof technique, originated by Nash Williams. Furthermore, this lemma sets the foundation for the finite basis property, a property focused on the potential definable nature of infinite sets. However, it is not commonly known to look at this lemma analytically rather than algebraically. By taking an analytical approach to Higman’s lemma through topological and analytical notions, new perspectives and insights may be uncovered, shedding light on other possible applications of this lemma.
- ItemAn Empirical Study of the Performance of Scalable Locality on a Distributed Shared Memory System(2011) Douglas, Tim; Wonnacott, David G.We run an empirical study to determine the performance attainable using scalable locality on a distributed shared memory system. We utilize the PLUTO automatic parallelizer and locality optimizer on a one-dimensional Jacobi stencil with a distributed shared memory system provided by Intel’s Cluster OpenMP on commodity hardware. Unlike earlier works which base performance concerns on tile size, we discover tile sizes are less crucial, and instead threading issues and cache interference need particular attention. On our 56 processor cluster of 14 quad-core machines, with preliminary single-level tiling we are able to realize 121.3 GFLOPS, or 63% of linear speedup. Our data suggest that adding further cores would increase performance linearly.
- ItemAn Explication of "A New Algorithm for Transitive Closures and Computation of Recursion in Relational Databases" by Yangjun Chen(2008) Miller, AnneIn this paper, we discuss the algorithm for computing the transitive closure of a directed, acyclic graph of bounded degree given in Yangjun Chen’s paper, "A New Algorithm for Transitive Closures and Computation of Recursion in Relational Databases" [2]. We point out several errors, one in the correctness of the algorithm and one in the complexity, and modify the algorithm slightly to repair the former. (The error in complexity does not seem to be as easily fixable.) We then prove the correctness of the altered algorithm.
- ItemAn Introduction to Ray Tracing and its Optimization Methods with a Focus on Denoising Through Filters(2022) Huang, Haosong; Xu, DiannaRasterization and ray tracing are two common techniques used in computer graphics rendering which displays complex 3D scenes containing billions of objects on our computer. Compared to rasterization, ray tracing can generate higher-quality and even realistic image by simulating the behaviors of physical lights at the cost of tremendously higher computational time. Researchers have taken decades to increase the efficiency of ray tracing and reduce the rendering time of one image from one day to several seconds. One research goal is to investigate the possibility of real time ray tracing whose rendering time is below 1/24 second so people can see animated photo-realistic images in games or simulations. Although real time ray tracing is a hot topics in rendering, it is very difficult for beginners to understand ray tracing through reading hundreds of pages of textbooks. In this literature review we provide a summarized top-down approach to explain the basic concepts of ray tracing and some of the optimization methods, so readers may obtain some insights about ray tracing without reading too many pages of rigorous math and code. This literature review starts with the basic framework and some of the early methods of ray tracing - ranging from ray casting, recursive ray tracing, Monte Carlo ray tracing to path tracing. The second section lists some existing optimization methods to ray tracing at each stage of the framework as examples. The third section looks at a specific optimization method called denoising and gives detailed explanations to the classical filtering method of denoising using spatio-temporal variance-guided filter (SVGF) as an example. Some machine learning applications in ray tracing optimization will also be mentioned.
- ItemAnalytical Computer Generated Holography and Vector Graphics(2012) McAllister, Takumi; Wonnacott, David G.In this paper, I investigate the problem of generating holograms with computers. Holograms are developing 3D display technology, superior to modern stereoscopic systems. After an introduction to the field, I survey some existing methods (Ping Pong, Coherent Ray Tracing, Diffraction Specific). Next, I examine two analytical algorithms in closer detail. Finally, I consider building on the analytical methods to make vector based holograms, and examine their possible merit. Upon examining a theoretical algorithm, I conclude that without further advances in vector displays, a vector hologram would be impractical.
- ItemAnalyzing Android App Permissions using ReRedexer(2018) Todd, Chandler; Micinski, KristopherFor my thesis, I will be researching ways to analyze Android Apps and the permissions that they have in order to look at how apps are using and miss using the permissions that they are given. I will be doing this by building a tool using Soot and tools that are built using Soot as a base. This will be done by using both static and dynamic analysis in order to track how apps are using or misusing the permissions that they have. I will personally be digging into the Google+ API to see if it is ever being called in an app, and if it is being called, whether or not it is being used without the users permission. Along with that, I will be looking specifically into either fixing some bugs in Soot that may cause issues with our tool or begin to put together all of code that we need to start building the tool.
- ItemAnalyzing Energy Efficiency in Neural Networks(2020) Susai, Silvia; Friedler, SorelleRecent advances in deep learning have led to the development of state-of-the-art models with remarkable accuracy; however, previous work has shown that these results incur a high environmental cost due to their significant energy usage. Nevertheless, accuracy remains the predominant evaluation criterion for neural network performance, so much so that computationally-expensive techniques such as neural architecture searches are oftentimes employed to only moderate success. What is more, the relationship between energy usage and accuracy has been shown to be non-linear. Thus, an increase in energy usage may not necessarily lead to an increase in accuracy. This thesis surveys the current literature pertaining to energy efficiency in deep learning and proposes that future work should examine how energy usage is a distinct trade-off in neural network models.
- ItemAnalyzing Iteration-Space Performance Tuning Using the Deriche Recursive Filtering Algorithm(2018) Glaser, Mary; Wonnacott, David G.No matter how efficient or satisfactory an algorithm or program may be, we as computer scientists are always looking to improve our programs by means of speeding them up or making them more accurate. In this paper, we will focus on finding the best way to optimize programs by using iteration-space performance tuning methods. We hope to contribute towards the bigger picture of understanding whether it is possible to do iteration-space performance tuning by studying a version of the Deriche Recursive Filtering Algorithm in C, which applies edge detection and smoothing to 2D images. Among the many different existing types of optimization techniques, we will be focusing on classic optimization and iterator-based optimization. With Deriche, we will first compare runtime results of the original code in C with the performance of classic optimization techniques and PluTo (a specific type of iteration-space optimization). Then we will convert the Deriche algorithm to the language Chapel and apply more direct iterator-based optimization techniques to further study its performance.
- ItemAnalyzing Passing Patterns and Player Space Occupation to Determine if They are Key to a Soccer Team’s Victory(2023) Silverman, Samantha; Farias Sales Rocha Neto, JeovaWe analyze passing patterns between players and how players occupy space during a soccer game to determine if they are key characteristics in predicting the outcome of a soccer game. We identify three main approaches that experts have studied to solve this problem. We investigate approaches where we gain information about counting the number of passing patterns that teams use and the effect on success in that manner. Also, we investigate the number of passes and duration of passing patterns for success. The next method uses representative scores, or indicator scores, to better summarize a group of statistics that investigates the success of teams. These authors find that when teams have a higher indicator score in this case, they have a better chance of winning. We can also combine common statistics by looking at how the defense moves after a pass to better understand the effect of the defense on successful passes. Finally, the third approach seeks to combine common statistics and even use machine learning to better represent and predict success of teams. Authors create scores that combine statistics from both offense and defense as well as use machine learning to predict the winner of the game as early on as possible. Surprisingly, 67.9% of the time, Goes et al. correctly predicted the winner 22.5 minutes (or a quarter) into the soccer game [3, p. 529]. Within each approach, we tie the relationship between our topic and the research back together to make progress in solving the problem. We find that no single strategy helps teams win the game, but instead teams must focus on their own player’s strengths and weaknesses and how to best attack their opponents. In the future, we wish to study a variety of leagues over a long period of time to determine if strategies have shifted or differ between leagues, and how teams best use different strategies for success.
- ItemAnalyzing the Aeneid and its Translations with Topic Models and Word Embeddings(2022) Langen, Carter; Kuper, Charles; Grissom, AlvinWe review new advances in word embeddings and apply them to cross-lingual literary analysis of Latin and English translations of Latin. We introduce word embeddings, summarize the developments to them that allow them to be trained from small data, on morphologically rich languages, and cross-lingually in detail. We also review Latent Dirichlet Allocation, and Polylingual Topic Models. We then use these models to analyze Vergil's Aeneid and the John Dryden, John Conington, and Theodor Williams translations into English.
- ItemANALYZING THE COMPAS ALGORITHM IN CRIMINAL DEFENDANT RISK ASSESSMENT(2019) Ayad, Yasmine; Friedler, SorelleFor my thesis, I analyzed the COMPAS recidivism prediction tool made by Equivant which aims to see how likely a defendant charged with a crime will re-offending given a score from 1-10 where 1 indicating lowest risk and 10 indicating highest risk and is used by many states in the country. ProPublica dataset consisted of re-arrest data of COMPAS predictions of 6172 people made between 2012-2014 which they proved that COMPAS was more likely to falsely label African-American defendants as high risk more often than White defendants and more likely to falsely label White defendants as low risk more often than African-American defendants. Looking at ProPublcia’s dataset along with Jai Nimgaonkar’s dataset who took ProPublica’s dataset to see if these people were convicted of a crime in order to see if there is still bias from re-arrest data to conviction data when looking at intersectionality between sex and race and different fairness aware algorithms.
- ItemAnalyzing the Flow of Sensitive Data and Facebook Permissions in Android Applications(2018) Ellenburg, Skyler; Micinski, KristopherWhen Android applications request permission to use certain data, the extent to which they use that data and where it goes is often vague, deceptive, or completely ignored by the user. Researchers in this field examine the use of permissions in Android applications to analyze what data is being used when, where it comes from, and where it is going. To complete this analysis, I must understand the specific calls to Facebook's Graph API, or application program interface, in order to search them in the application and compile a list of the fields and permissions to which Facebook grants access. I built my own application to test my detection script, and I compiled a list of other applications that access the Graph API in order to validate my permission analysis. The ultimate purpose is to contribute to an effective tool that takes an Android application as input and checks which social media permissions it uses and for what purpose, returning an analysis of the flow of sensitive data through the application.
- ItemAncestral DNA Reconstruction Using Pedigrees(2021) Wiggers, Alton; Mathieson, SaraIn this thesis we look at the purpose, process, and application of DNA reconstruction using a known family tree, known as a pedigree. First, we see the applications of having a pedigree with DNA sequences for all individuals in the pedigree since the ultimate goal of DNA reconstruction is to recreate the genome of all members of the pedigree. We do this to better understand trait inheritance, particularly in humans. With a family tree complete with DNA, we can study things such as heritable diseases. Then to better understand how DNA reconstruction is possible, we look at how family relatedness between people in a pedigree can help us understand the connections within their DNA. From that understanding, we will be able to see how DNA reconstruction can be performed on ancestors in a pedigree, given the genome of their descendants. Focusing on an algorithm called "Thread," we see how DNA reconstruction is being applied today. For this work, evaluation was done on the performance of Thread in comparison to Merlin, an implementation of an older algorithm for DNA reconstruction. The experimentation done here reconfirms the greatly more efficient execution time of Thread but exposes weaknesses in its performance under certain circumstances.