SYS 2202: Data and Information Engineering
Overview
Spring Semester Undergraduate Course at UVA
This course provides an introduction to a fundamental aspect of data science and engineering - working with data. Learn skills to efficiently and effectively obtain, manipulate, store, and analyze data (i.e., convert data to information) to support decision making and future data modeling (e.g., regression, data mining, machine learning) efforts. Emphasis on obtaining, cleaning, combining, and wrangling the data into a more usable form. Learn how to break up a large data set into manageable pieces and then use a variety of quantitative and visual tools to summarize and extract information from it. The challenges of big data (e.g., size, streaming data, mixed variable types) will be addressed throughout the course. As an introductory course, the focus will be on understanding basic concepts and how to implement them in R, a leading data science language.
Course Outline:
Introduction to Data Science and Engineering
Data Collection
Getting to Know Your Data
Data Types
Basic Statistical Descriptions of Data
Data Visualization
Measuring Data Similarity and Dissimilarity
Data Preprocessing
Data Cleaning
Data Integration
Data Reduction and Transformation
Dimensionality Reduction
Spring 2021 Projects
Vaccination Status
The COVID19 vaccine rollout has been different in every part of the US. This group explored the effects of income per capita, political party, and population on a state's vaccine rollout.
Success Patterns in the NBA
An analysis of the five most important player statistics in NBA basketball and their effect on win/lose outcomes in games.
Substance Abuse
This project analyzed societal factors such as geography, legalization, age, family structure, education, and employment and how they correspond with substance use, abuse, and recovery.
Sports Betting Analysis
With sports betting recently becoming legal in certain states around the country, and still yet to be legal in others, this project explored sports betting over/under and spread, and how the weather and game status affect these metrics.
Socioeconomic Status
This study aims to investigate socioeconomic disparities in Virginia, specifically looking at education attainment, employment, income, poverty, and degree of urbanization.
Social Media Analysis
For influencers, advertisers, and companies who want to effectively use social media platforms to promote their products to consumers and achieve a more effective and impactful social media presence, it is crucial to understand how users interact with them. This project analyzed how a company can optimize popularity and engagement of their posts on Facebook.
Racial and Gender Bias Analysis
This project explores the question of how race impacts different aspects of American society, stemming from systemic racial bias.
Exploring Pandemic Trends
This analysis addresses the various factors that have influenced the spread of COVID-19 across the US since early 2020. While there are numerous underlying factors, the focus of this analysis is on vaccines, mask policies, variants, and change of virus “hotspots” over time, as well as a comparison of the spread of COVID-19 in the USA to other countries and regions of the world.
Money and Sports Performance
Is there an optimal way to distribute money in order to get the most successful athletic team? This group took a closer look at whether spending more money can mean more success for sports teams.
Mental Health Analysis
Mental health is a pressing subject that can deeply affect anyone, anywhere. This project explores location, housing, demographics, COVID-19, and technology industries, observing how they affect mental health.
Cybersecurity Analysis
This group's analysis investigates cybersecurity trends to detect possible vulnerabilities that must be attended to when designing and implementing new cybersecurity measures.
Climate Change Analysis
Rising temperatures have created concerns among the scientific community regarding sea levels and the ways that communities and infrastructure will be affected by rising sea levels.
This project performed an analysis of ocean level rise, gross domestic profit, disastrous weather, Arctic ice concentration, and crop yields.
Spring 2020 Projects
Coronavirus Live Tracker
As COVID-19 began to spread throughout the world in March 2020, this project created a Coronavirus live tracker that focuses on state testing data to understand how states are impacted differently from one another.
Twitter Word Cloud
This group created a twitter word cloud to visually represent and understand trends in topics that are relevant to users in specific locations.
This group utilized the expansive data on Twitter to analyze trends in relevant topics to users in specific locations around the United States.
Stock Market Analysis: Effects of the Coronavirus
As the coronavirus pandemic has drastically affected the US economy, this project attempts to discover underlying relationships between various sectors of the stock market and coronavirus data within the United States and the global setting.
Student Productivity and Wellbeing
The purpose of this project is to characterize behavior patterns of anonymized UVA undergraduate students including movement, social communication, and activities from Aware data and identify the relationships with their corresponding productivity and wellbeing levels.
Crime Data Analysis
In this study, the team sought to research and analyze crime data in their town of Charlottesville, VA by creating an interactive map of a dataset of crime from Charlottesville Open Data.
Crime Data Analysis
In this study, the team sought to research and analyze crime data in their town of Charlottesville, VA by creating an interactive map of a dataset of crime from Charlottesville Open Data.