Data Visualization

Data Analysis and Visualization in Python, R, and JavaScript

In 2019, I started the Data Science specialization in Northwestern's MS in Information Systems program. The foundational courses included linear algebra, calculus, and applied statistics (in R and Python) to prepare us for study in data engineering, machine learning, and artificial intelligence. These courses taught me to make a habit of exploratory data analysis (EDA) and data visualization to achieve an understanding of the data and draw insights from the data. Doing so is a first step in any analytics or engineering practice.

Since then, I've used data visualization libraries in R, Python, and JavaScript. Many samples of data visualization are part of data engineering workflows published in my Github account. Below are an interactive dashboard using Plotly/Dash and a formal statistical report using R.

Interactive Graphs and Dashboards

Precious Metals Prices over Time

View dashboard | View Github repository

This interactive graph displays the prices in US dollars per ounce over 3 years between 2018 and 2021.

The graph is created with Plotly, which uses React.js behind the scenes to manage the state of data and the graph components.

The web page is create with Dash, which launches a server and uses callbacks to respond to user requests to change the view of data.

The dashbord is deploye on Google Cloud App Engine.

Exploratory Data Analysis

Protecting the Abalone Fishing Industry

View full report | View Github repository

This EDA in R addresses a conservation problem in fisheries and is a common academic project in data science. The question posed is how best to regulate the harvest of abalone to maintain a healthy population of abalone and a healthy fishing industry.

The dataset included 1086 observations of abalone caught in the wild and 10 variables: sex, length, diameter, height, whole, shuck, rings, class, volume, and ratio.

The analysis hinged on finding the variable that best classifies the abalone as adult or infant. The exploration found that best indicator of age was volume. I recommended 3 options ("cutoffs") to fisheries that could maximize the percentage of adults and minimize the number of infants in the harvest by taking abalone whose volume exceeded the recommendation.