Green's Living

Minor League Baseball Pitch Analysis

I received an assessment after applying for an analytics position with a baseball team. Anyone who knows me knows this would be a dream job in a way, having played softball basically my whole life. I dove right into the assignment, which had the stipulation of only spending 1 - 2 hours on it. I did try to stay within the limits, but the subject matter being one of those magical marriages between personal and professional interests, I continued on afterwards to practice 1) scaling and sampling techniques and 2) creating useful visualizations for the results.

Classification of Professional Men's Tennis Results

In an attempt to data science my way to solutions to the world’s problems, I tried to predict results from professional men’s tennis. More specifically, I used existing rankings and statistics to determine if a given player would make the round of 16 (R16) of a Grand Slam tournament. This wasn’t even for gambling purposes - I’m too cheap to gamble - but the motivation to see my favorite players in person. Ticket prices increase with each round and go on sale before the tournament starts, so it’s a gamble to spring for pricey later round seats without knowing who the players will be. Knowing with greater certainty that my favorites would be playing, perhaps I’d be more willing to spend more.

Project 2: Linear Regression Analysis of Toronto Airbnb Rates

The second project was one of our own choosing. I (finally) settled on trying to model the rental rates of Airbnb listings in Toronto using the details of the rental unit. As someone who frequently uses Airbnb, I am quite familiar with the website and its offerings. Choosing the location was easy - having spent the last six years in Toronto, I know the city well, which in theory would come in handy when analyzing.

Project 1: Analysis of MTA Transit Data

The first project at Metis involved using Python and Pandas to examine the turnstile data available online from the Metropolitan Transportation Authority. Doing some introductory data cleaning and analysis, we were expected to determine the MTA ridership of different stations and over different time periods.

Data Science