Hello! Thanks for visiting my webpage. My name is Michelle Tat, and I am currently a principal data scientist at the City of Boston. I encourage you to browse my GitHub repositories (click the “View on GitHub” button above!, or click here). I’ve started to do a few fun things lately, which include:
- A script exploring cryptocurrency forecasting (with the use of autoregressive models)
- A module that will allow individuals to easily access Boston 311 data for analysis
Any and all work related code can be seen at the City of Boston Github. Unfortunately, many of the analytics and data science projects I am working on are under private repos. They will become public as our team progresses to completing those projects.
I also have a blog, where I recently wrote a introductory tutorial on Random Forest in R. Feel free to check it out!
Links to my Insight Projects
This project takes a keyword search for PubMed, and uses natural language processing to clean and run topic modeling on the retrieved text.
Notabably, this project was an exercise in implementing more computer science principles, such as object oriented programming, unit testing, and the use of version control in github.
Notebooks ands scripts covering my development process for Happy Helper. Happy Helper was my initial Insight Project that I demoed at various Boston area companies.
This project takes 90,000 reddit comments from Google BigQuery (specifically from the subreddits /r/anxiety and /r/depression), takes those comments, cleans them up, and implements a classification analysis on the text using various models (e.g., Support Vector Machines, Naive Bayes)
A web app came out of this where a user could input a chunk of text, and it would get classified as being similar to anxious or depressive text. Further, the user would be referred to an appropriate reddit support forum links, where the user input is matched on similarity with posts in that support forum. Check out Happy Helper for an example.
The webapp was developed using the Flask framework, and deployed using Amazon EC2.
Other Fun Stuff
- This was an exploratory project, where I learned to put together a basic dashboard in RShiny, using the Boston 311 Requests dataset.
- This is an ongoing project where I use natural language processing (e.g.,frequency analyses, document classification, word embeddings) to examine the concerns of the transgender community. You can read more about it on my blog. This project was written in Python.
- Cleaned up webpage and personal repos.