created with

Profile photo

Pauline Yue

Data Scientist + Social Impact


I'm a data scientist interested in utilizing machine learning to make a social impact and improve the well-being of communities.

Python R SQL Predictive Modeling Machine Learning Data Visualization Google Cloud Platform


Project photo

Predictive Modeling

Secure the Bag

A platform designed for users to gain a deeper understanding of food insecurity in California. Predictive models are used to create data visualizations demonstrating the historical and future trends in food security in California. Tools for machine learning implemented to have users interact and gauge their current food security status. There is also a list of resources for users to gain direct access to food aid, benefits, and more.

Python Data Visualization Machine Learning Predictive Models
Project photo

Data Visualization

informARTive museum

An immersive virtual experience to view artwork from a data-centric point of view. The pieces featured are from the publicly available MET dataset. The website demonstrates through visualizations how art styles of popular artists change over time, the place of origin of the museum's art pieces, the most popular materials used in artwork, and more.

Tableau HTML Python pandas numpy
Project photo

Machine Learning

California Wildfires

Wildfires are becoming more common and destructive in California. This project aims to classify and predict how big a wildfire may become by training models on a variety of weather conditions to help provide more information to prevent large wildfires.

Python Machine Learning Predictive Modeling
Project photo

Natural Language Processing

Fake Review Detection and Sentiment Analysis

Fake reviews often damage the reputation and integrity of services such as Yelp. In this project fake reviews were created using natural language processing and n-gram and gpt2 models were used to discern between real and fake reviews. A sentiment analysis on the reviews was also performed to create more context around the business use case of fake reviews and how often the fake reviews are negative. The models used trained weights from BERT and DistilBERT to improve the accuracy of the models.

NLP Tensorflow Keras Neural Networks