Explore thousands of R data science projects and get inspired

Any topic SQL Python R Machine learning Data visualisation
Project photo

Malignant Tumor Identification in Breast Tissue

Utilized machine learning techniques to aid in identifying malignant tumors.
Claire Harris
Project photo

Mining Spatial & Text Data for Adoptable Dogs Dataset

By integrating county-level location data with a dataset of adoptable dogs, I not only cleaned web-scraped data for adoptable dogs, but visualized their location at the nation-wide and state level for the US.
John Trygier, M.S.
Project photo

Time Series Forecasting w/ Self-Optimizing ARIMA Models

Utilizing WDI (World Development Index) data, I go step-by-step through the creation of ARIMA models for time-series forecasting, and utilize machine learning to create an optimized model using the BIC & AIC. This optimally fits an ARIMA model to the data for GDP growth in France, the UK, and India.
John Trygier, M.S.
Project photo

How do Holiday's Impact COVID-19 Cases?

By leveraging a variety of time series techniques and frameworks, I develop time series models that fit and predict the trend of COVID case numbers by leveraging Facebook Prophet & ARIMA models, generating time-series cross-validation performance metrics and an analysis of the impact of holidays.
John Trygier, M.S.
Project photo

Propensity to churn

Development of two models: - The first model, propensity of email engagement, investigates the effectiveness of an email-based marketing campaign. - The propensity to churn model, on the other hand, aims to predict which consumers will cease to be customers of the company.
Alberto Monaco
Project photo

Employees attrition's investigation

Analysis of the factors and the relationships which lead to employees attrition, through the use of Bayesian Networks.
Alberto Monaco
Project photo

Principal Component Analysis via mathematical definition

PCA(Principal Component Analysis) is a powerful mathematical method that can reduce a tremendous amount of data but still can give the same explanation compared to the original dataset. In the real world, among the 100M rows of the original data set, the PCA uses only 60% of original data to...
Hosung Lee
Project photo

Yelp Restaurant Reviews Analysis

We looked at a subset of the restaurant reviews on Yelp and attempts to draw conclusions about the relationship between various words in the text portion of the review and the star rating through using text mining. The objective is to identify which reviews are positive and negative based on the...
Eduardo Herrera
Project photo

Sentiment Analysis of NFL Quarterback Using Twitter APIs and R

Sports fans across the country regularly discuss their favorite teams, players, coaches, etc. throughout each sports season. After all, a common passion connects people — and sports often bring people together. With so many opinions across different teams, what if we wanted to better understand...
Brianna King
Project photo

Rennes Trafic

This shiny web application visualizes data that concerns the state of traffic in Rennes. The data are provided to Rennes Metropole by the company Autoroutes Trafic. The main objective of this project is to showcase some dashboarding skills using R and Shiny.
Abdessamad. BAAHMED
Project photo

Visualize interest vs capacity for undergraduate programs in Indonesia (SBMPTN data)

Part of a project on the importance of education choices. (1) Web scraped data from multiple pages at government website: https://sidata-ptn.ltmpt.ac.id/ptn_sb.php? (2) Processed data of 3369 majors in (a) a nested list (b) its transformed tibble version (3) Made insightful visualization by...
Salsabila Mahdi
Project photo

Exploration of feature selection techniques

This notebook focuses on the different feature selection techniques in order to identify the most suitable method.
Valentin Joly
Project photo

University Twitter Analysis

In this project, I worked with a group to analyze the Twitter page for the University of Waterloo, comparing it to two other universities, the University of Toronto and the University of Western Ontario. We focus of the analysis was to create recommendations on how the University of Waterloo...
Richard Bao
Project photo

Sales Analysis

In this analysis project, previous figures were taken from two companies, Tesla and Apple, and these sales figures were analyzed. Through this data, extrapolations were taken in order to predict future sales figures, using ARIMA.
Richard Bao
Project photo

Sentiment Analysis of Twitter Hashtag

I use almost 250,000 tweets under the #StandWithUkraine hashtag and analysed the sentiment associated with those over the course of the first two weeks of the war. A mapping of real-world events onto the sentiment timeline was also done to attempt to provide explanatory reasoning for the shifts...
Gareth Moen
Project photo

Forecasting Public Transport Usage

I use a linear regression model and an auto arima model and compare their MAE and RMSE results from the training data. With the auto arima model producing marginally better results and the linear model possibly overfitting the arima model was trained on the full dataset and used for the final...
Gareth Moen
Project photo

Nobel Prize Analysis

The Nobel Prize is perhaps the world's most well known scientific award. Every year it is given to scientists and scholars in chemistry, literature, physics, medicine, economics, and peace. The first Nobel Prize was handed out in 1901, and at that time the prize was Eurocentric and male-focused,...
Rita Vo
Project photo

Modeling Wine Preferences

The modified dataset of 1,599 red wine samples from the north of Portugal is used to model wine quality based on physicochemical tests. The analysis was implemented on a random sample, containing 440 values from each variable.
Rita Vo
Project photo

Education in Bangladesh during COVID-19

Report realized within the context of a project in the first semester of my studies in statistics and business intelligence.
Enzo Berreur
Project photo

Movie Rating Prediction

This project utilized predictive analytics to tease apart what factors contribute to the classification of “good” and “bad” movies based on their ratings. Variables such as director, cast, and budget are considered for the prediction. Understanding the relationship between movie quality and...
Mandy Guo
Previous page Next page