Fake news is a pretty funky problem. In fact, it is also a really recent problem. Look at the Google Trends screenshot below:

Google Trends for the search term 'fake news'

Before 2017, it was barely brought up at all. With such recency and still a lot of problems to be solved, my fellowship friends and I thought it would be fun to take a stab at it for the 2017 Global AI Hackathon.

We were lucky enough to find a dataset on Kaggle that had fake news stories. However, we struggled to find a good dataset for real news, so we decided to create our own. We built a web scraper that grabbed from a variety of reliable sources: liberal, moderate, and conservative.

From there, we developed a Naive Bayes Classifier model that could predict whether an article was fake with 85% accuracy. The model was trained on a set of 4,000 articles and was tested on a set of 1,100 articles.

With a working model in hand, we set up a web demo with a really simple user flow:

Enter the URL of the page in question
We fetch the URL of the page
We parse the information on the page and feed it into the model
We return the result of the model as well as sentiment analysis and similarity between the headline and the article (one key indicator of fake news)

We ended up being the winner of the 2017 Seattle Global AI Hackathon. Check it out for yourself here and check out the GitHub repo here!

Breaking News

Other posts you may like

Landing a Data Science job

The 6AM Project

What do I want more of?

Are you bretty for this?

Who is Brett?