Under the hood of Newshound's news aggregation platform

In the coming weeks, we will be talking about how our news aggregator works under the hood. Please comment on the articles below to help us tackle our technical challenges better.

#1: Gathering and ranking news stories
#2: Clustering news stories at scale
#3: Discovering the biggest newsmakers of the day
#4: Searching news archives
#5: Analyzing news sentiment for fun and profit

#1: Gathering and ranking news stories

A news aggregator collects multiple news stories from multiple publishers which begs the question: how do we surface the most important stories of the day? We use some aspects of Natural Language Processing like the bag-of-words model and approximate string matching algorithms to come up with the answer.
↣ Click here to read the detailed blog post.

#2: Clustering news stories at scale

A news aggregator consumes thousands of news stories a day. We use data mining algorithms such as Locality-sensitive Hashing with MinHash to cluster similar news stories at scale. Hashing,