An Initial, Failed Solution For The Event Detection Task

In this post, we are trying to validate our initial solution for the event detection task. If you’re not familiar with the task you can refer to our previous post about “How to” Event Detection in Media using NLP and AI. For applying event detection in news articles we are planning to do the following: Represent each article as a vector of expressive features. Feed the vectorized articles into a sequential clustering model to aggregate the ones talking about the same event. In this post, we’re solving the problem of event detection in the Arabic language. Features Effectiveness In this … Continue reading An Initial, Failed Solution For The Event Detection Task

Building a Test Collection for Event Detection Systems Evaluation

Before we start, if you’re not familiar with the Event Detection task in NLP you can refer to our previous post on this topic here. So you’ve built a system to detect events in the media… now what? While building a system is a key step, how the system performs on real-world data has equal importance. We need to know whether it actually works and if we can trust its decisions. So.. we need to evaluate our system before putting it in use. Evaluation is a highly important step in the development of any system type as it allows the … Continue reading Building a Test Collection for Event Detection Systems Evaluation

News Stream Clustering – Sequential Clustering in Action

In a previous post, we talked about “How to” Event Detection in Media using NLP and AI. In another post, we presented the Sequential Clustering. Today we’re introducing an online (sequential) clustering algorithm specialized in aggregating news articles into fine-grained story clusters. Problem Formulation We focus on the clustering of a stream of documents, where the number of clusters is not fixed and learned automatically. We denote by D (potentially infinite) space of documents. We are interested in associating each document with a cluster via the function C(d) ∈ N, which returns the cluster label given a document. For each … Continue reading News Stream Clustering – Sequential Clustering in Action

An Implementation of a News Stream Sequence Clustering Algorithm

In a previous post, we discussed a News Stream Sequential Clustering Algorithm. In this post, we’re discussing the details of implementing this algorithm with minimal tuning, and showing the results produced by this implementation. Along with this post, we’re evaluating … Continue reading An Implementation of a News Stream Sequence Clustering Algorithm