Fetching And Indexing Tweets in AWS ElasticSearch

Here in Almeta we are striving to be a source of all valuable content on the Arabic side of the web, and while our current effort focuses on indexing and analysing news articles or blogsposts we believe that social outlets remain a vital part of the while news experience of any internet user. In this post we present our plan to fetch, analyze and index tweets in a simple and effecient manner. Twitter API Twitter have 2 major free APIs (asynchronous and streaming) with the following main differences between them: Functionaity: async API allows you to search, filter and inspect … Continue reading Fetching And Indexing Tweets in AWS ElasticSearch

User Profiling Using AWS ElasticSearch – RomCom use case.

Personal differences and preferences marks a very important part of our identity, and optimizing the user experiences based on them can be a great tool in improving users engagement. In our previous post to tackled the issue of personalized recommendations and how can ElasticSearch make the process extreemly simpler. However in order to build a robust personal recomendation system it is paramount to have an idea of each user. Who are they and what do they like. This is commonly refered to as a user profile. In this post we will present a road map to enabling user profiling with … Continue reading User Profiling Using AWS ElasticSearch – RomCom use case.

Initial Genre Classification Experiments

The ability to filter your news feed based on the genre is a critical component of any news aggregator, users would usually want to read sports or political news only not just the most recent or hottest news. In this post, we will explore in great details our initial genre classification system. Let’s start with the.. data In the following experiments, we used an in-house data set. The data set is composed of 190307 HTML document crawled from the following domains [Aljazira, Alarabia, Aljadeed, RT Arabic, BBC arabic]. For each of the documents we tried to extract the following features: … Continue reading Initial Genre Classification Experiments

Intial Experiments on Measuring Informativity of an Arabic Content – Data Collection

In one of our previous articles we suggested a method to build an initial system for informativeness detection, this system should utilize a small set of pairwise comparisons manually annotated and use Snorkel to expand these annotations automatically to a larger training set and then train the model to estimate the article informativeness using this set.In this article, we will go into the details of the implementation of this plan. Data Annotation As noted above Snorkel will need 3 typed of training data: A small manually annotated test set to evaluate the results of the model A smaller manually annotated … Continue reading Intial Experiments on Measuring Informativity of an Arabic Content – Data Collection

Contrary view detection based on VODUM

While reading the news each one of us perceives it in a different manner. We have our own biases and we tend to search for information that confirms our previous beliefs. Thus different people might have drastically different viewpoints of … Continue reading Contrary view detection based on VODUM

From Sentiment to Political Bias in the Arab World and the Arabic Content

From Sentiment to Political Bias in the Arab World and the Arabic Content

The rise of political bias problem across several news anchors presents a real threat to free and independent journalism and a major factor in shifting the populace conception of the world. Several NGO’s, research centres and private organizations are working … Continue reading From Sentiment to Political Bias in the Arab World and the Arabic Content

How to Rank Articles Based on How Informative They Are

How to Rank Articles Based on How Informative They Are – Using Snorkel

Let’s start with a simple question, what constitutes an informative article? based on Oxford’s dictionary. informative/ɪnˈfɔːmətɪv/ adjective: informativeproviding useful or interesting information However, this is still an abstract concept. Yes, it is much simpler to flag an article as spammy … Continue reading How to Rank Articles Based on How Informative They Are – Using Snorkel

Informativity Detection, Our Research Gist

Informativity Detection – Almeta’s Research Gist

Let’s start with a simple question, what constitutes an informative article? based on Oxford’s dictionary. informative/ɪnˈfɔːmətɪv/ adjective: informativeproviding useful or interesting information However, this is still an abstract concept. The question of measuring How informative a piece of news is … Continue reading Informativity Detection – Almeta’s Research Gist

What Makes an Article Informative and How Computers Can Measure Informativeness

What Makes an Article Informative – And How Computers Can Measure Informativity of a Text Content

The Concept of an informative text is really abstract and it is hard to come up with a definitive formula to measure it, in this article we will explore some of the features that we believe can make an article … Continue reading What Makes an Article Informative – And How Computers Can Measure Informativity of a Text Content