Here in Almeta we are striving to be a source of all valuable content on the Arabic side of the web, and while our current effort focuses on indexing and analysing news articles or blogsposts we believe that social outlets remain a vital part of the while news experience of any internet user. In this post we present our plan to fetch, analyze and index tweets in a simple and effecient manner. Twitter API Twitter have 2 major free APIs (asynchronous and streaming) with the following main differences between them: Functionaity: async API allows you to search, filter and inspect … Continue reading Fetching And Indexing Tweets in AWS ElasticSearch
Personal differences and preferences marks a very important part of our identity, and optimizing the user experiences based on them can be a great tool in improving users engagement. In our previous post to tackled the issue of personalized recommendations and how can ElasticSearch make the process extreemly simpler. However in order to build a robust personal recomendation system it is paramount to have an idea of each user. Who are they and what do they like. This is commonly refered to as a user profile. In this post we will present a road map to enabling user profiling with … Continue reading User Profiling Using AWS ElasticSearch – RomCom use case.
The ability to filter your news feed based on the genre is a critical component of any news aggregator, users would usually want to read sports or political news only not just the most recent or hottest news. In this post, we will explore in great details our initial genre classification system. Let’s start with the.. data In the following experiments, we used an in-house data set. The data set is composed of 190307 HTML document crawled from the following domains [Aljazira, Alarabia, Aljadeed, RT Arabic, BBC arabic]. For each of the documents we tried to extract the following features: … Continue reading Initial Genre Classification Experiments
In one of our previous articles we suggested a method to build an initial system for informativeness detection, this system should utilize a small set of pairwise comparisons manually annotated and use Snorkel to expand these annotations automatically to a larger training set and then train the model to estimate the article informativeness using this set.In this article, we will go into the details of the implementation of this plan. Data Annotation As noted above Snorkel will need 3 typed of training data: A small manually annotated test set to evaluate the results of the model A smaller manually annotated … Continue reading Intial Experiments on Measuring Informativity of an Arabic Content – Data Collection
Are you looking for an NLP Hero to implement your million dollars idea? Here at Almeta we value our employees and our customers alike and we always try to attract top talent. In this article we will give a template … Continue reading Almeta’s Interviews – A Guide into NLP Research Engineering Position
The increasing amount of text data in the digital age calls for methods to reduce reading time while maintaining information content. The process of summarization achieves this by deleting, generalizing or paraphrasing fragments of the input text to create a … Continue reading Abstractive Summarization in Underresourced Languages
News stories are created every day at many news agencies. Users may receive news streams from multiple sources. Browsing in large-scale information spaces without guidance is not effective. Suppose, for example, a person who has returned from a long vacation … Continue reading Event Detection – Almeta’s Research Gist
In the words of the daily show host Trevor Noah there is currently “So much news, so little time”. In fact, the issue of information explosion expands outside the realm of news and covers all the aspects of our life. … Continue reading Five Automatic Ways To Build Data For Abstractive Summarization
While reading the news each one of us perceives it in a different manner. We have our own biases and we tend to search for information that confirms our previous beliefs. Thus different people might have drastically different viewpoints of … Continue reading Contrary view detection based on VODUM
There have been some recent advancements in Dialectical Arabic processing across various NLP tasks, in this article the goal of this article won’t be to explore any particular task but to explore as many tasks as possible and give an … Continue reading Major Tasks in Dialectical Arabic Processing
Most of us are not good writers, and if you are like me you may have sometimes struggled in communicating your ideas in a written form. The new field of automated paraphrasing can provide a solution to this issue. In … Continue reading Automatic Sentence Paraphrasing
While reading the news you are most likely to encounter several articles that describe the same event or incident and each of these articles comes from a different news anchor and provides a different viewpoint to the event. However most … Continue reading Multi-document summarization. The What, Why and How
The rise of social media has undoubtedly changed the way marketing work, not only does it allow the companies to easily access massive numbers of potential customers but it also allow them to segment the market, figure out what is … Continue reading Can AI Guess How Many Likes Your Facebook Post Will Get Before You Post It?
Many data scientists employ Artificial Intelligence to solve various tasks in the realm of Computer vision. The field that focuses on giving the machines the ability to “see” and understand images similar to how humans do it.Many of these algorithms … Continue reading Instagram Graph API details
Social media constitutes a major part of our day-today life we share our ideas, dreams and most importantly views using these mediums, these concerns can be extremely helpful for brands that wish to better understand their customers base and thus … Continue reading Twitter API Drill Down Analysis
The rise in social media coverage and usage have transformed various aspects of our life. In the marketing sector, Social media marketing makes an integral part of modern marketing campaigns. However, just like any marketing effort, it is critical to … Continue reading How To Measure Social Engagement
Social media presents today a massive source of information for marketers and decision-makers to both better understand users trends and influence these users decision. With the field of AI and NLP conquering various aspects of our day to day life, … Continue reading Smart Services For Social Media Marketing
What is the common denominator between a news aggregator, an electronic shopping site and a music streaming service? All of these applications collect data, and for all of them the users would want to have access to a search service … Continue reading AWS Cloud Search Service
AWS provides 2 different managed search services to add the search capability to your application. However, if you are AWS enthusiast you will have to choose between the older AWS CloudSearch service and the newer AWS ElasticSearch. In a previous … Continue reading AWS CloudSearch VS AWS ElasticSearch
After reading a news article on your favourite news aggregator or your news site of choice, Most of the current news aggregators allows you to read other articles that are related to the one you have already read, these suggestions … Continue reading Contrary View Detection Based On Document Similarity
Many of the web-based application follows a simple pattern: collect data, process it to get some value and then allow users to access the analysis results. In most of these applications users would want to have access to a search … Continue reading AWS ElasticSearch – Implementation Plan
Many sites on the internet allow their users to specify tags for their content. The most famous example of such sites is Tumblr where each post on this social network can hold a manually selected set of tags. These tags … Continue reading Auto-Tagging Content with NLP
Yes, understandably you might be thinking is this related to Rick and Morty? Well unfortunately No. But you should really continue reading cause Multidimensional topic modeling is really cool. In this short piece we will explore the fundamental idea behind … Continue reading Multidimensional Topic Modelling. The What? and The How?
In one of our previous articles, we discussed the idea of multi-dimensional topic modelling, and no it is not related to Star Wars, so if you thought it is, go here and give it a good read. Back from Alderaan. … Continue reading Viewpoint, Topic and Opinion Discovery in an Opinionated Document
One of the services we provide at Almeta is estimating the political bias of a piece of the news in other pieces we have discussed the technical details of this feature but in this piece, we will go through the … Continue reading How to Visualize a Political Bias Data Metric
The rise of political bias problem across several news anchors presents a real threat to free and independent journalism and a major factor in shifting the populace conception of the world. Several NGO’s, research centres and private organizations are working … Continue reading From Sentiment to Political Bias in the Arab World and the Arabic Content
First, a motivational example: Many products on the internet allow the user to leave some feedback. This feedback is usually reviewed manually to figure out what are the users likes or dislikes in the product, what are the features they … Continue reading Aspect-level Vs Entity-level Sentiment Analysis
This article is a part of our series on political bias detection we will hopefully introduce you to the various aspects of our political bias detection system, and you can learn about: How can we predict the political orientation behind … Continue reading Stance Detection – State of the Art
If you don’t know what is stance detection make sure to check our article on it. Are we on the same page? Cool let’s go. First a motivational example: Many products on the internet allow the user to leave some … Continue reading Subjective Stance Detection What is it? and How to do it?
While some news anchors try to stay professional and subjective in all of their articles, most of the news we consume are published to push a specific agenda especially when it comes to politics. In our effort to battle news … Continue reading Political Orientation Detection – AI and NLP Approach
This is the first article from our series on political bias detection we will hopefully introduce you to the various aspects of our political bias detection system, and you can learn about: How can we predict the political orientation behind … Continue reading What Constitutes a “Bad” News Article?
In this task the goal is to assign a given piece of text a tag (or number) representing the level of informativeness or detail this text holds usually by training a model to do that. Here, we rely on the … Continue reading Automatically Tagging Data for Content Informativity Scoring
Let’s start with a simple question, what constitutes an informative article? based on Oxford’s dictionary. informative/ɪnˈfɔːmətɪv/ adjective: informativeproviding useful or interesting information However, this is still an abstract concept. Yes, it is much simpler to flag an article as spammy … Continue reading How to Rank Articles Based on How Informative They Are – Using Snorkel
In a previous article (see next paragraph) we explored how to approximate an article informativeness in a supervised fashion, such a method would require training data, in this article we will explore on way to get this data, one very … Continue reading Can you measure a text Informativeness using its summary?
Let’s start with a question, given 2 articles A and B that talks about the exact same thing, what makes one of them more informative than another? Is it the ease of reading? the amount of details? or is it … Continue reading Supervised Article Informativeness Prediction – The What and the How
In our effort at Almeta to provide the articles with the highest informative value to the Arabic readers, we have employed several methods to measure the informativeness of a piece of news, in this article we will shed light to … Continue reading Term Informativeness Estimation in the Arabic Language
In a previous article, we talked about the various factor that makes an article more informative, using cliches was not one of them, this article is a part of our research on measuring text informativeness, if you are interested jump … Continue reading How to Detect Cliches in Text
Let’s start with a simple question, what constitutes an informative article? based on Oxford’s dictionary. informative/ɪnˈfɔːmətɪv/ adjective: informativeproviding useful or interesting information However, this is still an abstract concept. The question of measuring How informative a piece of news is … Continue reading Informativity Detection – Almeta’s Research Gist
The Concept of an informative text is really abstract and it is hard to come up with a definitive formula to measure it, in this article we will explore some of the features that we believe can make an article … Continue reading What Makes an Article Informative – And How Computers Can Measure Informativity of a Text Content
In our effort to provide the best news feed out there, one of the goals we are trying to achieve here at Almeta is to capture the interaction between different news outlets and how the coverage of the same event … Continue reading Aspect Detection and Named Entity Linking (NEL): Using SPARQL and DBpedia
What is git-submodules? The basic principle that makes many professional tech companies professional is the simple principle of domain engineering. Basically working for a long period of time on a small set of domains with the hope that you will … Continue reading Git Submodules in the Python World
Before we talk about Dependency Management, Python is Awesome, it simply is, but in my opinion what truly makes Python a great language is not the Syntax structure or the dynamic nature or any other of these features, but rather … Continue reading Packaging and Dependency Management in Python