***
Abstract:
We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task).
The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features.
The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons.
These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons.
To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words.
The system ranked first in the SemEval-2013 shared task 'Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task.
Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task).
The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts.
The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.
***
https://dl.acm.org/doi/10.5555/2693068.2693087
https://www.jair.org/index.php/jair/article/view/10896/25984
***
No comments:
Post a Comment