Incorporating Semantic Information to FastText Word Vectors for Improved Sentiment Analysis

Document Type

Theses, Masters

Rights

This item is available under a Creative Commons License for non-commercial use only

Disciplines

Computer Sciences

Publication Details

A dissertation submitted in partial fulfillment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing (Data Analytics)

Abstract

Developments in natural language processing (NLP) have lead to words being represented as dense low-dimensional vectors that capture semantic and syntactic relations. These vectors are learned through the distributional statistics from large corpora. Recently researchers have released large pre-trained word vector models to be used in further research. This allows others the opportunity to use these high quality vectors which have lead to state-of-the-art results in a number of different NPL tasks such as sentiment analysis, machine translation and natural language generation. There are drawbacks to using pre-trained vectors. One problem encountered is the issue of out-of-vocabulary words, where there is no vectors in the models vocabulary for words that were not seen during their training.

This document is currently not available here.


Share

COinS