Aspect Based Sentiment Analysis with Machine Learning
BUILDING MODEL TO EXTRACT INSIGHTS FROM AMAZON REVIEWS
The paper aims to extract opinions and to find product improvement areas regarding different aspects of a product from their online reviews.
Why this Paper?
Giving a bad review for a product seldom means that it is bad in every aspect. If an Amazon delivery was delayed by a week, the bad review would not necessarily reflect product quality. It is important for customers and sellers to understand what exactly the negative review was about.
Consumers and sellers spend a large amount of time reading through long reviews to find out what is perceived as good and bad about a product. Amazon currently has a feature that lets users filter reviews by popular keywords, which is still tedious and time-consuming for customers. The users have to read through numerous reviews to get relevant information about the products that they need. The Amazon sellers or new entrants also need to find gaps in the product features for a particular category. For example, if the sentiment around “the taste” of all the products in a particular category is negative, there is a potential to develop and introduce a new product with a better taste. For marketers, the aspect based opinions can indicate which sentiment to enhance or downplay an advertisement based on how many people are talking about them in the review.
Looking at the example below, we can notice that why the Bottle aspect is negative,
When we step back and think about the different steps involved in the process, the pipeline seems very complicated. The intuition behind our model is that the aspects extracted from a set of reviews of a product can be similar or related to one other. Users may discuss the same features of a product in different words. Additionally, it will also ensure that there is no redundancy.
We broke down the whole process into submodules –
01. Getting the Data
In this step, We scrap the data of reviews from AMAZON.
02. Identifying Aspects
The objective of this step was to extract instances of product aspects that express the opinion about a particular aspect.
03. Visualizing the results
With the motive of developing an end-product, we modeled an interactive Dashboard so that sellers and users can gain insights from reviews.
This section provides a bird-eye view of the whole model we used. To make it very intuitive to understand, we have used one sample review through which we journey through the whole model.
Data is one of the most important aspects of any machine learning problem. The review text that we extracted from the AMAZON had a lot of unclean data, so we created a cleaning script for the dataset. It removed unnecessary characters, hyperlinks, symbols, excess spaces, and other patterns of text that could not be processed by our system. From the cleaned dataset, we extracted the review text description for our analysis.
The objective of this step was to extract instances of product aspects and modifiers that express the opinion about a particular aspect. We used Python’s spaCy, NLTK, ABSA extracts the aspects, and there respected sentiment. The output of this step was a list of such nouns with there respected sentiment and review.
Quick Overview of Dashboard
In the above Dashboard in the search box put ASIN (Amazon Standard Identification number), after some time you see the scrapped data with the download option so that you can download all the reviews, then the bar chart of positive and negative aspect with the Result option so that we can check manually.
Conclusions & Way Forward
Extracting aspects turned out to be a major bottleneck for the whole process so that we can find the product improvement areas and manufacturers can work on a product or replace the current product to a different one.
In our case, our product is Apple Cider Vinegar, some people don’t like the taste of the product so the taste is an improvement area for Apple Cider Vinegar so the producer can directly work on the product taste. etc.
- Creating an interactive UI, Developing a web-plugin for the UI – this will make sure that the user does not have to visit a new environment for accessing our model.
- Use Amazon S3 buckets instead of scraping the framework.
Machine Learning Engineer