Detecting the Sentiment and Entities from Amazon reviews

In this sub-section of the lab, we will create a new table, amazon_reviews_with_text_analysis, with two new columns added- sentiment and entities. We will be using the previously created amazon_reviews_with_language table.

The create script for the new table is available in the “Saved queries” section of your workgroup. Similar to previous labs, you can execute the query manually or from the saved queries.

Please follow the appropriate link below.

Instructions to execute saved query

Instructions to execute manually

The above step creates the new table, amazon_reviews_with_text_analysis.
Let us run the below query to preview the table.
 
SELECT * from default.amazon_reviews_with_text_analysis limit 10;
Scroll to the right to see the newly created columns. They contain JSON strings with nested structures and fields

Sentiment Entities You will see that both columns are nested JSONs and not well prepared for analysis. Let us use the JSON functions in Athena to prepare these columns for analysis.

Prepare sentiment for analysis

Let us start with the sentiment column and apply Athena's JSON functions to make the column into a more consumable form.

Instructions to execute saved query

Instructions to execute manually

Once the query is executed, a new table sentiment_results_final is created. Let us preview it using the below query
 
SELECT * from default.sentiment_results_final limit 10;
Does the sentiment generally align with the text of the review_body field? How does it correlate with the star_rating? If you spot any dubious sentiment assignments, check the confidence scores to see if the sentiment was assigned with a low confidence.

Prepare entities for analysis

Now that we have cleansed the sentiment for analysis, let us work on the entities columns to un-nest it and make it available for consumption.

Instructions to execute saved query

Instructions to execute manually

Now that we have executed the query, let us preview the contents of the new table, entities_results_final
 
SELECT * FROM default.entities_results_final limit 10;
With that, we have come to the end of the Text Analysis labs. For more information on different use-cases of text analyis in Athena, check this blog post.