Simple Sentiment Analysis

This notebook shows how to analyze a collection of passages like Tweets for sentiment.

This is based on Neal Caron's An introduction to text analysis with Python, Part 1.

This Notebook shows how to analyze one tweet.

Setting up our data

Here we will define the data to test our positive and negative dictionaries.

In [6]:
theTweet = "No food is good food. Ha. I'm on a diet and the food is awful and lame."
In [7]:

Tokenizing the text

Now we will tokenize the text.

In [8]:
import re
theTokens = re.findall(r'\b\w[\w-]*\b', theTweet.lower())
['no', 'food', 'is', 'good', 'food', 'ha', 'i', 'm', 'on', 'a']

Calculating postive words

Now we will count the number of positive words.

In [14]:
numPosWords = 0
for banana in theTokens:
    if banana in positive_words:
        numPosWords += 1

Calculating negative words

Now we will count the number of negative words.

In [10]:
numNegWords = 0
for word in theTokens:
    if word in negative_words:
        numNegWords += 1
In [18]:
v1 = "0"
v2 = 0
v3 = str(v2)
v1 == v3

Calculating percentages

Now we calculate the percentages of postive and negative.

In [11]:
numWords = len(theTokens)
percntPos = numPosWords / numWords
percntNeg = numNegWords / numWords
print("Positive: " + "{:.0%}".format(percntPos) + "  Negative: " + "{:.0%}".format(percntNeg))
Positive: 6%  Negative: 11%

Deciding if it is postive or negative

We are going assume that a simple majority will define if the Tweet is positive or negative.

In [12]:
if numPosWords > numNegWords:
    print("Positive " + str(numPosWords) + ":" + str(numNegWords))
elif numNegWords > numPosWords:
    print("Negative " + str(numPosWords) + ":" + str(numNegWords))
elif numNegWords == numPosWords:
    print("Neither " + str(numPosWords) + ":" + str(numNegWords))
Negative 1:2

Next Steps

Let's try another utility example, this time looking at more Complex Sentiment Analysis.

CC BY-SA From The Art of Literary Text Analysis by Stéfan Sinclair & Geoffrey Rockwell. Edited and revised by Melissa Mony.
Created August 8, 2014 (Jupyter 4.2.1)

In [ ]: