This notebook shows how to analyze a collection of passages like Tweets for sentiment.
This is based on Neal Caron's An introduction to text analysis with Python, Part 1.
This Notebook shows how to analyze one tweet.
Here we will define the data to test and our positive and negative dictionaries.
theTweet = "No food is good food. Ha. I'm on a diet and the food is awful and lame."
positive_words=['awesome','good','nice','super','fun','delightful']
negative_words=['awful','lame','horrible','bad']
type(positive_words)
list
Now we will tokenize the text.
import re
theTokens = re.findall(r'\b\w[\w-]*\b', theTweet.lower())
print(theTokens[:10])
['no', 'food', 'is', 'good', 'food', 'ha', 'i', 'm', 'on', 'a']
Now we will count the number of positive words.
numPosWords = 0
for banana in theTokens:
if banana in positive_words:
numPosWords += 1
print(numPosWords)
1
Now we will count the number of negative words.
numNegWords = 0
for word in theTokens:
if word in negative_words:
numNegWords += 1
print(numNegWords)
2
v1 = "0"
v2 = 0
v3 = str(v2)
v1 == v3
True
Now we calculate the percentages of postive and negative.
numWords = len(theTokens)
percntPos = numPosWords / numWords
percntNeg = numNegWords / numWords
print("Positive: " + "{:.0%}".format(percntPos) + " Negative: " + "{:.0%}".format(percntNeg))
Positive: 6% Negative: 11%
We are going assume that a simple majority will define if the Tweet is positive or negative.
if numPosWords > numNegWords:
print("Positive " + str(numPosWords) + ":" + str(numNegWords))
elif numNegWords > numPosWords:
print("Negative " + str(numPosWords) + ":" + str(numNegWords))
elif numNegWords == numPosWords:
print("Neither " + str(numPosWords) + ":" + str(numNegWords))
print()
Negative 1:2