Here, We will built hte spam filter which classify messages as spam or non-spam, in following steps:
Learns how humans classify messages.
Uses that human knowledge to etimate probabilities for new messages - probabilities for spam and non spam.
Classifies a new message based on these probability values — if the probability for spam is greater, then it classifies the message as spam. Otherwise, it classifies it as non-spam (if the two probability values are equal, then we may need a human to classify the message)
# importing the dependencies
import pandas as pd
import numpy as np
spam_clctn = pd.read_csv('SMSSpamCollection', sep = '\t', header = None ,names=['Label', 'SMS'] )
spam_clctn.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5572 entries, 0 to 5571 Data columns (total 2 columns): Label 5572 non-null object SMS 5572 non-null object dtypes: object(2) memory usage: 87.1+ KB
spam_clctn.head(4)
Label | SMS | |
---|---|---|
0 | ham | Go until jurong point, crazy.. Available only ... |
1 | ham | Ok lar... Joking wif u oni... |
2 | spam | Free entry in 2 a wkly comp to win FA Cup fina... |
3 | ham | U dun say so early hor... U c already then say... |
spam_clctn.tail(4)
Label | SMS | |
---|---|---|
5568 | ham | Will ü b going to esplanade fr home? |
5569 | ham | Pity, * was in mood for that. So...any other s... |
5570 | ham | The guy did some bitching but I acted like i'd... |
5571 | ham | Rofl. Its true to its name |
# calculating percentage of spam vs non - spam
spam_clctn['Label'].value_counts(normalize = True) * 100
ham 86.593683 spam 13.406317 Name: Label, dtype: float64
Here, ham means non- spam messages and spam simply means spam messages. So, given dataset has 86.59 % of non - spam messages and 13.40 % of spam messages.
We're going to keep 80% of our dataset for training, and 20% for testing (we want to train the algorithm on as much data as possible, but we also want to have enough test data). The dataset has 5,572 messages, which means that:
The training set will have 4,458 messages (about 80% of the dataset).
The test set will have 1,114 messages (about 20% of the dataset).
# randomising the entire datset.
spam_clctn = spam_clctn.sample(frac = 1 , random_state= 1)
spam_clctn
Label | SMS | |
---|---|---|
1078 | ham | Yep, by the pretty sculpture |
4028 | ham | Yes, princess. Are you going to make me moan? |
958 | ham | Welp apparently he retired |
4642 | ham | Havent. |
4674 | ham | I forgot 2 ask ü all smth.. There's a card on ... |
5461 | ham | Ok i thk i got it. Then u wan me 2 come now or... |
4210 | ham | I want kfc its Tuesday. Only buy 2 meals ONLY ... |
4216 | ham | No dear i was sleeping :-P |
1603 | ham | Ok pa. Nothing problem:-) |
1504 | ham | Ill be there on <#> ok. |
1783 | ham | My uncles in Atlanta. Wish you guys a great se... |
3465 | ham | My phone |
5534 | ham | Ok which your another number |
4267 | ham | The greatest test of courage on earth is to be... |
2498 | ham | Dai what this da.. Can i send my resume to thi... |
4259 | ham | I am late. I will be there at |
147 | spam | FreeMsg Why haven't you replied to my text? I'... |
141 | ham | K, text me when you're on the way |
4517 | spam | Congrats! 2 mobile 3G Videophones R yours. cal... |
3053 | ham | Please leave this topic..sorry for telling that.. |
5392 | ham | Ooooooh I forgot to tell u I can get on yovill... |
2346 | ham | Hi this is yijue, can i meet u at 11 tmr? |
1242 | ham | I want to show you the world, princess :) how ... |
3224 | ham | Well that must be a pain to catch |
4872 | ham | Well. You know what i mean. Texting |
3044 | ham | Your bill at 3 is £33.65 so thats not bad! |
1660 | ham | Yeah, where's your class at? |
3214 | ham | What's ur pin? |
501 | ham | Fighting with the world is easy, u either win ... |
1827 | ham | Dude. What's up. How Teresa. Hope you have bee... |
... | ... | ... |
1031 | ham | Can not use foreign stamps in this country. Go... |
1110 | ham | S s..first time..dhoni rocks... |
1888 | spam | Urgent! Please call 09061743811 from landline.... |
3550 | ham | I got like $ <#> , I can get some more l... |
1527 | ham | Wow ... I love you sooo much, you know ? I can... |
753 | ham | Dont gimme that lip caveboy |
3049 | ham | Die... Now i have e toot fringe again... |
2628 | ham | I know I'm lacking on most of this particular ... |
562 | ham | Thanx 4 e brownie it's v nice... |
4764 | ham | Prepare to be pleasured :) |
3562 | spam | Text BANNEDUK to 89555 to see! cost 150p texto... |
252 | ham | Wen ur lovable bcums angry wid u, dnt take it ... |
2516 | ham | Bognor it is! Should be splendid at this time ... |
2962 | ham | I'm doing da intro covers energy trends n pros... |
4453 | ham | I've told you everything will stop. Just dont ... |
5374 | ham | Do u konw waht is rael FRIENDSHIP Im gving yuo... |
5396 | ham | As in i want custom officer discount oh. |
1202 | ham | I know she called me |
3462 | ham | K.. I yan jiu liao... Sat we can go 4 bugis vi... |
2797 | ham | Tell your friends what you plan to do on Valen... |
4225 | ham | Double eviction this week - Spiral and Michael... |
144 | ham | I know you are. Can you pls open the back? |
5056 | ham | Am on a train back from northampton so i'm afr... |
2895 | ham | K...k...yesterday i was in cbe . |
2763 | ham | ARR birthday today:) i wish him to get more os... |
905 | ham | We're all getting worried over here, derek and... |
5192 | ham | Oh oh... Den muz change plan liao... Go back h... |
3980 | ham | CERI U REBEL! SWEET DREAMZ ME LITTLE BUDDY!! C... |
235 | spam | Text & meet someone sexy today. U can find a d... |
5157 | ham | K k:) sms chat with me. |
5572 rows × 2 columns
# splitting into training data.
training_set = spam_clctn.iloc[:4458,:]
training_set
Label | SMS | |
---|---|---|
1078 | ham | Yep, by the pretty sculpture |
4028 | ham | Yes, princess. Are you going to make me moan? |
958 | ham | Welp apparently he retired |
4642 | ham | Havent. |
4674 | ham | I forgot 2 ask ü all smth.. There's a card on ... |
5461 | ham | Ok i thk i got it. Then u wan me 2 come now or... |
4210 | ham | I want kfc its Tuesday. Only buy 2 meals ONLY ... |
4216 | ham | No dear i was sleeping :-P |
1603 | ham | Ok pa. Nothing problem:-) |
1504 | ham | Ill be there on <#> ok. |
1783 | ham | My uncles in Atlanta. Wish you guys a great se... |
3465 | ham | My phone |
5534 | ham | Ok which your another number |
4267 | ham | The greatest test of courage on earth is to be... |
2498 | ham | Dai what this da.. Can i send my resume to thi... |
4259 | ham | I am late. I will be there at |
147 | spam | FreeMsg Why haven't you replied to my text? I'... |
141 | ham | K, text me when you're on the way |
4517 | spam | Congrats! 2 mobile 3G Videophones R yours. cal... |
3053 | ham | Please leave this topic..sorry for telling that.. |
5392 | ham | Ooooooh I forgot to tell u I can get on yovill... |
2346 | ham | Hi this is yijue, can i meet u at 11 tmr? |
1242 | ham | I want to show you the world, princess :) how ... |
3224 | ham | Well that must be a pain to catch |
4872 | ham | Well. You know what i mean. Texting |
3044 | ham | Your bill at 3 is £33.65 so thats not bad! |
1660 | ham | Yeah, where's your class at? |
3214 | ham | What's ur pin? |
501 | ham | Fighting with the world is easy, u either win ... |
1827 | ham | Dude. What's up. How Teresa. Hope you have bee... |
... | ... | ... |
3117 | ham | Uncle Abbey! Happy New Year. Abiola |
2020 | ham | From tomorrow onwards eve 6 to 3 work. |
4827 | ham | Haha, just what I was thinkin |
2974 | ham | Happy New Year Princess! |
1275 | ham | Let me know how to contact you. I've you settl... |
2413 | spam | I don't know u and u don't know me. Send CHAT ... |
528 | ham | Yes! How is a pretty lady like you single? |
4404 | ham | Just getting back home |
1505 | ham | Oh my God. I'm almost home |
3598 | spam | Congratulations YOU'VE Won. You're a Winner in... |
3681 | ham | I cant pick the phone right now. Pls send a me... |
1318 | spam | Win the newest “Harry Potter and the Order of ... |
3380 | ham | Dear umma she called me now :-) |
3034 | ham | Aight, lemme know what's up |
2380 | ham | Good evening Sir, hope you are having a nice d... |
4311 | spam | Someone U know has asked our dating service 2 ... |
2616 | ham | 2marrow only. Wed at <#> to 2 aha. |
1166 | ham | Haha yeah I see that now, be there in a sec |
3471 | ham | aathi..where are you dear.. |
4695 | ham | Pls give her the food preferably pap very slow... |
5293 | ham | I donno its in your genes or something |
2871 | spam | YOUR CHANCE TO BE ON A REALITY FANTASY SHOW ca... |
1222 | ham | Prakesh is there know. |
4749 | ham | The beauty of life is in next second.. which h... |
4255 | ham | How about clothes, jewelry, and trips? |
1982 | ham | Sorry, I'll call later in meeting any thing re... |
5180 | ham | Babe! I fucking love you too !! You know? Fuck... |
4020 | spam | U've been selected to stay in 1 of 250 top Bri... |
371 | ham | Hello my boytoy ... Geeee I miss you already a... |
3482 | ham | Wherre's my boytoy ? :-( |
4458 rows × 2 columns
training_set = training_set.reset_index()
training_set.drop(['index'],axis = 1 , inplace = True)
training_set.head(5)
Label | SMS | |
---|---|---|
0 | ham | Yep, by the pretty sculpture |
1 | ham | Yes, princess. Are you going to make me moan? |
2 | ham | Welp apparently he retired |
3 | ham | Havent. |
4 | ham | I forgot 2 ask ü all smth.. There's a card on ... |
training_set.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 4458 entries, 0 to 4457 Data columns (total 2 columns): Label 4458 non-null object SMS 4458 non-null object dtypes: object(2) memory usage: 69.7+ KB
# splitting into testing data.
testing_set = spam_clctn.iloc[4458:,:]
testing_set = testing_set.reset_index()
testing_set.drop(['index'],axis = 1 , inplace = True)
testing_set.head(5)
Label | SMS | |
---|---|---|
0 | ham | Later i guess. I needa do mcat study too. |
1 | ham | But i haf enuff space got like 4 mb... |
2 | spam | Had your mobile 10 mths? Update to latest Oran... |
3 | ham | All sounds good. Fingers . Makes it difficult ... |
4 | ham | All done, all handed in. Don't know if mega sh... |
testing_set.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1114 entries, 0 to 1113 Data columns (total 2 columns): Label 1114 non-null object SMS 1114 non-null object dtypes: object(2) memory usage: 17.5+ KB
import re
def rmv_punc(x):
x = re.sub('\W',' ', x)
return x
training_set['SMS'] = training_set['SMS'].apply(rmv_punc)
training_set['SMS'] = training_set['SMS'].str.lower()
training_set.head(5)
Label | SMS | |
---|---|---|
0 | ham | yep by the pretty sculpture |
1 | ham | yes princess are you going to make me moan |
2 | ham | welp apparently he retired |
3 | ham | havent |
4 | ham | i forgot 2 ask ü all smth there s a card on ... |
testing_set['SMS'] = testing_set['SMS'].apply(rmv_punc)
testing_set['SMS'] = testing_set['SMS'].str.lower()
testing_set.head(5)
Label | SMS | |
---|---|---|
0 | ham | later i guess i needa do mcat study too |
1 | ham | but i haf enuff space got like 4 mb |
2 | spam | had your mobile 10 mths update to latest oran... |
3 | ham | all sounds good fingers makes it difficult ... |
4 | ham | all done all handed in don t know if mega sh... |
training_set['SMS'] = training_set['SMS'].astype(str)
training_set['SMS'] = training_set['SMS'].str.split()
vocabulary = []
for each in training_set['SMS']:
for i in each:
vocabulary.append(i)
vocabulary = list(set(vocabulary))
len(vocabulary)
7783
word_counts_per_sms = { unique_word: [0] * len(training_set['SMS'])
for unique_word in vocabulary
}
for index,sms in enumerate(training_set['SMS']):
for word in sms:
word_counts_per_sms[word][index] += 1
word_counts = pd.DataFrame(word_counts_per_sms)
3
3
pd.options.display.max_columns = 1000
word_counts.head(3)
0 | 00 | 000 | 000pes | 008704050406 | 0089 | 01223585334 | 02 | 0207 | 02072069400 | 02073162414 | 02085076972 | 021 | 03 | 04 | 0430 | 05 | 050703 | 0578 | 06 | 07008009200 | 07046744435 | 07090201529 | 07090298926 | 07099833605 | 07123456789 | 0721072 | 07734396839 | 07742676969 | 07753741225 | 07781482378 | 07786200117 | 077xxx | 078 | 07801543489 | 07808726822 | 07815296484 | 07821230901 | 078498 | 07880867867 | 0789xxxxxxx | 07946746291 | 0796xxxxxx | 07973788240 | 07xxxxxxxxx | 08 | 0800 | 08000407165 | 08000776320 | 08000839402 | 08000930705 | 08000938767 | 08001950382 | 08002888812 | 08002986030 | 08002986906 | 08002988890 | 08006344447 | 0808 | 08081263000 | 08081560665 | 0825 | 083 | 0844 | 08448350055 | 08448714184 | 0845 | 08450542832 | 08452810073 | 08452810075over18 | 0870 | 08700435505150p | 08700469649 | 08700621170150p | 08701237397 | 08701417012 | 08701417012150p | 0870141701216 | 087016248 | 087018728737 | 0870241182716 | 08702490080 | 08702840625 | 08704050406 | 08704439680ts | 08706091795 | 0870737910216yrs | 08707500020 | 08707509020 | 0870753331018 | 08708034412 | 08708800282 | 08709222922 | 08709501522 | 0871 | 087104711148 | 08712101358 | 08712103738 | 0871212025016 | 08712300220 | 087123002209am | 08712317606 | 08712400200 | 08712400602450p | 08712400603 | 08712402050 | 08712402578 | 08712402779 | 08712402902 | 08712402972 | 08712405020 | 08712405022 | 08712460324 | 08712466669 | 0871277810810 | 08714342399 | 087147123779am | 08714712388 | 08714712394 | 08714712412 | 08714714011 | 08715203649 | 08715203652 | 08715203656 | 08715203677 | 08715203685 | 08715205273 | 08715500022 | 08715705022 | 08717111821 | 08717168528 | 08717205546 | 0871750 | 08717507382 | 08717509990 | 08717890890 | 08717895698 | 08717898035 | 08718711108 | 08718720201 | 08718723815 | 08718725756 | 08718726270 | 087187262701 | 08718726970 | 08718726971 | 08718726978 | 087187272008 | 08718727868 | 08718727870 | 08718727870150ppm | 08718730555 | 08718730666 | 08718738001 | 08718738034 | 08719180219 | 08719180248 | 08719181259 | 08719181503 | 08719181513 | 08719839835 | 09 | 09041940223 | 09050000301 | 09050000332 | 09050000460 | 09050000878 | 09050000928 | 09050001295 | 09050001808 | 09050002311 | 09050003091 | 09050005321 | 09050090044 | 09050280520 | 09053750005 | 09056242159 | 09057039994 | 09058091854 | 09058091870 | 09058094454 | 09058094455 | 09058094565 | 09058094583 | 09058094597 | 09058094599 | 09058095107 | 09058095201 | 09058097189 | 09058097218 | 09058099801 | 09061104276 | 09061104283 | 09061209465 | 09061213237 | 09061221061 | 09061221066 | 09061701444 | 09061701461 | 09061701851 | 09061701939 | 09061702893 | 09061743386 | 09061743806 | 09061744553 | 09061790121 | 09061790125 | 09061790126 | 09063440451 | 09063442151 | 09063458130 | 0906346330 | 09064011000 | 09064012103 | 09064012160 | 09064015307 | 09064017295 | 09064017305 | 09064018838 | 09064019014 | 09065069154 | 09065171142 | 09065174042 | 09065394973 | 09065989182 | 09066350750 | 09066358152 | 09066361921 | 09066362206 | 09066362220 | 09066362231 | 09066364311 | 09066364349 | 09066364589 | 09066368327 | 09066368470 | 09066368753 | 09066380611 | 09066382422 | 09066612661 | 09066649731from | 09066660100 | 09071512432 | 09071512433 | 09071517866 | 09077818151 | 09090900040 | 09094100151 | 09094646631 | 09094646899 | 09095350301 | 09096102316 | 09099725823 | 09099726395 | 09099726429 | 09099726481 | 09111030116 | 09111032124 | 09701213186 | 0quit | 1 | 10 | 100 | 1000 | 1000call | 1000s | 100p | 100percent | 100txt | 1013 | 10am | 10k | 10p | 10ppm | 11 | 1120 | 113 | 1131 | 114 | 1146 | 116 | 1172 | 118p | 11mths | 11pm | 12 | 1205 | 120p | 121 | 1225 | 123 | 1250 | 125gift | 128 | 12hours | 12hrs | 12mths | 13 | 130 | 1327 | 139 | 14 | 140 | 1405 | 140ppm | 145 | 1450 | 14tcr | 14thmarch | 15 | 150 | 1500 | 150p | 150p16 | 150pm | 150ppermesssubscription | 150ppm | 150ppmpobox10183bhamb64xe | 150ppmsg | 150pw | 151 | 153 | 15541 | 15pm | 16 | 165 | 1680 | 169 | 177 | 18 | 180 | 1843 | 18p | 18yrs | 195 | 1956669 | 1b6a5ecef91ff9 | 1da | 1er | 1hr | 1mega | 1million | 1pm | 1st | 1stone | 1thing | 1win150ppmx3 | 1winaweek | 1x150p | 2 | 20 | 200 | 2000 | 2003 | 2004 | 2005 | 2006 | 2007 | 200p | 20p | 21 | 21870000 | 21st | 22 | 220 | 220cm2 | 2309 | 23f | 23g | 24 | 24hrs | 24m | 24th | 25 | 250 | 250k | 255 | 25p | 26 | 2667 | 26th | 27 | 28 | 2814032 | 28days | 28th | 29 | 2bold | 2c | 2channel | 2day | 2end | 2exit | 2ez | 2find | 2geva | 2go | 2gthr | 2hook | 2i | 2lands | 2marrow | 2moro | 2morow | 2morro | 2morrow | 2morrowxxxx | 2mro | 2mrw | 2mwen | 2nd | 2nhite | 2nights | 2nite | 2optout | 2p | 2price | 2px | 2rcv | 2stop | 2stoptx | 2stoptxt | 2u | 2u2 | 2watershd | 2waxsto | 2wks | 2wt | 2wu | 2years | 2yr | 3 | 30 | 300 | 3000 | 300603 | 300603t | 300p | 3030 | 30ish | 30pm | 30pp | 30s | 30th | 31 | 3100 | 310303 | 31p | 32000 | 3230 | 32323 | 33 | 330 | 350 | 3510i | 35p | 3650 | 36504 | 3680 | 373 | 3750 | 37819 | 38 | 391784 | 3aj | 3d | 3days | 3g | 3gbp | 3hrs | 3lp | 3mins | 3mobile | 3optical | 3qxj9 | 3rd | 3ss | 3uz | 3wks | 3x | 3xx | 4 | 40 | 400 | 400mins | 400thousad | 402 | 4041 | 40533 | 40gb | 40mph | 41782 | 420 | 42049 | 4217 | 42478 | 42810 | 430 | 434 | 44 | ... | vat | vava | vco | vday | ve | vegas | vegetables | veggie | vehicle | velachery | velly | velusamy | venaam | verified | verify | verifying | version | versus | very | vettam | vewy | via | vibrant | vibrate | vibrator | victoria | victors | vid | video | videochat | videophones | videos | videosound | videosounds | vijay | vijaykanth | vikky | vilikkam | villa | village | vinobanagar | violated | violence | violet | vip | vipclub4u | virgil | virgin | virgins | virtual | visa | visionsms | visit | visiting | visitor | visitors | vital | vitamin | viva | vivek | vivekanand | vl | voda | vodafone | vodka | voice | voicemail | voila | volcanoes | vomit | vomitin | vomiting | vote | vouch4me | voucher | vouchers | vpod | vry | vs | vu | w | w1 | w111wx | w1a | w1j | w1j6hl | w1jhl | w1t1jy | w45wq | w8in | wa | wa14 | waaaat | wadebridge | wah | wahala | wahay | waheed | wahleykkum | waht | wait | waited | waitin | waiting | wake | waking | wales | waliking | walk | walked | walkin | walking | walks | wall | wallet | wallpaper | walls | walmart | walsall | wamma | wan | wan2 | wana | wanna | wannatell | want | want2come | wanted | wanting | wants | wap | waqt | warm | warming | warned | warner | warning | warranty | warwick | was | washob | wasn | wasnt | waste | wasted | wasting | wat | watch | watches | watchin | watching | watchng | water | watever | watevr | wating | wats | watts | wavering | waves | way | way2sms | waz | wc1n | wc1n3xx | we | weak | weakness | wear | wearing | weaseling | weasels | weather | web | web2mobile | webadres | webeburnin | webpage | website | wed | weddin | wedding | weddingfriend | wedlunch | wednesday | weds | wee | weed | week | weekend | weekends | weekly | weeks | weigh | weighed | weight | weird | weirdest | weirdo | weirdy | weiyi | welcome | welcomes | well | wellda | welp | wen | wendy | wenever | went | wenwecan | wer | were | weren | werethe | wesley | wesleys | west | western | westlife | westonzoyland | westshore | wet | wewa | whassup | what | whatever | whats | whatsup | wheel | when | whenever | whenevr | whens | where | wherever | wherevr | wherre | whether | which | while | whilltake | whispers | white | whn | who | whoever | whole | whom | whore | whos | whose | whr | why | wicket | wicklow | wid | widelive | wif | wife | wifi | wihtuot | wikipedia | wil | wild | wildest | wildlife | will | willing | willpower | win | wind | window | windows | winds | windy | wine | wined | wings | wining | winner | winning | wins | winterstone | wipe | wipro | wisdom | wise | wish | wisheds | wishes | wishin | wishing | wiskey | wit | with | withdraw | wither | within | without | witot | witout | wiv | wizzle | wk | wkend | wkent | wkg | wkly | wknd | wks | wlcome | wld | wml | wn | wnt | wo | woah | wocay | woke | woken | woman | womdarfull | women | won | wondarfull | wonder | wonderful | wondering | wonders | wont | woo | woodland | woods | woohoo | woot | woould | woozles | worc | word | words | work | workage | workand | workin | working | workout | works | world | worlds | worms | worried | worries | worry | worrying | worse | worth | worthless | wot | wotu | wotz | woul | would | woulda | wouldn | wow | wrc | wrecked | wrench | wright | write | writhing | wrk | wrkin | wrking | wrks | wrld | wrnog | wrong | wrongly | wrote | ws | wt | wtc | wtf | wth | wthout | wtlp | wud | wuld | wun | www | wylie | x | x2 | x49 | xam | xavier | xchat | xclusive | xin | xmas | xuhui | xx | xxsp | xxuk | xxx | xxxx | xxxxx | xxxxxxx | xxxxxxxx | xxxxxxxxxxxxxx | xy | y | ya | yah | yahoo | yalrigu | yalru | yan | yar | yarasu | yards | yavnt | yaxx | yaxxx | yay | yck | yeah | year | years | yeh | yelling | yelow | yeovil | yep | yer | yes | yest | yesterday | yet | yetty | yetunde | yhl | yi | yifeng | yijue | ym | ymca | yo | yoga | yogasana | yor | yorge | you | youdoing | youi | young | younger | youphone | your | youre | yourinclusive | yourjob | yours | yourself | youuuuu | youwanna | yoville | yowifes | yoyyooo | yr | yrs | ystrday | ything | yummy | yun | yunny | yuo | yuou | yup | yupz | z | zac | zaher | zealand | zebra | zed | zeros | zhong | zindgi | zoe | zogtorius | zouk | zyada | é | ú1 | ü | 〨ud | 鈥 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 rows × 7783 columns
result = pd.concat([training_set,word_counts], axis=1, join='inner')
result.head()
Label | SMS | 0 | 00 | 000 | 000pes | 008704050406 | 0089 | 01223585334 | 02 | 0207 | 02072069400 | 02073162414 | 02085076972 | 021 | 03 | 04 | 0430 | 05 | 050703 | 0578 | 06 | 07008009200 | 07046744435 | 07090201529 | 07090298926 | 07099833605 | 07123456789 | 0721072 | 07734396839 | 07742676969 | 07753741225 | 07781482378 | 07786200117 | 077xxx | 078 | 07801543489 | 07808726822 | 07815296484 | 07821230901 | 078498 | 07880867867 | 0789xxxxxxx | 07946746291 | 0796xxxxxx | 07973788240 | 07xxxxxxxxx | 08 | 0800 | 08000407165 | 08000776320 | 08000839402 | 08000930705 | 08000938767 | 08001950382 | 08002888812 | 08002986030 | 08002986906 | 08002988890 | 08006344447 | 0808 | 08081263000 | 08081560665 | 0825 | 083 | 0844 | 08448350055 | 08448714184 | 0845 | 08450542832 | 08452810073 | 08452810075over18 | 0870 | 08700435505150p | 08700469649 | 08700621170150p | 08701237397 | 08701417012 | 08701417012150p | 0870141701216 | 087016248 | 087018728737 | 0870241182716 | 08702490080 | 08702840625 | 08704050406 | 08704439680ts | 08706091795 | 0870737910216yrs | 08707500020 | 08707509020 | 0870753331018 | 08708034412 | 08708800282 | 08709222922 | 08709501522 | 0871 | 087104711148 | 08712101358 | 08712103738 | 0871212025016 | 08712300220 | 087123002209am | 08712317606 | 08712400200 | 08712400602450p | 08712400603 | 08712402050 | 08712402578 | 08712402779 | 08712402902 | 08712402972 | 08712405020 | 08712405022 | 08712460324 | 08712466669 | 0871277810810 | 08714342399 | 087147123779am | 08714712388 | 08714712394 | 08714712412 | 08714714011 | 08715203649 | 08715203652 | 08715203656 | 08715203677 | 08715203685 | 08715205273 | 08715500022 | 08715705022 | 08717111821 | 08717168528 | 08717205546 | 0871750 | 08717507382 | 08717509990 | 08717890890 | 08717895698 | 08717898035 | 08718711108 | 08718720201 | 08718723815 | 08718725756 | 08718726270 | 087187262701 | 08718726970 | 08718726971 | 08718726978 | 087187272008 | 08718727868 | 08718727870 | 08718727870150ppm | 08718730555 | 08718730666 | 08718738001 | 08718738034 | 08719180219 | 08719180248 | 08719181259 | 08719181503 | 08719181513 | 08719839835 | 09 | 09041940223 | 09050000301 | 09050000332 | 09050000460 | 09050000878 | 09050000928 | 09050001295 | 09050001808 | 09050002311 | 09050003091 | 09050005321 | 09050090044 | 09050280520 | 09053750005 | 09056242159 | 09057039994 | 09058091854 | 09058091870 | 09058094454 | 09058094455 | 09058094565 | 09058094583 | 09058094597 | 09058094599 | 09058095107 | 09058095201 | 09058097189 | 09058097218 | 09058099801 | 09061104276 | 09061104283 | 09061209465 | 09061213237 | 09061221061 | 09061221066 | 09061701444 | 09061701461 | 09061701851 | 09061701939 | 09061702893 | 09061743386 | 09061743806 | 09061744553 | 09061790121 | 09061790125 | 09061790126 | 09063440451 | 09063442151 | 09063458130 | 0906346330 | 09064011000 | 09064012103 | 09064012160 | 09064015307 | 09064017295 | 09064017305 | 09064018838 | 09064019014 | 09065069154 | 09065171142 | 09065174042 | 09065394973 | 09065989182 | 09066350750 | 09066358152 | 09066361921 | 09066362206 | 09066362220 | 09066362231 | 09066364311 | 09066364349 | 09066364589 | 09066368327 | 09066368470 | 09066368753 | 09066380611 | 09066382422 | 09066612661 | 09066649731from | 09066660100 | 09071512432 | 09071512433 | 09071517866 | 09077818151 | 09090900040 | 09094100151 | 09094646631 | 09094646899 | 09095350301 | 09096102316 | 09099725823 | 09099726395 | 09099726429 | 09099726481 | 09111030116 | 09111032124 | 09701213186 | 0quit | 1 | 10 | 100 | 1000 | 1000call | 1000s | 100p | 100percent | 100txt | 1013 | 10am | 10k | 10p | 10ppm | 11 | 1120 | 113 | 1131 | 114 | 1146 | 116 | 1172 | 118p | 11mths | 11pm | 12 | 1205 | 120p | 121 | 1225 | 123 | 1250 | 125gift | 128 | 12hours | 12hrs | 12mths | 13 | 130 | 1327 | 139 | 14 | 140 | 1405 | 140ppm | 145 | 1450 | 14tcr | 14thmarch | 15 | 150 | 1500 | 150p | 150p16 | 150pm | 150ppermesssubscription | 150ppm | 150ppmpobox10183bhamb64xe | 150ppmsg | 150pw | 151 | 153 | 15541 | 15pm | 16 | 165 | 1680 | 169 | 177 | 18 | 180 | 1843 | 18p | 18yrs | 195 | 1956669 | 1b6a5ecef91ff9 | 1da | 1er | 1hr | 1mega | 1million | 1pm | 1st | 1stone | 1thing | 1win150ppmx3 | 1winaweek | 1x150p | 2 | 20 | 200 | 2000 | 2003 | 2004 | 2005 | 2006 | 2007 | 200p | 20p | 21 | 21870000 | 21st | 22 | 220 | 220cm2 | 2309 | 23f | 23g | 24 | 24hrs | 24m | 24th | 25 | 250 | 250k | 255 | 25p | 26 | 2667 | 26th | 27 | 28 | 2814032 | 28days | 28th | 29 | 2bold | 2c | 2channel | 2day | 2end | 2exit | 2ez | 2find | 2geva | 2go | 2gthr | 2hook | 2i | 2lands | 2marrow | 2moro | 2morow | 2morro | 2morrow | 2morrowxxxx | 2mro | 2mrw | 2mwen | 2nd | 2nhite | 2nights | 2nite | 2optout | 2p | 2price | 2px | 2rcv | 2stop | 2stoptx | 2stoptxt | 2u | 2u2 | 2watershd | 2waxsto | 2wks | 2wt | 2wu | 2years | 2yr | 3 | 30 | 300 | 3000 | 300603 | 300603t | 300p | 3030 | 30ish | 30pm | 30pp | 30s | 30th | 31 | 3100 | 310303 | 31p | 32000 | 3230 | 32323 | 33 | 330 | 350 | 3510i | 35p | 3650 | 36504 | 3680 | 373 | 3750 | 37819 | 38 | 391784 | 3aj | 3d | 3days | 3g | 3gbp | 3hrs | 3lp | 3mins | 3mobile | 3optical | 3qxj9 | 3rd | 3ss | 3uz | 3wks | 3x | 3xx | 4 | 40 | 400 | 400mins | 400thousad | 402 | 4041 | 40533 | 40gb | 40mph | 41782 | 420 | 42049 | 4217 | 42478 | 42810 | 430 | ... | vat | vava | vco | vday | ve | vegas | vegetables | veggie | vehicle | velachery | velly | velusamy | venaam | verified | verify | verifying | version | versus | very | vettam | vewy | via | vibrant | vibrate | vibrator | victoria | victors | vid | video | videochat | videophones | videos | videosound | videosounds | vijay | vijaykanth | vikky | vilikkam | villa | village | vinobanagar | violated | violence | violet | vip | vipclub4u | virgil | virgin | virgins | virtual | visa | visionsms | visit | visiting | visitor | visitors | vital | vitamin | viva | vivek | vivekanand | vl | voda | vodafone | vodka | voice | voicemail | voila | volcanoes | vomit | vomitin | vomiting | vote | vouch4me | voucher | vouchers | vpod | vry | vs | vu | w | w1 | w111wx | w1a | w1j | w1j6hl | w1jhl | w1t1jy | w45wq | w8in | wa | wa14 | waaaat | wadebridge | wah | wahala | wahay | waheed | wahleykkum | waht | wait | waited | waitin | waiting | wake | waking | wales | waliking | walk | walked | walkin | walking | walks | wall | wallet | wallpaper | walls | walmart | walsall | wamma | wan | wan2 | wana | wanna | wannatell | want | want2come | wanted | wanting | wants | wap | waqt | warm | warming | warned | warner | warning | warranty | warwick | was | washob | wasn | wasnt | waste | wasted | wasting | wat | watch | watches | watchin | watching | watchng | water | watever | watevr | wating | wats | watts | wavering | waves | way | way2sms | waz | wc1n | wc1n3xx | we | weak | weakness | wear | wearing | weaseling | weasels | weather | web | web2mobile | webadres | webeburnin | webpage | website | wed | weddin | wedding | weddingfriend | wedlunch | wednesday | weds | wee | weed | week | weekend | weekends | weekly | weeks | weigh | weighed | weight | weird | weirdest | weirdo | weirdy | weiyi | welcome | welcomes | well | wellda | welp | wen | wendy | wenever | went | wenwecan | wer | were | weren | werethe | wesley | wesleys | west | western | westlife | westonzoyland | westshore | wet | wewa | whassup | what | whatever | whats | whatsup | wheel | when | whenever | whenevr | whens | where | wherever | wherevr | wherre | whether | which | while | whilltake | whispers | white | whn | who | whoever | whole | whom | whore | whos | whose | whr | why | wicket | wicklow | wid | widelive | wif | wife | wifi | wihtuot | wikipedia | wil | wild | wildest | wildlife | will | willing | willpower | win | wind | window | windows | winds | windy | wine | wined | wings | wining | winner | winning | wins | winterstone | wipe | wipro | wisdom | wise | wish | wisheds | wishes | wishin | wishing | wiskey | wit | with | withdraw | wither | within | without | witot | witout | wiv | wizzle | wk | wkend | wkent | wkg | wkly | wknd | wks | wlcome | wld | wml | wn | wnt | wo | woah | wocay | woke | woken | woman | womdarfull | women | won | wondarfull | wonder | wonderful | wondering | wonders | wont | woo | woodland | woods | woohoo | woot | woould | woozles | worc | word | words | work | workage | workand | workin | working | workout | works | world | worlds | worms | worried | worries | worry | worrying | worse | worth | worthless | wot | wotu | wotz | woul | would | woulda | wouldn | wow | wrc | wrecked | wrench | wright | write | writhing | wrk | wrkin | wrking | wrks | wrld | wrnog | wrong | wrongly | wrote | ws | wt | wtc | wtf | wth | wthout | wtlp | wud | wuld | wun | www | wylie | x | x2 | x49 | xam | xavier | xchat | xclusive | xin | xmas | xuhui | xx | xxsp | xxuk | xxx | xxxx | xxxxx | xxxxxxx | xxxxxxxx | xxxxxxxxxxxxxx | xy | y | ya | yah | yahoo | yalrigu | yalru | yan | yar | yarasu | yards | yavnt | yaxx | yaxxx | yay | yck | yeah | year | years | yeh | yelling | yelow | yeovil | yep | yer | yes | yest | yesterday | yet | yetty | yetunde | yhl | yi | yifeng | yijue | ym | ymca | yo | yoga | yogasana | yor | yorge | you | youdoing | youi | young | younger | youphone | your | youre | yourinclusive | yourjob | yours | yourself | youuuuu | youwanna | yoville | yowifes | yoyyooo | yr | yrs | ystrday | ything | yummy | yun | yunny | yuo | yuou | yup | yupz | z | zac | zaher | zealand | zebra | zed | zeros | zhong | zindgi | zoe | zogtorius | zouk | zyada | é | ú1 | ü | 〨ud | 鈥 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ham | [yep, by, the, pretty, sculpture] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | ham | [yes, princess, are, you, going, to, make, me,... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | ham | [welp, apparently, he, retired] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | ham | [havent] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | ham | [i, forgot, 2, ask, ü, all, smth, there, s, a,... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 |
5 rows × 7785 columns
Now that we're done with data cleaning and have a training set to work with, we can begin creating the spam filter. The Naive Bayes algorithm will need to know the probability values of the two equations below to be able to classify new messages:
\begin{equation} P(Spam | w_1,w_2, ..., w_n) \propto P(Spam) \cdot \prod_{i=1}^{n}P(w_i|Spam) \\ P(Ham | w_1,w_2, ..., w_n) \propto P(Ham) \cdot \prod_{i=1}^{n}P(w_i|Ham) \end{equation}Also, to calculate P(wi|Spam) and P(wi|Ham) inside the formulas above.
\begin{equation} P(w_i|Spam) = \frac{N_{w_i|Spam} + \alpha}{N_{Spam} + \alpha \cdot N_{Vocabulary}} \\ P(w_i|Ham) = \frac{N_{w_i|Ham} + \alpha}{N_{Ham} + \alpha \cdot N_{Vocabulary}} \end{equation}Some of the terms in the four equations above will have the same value for every new message. As a start, let's first calculate:
P(Spam) and P(Ham)
NSpam, NHam, NVocabulary
# calculating p_scam
p_spam = result[result['Label'] == 'spam'].shape[0]/result.shape[0]
p_spam
0.13458950201884254
# calculating p_not_scam
p_not_spam = 1- p_spam
p_not_spam
0.8654104979811574
# calculating n_spam & n_not_spam
n_spam = 0
n_not_spam = 0
a = training_set[training_set['Label'] == 'spam']
b = training_set[training_set['Label'] == 'ham']
for each in a['SMS']:
n_spam += len(each)
for each in b['SMS']:
n_not_spam += len(each)
print(n_spam, n_not_spam)
15190 57237
# calculating n_vocab
n_vocab = len(vocabulary)
print(n_vocab)
7783
# initiating variable alpha
alpha = 1
We have 7,783 words in our vocabulary, which means we'll need to calculate a total of 15,566 probabilities. For each word, we need to calculate both P(wi|Spam) and P(wi|Ham).
In more technical language, the probability values that P(wi|Spam) and P(wi|Ham) will take are called parameters.
# initializing spam and non-spam parameters dictionary.
spam_prob = { unique_word: 0
for unique_word in vocabulary
}
non_spam_prob = { unique_word: 0
for unique_word in vocabulary
}
p = result[result['Label'] == 'spam']
q = result[result['Label'] == 'ham']
for each in vocabulary:
spam = (p[each].sum() + alpha)/ (n_spam + alpha*n_vocab)
spam_prob[each] = spam
non_spam = (q[each].sum() + alpha)/ (n_not_spam + alpha*n_vocab)
non_spam_prob[each] = non_spam
spam_prob
{'lifebook': 4.3529360553693465e-05, 'chef': 4.3529360553693465e-05, 'studyn': 4.3529360553693465e-05, 'aburo': 4.3529360553693465e-05, 'resort': 4.3529360553693465e-05, 'betta': 4.3529360553693465e-05, 'consensus': 4.3529360553693465e-05, 'sathy': 4.3529360553693465e-05, '08712460324': 0.00039176424498324117, 'oja': 8.705872110738693e-05, 'pop': 4.3529360553693465e-05, 'open': 4.3529360553693465e-05, 'receipts': 4.3529360553693465e-05, 'definitely': 8.705872110738693e-05, 'water': 4.3529360553693465e-05, '8007': 0.0007835284899664823, 'guys': 0.0002611761633221608, 'samus': 4.3529360553693465e-05, 'gibbs': 4.3529360553693465e-05, 'lyfu': 4.3529360553693465e-05, 'textbuddy': 0.0001305880816610804, '60': 4.3529360553693465e-05, '05': 0.0002611761633221608, 'hv': 4.3529360553693465e-05, 'cloth': 4.3529360553693465e-05, 'executive': 4.3529360553693465e-05, 'major': 4.3529360553693465e-05, 'ability': 4.3529360553693465e-05, 'hubby': 0.0001305880816610804, 'rentl': 0.00021764680276846734, 'buzy': 4.3529360553693465e-05, 'sweetest': 4.3529360553693465e-05, 'goverment': 4.3529360553693465e-05, 'wow': 0.0001305880816610804, 'psychic': 8.705872110738693e-05, 'comb': 4.3529360553693465e-05, 'singles': 0.0002611761633221608, 'monoc': 8.705872110738693e-05, 'beauty': 4.3529360553693465e-05, 'tmorrow': 4.3529360553693465e-05, '09061701444': 8.705872110738693e-05, 'huai': 4.3529360553693465e-05, 'finishes': 4.3529360553693465e-05, 'whatever': 4.3529360553693465e-05, 'such': 4.3529360553693465e-05, 'reception': 4.3529360553693465e-05, 'dancing': 4.3529360553693465e-05, 'proove': 4.3529360553693465e-05, 'turns': 4.3529360553693465e-05, 'dependents': 4.3529360553693465e-05, 'wan': 0.0001305880816610804, 'completed': 4.3529360553693465e-05, 'fieldof': 4.3529360553693465e-05, 'recd': 0.0001305880816610804, 'often': 4.3529360553693465e-05, 'blind': 0.0001305880816610804, 'heal': 4.3529360553693465e-05, 'million': 4.3529360553693465e-05, 'reward': 0.0004352936055369347, 'amazing': 0.0001305880816610804, 'cl': 4.3529360553693465e-05, 'haha': 4.3529360553693465e-05, 'warranty': 0.0001305880816610804, 'maximize': 0.0003482348844295477, 'woohoo': 4.3529360553693465e-05, 'hasbro': 4.3529360553693465e-05, 'fireplace': 4.3529360553693465e-05, 'waht': 4.3529360553693465e-05, '9758': 8.705872110738693e-05, 'twilight': 0.0001305880816610804, 'done': 0.00021764680276846734, 'questions': 0.00017411744221477386, 'adp': 8.705872110738693e-05, 'andres': 4.3529360553693465e-05, 'sf': 8.705872110738693e-05, '2006': 8.705872110738693e-05, 'tasts': 4.3529360553693465e-05, 'showing': 4.3529360553693465e-05, 'salam': 4.3529360553693465e-05, 'unfortunately': 4.3529360553693465e-05, 'drinkin': 4.3529360553693465e-05, 'sense': 4.3529360553693465e-05, 'musthu': 4.3529360553693465e-05, 'amnow': 4.3529360553693465e-05, 'bluff': 4.3529360553693465e-05, 'canada': 4.3529360553693465e-05, 'singapore': 4.3529360553693465e-05, 'trusting': 4.3529360553693465e-05, 'art': 4.3529360553693465e-05, 'loud': 4.3529360553693465e-05, 'enter': 0.00030470552387585427, 'sufficient': 8.705872110738693e-05, 'dining': 0.0001305880816610804, '36504': 0.0002611761633221608, 'things': 8.705872110738693e-05, 'margin': 8.705872110738693e-05, 'addie': 4.3529360553693465e-05, 'slowing': 4.3529360553693465e-05, 'railway': 4.3529360553693465e-05, 'thursday': 4.3529360553693465e-05, 'styles': 4.3529360553693465e-05, 'hill': 4.3529360553693465e-05, 'prakesh': 4.3529360553693465e-05, 'pobox365o4w45wq': 8.705872110738693e-05, 'yhl': 8.705872110738693e-05, 'issues': 0.0001305880816610804, 'rofl': 4.3529360553693465e-05, '4brekkie': 4.3529360553693465e-05, 'dobby': 4.3529360553693465e-05, 'ileave': 4.3529360553693465e-05, 'platt': 4.3529360553693465e-05, 'vl': 4.3529360553693465e-05, 'tok': 4.3529360553693465e-05, 'diff': 4.3529360553693465e-05, '1327': 0.0003482348844295477, 'confidence': 4.3529360553693465e-05, 'rules': 4.3529360553693465e-05, 'workin': 4.3529360553693465e-05, 'inconsiderate': 4.3529360553693465e-05, 'nattil': 4.3529360553693465e-05, 'soundtrack': 8.705872110738693e-05, 'fox': 4.3529360553693465e-05, 'sore': 4.3529360553693465e-05, 'im': 0.0003482348844295477, 'solved': 4.3529360553693465e-05, 'thm': 4.3529360553693465e-05, 'ecstacy': 8.705872110738693e-05, 'tx': 4.3529360553693465e-05, 'efreefone': 8.705872110738693e-05, 'fans': 8.705872110738693e-05, '4few': 4.3529360553693465e-05, 'census': 4.3529360553693465e-05, 'improved': 8.705872110738693e-05, 'bet': 4.3529360553693465e-05, 'note': 4.3529360553693465e-05, 'jorge': 8.705872110738693e-05, 'icmb3cktz8r7': 8.705872110738693e-05, 'lik': 4.3529360553693465e-05, 'resizing': 4.3529360553693465e-05, 'lucyxx': 4.3529360553693465e-05, 'accordin': 4.3529360553693465e-05, 'spinout': 4.3529360553693465e-05, 'sundayish': 4.3529360553693465e-05, 'retrieve': 0.0001305880816610804, 'portege': 4.3529360553693465e-05, 'fetching': 4.3529360553693465e-05, '07946746291': 8.705872110738693e-05, 'pieces': 4.3529360553693465e-05, 'fones': 8.705872110738693e-05, 'must': 0.0002611761633221608, 'pert': 4.3529360553693465e-05, 'hun': 4.3529360553693465e-05, 'ghodbandar': 4.3529360553693465e-05, 'ramen': 4.3529360553693465e-05, 'teletext': 8.705872110738693e-05, 'blur': 4.3529360553693465e-05, 'favor': 4.3529360553693465e-05, 'woah': 4.3529360553693465e-05, 'speed': 8.705872110738693e-05, 'tones2u': 0.0001305880816610804, 'apologise': 4.3529360553693465e-05, 'plumbers': 4.3529360553693465e-05, 'dl': 4.3529360553693465e-05, 'premier': 8.705872110738693e-05, 'realise': 4.3529360553693465e-05, 'bullshit': 4.3529360553693465e-05, 'px3748': 4.3529360553693465e-05, '3gbp': 0.00017411744221477386, 'subpoly': 0.00017411744221477386, 'musicnews': 8.705872110738693e-05, 'opted': 4.3529360553693465e-05, 'havn': 4.3529360553693465e-05, 'off': 0.00021764680276846734, 'there': 0.0006094110477517085, '150ppmpobox10183bhamb64xe': 8.705872110738693e-05, 'ntt': 0.00047882296609062813, 'themes': 4.3529360553693465e-05, 'nino': 4.3529360553693465e-05, 'hmm': 4.3529360553693465e-05, 'giving': 8.705872110738693e-05, 'north': 4.3529360553693465e-05, 'email': 4.3529360553693465e-05, 'bcs': 4.3529360553693465e-05, 'quizclub': 8.705872110738693e-05, 'slo': 0.0001305880816610804, '300': 0.0002611761633221608, 'got': 0.00021764680276846734, '4d': 4.3529360553693465e-05, 'la3': 8.705872110738693e-05, 'pride': 4.3529360553693465e-05, '6': 0.0003482348844295477, 'teasing': 4.3529360553693465e-05, 'kiosk': 0.0001305880816610804, 'xmas': 0.000652940408305402, 'ntimate': 4.3529360553693465e-05, 'icic': 4.3529360553693465e-05, 'leaving': 4.3529360553693465e-05, 'versus': 4.3529360553693465e-05, 'isaiah': 4.3529360553693465e-05, 'aft': 4.3529360553693465e-05, 'irulinae': 4.3529360553693465e-05, 'tiny': 4.3529360553693465e-05, 'problum': 4.3529360553693465e-05, 'f4q': 8.705872110738693e-05, 'fooled': 4.3529360553693465e-05, 'std': 0.00030470552387585427, 'possession': 4.3529360553693465e-05, 'chillin': 4.3529360553693465e-05, 'shld': 4.3529360553693465e-05, 'busty': 8.705872110738693e-05, '6669': 8.705872110738693e-05, 'shoot': 4.3529360553693465e-05, 'sucks': 4.3529360553693465e-05, 'cum': 0.0002611761633221608, 'king': 0.0001305880816610804, 'arcade': 0.00021764680276846734, 'glo': 4.3529360553693465e-05, 'liked': 4.3529360553693465e-05, 'euro2004': 0.00021764680276846734, 'twittering': 4.3529360553693465e-05, 'thnx': 4.3529360553693465e-05, 'losing': 4.3529360553693465e-05, 'fails': 4.3529360553693465e-05, 'agidhane': 4.3529360553693465e-05, 'wins': 4.3529360553693465e-05, 'med': 0.00017411744221477386, 'guitar': 4.3529360553693465e-05, 'freezing': 4.3529360553693465e-05, 'class': 4.3529360553693465e-05, 'handset': 0.00030470552387585427, '250': 0.000652940408305402, 'west': 4.3529360553693465e-05, '128': 4.3529360553693465e-05, 'local': 0.0002611761633221608, 'munsters': 0.0001305880816610804, 'exorcist': 0.0001305880816610804, 'jod': 4.3529360553693465e-05, 'scoring': 0.0001305880816610804, 'everyone': 0.0001305880816610804, 'misss': 4.3529360553693465e-05, 'priscilla': 4.3529360553693465e-05, '3qxj9': 0.00021764680276846734, 'fantastic': 0.00030470552387585427, 'mutations': 4.3529360553693465e-05, 'swann': 4.3529360553693465e-05, 'morefrmmob': 8.705872110738693e-05, 'gsoh': 8.705872110738693e-05, 'dance': 4.3529360553693465e-05, 'sw7': 0.0001305880816610804, 'lovin': 4.3529360553693465e-05, 'twelve': 4.3529360553693465e-05, 'shaved': 4.3529360553693465e-05, 'pizza': 4.3529360553693465e-05, 'daywith': 4.3529360553693465e-05, 'bird': 4.3529360553693465e-05, 'angry': 4.3529360553693465e-05, 'turkeys': 4.3529360553693465e-05, 'darkest': 8.705872110738693e-05, 'kip': 4.3529360553693465e-05, '09061221061': 0.0001305880816610804, 'multimedia': 4.3529360553693465e-05, 'didnt': 4.3529360553693465e-05, 'stops': 4.3529360553693465e-05, 'education': 4.3529360553693465e-05, '1450': 0.0001305880816610804, 'christmassy': 4.3529360553693465e-05, 'eppolum': 4.3529360553693465e-05, 'outgoing': 8.705872110738693e-05, 'violence': 4.3529360553693465e-05, 'bslvyl': 4.3529360553693465e-05, '08707500020': 8.705872110738693e-05, 'maneesha': 4.3529360553693465e-05, 'algorithms': 4.3529360553693465e-05, 'completely': 8.705872110738693e-05, 'bluray': 4.3529360553693465e-05, 'childrens': 4.3529360553693465e-05, '08718720201': 0.0003482348844295477, 'guesses': 4.3529360553693465e-05, '220': 0.00017411744221477386, 'chinky': 4.3529360553693465e-05, 'chuckin': 4.3529360553693465e-05, 'soc': 4.3529360553693465e-05, 'rcd': 8.705872110738693e-05, '07786200117': 8.705872110738693e-05, 'monos': 8.705872110738693e-05, 'individual': 4.3529360553693465e-05, 'rush': 4.3529360553693465e-05, 'juicy': 4.3529360553693465e-05, 'telephone': 8.705872110738693e-05, 'il': 4.3529360553693465e-05, 'paris': 0.00021764680276846734, '08709501522': 8.705872110738693e-05, 'daytime': 8.705872110738693e-05, 'asking': 4.3529360553693465e-05, 'ranjith': 4.3529360553693465e-05, 'entropication': 4.3529360553693465e-05, 'bud': 4.3529360553693465e-05, 'fassyole': 4.3529360553693465e-05, 'clocks': 4.3529360553693465e-05, 'bilo': 4.3529360553693465e-05, 'most': 8.705872110738693e-05, '80182': 0.0001305880816610804, 'dem': 4.3529360553693465e-05, 'astronomer': 4.3529360553693465e-05, 'analysis': 4.3529360553693465e-05, 'helens': 4.3529360553693465e-05, 'abi': 4.3529360553693465e-05, 'ml': 4.3529360553693465e-05, 'permissions': 4.3529360553693465e-05, 'dignity': 4.3529360553693465e-05, 'duo': 4.3529360553693465e-05, '4719': 8.705872110738693e-05, 'was': 0.0002611761633221608, '50perwksub': 0.00017411744221477386, 'somtimes': 4.3529360553693465e-05, 'edge': 4.3529360553693465e-05, 'flat': 4.3529360553693465e-05, '69696': 8.705872110738693e-05, 'tp': 4.3529360553693465e-05, 'chinese': 4.3529360553693465e-05, 'rcv': 0.0001305880816610804, 'careers': 0.0001305880816610804, 'da': 4.3529360553693465e-05, 'dude': 4.3529360553693465e-05, 'declare': 4.3529360553693465e-05, 'mom': 4.3529360553693465e-05, 'sef': 4.3529360553693465e-05, 'march': 4.3529360553693465e-05, '10p': 0.0009576459321812563, 'x': 0.00039176424498324117, 'sue': 0.0001305880816610804, 'greeting': 4.3529360553693465e-05, 'slow': 8.705872110738693e-05, 'nevamind': 4.3529360553693465e-05, 'tallent': 4.3529360553693465e-05, '69969': 8.705872110738693e-05, 'wc1n3xx': 0.00021764680276846734, 'bedroom': 8.705872110738693e-05, 'sonyericsson': 0.00021764680276846734, 'goes': 4.3529360553693465e-05, 'condition': 8.705872110738693e-05, '69101': 8.705872110738693e-05, 'forever': 4.3529360553693465e-05, 'wisheds': 4.3529360553693465e-05, 'arranging': 4.3529360553693465e-05, 'tongued': 4.3529360553693465e-05, '7mp': 8.705872110738693e-05, 'fonin': 4.3529360553693465e-05, 'chocolate': 4.3529360553693465e-05, 'trishul': 4.3529360553693465e-05, '2waxsto': 4.3529360553693465e-05, 'formal': 4.3529360553693465e-05, 'sic': 8.705872110738693e-05, 'trek': 4.3529360553693465e-05, '08702490080': 8.705872110738693e-05, 'listener': 4.3529360553693465e-05, 'kz': 4.3529360553693465e-05, 'meets': 4.3529360553693465e-05, '4years': 4.3529360553693465e-05, 'past': 4.3529360553693465e-05, 'consider': 4.3529360553693465e-05, 'looking': 0.0003482348844295477, 'indeed': 4.3529360553693465e-05, 'leadership': 8.705872110738693e-05, 'houseful': 4.3529360553693465e-05, 'yan': 4.3529360553693465e-05, 'lambu': 4.3529360553693465e-05, 'text': 0.004091759892047186, 'useless': 4.3529360553693465e-05, '2optout': 0.0002611761633221608, '09111032124': 8.705872110738693e-05, 'mi': 4.3529360553693465e-05, 'priority': 4.3529360553693465e-05, 'poyyarikatur': 4.3529360553693465e-05, 'soiree': 8.705872110738693e-05, 'greatness': 4.3529360553693465e-05, 'temp': 4.3529360553693465e-05, 'rite': 4.3529360553693465e-05, 'pouch': 4.3529360553693465e-05, 'fb': 4.3529360553693465e-05, 'relieved': 4.3529360553693465e-05, 'chasing': 8.705872110738693e-05, 'gee': 4.3529360553693465e-05, 'suggestions': 4.3529360553693465e-05, 'dogbreath': 4.3529360553693465e-05, 'settle': 4.3529360553693465e-05, 'rushing': 4.3529360553693465e-05, 'sen': 4.3529360553693465e-05, 'stash': 4.3529360553693465e-05, 'imat': 4.3529360553693465e-05, 'aaniye': 4.3529360553693465e-05, '83021': 8.705872110738693e-05, 'srt': 4.3529360553693465e-05, '69669': 8.705872110738693e-05, '1000call': 8.705872110738693e-05, 'notified': 8.705872110738693e-05, '08452810075over18': 8.705872110738693e-05, 'call09050000327': 0.0001305880816610804, 'behalf': 4.3529360553693465e-05, 'putting': 4.3529360553693465e-05, 'aig': 4.3529360553693465e-05, 'responce': 4.3529360553693465e-05, '08002988890': 8.705872110738693e-05, 'alle': 4.3529360553693465e-05, 'nagar': 4.3529360553693465e-05, 'mgs': 8.705872110738693e-05, 'upstairs': 4.3529360553693465e-05, 'humans': 4.3529360553693465e-05, '09061702893': 8.705872110738693e-05, 'notifications': 8.705872110738693e-05, 'captain': 4.3529360553693465e-05, '09057039994': 8.705872110738693e-05, 'else': 8.705872110738693e-05, 'sptv': 0.0001305880816610804, 'age': 0.00030470552387585427, 'wendy': 4.3529360553693465e-05, 'lv': 4.3529360553693465e-05, 'remove': 0.00017411744221477386, 'wylie': 4.3529360553693465e-05, 'visionsms': 8.705872110738693e-05, 'fights': 4.3529360553693465e-05, 'convey': 4.3529360553693465e-05, 'administrator': 4.3529360553693465e-05, 'ela': 4.3529360553693465e-05, 'placement': 4.3529360553693465e-05, 'classmates': 4.3529360553693465e-05, 'pataistha': 4.3529360553693465e-05, 'birds': 4.3529360553693465e-05, 'messaging': 0.0001305880816610804, 'rays': 4.3529360553693465e-05, '09058091870': 8.705872110738693e-05, 'mobypobox734ls27yf': 8.705872110738693e-05, '391784': 8.705872110738693e-05, 'casualty': 4.3529360553693465e-05, 'imma': 4.3529360553693465e-05, 'flute': 4.3529360553693465e-05, 'sagamu': 4.3529360553693465e-05, 'ovarian': 4.3529360553693465e-05, '430': 4.3529360553693465e-05, 'raj': 4.3529360553693465e-05, 'allalo': 4.3529360553693465e-05, 'domain': 4.3529360553693465e-05, '7th': 4.3529360553693465e-05, 'murderer': 4.3529360553693465e-05, 'unspoken': 4.3529360553693465e-05, 'whispers': 4.3529360553693465e-05, 'where': 0.0001305880816610804, 'icicibank': 4.3529360553693465e-05, 'digital': 0.0003482348844295477, 'birth': 4.3529360553693465e-05, 'celebrations': 4.3529360553693465e-05, 'gage': 8.705872110738693e-05, 'lounge': 4.3529360553693465e-05, 'becaus': 4.3529360553693465e-05, 'natural': 4.3529360553693465e-05, '83600': 0.00021764680276846734, 'sum1': 4.3529360553693465e-05, 'panties': 4.3529360553693465e-05, 'inc': 0.0002611761633221608, 'sp': 0.0002611761633221608, 'nusstu': 4.3529360553693465e-05, 'confirmd': 4.3529360553693465e-05, 'babygoodbye': 8.705872110738693e-05, 'timings': 4.3529360553693465e-05, 'rocking': 4.3529360553693465e-05, 'increments': 0.0001305880816610804, 'jerry': 4.3529360553693465e-05, 'fixed': 8.705872110738693e-05, 'premium': 0.0001305880816610804, 'clear': 4.3529360553693465e-05, 'lush': 4.3529360553693465e-05, 'added': 0.0001305880816610804, 'slide': 8.705872110738693e-05, 'lucy': 0.0001305880816610804, 'afternoons': 4.3529360553693465e-05, 'urmom': 4.3529360553693465e-05, 'chad': 4.3529360553693465e-05, 'tired': 4.3529360553693465e-05, 'num': 4.3529360553693465e-05, 'bridal': 8.705872110738693e-05, 'references': 4.3529360553693465e-05, 'anthony': 4.3529360553693465e-05, 'features': 8.705872110738693e-05, 'jabo': 4.3529360553693465e-05, 'seem': 8.705872110738693e-05, 'derek': 4.3529360553693465e-05, 'thank': 8.705872110738693e-05, 'elephant': 4.3529360553693465e-05, 'getsleep': 4.3529360553693465e-05, 'vpod': 8.705872110738693e-05, 'mundhe': 4.3529360553693465e-05, 'boytoy': 4.3529360553693465e-05, '9996': 8.705872110738693e-05, 'packing': 4.3529360553693465e-05, 'important': 0.00047882296609062813, 'acknowledgement': 4.3529360553693465e-05, 'perform': 4.3529360553693465e-05, '69876': 8.705872110738693e-05, 'yahoo': 0.0001305880816610804, 'stylish': 4.3529360553693465e-05, 'overdid': 4.3529360553693465e-05, 'stream': 8.705872110738693e-05, 'wuld': 4.3529360553693465e-05, 'accenture': 4.3529360553693465e-05, 'unusual': 4.3529360553693465e-05, 'buyer': 4.3529360553693465e-05, '08718723815': 8.705872110738693e-05, 'everythin': 4.3529360553693465e-05, 'sugababes': 8.705872110738693e-05, 'flirtparty': 8.705872110738693e-05, 'yorge': 4.3529360553693465e-05, 'engagement': 4.3529360553693465e-05, 'collection': 0.0009141165716275627, 'thru': 4.3529360553693465e-05, 'events': 4.3529360553693465e-05, 'ikno': 4.3529360553693465e-05, 'fraction': 8.705872110738693e-05, 'necessary': 4.3529360553693465e-05, 'wit': 4.3529360553693465e-05, 'mag': 4.3529360553693465e-05, 'studying': 4.3529360553693465e-05, 'ec2a': 0.0001305880816610804, 'christians': 4.3529360553693465e-05, 'accumulation': 4.3529360553693465e-05, 'adam': 0.0001305880816610804, 'row': 0.000652940408305402, 'cheat': 4.3529360553693465e-05, 'wildlife': 4.3529360553693465e-05, 'for': 0.006703521525268794, 'male': 0.00017411744221477386, 'bids': 8.705872110738693e-05, 'tohar': 4.3529360553693465e-05, 'deficient': 4.3529360553693465e-05, 'gnt': 4.3529360553693465e-05, 'coz': 4.3529360553693465e-05, 'if': 0.001305880816610804, 'anybody': 4.3529360553693465e-05, 'forms': 4.3529360553693465e-05, 'gep': 4.3529360553693465e-05, '5wb': 0.0003482348844295477, 'tag': 4.3529360553693465e-05, 'mcfly': 8.705872110738693e-05, 'befor': 4.3529360553693465e-05, 'problems': 4.3529360553693465e-05, 'symptoms': 4.3529360553693465e-05, 'deal': 4.3529360553693465e-05, 'toa': 4.3529360553693465e-05, 'persolvo': 8.705872110738693e-05, 'particularly': 4.3529360553693465e-05, 'portions': 4.3529360553693465e-05, 'orig': 8.705872110738693e-05, 'ericson': 4.3529360553693465e-05, 'rejected': 4.3529360553693465e-05, '2bold': 4.3529360553693465e-05, 'flyng': 8.705872110738693e-05, 'availa': 8.705872110738693e-05, 'china': 4.3529360553693465e-05, '09063458130': 0.0001305880816610804, 'misfits': 4.3529360553693465e-05, 'sign': 8.705872110738693e-05, 'datebox1282essexcm61xn': 8.705872110738693e-05, 'lyk': 4.3529360553693465e-05, 'gon': 4.3529360553693465e-05, 'crammed': 4.3529360553693465e-05, 'see': 0.0006094110477517085, 'site': 4.3529360553693465e-05, 'sos': 4.3529360553693465e-05, 'dogg': 4.3529360553693465e-05, 'upping': 4.3529360553693465e-05, 'watts': 4.3529360553693465e-05, 'fighting': 4.3529360553693465e-05, 'now1': 8.705872110738693e-05, 'amplikater': 4.3529360553693465e-05, 'doing': 4.3529360553693465e-05, 'speedchat': 0.0001305880816610804, 'workout': 4.3529360553693465e-05, 'ny': 8.705872110738693e-05, 'broken': 4.3529360553693465e-05, 'player': 0.0004352936055369347, 'loans': 8.705872110738693e-05, 'ctagg': 4.3529360553693465e-05, 'physics': 4.3529360553693465e-05, 'sim': 4.3529360553693465e-05, 'evng': 4.3529360553693465e-05, 'relation': 4.3529360553693465e-05, 'lotr': 0.0001305880816610804, 'kiss': 4.3529360553693465e-05, 'boye': 4.3529360553693465e-05, 'archive': 4.3529360553693465e-05, 'acc': 4.3529360553693465e-05, 'morn': 4.3529360553693465e-05, 'brdget': 4.3529360553693465e-05, 'babes': 0.0001305880816610804, 'bloody': 4.3529360553693465e-05, 'club4mobiles': 8.705872110738693e-05, 'predictive': 4.3529360553693465e-05, 'required': 4.3529360553693465e-05, 'jules': 4.3529360553693465e-05, 'harlem': 4.3529360553693465e-05, 'discreet': 0.0001305880816610804, 'blah': 4.3529360553693465e-05, 'wo': 4.3529360553693465e-05, 'contention': 4.3529360553693465e-05, 'removal': 0.0001305880816610804, 'fingers': 4.3529360553693465e-05, '83049': 8.705872110738693e-05, 'she': 4.3529360553693465e-05, 'pobox36504w45wq': 0.0002611761633221608, 'howu': 4.3529360553693465e-05, 'billing': 8.705872110738693e-05, 'mo': 4.3529360553693465e-05, 'phb1': 8.705872110738693e-05, '09099726395': 8.705872110738693e-05, 'mat': 8.705872110738693e-05, 'gbp': 0.0001305880816610804, 'm8s': 0.0001305880816610804, 'dime': 4.3529360553693465e-05, '92h': 8.705872110738693e-05, 'commercial': 4.3529360553693465e-05, 'mine': 4.3529360553693465e-05, 'internet': 8.705872110738693e-05, 'wither': 8.705872110738693e-05, 'brain': 4.3529360553693465e-05, 'appointment': 4.3529360553693465e-05, 'skilgme': 0.00017411744221477386, 'thats': 4.3529360553693465e-05, 'ihave': 4.3529360553693465e-05, 'students': 4.3529360553693465e-05, '89938': 8.705872110738693e-05, 'dryer': 4.3529360553693465e-05, 'uve': 8.705872110738693e-05, 'ure': 4.3529360553693465e-05, 'neglect': 4.3529360553693465e-05, 'onwards': 4.3529360553693465e-05, 'pushes': 4.3529360553693465e-05, 'babyjontet': 4.3529360553693465e-05, 'witot': 4.3529360553693465e-05, 'outrageous': 4.3529360553693465e-05, '83118': 8.705872110738693e-05, 'showered': 4.3529360553693465e-05, 'common': 4.3529360553693465e-05, 'messenger': 4.3529360553693465e-05, 'é': 4.3529360553693465e-05, 'pushbutton': 8.705872110738693e-05, 'corporation': 4.3529360553693465e-05, 'tke': 4.3529360553693465e-05, 'eng': 0.00021764680276846734, 'quite': 4.3529360553693465e-05, 'scrappy': 4.3529360553693465e-05, 'atm': 4.3529360553693465e-05, 'flights': 0.00017411744221477386, 'ones': 0.0001305880816610804, 'onlyfound': 4.3529360553693465e-05, 'crore': 4.3529360553693465e-05, 'normally': 4.3529360553693465e-05, 'g': 0.0003482348844295477, 'oops': 4.3529360553693465e-05, 'inch': 4.3529360553693465e-05, '3x': 8.705872110738693e-05, 'belly': 4.3529360553693465e-05, 'slightly': 4.3529360553693465e-05, 'looks': 4.3529360553693465e-05, 'embarassing': 4.3529360553693465e-05, 'hyde': 4.3529360553693465e-05, 'conacted': 8.705872110738693e-05, 'fromm': 0.00017411744221477386, 'rajas': 4.3529360553693465e-05, 'tsandcs': 0.0001305880816610804, 'mquiz': 8.705872110738693e-05, 'reserved': 4.3529360553693465e-05, 'unfortuntly': 4.3529360553693465e-05, 'ours': 4.3529360553693465e-05, '08712405020': 0.0001305880816610804, 'computers': 4.3529360553693465e-05, 'h': 4.3529360553693465e-05, '33': 8.705872110738693e-05, 'v': 0.0002611761633221608, '2mro': 4.3529360553693465e-05, 'mahfuuz': 4.3529360553693465e-05, 'simulate': 4.3529360553693465e-05, 'gaze': 4.3529360553693465e-05, 'app': 0.00017411744221477386, 'lennon': 4.3529360553693465e-05, 'initiate': 4.3529360553693465e-05, 'jazz': 4.3529360553693465e-05, 'skyped': 4.3529360553693465e-05, 'mokka': 4.3529360553693465e-05, 'dependable': 4.3529360553693465e-05, 'stitch': 4.3529360553693465e-05, 'adjustable': 4.3529360553693465e-05, 'lovers': 4.3529360553693465e-05, 'gist': 4.3529360553693465e-05, 'search': 0.0001305880816610804, 'ppm': 0.0001305880816610804, 'complete': 4.3529360553693465e-05, 'frnd': 0.00021764680276846734, 'cry': 4.3529360553693465e-05, 'along': 8.705872110738693e-05, 'kicks': 4.3529360553693465e-05, 'nipost': 4.3529360553693465e-05, 'kissing': 4.3529360553693465e-05, 'print': 4.3529360553693465e-05, 'cancer': 4.3529360553693465e-05, '3uz': 8.705872110738693e-05, 'lanka': 4.3529360553693465e-05, 'conected': 4.3529360553693465e-05, 'gentleman': 4.3529360553693465e-05, 'wife': 8.705872110738693e-05, 'veggie': 4.3529360553693465e-05, 'bishan': 4.3529360553693465e-05, 'pocketbabe': 0.0001305880816610804, 'remember': 4.3529360553693465e-05, 'robs': 4.3529360553693465e-05, 'suffering': 4.3529360553693465e-05, 'urgently': 4.3529360553693465e-05, 'pub': 8.705872110738693e-05, 'tomorw': 4.3529360553693465e-05, 'announcement': 0.0002611761633221608, 'spose': 4.3529360553693465e-05, 'cme': 4.3529360553693465e-05, 'terrific': 4.3529360553693465e-05, 'here': 0.00021764680276846734, 'lo': 4.3529360553693465e-05, 'amrita': 4.3529360553693465e-05, 'lay': 4.3529360553693465e-05, 'herself': 0.0001305880816610804, 'argh': 4.3529360553693465e-05, 'fudge': 4.3529360553693465e-05, 'town': 0.00017411744221477386, 'tests': 4.3529360553693465e-05, 'm': 0.0007399991294127889, 'callin': 4.3529360553693465e-05, 'gucci': 4.3529360553693465e-05, 'vivek': 4.3529360553693465e-05, 'aight': 4.3529360553693465e-05, 'deviousbitch': 4.3529360553693465e-05, 'shirts': 4.3529360553693465e-05, 'roommates': 4.3529360553693465e-05, 'nite': 4.3529360553693465e-05, '2end': 0.0001305880816610804, 'dat': 4.3529360553693465e-05, 'kaiez': 4.3529360553693465e-05, 'henry': 0.0001305880816610804, 'silver': 4.3529360553693465e-05, 'on': 0.004918817742567362, '88222': 0.0001305880816610804, 'urself': 4.3529360553693465e-05, 'flood': 4.3529360553693465e-05, 'propsd': 4.3529360553693465e-05, 'db': 4.3529360553693465e-05, '25': 0.00017411744221477386, 'tv': 0.0004352936055369347, 'expected': 4.3529360553693465e-05, 'basq': 4.3529360553693465e-05, 'nuther': 4.3529360553693465e-05, 'radiator': 4.3529360553693465e-05, 'well': 0.0002611761633221608, 'relationship': 4.3529360553693465e-05, 'safely': 4.3529360553693465e-05, 'dao': 4.3529360553693465e-05, 'deus': 4.3529360553693465e-05, 'malaria': 4.3529360553693465e-05, '40533': 8.705872110738693e-05, 'hand': 4.3529360553693465e-05, 'xxsp': 8.705872110738693e-05, 'wan2': 8.705872110738693e-05, 'tablet': 4.3529360553693465e-05, 'understanding': 4.3529360553693465e-05, 'swiss': 4.3529360553693465e-05, 'vote': 8.705872110738693e-05, 'terrible': 4.3529360553693465e-05, 'sat': 0.0001305880816610804, 'bangb': 8.705872110738693e-05, 'fat': 4.3529360553693465e-05, 'constantly': 4.3529360553693465e-05, 'stopped': 4.3529360553693465e-05, 'yep': 4.3529360553693465e-05, 'ptbo': 4.3529360553693465e-05, 'xchat': 0.00017411744221477386, 'position': 4.3529360553693465e-05, 'serena': 4.3529360553693465e-05, 'pixels': 4.3529360553693465e-05, 'bb': 4.3529360553693465e-05, 'um': 4.3529360553693465e-05, '077xxx': 8.705872110738693e-05, '0871': 0.00017411744221477386, 'mountains': 4.3529360553693465e-05, 'meat': 4.3529360553693465e-05, 'cloud': 4.3529360553693465e-05, 'allowed': 4.3529360553693465e-05, 'rolled': 4.3529360553693465e-05, 'influx': 4.3529360553693465e-05, 'thesedays': 4.3529360553693465e-05, '88066': 8.705872110738693e-05, '1000s': 0.0001305880816610804, 'violated': 4.3529360553693465e-05, 'gopalettan': 4.3529360553693465e-05, 'knw': 4.3529360553693465e-05, 'london': 0.00021764680276846734, 'opt': 0.0006094110477517085, 'east': 4.3529360553693465e-05, 'barred': 4.3529360553693465e-05, 'diapers': 4.3529360553693465e-05, 'satanic': 4.3529360553693465e-05, 'upon': 4.3529360553693465e-05, 'enjoying': 4.3529360553693465e-05, 'shanghai': 4.3529360553693465e-05, 'option': 4.3529360553693465e-05, 'tallahassee': 4.3529360553693465e-05, 'preferably': 4.3529360553693465e-05, 'wrnog': 4.3529360553693465e-05, 'songs': 4.3529360553693465e-05, 'unless': 4.3529360553693465e-05, '4fil': 0.0001305880816610804, 'itz': 4.3529360553693465e-05, 'tell': 0.0006094110477517085, 'baig': 4.3529360553693465e-05, 'president': 8.705872110738693e-05, 'shivratri': 4.3529360553693465e-05, 'callfreefone': 8.705872110738693e-05, 'within': 0.00030470552387585427, 'sexy': 0.0006094110477517085, 'videophones': 0.00021764680276846734, '50ea': 8.705872110738693e-05, 'speak': 0.0003482348844295477, 'province': 4.3529360553693465e-05, 'ummma': 4.3529360553693465e-05, 'clos1': 4.3529360553693465e-05, 'short': 4.3529360553693465e-05, 'george': 8.705872110738693e-05, 'restrict': 4.3529360553693465e-05, 'payed2day': 4.3529360553693465e-05, 'sink': 4.3529360553693465e-05, 'korean': 4.3529360553693465e-05, 'blocked': 4.3529360553693465e-05, 'ringtones': 0.00039176424498324117, 'ree': 8.705872110738693e-05, 'elaborate': 4.3529360553693465e-05, 'pose': 4.3529360553693465e-05, 'this': 0.003090584599312236, 'stand': 4.3529360553693465e-05, 'enjoy': 0.0004352936055369347, 'craziest': 4.3529360553693465e-05, 'love': 0.0004352936055369347, 'some1': 4.3529360553693465e-05, 'audrie': 4.3529360553693465e-05, 'treatin': 4.3529360553693465e-05, '2morro': 8.705872110738693e-05, 'mobsi': 8.705872110738693e-05, 'athletic': 4.3529360553693465e-05, 'log': 0.00030470552387585427, 'faith': 4.3529360553693465e-05, 'trains': 4.3529360553693465e-05, 'uploaded': 4.3529360553693465e-05, 'box39822': 0.00021764680276846734, 'requires': 4.3529360553693465e-05, 'downloads': 0.0001305880816610804, 'voila': 4.3529360553693465e-05, 'outstanding': 4.3529360553693465e-05, 'beggar': 4.3529360553693465e-05, 'console': 8.705872110738693e-05, 'unsub': 0.00030470552387585427, 'ibhltd': 0.00017411744221477386, 'hands': 4.3529360553693465e-05, '7876150ppm': 8.705872110738693e-05, 'panasonic': 8.705872110738693e-05, 'cos': 4.3529360553693465e-05, '09066358152': 0.0001305880816610804, 'regarding': 4.3529360553693465e-05, 'second': 4.3529360553693465e-05, '75': 0.0001305880816610804, 'coming': 4.3529360553693465e-05, 'anand': 4.3529360553693465e-05, 'social': 4.3529360553693465e-05, 'printed': 4.3529360553693465e-05, 'early': 4.3529360553693465e-05, 'driver': 4.3529360553693465e-05, 'rang': 4.3529360553693465e-05, 'wrc': 0.0001305880816610804, '09064018838': 8.705872110738693e-05, 'sort': 8.705872110738693e-05, 'manege': 4.3529360553693465e-05, 'noooooooo': 4.3529360553693465e-05, 'dehydration': 4.3529360553693465e-05, '86688': 0.0007399991294127889, '400thousad': 4.3529360553693465e-05, 'election': 4.3529360553693465e-05, 'clean': 4.3529360553693465e-05, 'granite': 0.0001305880816610804, 'box61': 8.705872110738693e-05, 'apt': 4.3529360553693465e-05, 'wed': 8.705872110738693e-05, 'ali': 4.3529360553693465e-05, 'english': 4.3529360553693465e-05, 'id': 0.00017411744221477386, 'fuelled': 4.3529360553693465e-05, 'matches': 0.00039176424498324117, 'unsold': 4.3529360553693465e-05, 'nìte': 4.3529360553693465e-05, 'wrk': 4.3529360553693465e-05, 'joys': 4.3529360553693465e-05, '08709222922': 0.0001305880816610804, 'xclusive': 8.705872110738693e-05, 'unmits': 4.3529360553693465e-05, 'renewal': 8.705872110738693e-05, 'onam': 4.3529360553693465e-05, 'surprise': 0.00021764680276846734, 'fake': 4.3529360553693465e-05, 'buffet': 4.3529360553693465e-05, 'decades': 4.3529360553693465e-05, '2day': 0.00017411744221477386, 'intentions': 4.3529360553693465e-05, 'live': 0.0011752927349497236, 'poo': 4.3529360553693465e-05, 'follow': 8.705872110738693e-05, 'sweetheart': 4.3529360553693465e-05, 'ecstasy': 4.3529360553693465e-05, 'dust': 4.3529360553693465e-05, 'theplace': 4.3529360553693465e-05, 'spelled': 4.3529360553693465e-05, 'courage': 4.3529360553693465e-05, 'twins': 4.3529360553693465e-05, 'stuff42moro': 4.3529360553693465e-05, '1winaweek': 0.0001305880816610804, 'planning': 4.3529360553693465e-05, '83222': 0.00017411744221477386, '09061701851': 8.705872110738693e-05, 'fair': 4.3529360553693465e-05, 'cheap': 0.00021764680276846734, 'felt': 4.3529360553693465e-05, 'sneham': 4.3529360553693465e-05, '09053750005': 8.705872110738693e-05, 'potential': 8.705872110738693e-05, 'feb': 8.705872110738693e-05, 'lets': 8.705872110738693e-05, 'certainly': 4.3529360553693465e-05, 'argentina': 4.3529360553693465e-05, 'result': 4.3529360553693465e-05, 'handing': 4.3529360553693465e-05, 'draw': 0.001305880816610804, 'browse': 8.705872110738693e-05, 'nevr': 4.3529360553693465e-05, 'oyster': 4.3529360553693465e-05, 'hockey': 8.705872110738693e-05, 'tolerance': 4.3529360553693465e-05, 'complaint': 4.3529360553693465e-05, 'tot': 4.3529360553693465e-05, 'sehwag': 4.3529360553693465e-05, 'game': 0.00030470552387585427, 'talking': 4.3529360553693465e-05, 'wotz': 4.3529360553693465e-05, 'ge': 4.3529360553693465e-05, 'y': 4.3529360553693465e-05, 'hopeful': 4.3529360553693465e-05, 'hesitate': 4.3529360553693465e-05, 'tom': 8.705872110738693e-05, 'renewing': 8.705872110738693e-05, 'mind': 8.705872110738693e-05, 'long': 4.3529360553693465e-05, '89693': 0.0001305880816610804, 'sleeping': 4.3529360553693465e-05, 'nottingham': 4.3529360553693465e-05, 'lab': 4.3529360553693465e-05, 'congrats': 0.0004352936055369347, 'thy': 4.3529360553693465e-05, 'hcl': 4.3529360553693465e-05, 'surrounded': 4.3529360553693465e-05, 'street': 8.705872110738693e-05, 'simonwatson5120': 8.705872110738693e-05, 'sterm': 4.3529360553693465e-05, 'yck': 4.3529360553693465e-05, 'exe': 4.3529360553693465e-05, 'cough': 4.3529360553693465e-05, 'some': 0.0002611761633221608, 'science': 4.3529360553693465e-05, 'stalking': 4.3529360553693465e-05, 'occasion': 4.3529360553693465e-05, 'spouse': 4.3529360553693465e-05, 'carlin': 4.3529360553693465e-05, 'sullivan': 8.705872110738693e-05, 'undrstnd': 4.3529360553693465e-05, 'soil': 4.3529360553693465e-05, 'nicky': 4.3529360553693465e-05, 'textin': 4.3529360553693465e-05, 'jenxxx': 4.3529360553693465e-05, 'eg': 0.0004352936055369347, 'gone': 4.3529360553693465e-05, 'sabarish': 4.3529360553693465e-05, 'shldxxxx': 4.3529360553693465e-05, '8wp': 0.0001305880816610804, 'bam': 4.3529360553693465e-05, 'thing': 8.705872110738693e-05, 'butting': 4.3529360553693465e-05, 'laughing': 4.3529360553693465e-05, '750': 0.0007399991294127889, 'ts': 0.0004352936055369347, '2nd': 0.0008270578505201758, 'wouldn': 4.3529360553693465e-05, 'goodnite': 4.3529360553693465e-05, 'percent': 4.3529360553693465e-05, ...}
non_spam_prob
{'lifebook': 3.075976622577668e-05, 'chef': 3.075976622577668e-05, 'studyn': 3.075976622577668e-05, 'aburo': 4.6139649338665025e-05, 'resort': 3.075976622577668e-05, 'betta': 3.075976622577668e-05, 'consensus': 3.075976622577668e-05, 'sathy': 3.075976622577668e-05, '08712460324': 1.537988311288834e-05, 'oja': 1.537988311288834e-05, 'pop': 4.6139649338665025e-05, 'open': 0.0001845585973546601, 'receipts': 4.6139649338665025e-05, 'definitely': 9.227929867733005e-05, 'water': 0.00013841894801599507, '8007': 1.537988311288834e-05, 'guys': 0.0004306367271608736, 'samus': 3.075976622577668e-05, 'gibbs': 3.075976622577668e-05, 'lyfu': 4.6139649338665025e-05, 'textbuddy': 1.537988311288834e-05, '60': 3.075976622577668e-05, '05': 1.537988311288834e-05, 'hv': 6.151953245155337e-05, 'cloth': 3.075976622577668e-05, 'executive': 3.075976622577668e-05, 'major': 6.151953245155337e-05, 'ability': 3.075976622577668e-05, 'hubby': 3.075976622577668e-05, 'rentl': 1.537988311288834e-05, 'buzy': 6.151953245155337e-05, 'sweetest': 6.151953245155337e-05, 'goverment': 3.075976622577668e-05, 'wow': 0.00013841894801599507, 'psychic': 1.537988311288834e-05, 'comb': 3.075976622577668e-05, 'singles': 1.537988311288834e-05, 'monoc': 1.537988311288834e-05, 'beauty': 3.075976622577668e-05, 'tmorrow': 3.075976622577668e-05, '09061701444': 1.537988311288834e-05, 'huai': 3.075976622577668e-05, 'finishes': 6.151953245155337e-05, 'whatever': 0.00019993848046754843, 'such': 3.075976622577668e-05, 'reception': 3.075976622577668e-05, 'dancing': 4.6139649338665025e-05, 'proove': 3.075976622577668e-05, 'turns': 9.227929867733005e-05, 'dependents': 3.075976622577668e-05, 'wan': 0.0006920947400799754, 'completed': 4.6139649338665025e-05, 'fieldof': 3.075976622577668e-05, 'recd': 1.537988311288834e-05, 'often': 6.151953245155337e-05, 'blind': 1.537988311288834e-05, 'heal': 3.075976622577668e-05, 'million': 3.075976622577668e-05, 'reward': 1.537988311288834e-05, 'amazing': 6.151953245155337e-05, 'cl': 3.075976622577668e-05, 'haha': 0.0005998154414026453, 'warranty': 1.537988311288834e-05, 'maximize': 1.537988311288834e-05, 'woohoo': 3.075976622577668e-05, 'hasbro': 3.075976622577668e-05, 'fireplace': 3.075976622577668e-05, 'waht': 3.075976622577668e-05, '9758': 1.537988311288834e-05, 'twilight': 1.537988311288834e-05, 'done': 0.0005690556751768687, 'questions': 0.0001076591817902184, 'adp': 1.537988311288834e-05, 'andres': 3.075976622577668e-05, 'sf': 1.537988311288834e-05, '2006': 1.537988311288834e-05, 'tasts': 3.075976622577668e-05, 'showing': 9.227929867733005e-05, 'salam': 4.6139649338665025e-05, 'unfortunately': 4.6139649338665025e-05, 'drinkin': 4.6139649338665025e-05, 'sense': 9.227929867733005e-05, 'musthu': 3.075976622577668e-05, 'amnow': 3.075976622577668e-05, 'bluff': 4.6139649338665025e-05, 'canada': 4.6139649338665025e-05, 'singapore': 3.075976622577668e-05, 'trusting': 3.075976622577668e-05, 'art': 6.151953245155337e-05, 'loud': 4.6139649338665025e-05, 'enter': 4.6139649338665025e-05, 'sufficient': 3.075976622577668e-05, 'dining': 1.537988311288834e-05, '36504': 1.537988311288834e-05, 'things': 0.0005998154414026453, 'margin': 1.537988311288834e-05, 'addie': 6.151953245155337e-05, 'slowing': 3.075976622577668e-05, 'railway': 6.151953245155337e-05, 'thursday': 9.227929867733005e-05, 'styles': 3.075976622577668e-05, 'hill': 6.151953245155337e-05, 'prakesh': 3.075976622577668e-05, 'pobox365o4w45wq': 1.537988311288834e-05, 'yhl': 1.537988311288834e-05, 'issues': 3.075976622577668e-05, 'rofl': 4.6139649338665025e-05, '4brekkie': 3.075976622577668e-05, 'dobby': 3.075976622577668e-05, 'ileave': 3.075976622577668e-05, 'platt': 3.075976622577668e-05, 'vl': 6.151953245155337e-05, 'tok': 3.075976622577668e-05, 'diff': 6.151953245155337e-05, '1327': 1.537988311288834e-05, 'confidence': 9.227929867733005e-05, 'rules': 3.075976622577668e-05, 'workin': 0.00013841894801599507, 'inconsiderate': 3.075976622577668e-05, 'nattil': 3.075976622577668e-05, 'soundtrack': 1.537988311288834e-05, 'fox': 3.075976622577668e-05, 'sore': 7.689941556444171e-05, 'im': 0.0009996924023377423, 'solved': 4.6139649338665025e-05, 'thm': 4.6139649338665025e-05, 'ecstacy': 1.537988311288834e-05, 'tx': 3.075976622577668e-05, 'efreefone': 1.537988311288834e-05, 'fans': 1.537988311288834e-05, '4few': 3.075976622577668e-05, 'census': 3.075976622577668e-05, 'improved': 1.537988311288834e-05, 'bet': 6.151953245155337e-05, 'note': 7.689941556444171e-05, 'jorge': 1.537988311288834e-05, 'icmb3cktz8r7': 1.537988311288834e-05, 'lik': 7.689941556444171e-05, 'resizing': 3.075976622577668e-05, 'lucyxx': 3.075976622577668e-05, 'accordin': 3.075976622577668e-05, 'spinout': 3.075976622577668e-05, 'sundayish': 3.075976622577668e-05, 'retrieve': 1.537988311288834e-05, 'portege': 3.075976622577668e-05, 'fetching': 3.075976622577668e-05, '07946746291': 1.537988311288834e-05, 'pieces': 6.151953245155337e-05, 'fones': 1.537988311288834e-05, 'must': 0.00023069824669332513, 'pert': 3.075976622577668e-05, 'hun': 7.689941556444171e-05, 'ghodbandar': 3.075976622577668e-05, 'ramen': 3.075976622577668e-05, 'teletext': 1.537988311288834e-05, 'blur': 4.6139649338665025e-05, 'favor': 6.151953245155337e-05, 'woah': 3.075976622577668e-05, 'speed': 4.6139649338665025e-05, 'tones2u': 1.537988311288834e-05, 'apologise': 4.6139649338665025e-05, 'plumbers': 3.075976622577668e-05, 'dl': 3.075976622577668e-05, 'premier': 1.537988311288834e-05, 'realise': 3.075976622577668e-05, 'bullshit': 3.075976622577668e-05, 'px3748': 3.075976622577668e-05, '3gbp': 1.537988311288834e-05, 'subpoly': 1.537988311288834e-05, 'musicnews': 1.537988311288834e-05, 'opted': 3.075976622577668e-05, 'havn': 3.075976622577668e-05, 'off': 0.0006459550907413103, 'there': 0.0026760996616425714, '150ppmpobox10183bhamb64xe': 1.537988311288834e-05, 'ntt': 1.537988311288834e-05, 'themes': 3.075976622577668e-05, 'nino': 3.075976622577668e-05, 'hmm': 0.0002153183635804368, 'giving': 9.227929867733005e-05, 'north': 3.075976622577668e-05, 'email': 0.00016917871424177176, 'bcs': 3.075976622577668e-05, 'quizclub': 1.537988311288834e-05, 'slo': 1.537988311288834e-05, '300': 1.537988311288834e-05, 'got': 0.002660719778529683, '4d': 3.075976622577668e-05, 'la3': 1.537988311288834e-05, 'pride': 4.6139649338665025e-05, '6': 0.0005229160258382037, 'teasing': 0.0001076591817902184, 'kiosk': 1.537988311288834e-05, 'xmas': 0.0001845585973546601, 'ntimate': 3.075976622577668e-05, 'icic': 3.075976622577668e-05, 'leaving': 0.00027683789603199013, 'versus': 3.075976622577668e-05, 'isaiah': 3.075976622577668e-05, 'aft': 0.00027683789603199013, 'irulinae': 3.075976622577668e-05, 'tiny': 3.075976622577668e-05, 'problum': 3.075976622577668e-05, 'f4q': 1.537988311288834e-05, 'fooled': 3.075976622577668e-05, 'std': 1.537988311288834e-05, 'possession': 4.6139649338665025e-05, 'chillin': 6.151953245155337e-05, 'shld': 3.075976622577668e-05, 'busty': 1.537988311288834e-05, '6669': 1.537988311288834e-05, 'shoot': 7.689941556444171e-05, 'sucks': 0.00012303906490310673, 'cum': 0.0001076591817902184, 'king': 0.0001076591817902184, 'arcade': 1.537988311288834e-05, 'glo': 3.075976622577668e-05, 'liked': 7.689941556444171e-05, 'euro2004': 1.537988311288834e-05, 'twittering': 3.075976622577668e-05, 'thnx': 3.075976622577668e-05, 'losing': 6.151953245155337e-05, 'fails': 4.6139649338665025e-05, 'agidhane': 3.075976622577668e-05, 'wins': 6.151953245155337e-05, 'med': 1.537988311288834e-05, 'guitar': 3.075976622577668e-05, 'freezing': 6.151953245155337e-05, 'class': 0.0005229160258382037, 'handset': 1.537988311288834e-05, '250': 1.537988311288834e-05, 'west': 4.6139649338665025e-05, '128': 3.075976622577668e-05, 'local': 1.537988311288834e-05, 'munsters': 1.537988311288834e-05, 'exorcist': 1.537988311288834e-05, 'jod': 3.075976622577668e-05, 'scoring': 1.537988311288834e-05, 'everyone': 0.0001845585973546601, 'misss': 3.075976622577668e-05, 'priscilla': 3.075976622577668e-05, '3qxj9': 1.537988311288834e-05, 'fantastic': 4.6139649338665025e-05, 'mutations': 3.075976622577668e-05, 'swann': 3.075976622577668e-05, 'morefrmmob': 1.537988311288834e-05, 'gsoh': 1.537988311288834e-05, 'dance': 4.6139649338665025e-05, 'sw7': 1.537988311288834e-05, 'lovin': 3.075976622577668e-05, 'twelve': 4.6139649338665025e-05, 'shaved': 3.075976622577668e-05, 'pizza': 0.00013841894801599507, 'daywith': 3.075976622577668e-05, 'bird': 3.075976622577668e-05, 'angry': 0.0002153183635804368, 'turkeys': 3.075976622577668e-05, 'darkest': 1.537988311288834e-05, 'kip': 3.075976622577668e-05, '09061221061': 1.537988311288834e-05, 'multimedia': 3.075976622577668e-05, 'didnt': 0.00038449707782220856, 'stops': 4.6139649338665025e-05, 'education': 3.075976622577668e-05, '1450': 1.537988311288834e-05, 'christmassy': 3.075976622577668e-05, 'eppolum': 3.075976622577668e-05, 'outgoing': 1.537988311288834e-05, 'violence': 6.151953245155337e-05, 'bslvyl': 0.0001845585973546601, '08707500020': 1.537988311288834e-05, 'maneesha': 6.151953245155337e-05, 'algorithms': 3.075976622577668e-05, 'completely': 0.00012303906490310673, 'bluray': 3.075976622577668e-05, 'childrens': 3.075976622577668e-05, '08718720201': 1.537988311288834e-05, 'guesses': 3.075976622577668e-05, '220': 1.537988311288834e-05, 'chinky': 3.075976622577668e-05, 'chuckin': 3.075976622577668e-05, 'soc': 3.075976622577668e-05, 'rcd': 1.537988311288834e-05, '07786200117': 1.537988311288834e-05, 'monos': 1.537988311288834e-05, 'individual': 3.075976622577668e-05, 'rush': 6.151953245155337e-05, 'juicy': 6.151953245155337e-05, 'telephone': 3.075976622577668e-05, 'il': 0.00013841894801599507, 'paris': 1.537988311288834e-05, '08709501522': 1.537988311288834e-05, 'daytime': 1.537988311288834e-05, 'asking': 0.00013841894801599507, 'ranjith': 6.151953245155337e-05, 'entropication': 3.075976622577668e-05, 'bud': 7.689941556444171e-05, 'fassyole': 3.075976622577668e-05, 'clocks': 3.075976622577668e-05, 'bilo': 3.075976622577668e-05, 'most': 0.0002922177791448785, '80182': 1.537988311288834e-05, 'dem': 6.151953245155337e-05, 'astronomer': 3.075976622577668e-05, 'analysis': 3.075976622577668e-05, 'helens': 3.075976622577668e-05, 'abi': 6.151953245155337e-05, 'ml': 3.075976622577668e-05, 'permissions': 3.075976622577668e-05, 'dignity': 4.6139649338665025e-05, 'duo': 3.075976622577668e-05, '4719': 1.537988311288834e-05, 'was': 0.0028452783758843433, '50perwksub': 1.537988311288834e-05, 'somtimes': 3.075976622577668e-05, 'edge': 4.6139649338665025e-05, 'flat': 7.689941556444171e-05, '69696': 1.537988311288834e-05, 'tp': 3.075976622577668e-05, 'chinese': 7.689941556444171e-05, 'rcv': 1.537988311288834e-05, 'careers': 1.537988311288834e-05, 'da': 0.0017379267917563826, 'dude': 0.0002614580129191018, 'declare': 3.075976622577668e-05, 'mom': 0.00019993848046754843, 'sef': 3.075976622577668e-05, 'march': 0.0001845585973546601, '10p': 1.537988311288834e-05, 'x': 0.0005690556751768687, 'sue': 1.537988311288834e-05, 'greeting': 3.075976622577668e-05, 'slow': 0.00013841894801599507, 'nevamind': 3.075976622577668e-05, 'tallent': 3.075976622577668e-05, '69969': 1.537988311288834e-05, 'wc1n3xx': 1.537988311288834e-05, 'bedroom': 0.00012303906490310673, 'sonyericsson': 1.537988311288834e-05, 'goes': 0.00033835742848354353, 'condition': 3.075976622577668e-05, '69101': 1.537988311288834e-05, 'forever': 0.0001076591817902184, 'wisheds': 3.075976622577668e-05, 'arranging': 3.075976622577668e-05, 'tongued': 3.075976622577668e-05, '7mp': 1.537988311288834e-05, 'fonin': 3.075976622577668e-05, 'chocolate': 4.6139649338665025e-05, 'trishul': 3.075976622577668e-05, '2waxsto': 4.6139649338665025e-05, 'formal': 3.075976622577668e-05, 'sic': 1.537988311288834e-05, 'trek': 3.075976622577668e-05, '08702490080': 1.537988311288834e-05, 'listener': 3.075976622577668e-05, 'kz': 4.6139649338665025e-05, 'meets': 4.6139649338665025e-05, '4years': 3.075976622577668e-05, 'past': 9.227929867733005e-05, 'consider': 3.075976622577668e-05, 'looking': 0.00024607812980621346, 'indeed': 3.075976622577668e-05, 'leadership': 1.537988311288834e-05, 'houseful': 3.075976622577668e-05, 'yan': 6.151953245155337e-05, 'lambu': 3.075976622577668e-05, 'text': 0.0008458935712088588, 'useless': 3.075976622577668e-05, '2optout': 1.537988311288834e-05, '09111032124': 1.537988311288834e-05, 'mi': 3.075976622577668e-05, 'priority': 3.075976622577668e-05, 'poyyarikatur': 3.075976622577668e-05, 'soiree': 3.075976622577668e-05, 'greatness': 3.075976622577668e-05, 'temp': 6.151953245155337e-05, 'rite': 0.0002922177791448785, 'pouch': 4.6139649338665025e-05, 'fb': 0.0001076591817902184, 'relieved': 3.075976622577668e-05, 'chasing': 3.075976622577668e-05, 'gee': 7.689941556444171e-05, 'suggestions': 3.075976622577668e-05, 'dogbreath': 3.075976622577668e-05, 'settle': 4.6139649338665025e-05, 'rushing': 3.075976622577668e-05, 'sen': 7.689941556444171e-05, 'stash': 3.075976622577668e-05, 'imat': 3.075976622577668e-05, 'aaniye': 3.075976622577668e-05, '83021': 1.537988311288834e-05, 'srt': 3.075976622577668e-05, '69669': 1.537988311288834e-05, '1000call': 1.537988311288834e-05, 'notified': 1.537988311288834e-05, '08452810075over18': 1.537988311288834e-05, 'call09050000327': 1.537988311288834e-05, 'behalf': 3.075976622577668e-05, 'putting': 0.0001076591817902184, 'aig': 3.075976622577668e-05, 'responce': 4.6139649338665025e-05, '08002988890': 1.537988311288834e-05, 'alle': 3.075976622577668e-05, 'nagar': 3.075976622577668e-05, 'mgs': 1.537988311288834e-05, 'upstairs': 4.6139649338665025e-05, 'humans': 3.075976622577668e-05, '09061702893': 1.537988311288834e-05, 'notifications': 1.537988311288834e-05, 'captain': 6.151953245155337e-05, '09057039994': 1.537988311288834e-05, 'else': 0.0002922177791448785, 'sptv': 1.537988311288834e-05, 'age': 7.689941556444171e-05, 'wendy': 3.075976622577668e-05, 'lv': 3.075976622577668e-05, 'remove': 9.227929867733005e-05, 'wylie': 4.6139649338665025e-05, 'visionsms': 1.537988311288834e-05, 'fights': 4.6139649338665025e-05, 'convey': 0.00013841894801599507, 'administrator': 4.6139649338665025e-05, 'ela': 4.6139649338665025e-05, 'placement': 4.6139649338665025e-05, 'classmates': 3.075976622577668e-05, 'pataistha': 3.075976622577668e-05, 'birds': 6.151953245155337e-05, 'messaging': 1.537988311288834e-05, 'rays': 9.227929867733005e-05, '09058091870': 1.537988311288834e-05, 'mobypobox734ls27yf': 1.537988311288834e-05, '391784': 1.537988311288834e-05, 'casualty': 3.075976622577668e-05, 'imma': 9.227929867733005e-05, 'flute': 3.075976622577668e-05, 'sagamu': 3.075976622577668e-05, 'ovarian': 3.075976622577668e-05, '430': 3.075976622577668e-05, 'raj': 4.6139649338665025e-05, 'allalo': 3.075976622577668e-05, 'domain': 3.075976622577668e-05, '7th': 7.689941556444171e-05, 'murderer': 0.00012303906490310673, 'unspoken': 3.075976622577668e-05, 'whispers': 3.075976622577668e-05, 'where': 0.0015072285450630576, 'icicibank': 4.6139649338665025e-05, 'digital': 1.537988311288834e-05, 'birth': 4.6139649338665025e-05, 'celebrations': 3.075976622577668e-05, 'gage': 1.537988311288834e-05, 'lounge': 3.075976622577668e-05, 'becaus': 3.075976622577668e-05, 'natural': 6.151953245155337e-05, '83600': 1.537988311288834e-05, 'sum1': 4.6139649338665025e-05, 'panties': 3.075976622577668e-05, 'inc': 1.537988311288834e-05, 'sp': 1.537988311288834e-05, 'nusstu': 3.075976622577668e-05, 'confirmd': 4.6139649338665025e-05, 'babygoodbye': 1.537988311288834e-05, 'timings': 3.075976622577668e-05, 'rocking': 3.075976622577668e-05, 'increments': 1.537988311288834e-05, 'jerry': 4.6139649338665025e-05, 'fixed': 0.0001076591817902184, 'premium': 1.537988311288834e-05, 'clear': 6.151953245155337e-05, 'lush': 4.6139649338665025e-05, 'added': 6.151953245155337e-05, 'slide': 3.075976622577668e-05, 'lucy': 1.537988311288834e-05, 'afternoons': 3.075976622577668e-05, 'urmom': 3.075976622577668e-05, 'chad': 3.075976622577668e-05, 'tired': 0.00016917871424177176, 'num': 9.227929867733005e-05, 'bridal': 1.537988311288834e-05, 'references': 3.075976622577668e-05, 'anthony': 4.6139649338665025e-05, 'features': 1.537988311288834e-05, 'jabo': 3.075976622577668e-05, 'seem': 4.6139649338665025e-05, 'derek': 4.6139649338665025e-05, 'thank': 0.0003691171947093202, 'elephant': 3.075976622577668e-05, 'getsleep': 3.075976622577668e-05, 'vpod': 1.537988311288834e-05, 'mundhe': 3.075976622577668e-05, 'boytoy': 0.00023069824669332513, '9996': 1.537988311288834e-05, 'packing': 3.075976622577668e-05, 'important': 0.00016917871424177176, 'acknowledgement': 3.075976622577668e-05, 'perform': 3.075976622577668e-05, '69876': 1.537988311288834e-05, 'yahoo': 0.0001076591817902184, 'stylish': 9.227929867733005e-05, 'overdid': 3.075976622577668e-05, 'stream': 1.537988311288834e-05, 'wuld': 4.6139649338665025e-05, 'accenture': 3.075976622577668e-05, 'unusual': 3.075976622577668e-05, 'buyer': 3.075976622577668e-05, '08718723815': 1.537988311288834e-05, 'everythin': 3.075976622577668e-05, 'sugababes': 1.537988311288834e-05, 'flirtparty': 1.537988311288834e-05, 'yorge': 3.075976622577668e-05, 'engagement': 3.075976622577668e-05, 'collection': 1.537988311288834e-05, 'thru': 0.0001076591817902184, 'events': 3.075976622577668e-05, 'ikno': 3.075976622577668e-05, 'fraction': 1.537988311288834e-05, 'necessary': 4.6139649338665025e-05, 'wit': 0.0002153183635804368, 'mag': 4.6139649338665025e-05, 'studying': 0.00013841894801599507, 'ec2a': 1.537988311288834e-05, 'christians': 3.075976622577668e-05, 'accumulation': 3.075976622577668e-05, 'adam': 1.537988311288834e-05, 'row': 6.151953245155337e-05, 'cheat': 4.6139649338665025e-05, 'wildlife': 3.075976622577668e-05, 'for': 0.006336511842509997, 'male': 1.537988311288834e-05, 'bids': 1.537988311288834e-05, 'tohar': 3.075976622577668e-05, 'deficient': 3.075976622577668e-05, 'gnt': 4.6139649338665025e-05, 'coz': 0.00030759766225776686, 'if': 0.004260227622270071, 'anybody': 6.151953245155337e-05, 'forms': 3.075976622577668e-05, 'gep': 3.075976622577668e-05, '5wb': 1.537988311288834e-05, 'tag': 3.075976622577668e-05, 'mcfly': 1.537988311288834e-05, 'befor': 6.151953245155337e-05, 'problems': 9.227929867733005e-05, 'symptoms': 3.075976622577668e-05, 'deal': 0.00013841894801599507, 'toa': 4.6139649338665025e-05, 'persolvo': 1.537988311288834e-05, 'particularly': 3.075976622577668e-05, 'portions': 3.075976622577668e-05, 'orig': 1.537988311288834e-05, 'ericson': 3.075976622577668e-05, 'rejected': 3.075976622577668e-05, '2bold': 3.075976622577668e-05, 'flyng': 1.537988311288834e-05, 'availa': 1.537988311288834e-05, 'china': 4.6139649338665025e-05, '09063458130': 1.537988311288834e-05, 'misfits': 3.075976622577668e-05, 'sign': 9.227929867733005e-05, 'datebox1282essexcm61xn': 1.537988311288834e-05, 'lyk': 4.6139649338665025e-05, 'gon': 3.075976622577668e-05, 'crammed': 3.075976622577668e-05, 'see': 0.0018302060904337126, 'site': 9.227929867733005e-05, 'sos': 3.075976622577668e-05, 'dogg': 3.075976622577668e-05, 'upping': 3.075976622577668e-05, 'watts': 3.075976622577668e-05, 'fighting': 9.227929867733005e-05, 'now1': 1.537988311288834e-05, 'amplikater': 3.075976622577668e-05, 'doing': 0.0010458320516764073, 'speedchat': 1.537988311288834e-05, 'workout': 3.075976622577668e-05, 'ny': 3.075976622577668e-05, 'broken': 3.075976622577668e-05, 'player': 6.151953245155337e-05, 'loans': 1.537988311288834e-05, 'ctagg': 3.075976622577668e-05, 'physics': 3.075976622577668e-05, 'sim': 9.227929867733005e-05, 'evng': 7.689941556444171e-05, 'relation': 7.689941556444171e-05, 'lotr': 3.075976622577668e-05, 'kiss': 0.00039987696093509687, 'boye': 4.6139649338665025e-05, 'archive': 3.075976622577668e-05, 'acc': 6.151953245155337e-05, 'morn': 6.151953245155337e-05, 'brdget': 3.075976622577668e-05, 'babes': 4.6139649338665025e-05, 'bloody': 4.6139649338665025e-05, 'club4mobiles': 1.537988311288834e-05, 'predictive': 3.075976622577668e-05, 'required': 3.075976622577668e-05, 'jules': 3.075976622577668e-05, 'harlem': 3.075976622577668e-05, 'discreet': 1.537988311288834e-05, 'blah': 6.151953245155337e-05, 'wo': 4.6139649338665025e-05, 'contention': 3.075976622577668e-05, 'removal': 1.537988311288834e-05, 'fingers': 9.227929867733005e-05, '83049': 1.537988311288834e-05, 'she': 0.002122423869578591, 'pobox36504w45wq': 1.537988311288834e-05, 'howu': 3.075976622577668e-05, 'billing': 1.537988311288834e-05, 'mo': 0.00012303906490310673, 'phb1': 1.537988311288834e-05, '09099726395': 1.537988311288834e-05, 'mat': 1.537988311288834e-05, 'gbp': 1.537988311288834e-05, 'm8s': 1.537988311288834e-05, 'dime': 4.6139649338665025e-05, '92h': 1.537988311288834e-05, 'commercial': 3.075976622577668e-05, 'mine': 0.00024607812980621346, 'internet': 7.689941556444171e-05, 'wither': 1.537988311288834e-05, 'brain': 4.6139649338665025e-05, 'appointment': 7.689941556444171e-05, 'skilgme': 1.537988311288834e-05, 'thats': 0.0005536757920639803, 'ihave': 3.075976622577668e-05, 'students': 0.0001076591817902184, '89938': 1.537988311288834e-05, 'dryer': 3.075976622577668e-05, 'uve': 1.537988311288834e-05, 'ure': 6.151953245155337e-05, 'neglect': 3.075976622577668e-05, 'onwards': 4.6139649338665025e-05, 'pushes': 3.075976622577668e-05, 'babyjontet': 3.075976622577668e-05, 'witot': 3.075976622577668e-05, 'outrageous': 3.075976622577668e-05, '83118': 1.537988311288834e-05, 'showered': 3.075976622577668e-05, 'common': 6.151953245155337e-05, 'messenger': 4.6139649338665025e-05, 'é': 7.689941556444171e-05, 'pushbutton': 1.537988311288834e-05, 'corporation': 3.075976622577668e-05, 'tke': 3.075976622577668e-05, 'eng': 3.075976622577668e-05, 'quite': 0.0004460166102737619, 'scrappy': 3.075976622577668e-05, 'atm': 6.151953245155337e-05, 'flights': 1.537988311288834e-05, 'ones': 0.0001076591817902184, 'onlyfound': 3.075976622577668e-05, 'crore': 4.6139649338665025e-05, 'normally': 6.151953245155337e-05, 'g': 0.0002614580129191018, 'oops': 0.00013841894801599507, 'inch': 4.6139649338665025e-05, '3x': 1.537988311288834e-05, 'belly': 7.689941556444171e-05, 'slightly': 3.075976622577668e-05, 'looks': 0.0001076591817902184, 'embarassing': 3.075976622577668e-05, 'hyde': 3.075976622577668e-05, 'conacted': 1.537988311288834e-05, 'fromm': 1.537988311288834e-05, 'rajas': 3.075976622577668e-05, 'tsandcs': 1.537988311288834e-05, 'mquiz': 1.537988311288834e-05, 'reserved': 3.075976622577668e-05, 'unfortuntly': 3.075976622577668e-05, 'ours': 3.075976622577668e-05, '08712405020': 1.537988311288834e-05, 'computers': 3.075976622577668e-05, 'h': 4.6139649338665025e-05, '33': 3.075976622577668e-05, 'v': 0.0005998154414026453, '2mro': 3.075976622577668e-05, 'mahfuuz': 3.075976622577668e-05, 'simulate': 3.075976622577668e-05, 'gaze': 3.075976622577668e-05, 'app': 3.075976622577668e-05, 'lennon': 3.075976622577668e-05, 'initiate': 3.075976622577668e-05, 'jazz': 7.689941556444171e-05, 'skyped': 4.6139649338665025e-05, 'mokka': 4.6139649338665025e-05, 'dependable': 3.075976622577668e-05, 'stitch': 3.075976622577668e-05, 'adjustable': 3.075976622577668e-05, 'lovers': 3.075976622577668e-05, 'gist': 4.6139649338665025e-05, 'search': 0.00024607812980621346, 'ppm': 1.537988311288834e-05, 'complete': 7.689941556444171e-05, 'frnd': 0.00015379883112888343, 'cry': 9.227929867733005e-05, 'along': 4.6139649338665025e-05, 'kicks': 4.6139649338665025e-05, 'nipost': 3.075976622577668e-05, 'kissing': 3.075976622577668e-05, 'print': 4.6139649338665025e-05, 'cancer': 7.689941556444171e-05, '3uz': 1.537988311288834e-05, 'lanka': 3.075976622577668e-05, 'conected': 3.075976622577668e-05, 'gentleman': 4.6139649338665025e-05, 'wife': 0.00041525684404798523, 'veggie': 3.075976622577668e-05, 'bishan': 6.151953245155337e-05, 'pocketbabe': 1.537988311288834e-05, 'remember': 0.0003691171947093202, 'robs': 3.075976622577668e-05, 'suffering': 3.075976622577668e-05, 'urgently': 4.6139649338665025e-05, 'pub': 0.00019993848046754843, 'tomorw': 3.075976622577668e-05, 'announcement': 1.537988311288834e-05, 'spose': 3.075976622577668e-05, 'cme': 4.6139649338665025e-05, 'terrific': 3.075976622577668e-05, 'here': 0.0014764687788372808, 'lo': 3.075976622577668e-05, 'amrita': 3.075976622577668e-05, 'lay': 3.075976622577668e-05, 'herself': 1.537988311288834e-05, 'argh': 3.075976622577668e-05, 'fudge': 3.075976622577668e-05, 'town': 0.00027683789603199013, 'tests': 4.6139649338665025e-05, 'm': 0.005121501076591818, 'callin': 4.6139649338665025e-05, 'gucci': 3.075976622577668e-05, 'vivek': 4.6139649338665025e-05, 'aight': 0.0004767763764995386, 'deviousbitch': 3.075976622577668e-05, 'shirts': 4.6139649338665025e-05, 'roommates': 3.075976622577668e-05, 'nite': 0.00030759766225776686, '2end': 1.537988311288834e-05, 'dat': 0.0004460166102737619, 'kaiez': 4.6139649338665025e-05, 'henry': 1.537988311288834e-05, 'silver': 3.075976622577668e-05, 'on': 0.004783143648108275, '88222': 1.537988311288834e-05, 'urself': 0.00015379883112888343, 'flood': 3.075976622577668e-05, 'propsd': 3.075976622577668e-05, 'db': 3.075976622577668e-05, '25': 3.075976622577668e-05, 'tv': 0.0002614580129191018, 'expected': 3.075976622577668e-05, 'basq': 3.075976622577668e-05, 'nuther': 4.6139649338665025e-05, 'radiator': 3.075976622577668e-05, 'well': 0.0013995693632728391, 'relationship': 3.075976622577668e-05, 'safely': 3.075976622577668e-05, 'dao': 3.075976622577668e-05, 'deus': 4.6139649338665025e-05, 'malaria': 0.00012303906490310673, '40533': 1.537988311288834e-05, 'hand': 0.0002153183635804368, 'xxsp': 1.537988311288834e-05, 'wan2': 1.537988311288834e-05, 'tablet': 3.075976622577668e-05, 'understanding': 6.151953245155337e-05, 'swiss': 4.6139649338665025e-05, 'vote': 7.689941556444171e-05, 'terrible': 6.151953245155337e-05, 'sat': 0.00032297754537065517, 'bangb': 1.537988311288834e-05, 'fat': 0.0001076591817902184, 'constantly': 4.6139649338665025e-05, 'stopped': 7.689941556444171e-05, 'yep': 0.00015379883112888343, 'ptbo': 4.6139649338665025e-05, 'xchat': 1.537988311288834e-05, 'position': 3.075976622577668e-05, 'serena': 3.075976622577668e-05, 'pixels': 3.075976622577668e-05, 'bb': 0.00013841894801599507, 'um': 3.075976622577668e-05, '077xxx': 1.537988311288834e-05, '0871': 1.537988311288834e-05, 'mountains': 3.075976622577668e-05, 'meat': 3.075976622577668e-05, 'cloud': 3.075976622577668e-05, 'allowed': 4.6139649338665025e-05, 'rolled': 3.075976622577668e-05, 'influx': 3.075976622577668e-05, 'thesedays': 3.075976622577668e-05, '88066': 1.537988311288834e-05, '1000s': 3.075976622577668e-05, 'violated': 3.075976622577668e-05, 'gopalettan': 3.075976622577668e-05, 'knw': 0.00019993848046754843, 'london': 4.6139649338665025e-05, 'opt': 1.537988311288834e-05, 'east': 3.075976622577668e-05, 'barred': 3.075976622577668e-05, 'diapers': 3.075976622577668e-05, 'satanic': 3.075976622577668e-05, 'upon': 3.075976622577668e-05, 'enjoying': 3.075976622577668e-05, 'shanghai': 3.075976622577668e-05, 'option': 6.151953245155337e-05, 'tallahassee': 3.075976622577668e-05, 'preferably': 7.689941556444171e-05, 'wrnog': 3.075976622577668e-05, 'songs': 6.151953245155337e-05, 'unless': 0.00013841894801599507, '4fil': 1.537988311288834e-05, 'itz': 6.151953245155337e-05, 'tell': 0.0015226084281759458, 'baig': 3.075976622577668e-05, 'president': 1.537988311288834e-05, 'shivratri': 3.075976622577668e-05, 'callfreefone': 1.537988311288834e-05, 'within': 9.227929867733005e-05, 'sexy': 0.00019993848046754843, 'videophones': 1.537988311288834e-05, '50ea': 1.537988311288834e-05, 'speak': 0.00030759766225776686, 'province': 3.075976622577668e-05, 'ummma': 3.075976622577668e-05, 'clos1': 6.151953245155337e-05, 'short': 0.00013841894801599507, 'george': 1.537988311288834e-05, 'restrict': 3.075976622577668e-05, 'payed2day': 3.075976622577668e-05, 'sink': 3.075976622577668e-05, 'korean': 3.075976622577668e-05, 'blocked': 3.075976622577668e-05, 'ringtones': 1.537988311288834e-05, 'ree': 1.537988311288834e-05, 'elaborate': 3.075976622577668e-05, 'pose': 3.075976622577668e-05, 'this': 0.0031682559212549985, 'stand': 0.0001076591817902184, 'enjoy': 0.00041525684404798523, 'craziest': 3.075976622577668e-05, 'love': 0.002522300830513688, 'some1': 0.00012303906490310673, 'audrie': 3.075976622577668e-05, 'treatin': 3.075976622577668e-05, '2morro': 3.075976622577668e-05, 'mobsi': 1.537988311288834e-05, 'athletic': 3.075976622577668e-05, 'log': 0.0001076591817902184, 'faith': 4.6139649338665025e-05, 'trains': 4.6139649338665025e-05, 'uploaded': 3.075976622577668e-05, 'box39822': 1.537988311288834e-05, 'requires': 3.075976622577668e-05, 'downloads': 1.537988311288834e-05, 'voila': 3.075976622577668e-05, 'outstanding': 6.151953245155337e-05, 'beggar': 3.075976622577668e-05, 'console': 1.537988311288834e-05, 'unsub': 1.537988311288834e-05, 'ibhltd': 1.537988311288834e-05, 'hands': 7.689941556444171e-05, '7876150ppm': 1.537988311288834e-05, 'panasonic': 1.537988311288834e-05, 'cos': 0.0009689326361119656, '09066358152': 1.537988311288834e-05, 'regarding': 4.6139649338665025e-05, 'second': 0.00027683789603199013, '75': 1.537988311288834e-05, 'coming': 0.000630575207628422, 'anand': 3.075976622577668e-05, 'social': 6.151953245155337e-05, 'printed': 4.6139649338665025e-05, 'early': 0.00041525684404798523, 'driver': 4.6139649338665025e-05, 'rang': 4.6139649338665025e-05, 'wrc': 1.537988311288834e-05, '09064018838': 1.537988311288834e-05, 'sort': 0.00013841894801599507, 'manege': 3.075976622577668e-05, 'noooooooo': 3.075976622577668e-05, 'dehydration': 3.075976622577668e-05, '86688': 1.537988311288834e-05, '400thousad': 3.075976622577668e-05, 'election': 3.075976622577668e-05, 'clean': 0.0001845585973546601, 'granite': 1.537988311288834e-05, 'box61': 1.537988311288834e-05, 'apt': 3.075976622577668e-05, 'wed': 6.151953245155337e-05, 'ali': 4.6139649338665025e-05, 'english': 9.227929867733005e-05, 'id': 0.00023069824669332513, 'fuelled': 3.075976622577668e-05, 'matches': 1.537988311288834e-05, 'unsold': 6.151953245155337e-05, 'nìte': 3.075976622577668e-05, 'wrk': 4.6139649338665025e-05, 'joys': 3.075976622577668e-05, '08709222922': 1.537988311288834e-05, 'xclusive': 1.537988311288834e-05, 'unmits': 3.075976622577668e-05, 'renewal': 3.075976622577668e-05, 'onam': 3.075976622577668e-05, 'surprise': 9.227929867733005e-05, 'fake': 3.075976622577668e-05, 'buffet': 4.6139649338665025e-05, 'decades': 3.075976622577668e-05, '2day': 9.227929867733005e-05, 'intentions': 3.075976622577668e-05, 'live': 0.00024607812980621346, 'poo': 3.075976622577668e-05, 'follow': 3.075976622577668e-05, 'sweetheart': 3.075976622577668e-05, 'ecstasy': 3.075976622577668e-05, 'dust': 3.075976622577668e-05, 'theplace': 3.075976622577668e-05, 'spelled': 3.075976622577668e-05, 'courage': 4.6139649338665025e-05, 'twins': 3.075976622577668e-05, 'stuff42moro': 3.075976622577668e-05, '1winaweek': 1.537988311288834e-05, 'planning': 0.00013841894801599507, '83222': 1.537988311288834e-05, '09061701851': 1.537988311288834e-05, 'fair': 4.6139649338665025e-05, 'cheap': 9.227929867733005e-05, 'felt': 0.0001845585973546601, 'sneham': 3.075976622577668e-05, '09053750005': 1.537988311288834e-05, 'potential': 4.6139649338665025e-05, 'feb': 7.689941556444171e-05, 'lets': 0.00019993848046754843, 'certainly': 3.075976622577668e-05, 'argentina': 3.075976622577668e-05, 'result': 4.6139649338665025e-05, 'handing': 3.075976622577668e-05, 'draw': 7.689941556444171e-05, 'browse': 1.537988311288834e-05, 'nevr': 3.075976622577668e-05, 'oyster': 3.075976622577668e-05, 'hockey': 3.075976622577668e-05, 'tolerance': 3.075976622577668e-05, 'complaint': 3.075976622577668e-05, 'tot': 0.0002614580129191018, 'sehwag': 3.075976622577668e-05, 'game': 0.0001845585973546601, 'talking': 0.0001076591817902184, 'wotz': 3.075976622577668e-05, 'ge': 0.0001845585973546601, 'y': 0.0005690556751768687, 'hopeful': 3.075976622577668e-05, 'hesitate': 3.075976622577668e-05, 'tom': 4.6139649338665025e-05, 'renewing': 1.537988311288834e-05, 'mind': 0.0005229160258382037, 'long': 0.0005536757920639803, '89693': 1.537988311288834e-05, 'sleeping': 0.0002922177791448785, 'nottingham': 3.075976622577668e-05, 'lab': 6.151953245155337e-05, 'congrats': 0.00013841894801599507, 'thy': 3.075976622577668e-05, 'hcl': 3.075976622577668e-05, 'surrounded': 3.075976622577668e-05, 'street': 0.00013841894801599507, 'simonwatson5120': 1.537988311288834e-05, 'sterm': 3.075976622577668e-05, 'yck': 3.075976622577668e-05, 'exe': 4.6139649338665025e-05, 'cough': 3.075976622577668e-05, 'some': 0.0013995693632728391, 'science': 4.6139649338665025e-05, 'stalking': 3.075976622577668e-05, 'occasion': 3.075976622577668e-05, 'spouse': 3.075976622577668e-05, 'carlin': 3.075976622577668e-05, 'sullivan': 1.537988311288834e-05, 'undrstnd': 3.075976622577668e-05, 'soil': 3.075976622577668e-05, 'nicky': 3.075976622577668e-05, 'textin': 3.075976622577668e-05, 'jenxxx': 3.075976622577668e-05, 'eg': 1.537988311288834e-05, 'gone': 0.0002153183635804368, 'sabarish': 3.075976622577668e-05, 'shldxxxx': 3.075976622577668e-05, '8wp': 1.537988311288834e-05, 'bam': 3.075976622577668e-05, 'thing': 0.0007382343894186404, 'butting': 3.075976622577668e-05, 'laughing': 3.075976622577668e-05, '750': 1.537988311288834e-05, 'ts': 1.537988311288834e-05, '2nd': 0.00016917871424177176, 'wouldn': 7.689941556444171e-05, 'goodnite': 6.151953245155337e-05, 'percent': 3.075976622577668e-05, ...}
The spam filter can be understood as a function that:
Takes in as input a new message (w1, w2, ..., wn)
Calculates P(Spam|w1, w2, ..., wn) and P(Ham|w1, w2, ..., wn)
Compares the values of P(Spam|w1, w2, ..., wn) and P(Ham|w1, w2, ..., wn), and:
If P(Ham|w1, w2, ..., wn) > P(Spam|w1, w2, ..., wn), then the message is classified as ham.
If P(Ham|w1, w2, ..., wn) < P(Spam|w1, w2, ..., wn), then the message is classified as spam.
If P(Ham|w1, w2, ..., wn) = P(Spam|w1, w2, ..., wn), then the algorithm may request human help.
Below, we see a rough sketch of how the spam filter function might look like:
# creating classify() function which will act as spam filter
import re
def classify(message):
message = re.sub('\W', ' ', message)
message = message.lower()
message = message.split()
p_spam_given_message = p_spam
p_ham_given_message = p_not_spam
for each in message:
if each in spam_prob:
p_spam_given_message *= spam_prob[each]
if each in non_spam_prob:
p_ham_given_message *= non_spam_prob[each]
# print('P(Spam|message):', p_spam_given_message)
# print('P(Ham|message):', p_ham_given_message)
if p_ham_given_message > p_spam_given_message:
return 'ham'
elif p_ham_given_message < p_spam_given_message:
return 'spam'
else:
return 'needs human classification'
print(classify('WINNER!! This is the secret code to unlock the money: C3421.'))
spam
print(classify("Sounds good, Tom, then see u there"))
ham
testing_set['predicted'] = testing_set['SMS'].apply(classify)
testing_set.head()
Label | SMS | predicted | |
---|---|---|---|
0 | ham | later i guess i needa do mcat study too | ham |
1 | ham | but i haf enuff space got like 4 mb | ham |
2 | spam | had your mobile 10 mths update to latest oran... | spam |
3 | ham | all sounds good fingers makes it difficult ... | ham |
4 | ham | all done all handed in don t know if mega sh... | ham |
x = (testing_set['Label'] == testing_set['predicted']).sum()
x
1100
accuracy = x/testing_set.shape[0]
print(str(accuracy*100) + " %")
98.74326750448833 %
In this project, we managed to build a spam filter for SMS messages using the multinomial Naive Bayes algorithm. The filter had an accuracy of 98.74% on the test set, which is an excellent result.
Future Scope:
Isolate the 14 messages that were classified incorrectly and try to figure out why the algorithm reached the wrong conclusions.
Make the filtering process more complex by making the algorithm sensitive to letter case.