Analysis and Visualization of Today's Birthday Data of NairaLand.com forum Members!
This experiment is valid for today.
Which mean that only the NairaLand forum Members whose birthday is today and are registered on NairaLand before now will have their data Analyzed and Visualized.
Note: If you run this script Tomorrow, you will get different data and results.
Since a birthday is just once in a year different members will celebrate their birthday on different days.
Author: Umay Yusuf. Read the blog post here.
Scrap birthday data from Nairaland home page
Clean the data into a friendly format
Analyze and Visualize the data
Note:: You can increase your work load and skip this section by copying and editing data manually
The home page url is at: http://www.nairaland.com/home#featured. If you scroll down the page, you will see the list of members having their Birthday today!
The list is in this format: rodbel(29), Sirolad(29), mokei(27)... The first word is the username of the member and his age in braces. That is: member_username(age)
The format above isn't useful in python, so we need to clean it into a tabular format useful in python.
Note: If you inspect the html of the Birthday list, you should see that it is contained in a cell of table tag (< td > ......... < /td >).
See sample Birthday list on nairaland.com below;-
In Summary: We want to scrap data from this format "rodbel(29), Sirolad(29), mokei(27)" into tabular format.
# import the libraries we are going to use
# libraries for Scraping and Cleaning the data
import re
import requests
from bs4 import BeautifulSoup
# libraries for Analyzing and Visualizing the data
import pandas as pd
from datetime import datetime
# Scraping out the raw html code of nairaland home page
url = "http://www.nairaland.com/home"
raw_html = requests.get(url) # returns the complete url html code
# print (raw_html.text)
raw_data = raw_html.text # save the text in an object
soup_data = BeautifulSoup(raw_data, "lxml") # use BeautifulSoup module read the html into xml to and save it in an object
# lets display only the part of the data we need. It is contained in the cell of table tag (<td>)
soup_data("td")
[<td class="grad"><h1><a class="g" href="http://www.nairaland.com/" title="Nairaland Nigerian Forum">₦airaland Forum</a></h1> Welcome, <b>Guest</b>: <b><a href="/register">Join Nairaland</a></b> / <a href="/login">Login</a> / <a href="/trending">Trending</a> / <a href="/recent">Recent</a> / <a href="/topics">New</a><br/><b>Stats: </b>1,637,780 members, 3,033,792 topics. <b>Date</b>: Friday, 19 August 2016 at 12:02 PM<p></p><form action="/search"> <input name="q" size="32" type="text"/> <input name="search" type="submit" value="Search"/></form> </td>, <td class="l w"><a href="/nairaland" title=" class=g"><b>Nairaland / General</b></a>: <a href="/politics" title="Our country Nigeria is the giant of Africa!"><b>Politics</b></a>, <a href="/crime" title=""><b>Crime</b></a>, <a href="/romance" title="Discuss dating, courtship, and romance in marriage."><b>Romance</b></a>, <a href="/jobs" title="Job/Employment Opportunities; Vacancies In Nigeria!"><b>Jobs/Vacancies</b></a>, <a href="/career" title="Talk about workplace experiences and incidents, professional training and certification, career management, etc. Employed, self-employed, job seekers."><b>Career</b></a>, <a href="/business" title="Entrepreneurship, Startups, Economics, etc"><b>Business</b></a>, <a href="/investment" title="For Investors. Nigerian Stock Exchange (NSE) Stocks, Bonds, T-bills, Real-Estate, etc."><b>Investment</b></a>, <a href="/nysc" title="National Youth Service Corps (NYSC) Discussions. Corpers, etc."><b>NYSC</b></a>, <a href="/education" title="Nairaland Students Forum: Primary and secondary school, universities, polytechnics, et cetera."><b>Education</b></a>, <a href="/autos" title="Cars, motorbikes, airplanes, et cetera."><b>Autos</b></a>, <a href="/cartalk" title="Let's talk about cars here"><b>Car Talk</b></a>, <a href="/properties" title="Real Estate: Land, Houses, Flats, Etc."><b>Properties</b></a>, <a href="/health" title="Health related topics."><b>Health</b></a>, <a href="/travel" title="Tourism, travel. Interesting destinations within Nigeria and abroad. And motoring!"><b>Travel</b></a>, <a href="/family" title="Marriage/Family issues including husband to wife, parent to child, sibling and extended family relationships"><b>Family</b></a>, <a href="/culture" title="Nigerian languages, traditions, practices, et cetera."><b>Culture</b></a>, <a href="/religion" title="Share your faith and belief in God or higher powers here."><b>Religion</b></a>, <a href="/food" title="Delicious foods and how to prepare them. Anything about food or drink goes here. This is the Nairaland kitchen!"><b>Food</b></a>, <a href="/diaries" title="Write about your life and let the whole world read you. You won't even be interrupted by comments. Nairaland members only!"><b>Diaries</b></a>, <a href="/ads" title="Nairaland Direct Adverts"><b>Nairaland Ads</b></a>, <a href="/pets" title="Discuss dogs/puppies, cats/kittens, and other pets. Buy and sell them"><b>Pets</b></a>, <a href="/agriculture" title="The business and science of crop and livestock production, Agric Science, Agric Economics, etc"><b>Agriculture</b></a></td>, <td class="l "><a href="/entertainment" title="Entertainment threads that won't fit into any child board. class=g"><b>Entertainment</b></a>: <a href="/jokes" title="Riddles and jokes that fellow Nigerians can appreciate."><b>Jokes Etc</b></a>, <a href="/tv-movies" title="Local television programmes, local and foreign movies available in Nigeria."><b>TV/Movies</b></a>, <a href="/music-radio" title="Talk about your favorite songs, music albums, artists and bands."><b>Music/Radio</b></a>, <a href="/celebs" title="Celebrity obsession, Nollywood gossip, etc."><b>Celebrities</b></a>, <a href="/fashion" title="Clothes, dresses, make-up routines, and modeling."><b>Fashion</b></a>, <a href="/events" title="Birthdays, Weddings, other Occasions. Planning, Announcement, Gifts."><b>Events</b></a>, <a href="/sports" title="We are Nigerians; and we love soccer and many other sports!"><b>Sports</b></a>, <a href="/gaming" title="Welcome to the world of computer, Internet, video, and board games. Xbox, PS2, Chess, whatever!"><b>Gaming</b></a>, <a href="/forum-games" title="Play various addictive forum games with fellow members of Nairaland."><b>Forum Games</b></a>, <a href="/literature" title="For writers and lovers of books."><b>Literature</b></a></td>, <td class="l w"><a href="/science" title=" class=g"><b>Science/Technology</b></a>: <a href="/programming" title="Software programming, development of applications.."><b>Programming</b></a>, <a href="/webmasters" title="Website design and development, management of forums, blogs, wikis, and all sorts of websites."><b>Webmasters</b></a>, <a href="/computers" title="Personal Computing, etc."><b>Computers</b></a>, <a href="/phones" title="Nigerian GSM networks, telephone companies, et cetera. ISPs, Modems, Websites, etc"><b>Phones</b></a>, <a href="/graphics-video" title="Digital Video and Film, Computer Graphics and Animation. Tips, Tricks & Tools."><b>Art, Graphics & Video</b></a>, <a href="/techmarket" title="Buy and sell Phones, Computers and PC accessories here."><b>Technology Market</b></a></td>, <td><img alt="" src="/icons/smiley.gif"/> <b><a href="/links">Featured Links</a></b> / <b><a href="http://twitter.com/nairaland">Twitter</a></b> / <b><a href="http://facebook.com/nigerianforum">Facebook</a></b> / <b><a href="http://www.nairaland.com/1049481/how-place-targeted-ads-nairaland">How To Advertise</a></b> <img src="/icons/smiley.gif"/></td>, <td class="featured w"> » <a href="http://www.nairaland.com/3297707/usain-bolt-wins-200m-gold" rel="noopener"><b>Usain Bolt Wins 200m Gold At The RIO 2016 Olympics: His 8th Olympic Gold</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297659/popular-actress-suffering-severe-skin" rel="noopener"><b>See How Tattoo Damaged The Skin Of Actress Anita Joseph</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297616/naira-sinks-all-time-low-365.25" rel="noopener"><b>Naira Sinks To All-Time Low Of 365.25/Dollar</b></a> «<br/>» <a href="http://www.nairaland.com/3294684/see-little-bushmeat-friend-killed" rel="noopener"><b>"See The Bush Meat A Friend And I Killed" - Hillarie</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297072/john-kerry-visit-nigeria-next" rel="noopener"><b>US Secretary Of State, John Kerry, To Visit Nigeria Next Week</b></a> «<br/>» <a href="http://www.nairaland.com/3297829/police-arrest-kidnappers-sani-bello" rel="noopener"><b>Photo Of The 3 Herdsmen Who Kidnapped Law Maker, Sani Bello & The Money Recovered</b></a> «<br/>» <a href="http://www.nairaland.com/3297347/brother-organises-robbers-attack-pregnant" rel="noopener"><b>A Brother Organised Robbers To Attack His Pregnant Sister In Lagos, Maid Raped</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297333/photos-veteran-actors-actresses-storm" rel="noopener"><b>Actors And Actresses Storm Comic Star Actor, Aluwe’s Mom's Burial</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297331/5-keys-long-term-success" rel="noopener"><b>"5 Keys To Long Term Success In Your Business"</b></a> «<br/>» <a href="http://www.nairaland.com/3297295/dino-melaye-children-tour-georgia" rel="noopener"><b>Senator Dino Melaye And His Children Tour Georgia</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3257201/these-been-nigerian-names-these" rel="noopener"><b>"These Would Have Been The Nigerian Names Of These Foreign Celebrities"</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296988/team-nigeria-kits-arrive-rio" rel="noopener"><b>Team Nigeria Kits Arrive 3 Days To End Of Olympics Games</b></a> «<br/>» <a href="http://www.nairaland.com/3296981/kill-yourself-dont-like-me" rel="noopener"><b>“Kill Yourself If You Don’t Like Me” – Actress Angela Okorie Tells Critics</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297439/photos-only-those-went-secondary" rel="noopener"><b>Only Those Who Went To Secondary School In Nigeria Will Understand These</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297082/dogged-ill-health-marital-feuds" rel="noopener"><b>Dogged By Ill Health, Marital Feuds, Emeka Offor’s Crisis Deepens - Sahara Reporters</b></a> «<br/>» <a href="http://www.nairaland.com/3295363/living-unhappy-marriage" rel="noopener">"<b>I Am Living</b> In <b>An Unhappy Marriage</b>" - <b>Please Advice</b></a> «<br/>» <a href="http://www.nairaland.com/3297454/reasons-why-memorising-quran-good" rel="noopener"><b>Reasons Why Memorising</b> The <b>Quran</b> Is <b>Good</b> For <b>Your Brain</b></a> «<br/>» <a href="http://www.nairaland.com/3297642/10-tips-concentration-prayer-salah" rel="noopener"><b>10 Tips</b> For <b>Concentration</b> In <b>Prayer</b> </a> «<br/>» <a href="http://www.nairaland.com/3297224/declare-republic-face-treason-police" rel="noopener"><b>Declare Niger Delta Republic And Face Treason - Police Tell Militants</b></a> «<br/>» <a href="http://www.nairaland.com/3297698/what-does-love-look-like" rel="noopener"><b>What Does Love Look Like</b> To <b>You</b>?</a> «<br/>» <a href="http://www.nairaland.com/3297403/see-what-funaab-students-did" rel="noopener"><b>See What FUNAAB Students Did To A Thief Who Stole Tecno M3</b></a> «<br/>» <a href="http://www.nairaland.com/3297300/how-nairalander-mauled-after-writing" rel="noopener"><b>See What They Did To A Nairalander After Writing His Final Exams In AAU, Ekpoma</b></a> «<br/>» <a href="http://www.nairaland.com/3297386/excited-woman-couldnt-hold-herself" rel="noopener"><b>This Excited Woman Couldn't Hold Herself As Oshiomhole Passed By</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296460/anything-wrong-photo-father-step" rel="noopener"><b>Anything Wrong With This Photo Of A Father And His Step Daughter?</b></a> «<br/>» <a href="http://www.nairaland.com/3296889/haircut-head-cut-photos" rel="noopener"><b>Is This A Haircut Or Head Cut?</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297062/should-pay-him-n500000-job" rel="noopener"><b>"Should I Pay Him N500,000 For This Job?" - Judeibro</b></a> «<br/>» <a href="http://www.nairaland.com/3297253/funke-adesiyan-uses-phone-cover" rel="noopener"><b>Actress Funke Adesiyan Covers Her Private Part With Phone During Selfie, Fans Go Gaga</b></a> «<br/>» <a href="http://www.nairaland.com/3296216/uti-nwachukwu-throws-shot-timi" rel="noopener"><b>Uti Nwachukwu Throws Shot At Timi Dakolo Over Expensive Marriage Joke</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297389/standup-comedian-a.y-shares-lovely" rel="noopener"><b>Comedian AY Shares Lovely Photos In Celebration Of His 38th Birthday</b></a> «<br/>» <a href="http://www.nairaland.com/3297407/nollywood-child-star-looks-unrecognizable" rel="noopener"><b>Nollywood Child Star Looks Almost Unrecognizable 15 Years After</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297392/economy-buhari-rejects-imf-world" rel="noopener"><b>Economy: Buhari Rejects IMF And World Bank Prescriptions</b></a> «<br/>» <a href="http://www.nairaland.com/3297365/lagos-ranks-worlds-3rd-worst" rel="noopener"><b>Lagos Ranks World’s 3rd Worst City To Live In By Economist Intelligence Unit</b></a> «<br/>» <a href="http://www.nairaland.com/3297015/some-people-exported-stones-claim" rel="noopener"><b>"Some People Exported Stones To Claim Export Grant" – Adeosun</b></a> «<br/>» <a href="http://www.nairaland.com/3297244/airforce-expecting-12-attack-helicopter" rel="noopener"><b>Terrorism: Airforce Expecting 12 Attack Helicopter Gunships From Russia</b></a> «<br/>» <a href="http://www.nairaland.com/3296992/outgoing-egyptian-amb-travelled-road" rel="noopener"><b>"Outgoing Egyptian Ambassador Travelled By Road From Maiduguri To Yobe" - Buhari</b></a> «<br/>» <a href="http://www.nairaland.com/3296427/breaking-news-several-prisoners-killed" rel="noopener"><b>Several Prisoners Killed In Abakiliki Foiled Jail Break</b></a> «<br/>» <a href="http://www.nairaland.com/3297367/abubakar-abusidiq-usman-true-story" rel="noopener"><b>Abubakar ‘Abusidiq’ Usman: The True Story Of My Arrest By EFCC</b></a> «<br/>» <a href="http://www.nairaland.com/3296798/navy-rescues-hijacked-british-vessel" rel="noopener"><b>Navy Rescues Hijacked British Vessel, ‘MT VECTIS OSPREY’ From Sea Pirates</b> (<b>Picture</b>)</a> «<br/>» <a href="http://www.nairaland.com/3297092/firs-seals-senator-akumes-hotel" rel="noopener"><b>FIRS Seals Senator Akume’s Hotel Over N13.5 Million Unpaid Taxes</b></a> «<br/>» <a href="http://www.nairaland.com/3295884/ogun-emerges-nigerias-mining-capital" rel="noopener"><b>Ogun Emerges Nigeria’s Mining Capital</b></a> «<br/>» <a href="http://www.nairaland.com/3296922/apc-blasts-pdp-lack-moral" rel="noopener"><b>APC Blasts PDP: "You Lack The Moral Basis To Comment On Nigerian Economy"</b></a> «<br/>» <a href="http://www.nairaland.com/3297207/budget-padding-10-principal-officers" rel="noopener"><b>Budget Padding: 10 Principal Officers Disown Jibrin</b></a> «<br/>» <a href="http://www.nairaland.com/3296499/brent-crude-oil-rose-47.06" rel="noopener"><b>Brent Crude Oil Rise From $47.06 To $50 Per Barrel</b></a> «<br/>» <a href="http://www.nairaland.com/3296797/five-things-learn-jobless" rel="noopener"><b>"Five Things I Would Learn To Do If I Was Jobless"</b></a> «<br/>» <a href="http://www.nairaland.com/3296907/opera-max-how-does-it" rel="noopener"><b>Opera Max: How Does It Work?</b></a> «<br/>» <a href="http://www.nairaland.com/3297294/babcock-university-set-graduate-set" rel="noopener"><b>Babcock University Set To Graduate Set Of Maiden Doctors</b></a> «<br/>» <a href="http://www.nairaland.com/3293752/how-much-bakery-worker-paid" rel="noopener"><b>How Much Are Bakery Workers Paid?</b></a> «<br/>» <a href="http://www.nairaland.com/3297068/lagos-govt-plans-50-housing" rel="noopener"><b>Lagos Govt Plans 50 Housing Units In Every LGA</b></a> «<br/>» <a href="http://www.nairaland.com/3295766/seven-types-drivers-nigeria-which" rel="noopener"><b>The Seven Types Of Drivers In Nigeria: Which One Are You?</b></a> «<br/>» <a href="http://www.nairaland.com/3297194/how-website-errors-affect-search" rel="noopener"><b>How Website Errors Affect Search Engine Rankings</b></a> «<br/>» <a href="http://www.nairaland.com/2888716/bloodshot-short-romance-story" rel="noopener"><b>"Bloodshot" A Story By Godmother</b></a> «<br/>» <a href="http://www.nairaland.com/3296437/husband-impregnated-sister-matrimonial-home" rel="noopener"><b>"My Husband Impregnated His 'Sister' In Our Matrimonial Home" - Wife</b></a> «<br/>» <a href="http://www.nairaland.com/3296883/actress-rukky-sanda-shows-off" rel="noopener"><b>Actress Rukky Sanda Shows Off Her Living Room</b> (<b>Photo</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296782/kelechi-iheanacho-sign-new-5-year" rel="noopener"><b>Kelechi Iheanacho Signs New 5-year Deal With Man City</b></a> «<br/>» <a href="http://www.nairaland.com/3296519/v.i.s-officials-just-melted-injustice" rel="noopener"><b>"Help! VIS Officials Just Meted An Injustice On Me" - Drabeey</b> (<b>Pics</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296762/student-shot-during-protest-funaab" rel="noopener"><b>"Student Shot During Protest In FUNAAB Is Not Dead, He's Receiving Treatment"</b></a> «<br/>» <a href="http://www.nairaland.com/3296677/obinim-accused-sleeping-pastors-wife" rel="noopener"><b>Pastor Who Flogged Girl For Having Sex Accused Of Sleeping With Junior Pastor's Wife</b></a> «<br/>» <a href="http://www.nairaland.com/3296933/area-boys-go-shop-shop" rel="noopener"><b>Area Boys Raid Shops After Bloody Clash In Abule-Ado, Lagos</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296565/gej-denies-being-probed-alledged" rel="noopener"><b>Jonathan Reacts To Reports Linking Him To Militants</b></a> «<br/>» <a href="http://www.nairaland.com/3296848/okorocha-relaxing-grandchildren-daughter-uloma" rel="noopener"><b>Photo Of Governor Okorocha Relaxing With His Grandchildren & Daughter</b></a> «<br/>» <a href="http://www.nairaland.com/3296723/muma-gee-prince-eke-welcome" rel="noopener"><b>Singer Muma Gee And Actor Prince Eke Welcome Baby Girl!</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296698/nnamdi-kanu-writes-british-government" rel="noopener"><b>Nnamdi Kanu Writes British Government</b></a> «<br/>» <a href="http://www.nairaland.com/3296730/true-love-very-young-girl" rel="noopener"><b>Is This True Love? Young Girl Flaunts Her Aged White Husband</b> (<b>Photos</b>)</a> «<br/>» <a href="http://www.nairaland.com/3296485/flash-inec-announce-results-cancelled" rel="noopener"><b>INEC Announces Results Of Cancelled Tai Election, Rivers, Declares APC Winner</b></a> «<br/>» <a href="http://www.nairaland.com/3296693/dj-cuppy-billionaire-dad-femi" rel="noopener"><b>Dj Cuppy And Her Billionaire Dad, Femi Otedola Step Out For Lunch In Luxurious Style</b></a> «</td>, <td>(<b>0</b>) <a href="/links/1">(<b>1</b>)</a> <a href="/links/2">(<b>2</b>)</a> <a href="/links/3">(<b>3</b>)</a> <a href="/links/4">(<b>4</b>)</a> <a href="/links/5">(<b>5</b>)</a> <a href="/links/6">(<b>6</b>)</a> <a href="/links/7">(<b>7</b>)</a> <a href="/links/8">(<b>8</b>)</a> <a href="/links/9">(<b>9</b>)</a> <a href="/links/10">(<b>10</b>)</a> </td>, <td class="w"><h3>Members Online:</h3> (<b>2944 Members</b> and <b>6196 Guests</b> online in <b>last 5 minutes</b>!)</td>, <td class="homeuser"><h3>Birthdays:</h3>gabng(<span class="m">31</span>), olaeffect(<span class="m">40</span>), TadeDada, wildchild1, KMB, seunny4lif(<span class="m">29</span>), Samdurance(<span class="m">33</span>), Nazcoj(<span class="m">29</span>), wallex1983, queenesthr, uyilee(<span class="m">32</span>), yusuf01(<span class="m">32</span>), meetdopi(<span class="m">47</span>), daprophet(<span class="m">83</span>), melifew213, adebiyiait, jaittofidelix1(<span class="m">28</span>), Sijo01, TheRector(<span class="m">39</span>), faisal00(<span class="m">30</span>), globigpun(<span class="m">36</span>), Nerosoft19(<span class="m">22</span>), Havilah93(<span class="m">23</span>), passthem(<span class="m">28</span>), Bolt2011(<span class="m">29</span>), debbianah(<span class="f">25</span>), elemzyfinest(<span class="m">23</span>), Ayodeji1908(<span class="m">32</span>), aitanofi(<span class="m">36</span>), jibolarazor(<span class="m">24</span>), julius2825, LARRYDKING, Bibings, Seylad2009, wemicoal(<span class="m">24</span>), Mexyz(<span class="m">24</span>), cedaraustine, amyboy(<span class="m">26</span>), yemcoguy(<span class="m">31</span>), jendoslim(<span class="m">29</span>), zaye, oluebubesyd(<span class="m">20</span>), mustymatic(<span class="m">24</span>), markson48, izy4all(<span class="m">95</span>), omoga1908(<span class="m">32</span>), mojibbz(<span class="m">21</span>), endibe(<span class="m">24</span>), nolaniyonu(<span class="m">28</span>), shinacollins(<span class="m">38</span>), tolam4skywd(<span class="m">21</span>), Kenkesh(<span class="m">28</span>), Chibaba247(<span class="m">29</span>), emperorhenry(<span class="m">26</span>), haywhyze, samuelkingz(<span class="m">21</span>), Giofresh1, makaveli902, lordkizzy3(<span class="m">18</span>), funnysaint(<span class="m">37</span>), dungas30, lilryder(<span class="m">24</span>), Nikapetrelli, Ade001ng(<span class="m">38</span>), Gabriel6(<span class="m">22</span>), obami007(<span class="m">27</span>), jhorel(<span class="m">22</span>), Oketwin(<span class="m">30</span>), SeanRainfall(<span class="m">25</span>), obawolea(<span class="m">21</span>), kensyno(<span class="m">30</span>), ololaderhoda(<span class="m">24</span>), scarred9jan(<span class="m">33</span>), Hifijen(<span class="m">24</span>), LilyHomes(<span class="-">18</span>), MizTyna(<span class="f">26</span>), Edehngene(<span class="m">29</span>), maroedeks, TrippleA19, richyrichlady, Bonatti, ja2ken, KizzyyRae(<span class="f">19</span>), noblebirth, algonfidish(<span class="f">32</span>), Thobiy, Essienblaze(<span class="m">22</span>), nicekid4u(<span class="m">26</span>), hujjat(<span class="m">28</span>), Josephamstrong1(<span class="m">26</span>), LezDiva, dejavuh0007(<span class="m">21</span>), Cossie0000001(<span class="m">29</span>), olaxy2, Damfostopper(<span class="m">24</span>), kayve, brian08(<span class="m">30</span>), lindahelda(<span class="f">22</span>), Browndipson(<span class="m">20</span>), Quace(<span class="m">17</span>), Lorddj4real(<span class="m">40</span>), Sodijan(<span class="m">24</span>), uthlaw, princewill911, SimplyIFE(<span class="m">25</span>), Treazoure(<span class="m">28</span>), Proxy4ever(<span class="m">30</span>), ItzStone(<span class="m">25</span>), michael9ja(<span class="m">26</span>), dynasty231(<span class="m">91</span>), GudPpleG8Nation, easyreal(<span class="m">30</span>), Oluwashola01(<span class="m">22</span>), toulah, octal2003(<span class="m">35</span>), seeker63, SABA2002, egbostan, biggie73(<span class="m">23</span>), Globallords(<span class="m">21</span>), kcyarn(<span class="m">42</span>), Max124, moralex(<span class="m">37</span>), chukz999(<span class="m">26</span>), Hauwwyy21(<span class="f">24</span>), sirbendit(<span class="m">17</span>), Paulgracious, amainus01(<span class="m">28</span>), Kingsleyjoel44(<span class="m">19</span>), launchx431(<span class="f">34</span>), Quteezy(<span class="m">30</span>), tkpumping(<span class="m">23</span>), Kelchines(<span class="m">19</span>), taiwoakinlabi(<span class="m">24</span>), Dinma1908(<span class="f">23</span>), saintrita(<span class="f">20</span>), Lordsinger(<span class="m">21</span>)</td>, <td class="w"><iframe allowtransparency="true" frameborder="0" scrolling="no" src="//www.facebook.com/plugins/likebox.php?href=http%3A%2F%2Fwww.facebook.com%2Fnigerianforum&width=960&height=170&colorscheme=light&show_faces=true&header=false&stream=false&show_border=false&appId=214922901863083" style="border:none; overflow:hidden; width:960px; height:170;"></iframe></td>, <td class="small w grad"><p></p><form action="/search"> <input name="q" size="32" type="text"/> <input name="search" type="submit" value="Search"/></form>Sections: <a href="/politics">politics</a> <a href="/politics/1">(1)</a> <a href="/business">business</a> <a href="/autos">autos</a> <a href="/autos/1">(1)</a> <a href="/jobs">jobs</a> <a href="/jobs/1">(1)</a> <a href="/career">career</a> <a href="/education">education</a> <a href="/education/1">(1)</a> <a href="/romance">romance</a> <a href="/computers">computers</a> <a href="/phones">phones</a> <a href="/travel">travel</a> <a href="/sports">sports</a> <a href="/fashion">fashion</a> <a href="/health">health</a> <br/> <a href="/religion">religion</a> <a href="/celebs">celebs</a> <a href="/tv-movies">tv-movies</a> <a href="/music-radio">music-radio</a> <a href="/literature">literature</a> <a href="/webmasters">webmasters</a> <a href="/programming">programming</a> <a href="/techmarket">techmarket</a> <p>Links: <a href="/links">(0)</a> <a href="/links/1">(1)</a> <a href="/links/2">(2)</a> <a href="/links/3">(3)</a> <a href="/links/4">(4)</a> <a href="/links/5">(5)</a> <a href="/links/6">(6)</a> <a href="/links/7">(7)</a> <a href="/links/8">(8)</a> <a href="/links/9">(9)</a> </p><p><b><a href="/" title="Nigerian Forum">Nairaland</a></b> - Copyright © 2005 - 2016 <a href="http://www.seunosewa.com" title="Seun">Oluwaseun Osewa</a>. All rights reserved. See <a href="http://www.nairaland.com/1049481/how-place-targeted-ads-nairaland">How To Advertise</a>. 3<br/><b>Disclaimer</b>: Every Nairaland member is <b>solely responsible</b> for <b>anything</b> that he/she <b>posts</b> or <b>uploads</b> on Nairaland.</p></td>]
Let extract all irrelevant text and keep only the birthday list in the format of: Username, age. To be saved in a CSV file
# lets read out the text only ignoring the tag cell in a table
for data in soup_data("td"):
print (data.text)
₦airaland Forum Welcome, Guest: Join Nairaland / Login / Trending / Recent / NewStats: 1,637,780 members, 3,033,792 topics. Date: Friday, 19 August 2016 at 12:02 PM Nairaland / General: Politics, Crime, Romance, Jobs/Vacancies, Career, Business, Investment, NYSC, Education, Autos, Car Talk, Properties, Health, Travel, Family, Culture, Religion, Food, Diaries, Nairaland Ads, Pets, Agriculture Entertainment: Jokes Etc, TV/Movies, Music/Radio, Celebrities, Fashion, Events, Sports, Gaming, Forum Games, Literature Science/Technology: Programming, Webmasters, Computers, Phones, Art, Graphics & Video, Technology Market Featured Links / Twitter / Facebook / How To Advertise » Usain Bolt Wins 200m Gold At The RIO 2016 Olympics: His 8th Olympic Gold (Photos) «» See How Tattoo Damaged The Skin Of Actress Anita Joseph (Pics) «» Naira Sinks To All-Time Low Of 365.25/Dollar «» "See The Bush Meat A Friend And I Killed" - Hillarie (Photos) «» US Secretary Of State, John Kerry, To Visit Nigeria Next Week «» Photo Of The 3 Herdsmen Who Kidnapped Law Maker, Sani Bello & The Money Recovered «» A Brother Organised Robbers To Attack His Pregnant Sister In Lagos, Maid Raped (Pics) «» Actors And Actresses Storm Comic Star Actor, Aluwe’s Mom's Burial (Pics) «» "5 Keys To Long Term Success In Your Business" «» Senator Dino Melaye And His Children Tour Georgia (Photos) «» "These Would Have Been The Nigerian Names Of These Foreign Celebrities" (Photos) «» Team Nigeria Kits Arrive 3 Days To End Of Olympics Games «» “Kill Yourself If You Don’t Like Me” – Actress Angela Okorie Tells Critics (Pics) «» Only Those Who Went To Secondary School In Nigeria Will Understand These (Pics) «» Dogged By Ill Health, Marital Feuds, Emeka Offor’s Crisis Deepens - Sahara Reporters «» "I Am Living In An Unhappy Marriage" - Please Advice «» Reasons Why Memorising The Quran Is Good For Your Brain «» 10 Tips For Concentration In Prayer «» Declare Niger Delta Republic And Face Treason - Police Tell Militants «» What Does Love Look Like To You? «» See What FUNAAB Students Did To A Thief Who Stole Tecno M3 «» See What They Did To A Nairalander After Writing His Final Exams In AAU, Ekpoma «» This Excited Woman Couldn't Hold Herself As Oshiomhole Passed By (Photos) «» Anything Wrong With This Photo Of A Father And His Step Daughter? «» Is This A Haircut Or Head Cut? (Photos) «» "Should I Pay Him N500,000 For This Job?" - Judeibro «» Actress Funke Adesiyan Covers Her Private Part With Phone During Selfie, Fans Go Gaga «» Uti Nwachukwu Throws Shot At Timi Dakolo Over Expensive Marriage Joke (Pics) «» Comedian AY Shares Lovely Photos In Celebration Of His 38th Birthday «» Nollywood Child Star Looks Almost Unrecognizable 15 Years After (Photos) «» Economy: Buhari Rejects IMF And World Bank Prescriptions «» Lagos Ranks World’s 3rd Worst City To Live In By Economist Intelligence Unit «» "Some People Exported Stones To Claim Export Grant" – Adeosun «» Terrorism: Airforce Expecting 12 Attack Helicopter Gunships From Russia «» "Outgoing Egyptian Ambassador Travelled By Road From Maiduguri To Yobe" - Buhari «» Several Prisoners Killed In Abakiliki Foiled Jail Break «» Abubakar ‘Abusidiq’ Usman: The True Story Of My Arrest By EFCC «» Navy Rescues Hijacked British Vessel, ‘MT VECTIS OSPREY’ From Sea Pirates (Picture) «» FIRS Seals Senator Akume’s Hotel Over N13.5 Million Unpaid Taxes «» Ogun Emerges Nigeria’s Mining Capital «» APC Blasts PDP: "You Lack The Moral Basis To Comment On Nigerian Economy" «» Budget Padding: 10 Principal Officers Disown Jibrin «» Brent Crude Oil Rise From $47.06 To $50 Per Barrel «» "Five Things I Would Learn To Do If I Was Jobless" «» Opera Max: How Does It Work? «» Babcock University Set To Graduate Set Of Maiden Doctors «» How Much Are Bakery Workers Paid? «» Lagos Govt Plans 50 Housing Units In Every LGA «» The Seven Types Of Drivers In Nigeria: Which One Are You? «» How Website Errors Affect Search Engine Rankings «» "Bloodshot" A Story By Godmother «» "My Husband Impregnated His 'Sister' In Our Matrimonial Home" - Wife «» Actress Rukky Sanda Shows Off Her Living Room (Photo) «» Kelechi Iheanacho Signs New 5-year Deal With Man City «» "Help! VIS Officials Just Meted An Injustice On Me" - Drabeey (Pics) «» "Student Shot During Protest In FUNAAB Is Not Dead, He's Receiving Treatment" «» Pastor Who Flogged Girl For Having Sex Accused Of Sleeping With Junior Pastor's Wife «» Area Boys Raid Shops After Bloody Clash In Abule-Ado, Lagos (Photos) «» Jonathan Reacts To Reports Linking Him To Militants «» Photo Of Governor Okorocha Relaxing With His Grandchildren & Daughter «» Singer Muma Gee And Actor Prince Eke Welcome Baby Girl! (Photos) «» Nnamdi Kanu Writes British Government «» Is This True Love? Young Girl Flaunts Her Aged White Husband (Photos) «» INEC Announces Results Of Cancelled Tai Election, Rivers, Declares APC Winner «» Dj Cuppy And Her Billionaire Dad, Femi Otedola Step Out For Lunch In Luxurious Style « (0) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Members Online: (2944 Members and 6196 Guests online in last 5 minutes!) Birthdays:gabng(31), olaeffect(40), TadeDada, wildchild1, KMB, seunny4lif(29), Samdurance(33), Nazcoj(29), wallex1983, queenesthr, uyilee(32), yusuf01(32), meetdopi(47), daprophet(83), melifew213, adebiyiait, jaittofidelix1(28), Sijo01, TheRector(39), faisal00(30), globigpun(36), Nerosoft19(22), Havilah93(23), passthem(28), Bolt2011(29), debbianah(25), elemzyfinest(23), Ayodeji1908(32), aitanofi(36), jibolarazor(24), julius2825, LARRYDKING, Bibings, Seylad2009, wemicoal(24), Mexyz(24), cedaraustine, amyboy(26), yemcoguy(31), jendoslim(29), zaye, oluebubesyd(20), mustymatic(24), markson48, izy4all(95), omoga1908(32), mojibbz(21), endibe(24), nolaniyonu(28), shinacollins(38), tolam4skywd(21), Kenkesh(28), Chibaba247(29), emperorhenry(26), haywhyze, samuelkingz(21), Giofresh1, makaveli902, lordkizzy3(18), funnysaint(37), dungas30, lilryder(24), Nikapetrelli, Ade001ng(38), Gabriel6(22), obami007(27), jhorel(22), Oketwin(30), SeanRainfall(25), obawolea(21), kensyno(30), ololaderhoda(24), scarred9jan(33), Hifijen(24), LilyHomes(18), MizTyna(26), Edehngene(29), maroedeks, TrippleA19, richyrichlady, Bonatti, ja2ken, KizzyyRae(19), noblebirth, algonfidish(32), Thobiy, Essienblaze(22), nicekid4u(26), hujjat(28), Josephamstrong1(26), LezDiva, dejavuh0007(21), Cossie0000001(29), olaxy2, Damfostopper(24), kayve, brian08(30), lindahelda(22), Browndipson(20), Quace(17), Lorddj4real(40), Sodijan(24), uthlaw, princewill911, SimplyIFE(25), Treazoure(28), Proxy4ever(30), ItzStone(25), michael9ja(26), dynasty231(91), GudPpleG8Nation, easyreal(30), Oluwashola01(22), toulah, octal2003(35), seeker63, SABA2002, egbostan, biggie73(23), Globallords(21), kcyarn(42), Max124, moralex(37), chukz999(26), Hauwwyy21(24), sirbendit(17), Paulgracious, amainus01(28), Kingsleyjoel44(19), launchx431(34), Quteezy(30), tkpumping(23), Kelchines(19), taiwoakinlabi(24), Dinma1908(23), saintrita(20), Lordsinger(21) Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health religion celebs tv-movies music-radio literature webmasters programming techmarket Links: (0) (1) (2) (3) (4) (5) (6) (7) (8) (9) Nairaland - Copyright © 2005 - 2016 Oluwaseun Osewa. All rights reserved. See How To Advertise. 3Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.
# Obviously, we don't need every text above. So use the 're' module, to extract only the relevant birthday list
# Note: I will ignore those members whose ages are not displayed, so that we don't have to deal with NaN values in our data
member_found = None
re_match = "[\w]+\([\d]+\)" # any word count+1 followed-by '(' followed-by any number count+1 followed-by ')'
for data in soup_data("td"):
data_found = re.findall(re_match, data.text)
if data_found:
member_found = data_found
print (member_found)
['gabng(31)', 'olaeffect(40)', 'seunny4lif(29)', 'Samdurance(33)', 'Nazcoj(29)', 'uyilee(32)', 'yusuf01(32)', 'meetdopi(47)', 'daprophet(83)', 'jaittofidelix1(28)', 'TheRector(39)', 'faisal00(30)', 'globigpun(36)', 'Nerosoft19(22)', 'Havilah93(23)', 'passthem(28)', 'Bolt2011(29)', 'debbianah(25)', 'elemzyfinest(23)', 'Ayodeji1908(32)', 'aitanofi(36)', 'jibolarazor(24)', 'wemicoal(24)', 'Mexyz(24)', 'amyboy(26)', 'yemcoguy(31)', 'jendoslim(29)', 'oluebubesyd(20)', 'mustymatic(24)', 'izy4all(95)', 'omoga1908(32)', 'mojibbz(21)', 'endibe(24)', 'nolaniyonu(28)', 'shinacollins(38)', 'tolam4skywd(21)', 'Kenkesh(28)', 'Chibaba247(29)', 'emperorhenry(26)', 'samuelkingz(21)', 'lordkizzy3(18)', 'funnysaint(37)', 'lilryder(24)', 'Ade001ng(38)', 'Gabriel6(22)', 'obami007(27)', 'jhorel(22)', 'Oketwin(30)', 'SeanRainfall(25)', 'obawolea(21)', 'kensyno(30)', 'ololaderhoda(24)', 'scarred9jan(33)', 'Hifijen(24)', 'LilyHomes(18)', 'MizTyna(26)', 'Edehngene(29)', 'KizzyyRae(19)', 'algonfidish(32)', 'Essienblaze(22)', 'nicekid4u(26)', 'hujjat(28)', 'Josephamstrong1(26)', 'dejavuh0007(21)', 'Cossie0000001(29)', 'Damfostopper(24)', 'brian08(30)', 'lindahelda(22)', 'Browndipson(20)', 'Quace(17)', 'Lorddj4real(40)', 'Sodijan(24)', 'SimplyIFE(25)', 'Treazoure(28)', 'Proxy4ever(30)', 'ItzStone(25)', 'michael9ja(26)', 'dynasty231(91)', 'easyreal(30)', 'Oluwashola01(22)', 'octal2003(35)', 'biggie73(23)', 'Globallords(21)', 'kcyarn(42)', 'moralex(37)', 'chukz999(26)', 'Hauwwyy21(24)', 'sirbendit(17)', 'amainus01(28)', 'Kingsleyjoel44(19)', 'launchx431(34)', 'Quteezy(30)', 'tkpumping(23)', 'Kelchines(19)', 'taiwoakinlabi(24)', 'Dinma1908(23)', 'saintrita(20)', 'Lordsinger(21)']
# Lets further clean up the list to seperate Usernames from age
# Use list comprehension to replace the last brace ")" with empty "" in member_found above
member_found_replaced = [x.replace(")", "") for x in member_found] # replaces ")" by ""
print (member_found_replaced)
['gabng(31', 'olaeffect(40', 'seunny4lif(29', 'Samdurance(33', 'Nazcoj(29', 'uyilee(32', 'yusuf01(32', 'meetdopi(47', 'daprophet(83', 'jaittofidelix1(28', 'TheRector(39', 'faisal00(30', 'globigpun(36', 'Nerosoft19(22', 'Havilah93(23', 'passthem(28', 'Bolt2011(29', 'debbianah(25', 'elemzyfinest(23', 'Ayodeji1908(32', 'aitanofi(36', 'jibolarazor(24', 'wemicoal(24', 'Mexyz(24', 'amyboy(26', 'yemcoguy(31', 'jendoslim(29', 'oluebubesyd(20', 'mustymatic(24', 'izy4all(95', 'omoga1908(32', 'mojibbz(21', 'endibe(24', 'nolaniyonu(28', 'shinacollins(38', 'tolam4skywd(21', 'Kenkesh(28', 'Chibaba247(29', 'emperorhenry(26', 'samuelkingz(21', 'lordkizzy3(18', 'funnysaint(37', 'lilryder(24', 'Ade001ng(38', 'Gabriel6(22', 'obami007(27', 'jhorel(22', 'Oketwin(30', 'SeanRainfall(25', 'obawolea(21', 'kensyno(30', 'ololaderhoda(24', 'scarred9jan(33', 'Hifijen(24', 'LilyHomes(18', 'MizTyna(26', 'Edehngene(29', 'KizzyyRae(19', 'algonfidish(32', 'Essienblaze(22', 'nicekid4u(26', 'hujjat(28', 'Josephamstrong1(26', 'dejavuh0007(21', 'Cossie0000001(29', 'Damfostopper(24', 'brian08(30', 'lindahelda(22', 'Browndipson(20', 'Quace(17', 'Lorddj4real(40', 'Sodijan(24', 'SimplyIFE(25', 'Treazoure(28', 'Proxy4ever(30', 'ItzStone(25', 'michael9ja(26', 'dynasty231(91', 'easyreal(30', 'Oluwashola01(22', 'octal2003(35', 'biggie73(23', 'Globallords(21', 'kcyarn(42', 'moralex(37', 'chukz999(26', 'Hauwwyy21(24', 'sirbendit(17', 'amainus01(28', 'Kingsleyjoel44(19', 'launchx431(34', 'Quteezy(30', 'tkpumping(23', 'Kelchines(19', 'taiwoakinlabi(24', 'Dinma1908(23', 'saintrita(20', 'Lordsinger(21']
# Now split "member_found_replaced" based on '(' between the usernames and age
# we use for loop to loop through each item of the "member_found_replaced" list above
for y in member_found_replaced:
member_cleaned = y.split("(")
print (member_cleaned)
# what we have "member_cleaned" is individual list with two elements each
# lets combine all the lists into a dictionary
['gabng', '31'] ['olaeffect', '40'] ['seunny4lif', '29'] ['Samdurance', '33'] ['Nazcoj', '29'] ['uyilee', '32'] ['yusuf01', '32'] ['meetdopi', '47'] ['daprophet', '83'] ['jaittofidelix1', '28'] ['TheRector', '39'] ['faisal00', '30'] ['globigpun', '36'] ['Nerosoft19', '22'] ['Havilah93', '23'] ['passthem', '28'] ['Bolt2011', '29'] ['debbianah', '25'] ['elemzyfinest', '23'] ['Ayodeji1908', '32'] ['aitanofi', '36'] ['jibolarazor', '24'] ['wemicoal', '24'] ['Mexyz', '24'] ['amyboy', '26'] ['yemcoguy', '31'] ['jendoslim', '29'] ['oluebubesyd', '20'] ['mustymatic', '24'] ['izy4all', '95'] ['omoga1908', '32'] ['mojibbz', '21'] ['endibe', '24'] ['nolaniyonu', '28'] ['shinacollins', '38'] ['tolam4skywd', '21'] ['Kenkesh', '28'] ['Chibaba247', '29'] ['emperorhenry', '26'] ['samuelkingz', '21'] ['lordkizzy3', '18'] ['funnysaint', '37'] ['lilryder', '24'] ['Ade001ng', '38'] ['Gabriel6', '22'] ['obami007', '27'] ['jhorel', '22'] ['Oketwin', '30'] ['SeanRainfall', '25'] ['obawolea', '21'] ['kensyno', '30'] ['ololaderhoda', '24'] ['scarred9jan', '33'] ['Hifijen', '24'] ['LilyHomes', '18'] ['MizTyna', '26'] ['Edehngene', '29'] ['KizzyyRae', '19'] ['algonfidish', '32'] ['Essienblaze', '22'] ['nicekid4u', '26'] ['hujjat', '28'] ['Josephamstrong1', '26'] ['dejavuh0007', '21'] ['Cossie0000001', '29'] ['Damfostopper', '24'] ['brian08', '30'] ['lindahelda', '22'] ['Browndipson', '20'] ['Quace', '17'] ['Lorddj4real', '40'] ['Sodijan', '24'] ['SimplyIFE', '25'] ['Treazoure', '28'] ['Proxy4ever', '30'] ['ItzStone', '25'] ['michael9ja', '26'] ['dynasty231', '91'] ['easyreal', '30'] ['Oluwashola01', '22'] ['octal2003', '35'] ['biggie73', '23'] ['Globallords', '21'] ['kcyarn', '42'] ['moralex', '37'] ['chukz999', '26'] ['Hauwwyy21', '24'] ['sirbendit', '17'] ['amainus01', '28'] ['Kingsleyjoel44', '19'] ['launchx431', '34'] ['Quteezy', '30'] ['tkpumping', '23'] ['Kelchines', '19'] ['taiwoakinlabi', '24'] ['Dinma1908', '23'] ['saintrita', '20'] ['Lordsinger', '21']
# we first declare "member_cleaned" as empty dictiory, so we can append individaul list above into it
member_cleaned = {}
for y in member_found_replaced:
temp_data = y.split("(")
member_cleaned[temp_data[0]] = int(temp_data[1])
print (member_cleaned)
{'dejavuh0007': 21, 'ololaderhoda': 24, 'mustymatic': 24, 'scarred9jan': 33, 'shinacollins': 38, 'globigpun': 36, 'sirbendit': 17, 'Kenkesh': 28, 'faisal00': 30, 'omoga1908': 32, 'samuelkingz': 21, 'amainus01': 28, 'TheRector': 39, 'Gabriel6': 22, 'Lorddj4real': 40, 'Kelchines': 19, 'easyreal': 30, 'Proxy4ever': 30, 'taiwoakinlabi': 24, 'jhorel': 22, 'jendoslim': 29, 'wemicoal': 24, 'ItzStone': 25, 'mojibbz': 21, 'Dinma1908': 23, 'Ayodeji1908': 32, 'Nazcoj': 29, 'jaittofidelix1': 28, 'seunny4lif': 29, 'brian08': 30, 'gabng': 31, 'Cossie0000001': 29, 'hujjat': 28, 'meetdopi': 47, 'SeanRainfall': 25, 'algonfidish': 32, 'nicekid4u': 26, 'amyboy': 26, 'Oketwin': 30, 'kcyarn': 42, 'elemzyfinest': 23, 'Quace': 17, 'Ade001ng': 38, 'Quteezy': 30, 'dynasty231': 91, 'lindahelda': 22, 'Lordsinger': 21, 'obawolea': 21, 'obami007': 27, 'kensyno': 30, 'Mexyz': 24, 'biggie73': 23, 'izy4all': 95, 'octal2003': 35, 'Sodijan': 24, 'oluebubesyd': 20, 'michael9ja': 26, 'SimplyIFE': 25, 'Edehngene': 29, 'Oluwashola01': 22, 'launchx431': 34, 'Samdurance': 33, 'Nerosoft19': 22, 'Havilah93': 23, 'Josephamstrong1': 26, 'emperorhenry': 26, 'Kingsleyjoel44': 19, 'Bolt2011': 29, 'olaeffect': 40, 'chukz999': 26, 'Essienblaze': 22, 'tkpumping': 23, 'yemcoguy': 31, 'saintrita': 20, 'moralex': 37, 'LilyHomes': 18, 'daprophet': 83, 'aitanofi': 36, 'Browndipson': 20, 'lilryder': 24, 'tolam4skywd': 21, 'KizzyyRae': 19, 'Globallords': 21, 'uyilee': 32, 'Chibaba247': 29, 'Hifijen': 24, 'endibe': 24, 'funnysaint': 37, 'Treazoure': 28, 'Damfostopper': 24, 'Hauwwyy21': 24, 'nolaniyonu': 28, 'lordkizzy3': 18, 'jibolarazor': 24, 'MizTyna': 26, 'passthem': 28, 'yusuf01': 32, 'debbianah': 25}
# covert the dictionary "member_cleaned" above into a Pandas DataFrame
# Note: in python 3, we have to convert the dictionary items into a list to work with Pandas DataFrame
# define the column names
columns_name = ["Username", "Age"]
# df = pd.DataFrame(member_cleaned.items(), columns = columns_name ) # this is for python 2
df = pd.DataFrame(list(member_cleaned.items()), columns = columns_name )
df
Username | Age | |
---|---|---|
0 | dejavuh0007 | 21 |
1 | ololaderhoda | 24 |
2 | mustymatic | 24 |
3 | scarred9jan | 33 |
4 | shinacollins | 38 |
5 | globigpun | 36 |
6 | sirbendit | 17 |
7 | Kenkesh | 28 |
8 | faisal00 | 30 |
9 | omoga1908 | 32 |
10 | samuelkingz | 21 |
11 | amainus01 | 28 |
12 | TheRector | 39 |
13 | Gabriel6 | 22 |
14 | Lorddj4real | 40 |
15 | Kelchines | 19 |
16 | easyreal | 30 |
17 | Proxy4ever | 30 |
18 | taiwoakinlabi | 24 |
19 | jhorel | 22 |
20 | jendoslim | 29 |
21 | wemicoal | 24 |
22 | ItzStone | 25 |
23 | mojibbz | 21 |
24 | Dinma1908 | 23 |
25 | Ayodeji1908 | 32 |
26 | Nazcoj | 29 |
27 | jaittofidelix1 | 28 |
28 | seunny4lif | 29 |
29 | brian08 | 30 |
... | ... | ... |
68 | olaeffect | 40 |
69 | chukz999 | 26 |
70 | Essienblaze | 22 |
71 | tkpumping | 23 |
72 | yemcoguy | 31 |
73 | saintrita | 20 |
74 | moralex | 37 |
75 | LilyHomes | 18 |
76 | daprophet | 83 |
77 | aitanofi | 36 |
78 | Browndipson | 20 |
79 | lilryder | 24 |
80 | tolam4skywd | 21 |
81 | KizzyyRae | 19 |
82 | Globallords | 21 |
83 | uyilee | 32 |
84 | Chibaba247 | 29 |
85 | Hifijen | 24 |
86 | endibe | 24 |
87 | funnysaint | 37 |
88 | Treazoure | 28 |
89 | Damfostopper | 24 |
90 | Hauwwyy21 | 24 |
91 | nolaniyonu | 28 |
92 | lordkizzy3 | 18 |
93 | jibolarazor | 24 |
94 | MizTyna | 26 |
95 | passthem | 28 |
96 | yusuf01 | 32 |
97 | debbianah | 25 |
98 rows × 2 columns
# Lets add a column for today's date
# using the datetime module
todays_date = datetime.now().date()
df["Date"] = todays_date
df
Username | Age | Date | |
---|---|---|---|
0 | dejavuh0007 | 21 | 2016-08-19 |
1 | ololaderhoda | 24 | 2016-08-19 |
2 | mustymatic | 24 | 2016-08-19 |
3 | scarred9jan | 33 | 2016-08-19 |
4 | shinacollins | 38 | 2016-08-19 |
5 | globigpun | 36 | 2016-08-19 |
6 | sirbendit | 17 | 2016-08-19 |
7 | Kenkesh | 28 | 2016-08-19 |
8 | faisal00 | 30 | 2016-08-19 |
9 | omoga1908 | 32 | 2016-08-19 |
10 | samuelkingz | 21 | 2016-08-19 |
11 | amainus01 | 28 | 2016-08-19 |
12 | TheRector | 39 | 2016-08-19 |
13 | Gabriel6 | 22 | 2016-08-19 |
14 | Lorddj4real | 40 | 2016-08-19 |
15 | Kelchines | 19 | 2016-08-19 |
16 | easyreal | 30 | 2016-08-19 |
17 | Proxy4ever | 30 | 2016-08-19 |
18 | taiwoakinlabi | 24 | 2016-08-19 |
19 | jhorel | 22 | 2016-08-19 |
20 | jendoslim | 29 | 2016-08-19 |
21 | wemicoal | 24 | 2016-08-19 |
22 | ItzStone | 25 | 2016-08-19 |
23 | mojibbz | 21 | 2016-08-19 |
24 | Dinma1908 | 23 | 2016-08-19 |
25 | Ayodeji1908 | 32 | 2016-08-19 |
26 | Nazcoj | 29 | 2016-08-19 |
27 | jaittofidelix1 | 28 | 2016-08-19 |
28 | seunny4lif | 29 | 2016-08-19 |
29 | brian08 | 30 | 2016-08-19 |
... | ... | ... | ... |
68 | olaeffect | 40 | 2016-08-19 |
69 | chukz999 | 26 | 2016-08-19 |
70 | Essienblaze | 22 | 2016-08-19 |
71 | tkpumping | 23 | 2016-08-19 |
72 | yemcoguy | 31 | 2016-08-19 |
73 | saintrita | 20 | 2016-08-19 |
74 | moralex | 37 | 2016-08-19 |
75 | LilyHomes | 18 | 2016-08-19 |
76 | daprophet | 83 | 2016-08-19 |
77 | aitanofi | 36 | 2016-08-19 |
78 | Browndipson | 20 | 2016-08-19 |
79 | lilryder | 24 | 2016-08-19 |
80 | tolam4skywd | 21 | 2016-08-19 |
81 | KizzyyRae | 19 | 2016-08-19 |
82 | Globallords | 21 | 2016-08-19 |
83 | uyilee | 32 | 2016-08-19 |
84 | Chibaba247 | 29 | 2016-08-19 |
85 | Hifijen | 24 | 2016-08-19 |
86 | endibe | 24 | 2016-08-19 |
87 | funnysaint | 37 | 2016-08-19 |
88 | Treazoure | 28 | 2016-08-19 |
89 | Damfostopper | 24 | 2016-08-19 |
90 | Hauwwyy21 | 24 | 2016-08-19 |
91 | nolaniyonu | 28 | 2016-08-19 |
92 | lordkizzy3 | 18 | 2016-08-19 |
93 | jibolarazor | 24 | 2016-08-19 |
94 | MizTyna | 26 | 2016-08-19 |
95 | passthem | 28 | 2016-08-19 |
96 | yusuf01 | 32 | 2016-08-19 |
97 | debbianah | 25 | 2016-08-19 |
98 rows × 3 columns
# Let save the dataframe into csv file
# we name the csv file with the current date, i.e: 14/08/2016 will be 20160814 for the file name
csv_name = todays_date.strftime("%Y%m%d")
df.to_csv(csv_name + ".csv")
After you have completed a months dataset, you can then Merge all csv file for that month into one file using pandas concat() method. The concat() method takes in list of dataframes (the CSVs) to merge together.
To Analyze and Visualize our data, below are some of the questions we are going to answer:-
a) How many members are celebrating their birthdays today?
b) Who is the oldest and youngest member celebrating his/her birthdays today?
c) What is the average age the celebrants?
d) How old will each celebrant be in 10years?
e) How old was each celebrant when NairaLand was established?
# Checking the statistical summary of the age column
df.describe()
Age | |
---|---|
count | 98.000000 |
mean | 29.010204 |
std | 12.404268 |
min | 17.000000 |
25% | 23.000000 |
50% | 26.000000 |
75% | 30.750000 |
max | 95.000000 |
From the summary above, you will see the count. And that is the count of members are celebrating their birthdays today which is equivalent to the number of rows or records in our data.
From the summary above, we can see the minimum (youngest) age and maximum (oldest) age. To know their usernames, we use the "sort_value" function to sort in ascending and decending order.
# First 10 nOldet members celbrating
df.sort_values(by="Age", ascending=False)[:10]
Username | Age | Date | |
---|---|---|---|
52 | izy4all | 95 | 2016-08-19 |
44 | dynasty231 | 91 | 2016-08-19 |
76 | daprophet | 83 | 2016-08-19 |
33 | meetdopi | 47 | 2016-08-19 |
39 | kcyarn | 42 | 2016-08-19 |
68 | olaeffect | 40 | 2016-08-19 |
14 | Lorddj4real | 40 | 2016-08-19 |
12 | TheRector | 39 | 2016-08-19 |
4 | shinacollins | 38 | 2016-08-19 |
42 | Ade001ng | 38 | 2016-08-19 |
# First 10 youngest members celebrating
df.sort_values(by="Age", ascending=True)[:10]
Username | Age | Date | |
---|---|---|---|
6 | sirbendit | 17 | 2016-08-19 |
41 | Quace | 17 | 2016-08-19 |
75 | LilyHomes | 18 | 2016-08-19 |
92 | lordkizzy3 | 18 | 2016-08-19 |
66 | Kingsleyjoel44 | 19 | 2016-08-19 |
15 | Kelchines | 19 | 2016-08-19 |
81 | KizzyyRae | 19 | 2016-08-19 |
55 | oluebubesyd | 20 | 2016-08-19 |
73 | saintrita | 20 | 2016-08-19 |
78 | Browndipson | 20 | 2016-08-19 |
From the summary above, you will see the mean. And that is the average/mean age the celebrants in our dataframe.
To achieve this, lets add 10 to the age column and save it in a new colum "Age_10_Plus"
# to answer, How old will each celebrant be in 10years?
df["Age_10_Plus"] = df["Age"] + 10
df
Username | Age | Date | Age_10_Plus | |
---|---|---|---|---|
0 | dejavuh0007 | 21 | 2016-08-19 | 31 |
1 | ololaderhoda | 24 | 2016-08-19 | 34 |
2 | mustymatic | 24 | 2016-08-19 | 34 |
3 | scarred9jan | 33 | 2016-08-19 | 43 |
4 | shinacollins | 38 | 2016-08-19 | 48 |
5 | globigpun | 36 | 2016-08-19 | 46 |
6 | sirbendit | 17 | 2016-08-19 | 27 |
7 | Kenkesh | 28 | 2016-08-19 | 38 |
8 | faisal00 | 30 | 2016-08-19 | 40 |
9 | omoga1908 | 32 | 2016-08-19 | 42 |
10 | samuelkingz | 21 | 2016-08-19 | 31 |
11 | amainus01 | 28 | 2016-08-19 | 38 |
12 | TheRector | 39 | 2016-08-19 | 49 |
13 | Gabriel6 | 22 | 2016-08-19 | 32 |
14 | Lorddj4real | 40 | 2016-08-19 | 50 |
15 | Kelchines | 19 | 2016-08-19 | 29 |
16 | easyreal | 30 | 2016-08-19 | 40 |
17 | Proxy4ever | 30 | 2016-08-19 | 40 |
18 | taiwoakinlabi | 24 | 2016-08-19 | 34 |
19 | jhorel | 22 | 2016-08-19 | 32 |
20 | jendoslim | 29 | 2016-08-19 | 39 |
21 | wemicoal | 24 | 2016-08-19 | 34 |
22 | ItzStone | 25 | 2016-08-19 | 35 |
23 | mojibbz | 21 | 2016-08-19 | 31 |
24 | Dinma1908 | 23 | 2016-08-19 | 33 |
25 | Ayodeji1908 | 32 | 2016-08-19 | 42 |
26 | Nazcoj | 29 | 2016-08-19 | 39 |
27 | jaittofidelix1 | 28 | 2016-08-19 | 38 |
28 | seunny4lif | 29 | 2016-08-19 | 39 |
29 | brian08 | 30 | 2016-08-19 | 40 |
... | ... | ... | ... | ... |
68 | olaeffect | 40 | 2016-08-19 | 50 |
69 | chukz999 | 26 | 2016-08-19 | 36 |
70 | Essienblaze | 22 | 2016-08-19 | 32 |
71 | tkpumping | 23 | 2016-08-19 | 33 |
72 | yemcoguy | 31 | 2016-08-19 | 41 |
73 | saintrita | 20 | 2016-08-19 | 30 |
74 | moralex | 37 | 2016-08-19 | 47 |
75 | LilyHomes | 18 | 2016-08-19 | 28 |
76 | daprophet | 83 | 2016-08-19 | 93 |
77 | aitanofi | 36 | 2016-08-19 | 46 |
78 | Browndipson | 20 | 2016-08-19 | 30 |
79 | lilryder | 24 | 2016-08-19 | 34 |
80 | tolam4skywd | 21 | 2016-08-19 | 31 |
81 | KizzyyRae | 19 | 2016-08-19 | 29 |
82 | Globallords | 21 | 2016-08-19 | 31 |
83 | uyilee | 32 | 2016-08-19 | 42 |
84 | Chibaba247 | 29 | 2016-08-19 | 39 |
85 | Hifijen | 24 | 2016-08-19 | 34 |
86 | endibe | 24 | 2016-08-19 | 34 |
87 | funnysaint | 37 | 2016-08-19 | 47 |
88 | Treazoure | 28 | 2016-08-19 | 38 |
89 | Damfostopper | 24 | 2016-08-19 | 34 |
90 | Hauwwyy21 | 24 | 2016-08-19 | 34 |
91 | nolaniyonu | 28 | 2016-08-19 | 38 |
92 | lordkizzy3 | 18 | 2016-08-19 | 28 |
93 | jibolarazor | 24 | 2016-08-19 | 34 |
94 | MizTyna | 26 | 2016-08-19 | 36 |
95 | passthem | 28 | 2016-08-19 | 38 |
96 | yusuf01 | 32 | 2016-08-19 | 42 |
97 | debbianah | 25 | 2016-08-19 | 35 |
98 rows × 4 columns
Nairaland was established in the year 2005. So year 2005 to 2016 is exactly 11years.
Now, to determind the age of each celebrant when NairaLand was established we will subtract 11years from the celebrant age and save it on a new colum "Age_at_2005"
# age at 2005 when NairaLand was established
df["Age_at_2005"] = df["Age"] - 11
df
Username | Age | Date | Age_10_Plus | Age_at_2005 | |
---|---|---|---|---|---|
0 | dejavuh0007 | 21 | 2016-08-19 | 31 | 10 |
1 | ololaderhoda | 24 | 2016-08-19 | 34 | 13 |
2 | mustymatic | 24 | 2016-08-19 | 34 | 13 |
3 | scarred9jan | 33 | 2016-08-19 | 43 | 22 |
4 | shinacollins | 38 | 2016-08-19 | 48 | 27 |
5 | globigpun | 36 | 2016-08-19 | 46 | 25 |
6 | sirbendit | 17 | 2016-08-19 | 27 | 6 |
7 | Kenkesh | 28 | 2016-08-19 | 38 | 17 |
8 | faisal00 | 30 | 2016-08-19 | 40 | 19 |
9 | omoga1908 | 32 | 2016-08-19 | 42 | 21 |
10 | samuelkingz | 21 | 2016-08-19 | 31 | 10 |
11 | amainus01 | 28 | 2016-08-19 | 38 | 17 |
12 | TheRector | 39 | 2016-08-19 | 49 | 28 |
13 | Gabriel6 | 22 | 2016-08-19 | 32 | 11 |
14 | Lorddj4real | 40 | 2016-08-19 | 50 | 29 |
15 | Kelchines | 19 | 2016-08-19 | 29 | 8 |
16 | easyreal | 30 | 2016-08-19 | 40 | 19 |
17 | Proxy4ever | 30 | 2016-08-19 | 40 | 19 |
18 | taiwoakinlabi | 24 | 2016-08-19 | 34 | 13 |
19 | jhorel | 22 | 2016-08-19 | 32 | 11 |
20 | jendoslim | 29 | 2016-08-19 | 39 | 18 |
21 | wemicoal | 24 | 2016-08-19 | 34 | 13 |
22 | ItzStone | 25 | 2016-08-19 | 35 | 14 |
23 | mojibbz | 21 | 2016-08-19 | 31 | 10 |
24 | Dinma1908 | 23 | 2016-08-19 | 33 | 12 |
25 | Ayodeji1908 | 32 | 2016-08-19 | 42 | 21 |
26 | Nazcoj | 29 | 2016-08-19 | 39 | 18 |
27 | jaittofidelix1 | 28 | 2016-08-19 | 38 | 17 |
28 | seunny4lif | 29 | 2016-08-19 | 39 | 18 |
29 | brian08 | 30 | 2016-08-19 | 40 | 19 |
... | ... | ... | ... | ... | ... |
68 | olaeffect | 40 | 2016-08-19 | 50 | 29 |
69 | chukz999 | 26 | 2016-08-19 | 36 | 15 |
70 | Essienblaze | 22 | 2016-08-19 | 32 | 11 |
71 | tkpumping | 23 | 2016-08-19 | 33 | 12 |
72 | yemcoguy | 31 | 2016-08-19 | 41 | 20 |
73 | saintrita | 20 | 2016-08-19 | 30 | 9 |
74 | moralex | 37 | 2016-08-19 | 47 | 26 |
75 | LilyHomes | 18 | 2016-08-19 | 28 | 7 |
76 | daprophet | 83 | 2016-08-19 | 93 | 72 |
77 | aitanofi | 36 | 2016-08-19 | 46 | 25 |
78 | Browndipson | 20 | 2016-08-19 | 30 | 9 |
79 | lilryder | 24 | 2016-08-19 | 34 | 13 |
80 | tolam4skywd | 21 | 2016-08-19 | 31 | 10 |
81 | KizzyyRae | 19 | 2016-08-19 | 29 | 8 |
82 | Globallords | 21 | 2016-08-19 | 31 | 10 |
83 | uyilee | 32 | 2016-08-19 | 42 | 21 |
84 | Chibaba247 | 29 | 2016-08-19 | 39 | 18 |
85 | Hifijen | 24 | 2016-08-19 | 34 | 13 |
86 | endibe | 24 | 2016-08-19 | 34 | 13 |
87 | funnysaint | 37 | 2016-08-19 | 47 | 26 |
88 | Treazoure | 28 | 2016-08-19 | 38 | 17 |
89 | Damfostopper | 24 | 2016-08-19 | 34 | 13 |
90 | Hauwwyy21 | 24 | 2016-08-19 | 34 | 13 |
91 | nolaniyonu | 28 | 2016-08-19 | 38 | 17 |
92 | lordkizzy3 | 18 | 2016-08-19 | 28 | 7 |
93 | jibolarazor | 24 | 2016-08-19 | 34 | 13 |
94 | MizTyna | 26 | 2016-08-19 | 36 | 15 |
95 | passthem | 28 | 2016-08-19 | 38 | 17 |
96 | yusuf01 | 32 | 2016-08-19 | 42 | 21 |
97 | debbianah | 25 | 2016-08-19 | 35 | 14 |
98 rows × 5 columns
# First 10 youngest members celebrating
youngest_10 = df.sort_values(by="Age", ascending=True)[:10]
# To display the plot within the Jupyter notebook
%matplotlib inline
youngest_10.plot(x="Username", y="Age", kind="bar", title="10 Youngest Members Celebrating")
youngest_10.plot(x="Username", y="Age", kind="barh", title="10 Youngest Members Celebrating")
<matplotlib.axes._subplots.AxesSubplot at 0x907a048>
# Let see the data of 10 youngest members
youngest_10
Username | Age | Date | Age_10_Plus | Age_at_2005 | |
---|---|---|---|---|---|
6 | sirbendit | 17 | 2016-08-19 | 27 | 6 |
41 | Quace | 17 | 2016-08-19 | 27 | 6 |
75 | LilyHomes | 18 | 2016-08-19 | 28 | 7 |
92 | lordkizzy3 | 18 | 2016-08-19 | 28 | 7 |
66 | Kingsleyjoel44 | 19 | 2016-08-19 | 29 | 8 |
15 | Kelchines | 19 | 2016-08-19 | 29 | 8 |
81 | KizzyyRae | 19 | 2016-08-19 | 29 | 8 |
55 | oluebubesyd | 20 | 2016-08-19 | 30 | 9 |
73 | saintrita | 20 | 2016-08-19 | 30 | 9 |
78 | Browndipson | 20 | 2016-08-19 | 30 | 9 |
# Lets find the sum of the ages
sum_youngest_10 = youngest_10["Age"].sum()
sum_youngest_10
187
# Lets find the percentage of each first 10 youngest members and save it in a new column "Percentage"
youngest_10["Percentage"] = (youngest_10["Age"] * 100) / (sum_youngest_10)
# Nowlets check the new dataframe first 10 youngest members
youngest_10
Username | Age | Date | Age_10_Plus | Age_at_2005 | Percentage | |
---|---|---|---|---|---|---|
6 | sirbendit | 17 | 2016-08-19 | 27 | 6 | 9.090909 |
41 | Quace | 17 | 2016-08-19 | 27 | 6 | 9.090909 |
75 | LilyHomes | 18 | 2016-08-19 | 28 | 7 | 9.625668 |
92 | lordkizzy3 | 18 | 2016-08-19 | 28 | 7 | 9.625668 |
66 | Kingsleyjoel44 | 19 | 2016-08-19 | 29 | 8 | 10.160428 |
15 | Kelchines | 19 | 2016-08-19 | 29 | 8 | 10.160428 |
81 | KizzyyRae | 19 | 2016-08-19 | 29 | 8 | 10.160428 |
55 | oluebubesyd | 20 | 2016-08-19 | 30 | 9 | 10.695187 |
73 | saintrita | 20 | 2016-08-19 | 30 | 9 | 10.695187 |
78 | Browndipson | 20 | 2016-08-19 | 30 | 9 | 10.695187 |
# to plot the pie chat of the Percentage column above
youngest_10["Percentage"].plot.pie(autopct='%.2f', fontsize=15, figsize=(6, 6), title="Pie Chart for 10 Youngest Members Celebrating")
<matplotlib.axes._subplots.AxesSubplot at 0x9029cc0>
# box plot on df for the three columns, if there are outliers you will see them
"""In statistics, an outlier is an observation point that is distant from other observations.
An outlier may be due to variability in the measurement or it may indicate experimental error;
the latter are sometimes excluded from the data set."""
df.plot.box()
<matplotlib.axes._subplots.AxesSubplot at 0x90302b0>
# Area plot, just to compare the three colums
df.plot.area()
<matplotlib.axes._subplots.AxesSubplot at 0xab85908>
To read the blog post about this NoteBook, visit: http://umar-yusuf.blogspot.com.ng/2016/08/Data-Srapping-Analysis-and-Visualization-with-Python.html