Using R package yorkr - A quick overview

In this notebook I use my R package yorkr to perform analysis of

  1. ODI match between Australia - India in 12 feb 2012
  2. Analysis of Virat Kohli's batting

The data for these 2 analyses are available on Github. To know more about my R package 'yorkr' do take a look at my blog Giga thoughts. My package yorkr uses data from Cricsheet and can handle ODI, T20 and IPL matches. The data in Cricsheet are in yaml format. I have already converted the data of the individual matches to .RData and have uploaded it to Github. You can download or use this data from Github which is available at yorkrData. The details on how to use this data are also available in my blog posts.

In [1]:
install.packages("yorkr")
library(yorkr)
library(rpart)
library(dplyr)
Installing package into '/gpfs/global_fs01/sym_shared/YPProdSpark/user/sc50-a979762c5b05ec-d2b8f26cd8fd/R/libs'
(as 'lib' is unspecified)

Attaching package: 'dplyr'

The following objects are masked from 'package:SparkR':

    arrange, between, collect, contains, count, cume_dist, dense_rank,
    desc, distinct, explain, filter, first, group_by, intersect, lag,
    last, lead, mutate, n, n_distinct, ntile, percent_rank, rename,
    row_number, sample_frac, select, sql, summarize

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Disclaimer: This article represents the author’s viewpoint only and doesn’t necessarily represent IBM’s positions, strategies or opinions

Load the match data for the Australia-India ODI match in 2012 at Sydney from Github

The data for all matches are available at Github at yorkrData Note:Here I use the data one particular match. You can use this notebook with data from other matches. yorkrData has data for ODI, T20 and IPL matches

In [2]:
load(url("https://github.com/tvganesh/yorkrData/raw/master/ODI/ODI-matches/Australia-India-2012-02-12.RData"))
aus_ind <- overs
In [3]:
# Display the batting scorecard of Australia
teamBattingScorecardMatch(aus_ind,'Australia')
Total= 260 
Out[3]:
batsmanballsPlayedfourssixesruns
1DA Warner232018
2RT Ponting13106
3MJ Clarke435038
4PJ Forrest835266
5DJ Hussey765072
6DT Christian362039
7MS Wade171016
8RJ Harris2002
9CJ McKay3003
In [4]:
# Display the batting scorecard of India
teamBattingScorecardMatch(aus_ind,'India')
Total= 258 
Out[4]:
batsmanballsPlayedfourssixesruns
1G Gambhir1107092
2V Sehwag203020
3V Kohli281018
4RG Sharma411133
5SK Raina303138
6MS Dhoni570144
7RA Jadeja80012
8R Ashwin2001
In [5]:
# Plot the batting partenerships of India
teamBatsmenPartnershipMatch(aus_ind,"India","Australia")
In [6]:
# Plot the performance of Australian batsmen against Indian bowlers
teamBatsmenVsBowlersMatch(aus_ind,'Australia',"India", plot=TRUE)
In [7]:
# Display the bowling scorecard of India
teamBowlingScorecardMatch(aus_ind,'India')
Out[7]:
bowleroversmaidensrunswickets
1Z Khan100461
2R Vinay Kumar101585
3RA Jadeja100500
4UT Yadav101492
5R Ashwin80470
6RG Sharma20150
In [8]:
# Display the Wicket kind vs Runs conceded of Australia
teamBowlingWicketKindMatch(aus_ind,"Australia","India")
In [9]:
# Plot the runs conceded of India bowlers vs Australia
teamBowlersVsBatsmenMatch(aus_ind,"India","Australia")
In [10]:
# Plot the worm chart of the match
matchWormGraph(aus_ind,'Australia',"India")

Using yorkr with IPL data. Analysis of batting performance of Virat Kohli

In this part of the notebook I analyze performance of Virat Kohli of Royal Challengers Bangalore(RCB). For this I use the data of all matches of RCB. As I mentioned above the data of all the IPL teams are available at yorkrData. You can this data for other players of RCB. In yorkrdata data is available for other IPL teams like CSK, MI, DD etc. You can import this notebook and just change the URL and a few other minor changes to analyze the performance of the player of that team. For e.g. M S Dhoni of Chennai Super Kings, or Rohit Sharma of Mumbai Indians etc.

Note:: This posts uses data for IPL matches. However yorkrData also has data for ODI and T20 matches. You can do similar analysis of other batsmen or bowlers in ODI ot T20 matches. For details on how to use the R package 'yorkr' please see my blog Giga thoughts and check out the posts on yorkr for IDI, T20 and IPL matches. All posts can be seen at Index of Posts

In [11]:
# Load the RCB Batting details data from yorkrData
load(url("https://github.com/tvganesh/yorkrData/raw/master/battingBowlingDetails/Royal%20Challengers%20Bangalore-BattingDetails.RData"))
save(battingDetails,file="Royal Challengers Bangalore-BattingDetails.RData")
# Get the current directory
d= getwd()
# Call getBastmanDetails() and set the path where the .RData file is available
kohli <-  getBatsmanDetails(team="Royal Challengers Bangalore",name="V Kohli",dir=d)
dim(kohli)
[1] "/gpfs/global_fs01/sym_shared/YPProdSpark/user/sc50-a979762c5b05ec-d2b8f26cd8fd/notebook/work/Royal Challengers Bangalore-BattingDetails.RData"
Out[11]:
  1. 120
  2. 15
In [12]:
# Plot the runs scored vs deliveries faced by Virat Kohli at IPL
batsmanRunsVsDeliveries(kohli,"V Kohli")
In [13]:
# Plot the runs scored by Virat Kohli as 4s, 6s in IPL matches
kohli46 <- select(kohli,batsman,ballsPlayed,fours,sixes,runs)
batsmanFoursSixes(kohli46,"V Kohli")
In [14]:
# Plot the different dismissal types of Virat Kohli in IPL matches
batsmanDismissals(kohli,"V Kohli")