Cricket Match Winnig PredictionAMini Project Report SubmittedbyMr.Harsh Sadashiv Swami 1841064Mr.Darshan Shiva ji Waman 1841007Mr.Deepak Arvind Khamkar 1841032In partial fulfillment for the requirement of Laboratory Practice-II ofBa…elor of Computer EngineeringUnder the guidance ofProf.Mr. Digambar Padulkar (Assistant Professor) Department of Computer EngineeringVidya Pratishthan’s Kamalnayan Ba ja j Institute of Engineering and TechnologyBhigawan Road, Vidyanagari Baramati-4131332018-2019Vidya Pratishthan’sKamalnayan Ba ja j Institute of Engineering and Technology, BaramatiDepartment of Computer EngineeringCertificateThis is to certify that following students Mr.
Harsh Sadashiv Swami 1841064Mr.Darshan Shiva ji Waman 1841007Mr.Deepak Arvind Khamkar 1841032have successfully completed their project work on Cricket Match Winning Predictionduring the academic year 2018-2019in the partial fulfillment towardsthe completion of Laboratory Practice-II inComputer Engineering.Pro ject Guide HoD Deptt.
of Comp. Engg.(Mr. Digambar Padulkar) (Prof. Mrs. S.
S. Nandgaonkar)Principal( Dr. R. S. Bichkar)Internal Examiner External ExaminerAcknowledgmentsThe success and nal outcome of the pro ject which we have implemented required a lotof guidance and assistance from many people and we are extremely privileged to get thisall along with completion of our pro ject. All that we have done is only due to suchsupervision and guidance and I would not forget to thank them.We respect and thank Prof.Mr.
Digambar Padulkar, for providing us an opportunityto do the pro ject work in Laboratory Practice-II and giving us all support and guidancewhich made us complete the pro ject duly. We are extremely thankful to him for providingsuch a nice support and guidance, although he had busy schedule.We are thankful to and fortunate enough to get constant encouragement, support andguidance from all Teaching stas of Computer Department which helped us in successfullycompleting our pro ject work.
Also, I would like to extend our sincere esteems to all stain laboratory for their timely support.Mr.Harsh Sadashiv SwamiMr.Darshan Shiva ji WamanMr.Deepak Arvind KhamakriAbstractWinning has become the goal in any sport. Cricket is one among the frequentlywatched sport now a days. Winning in Cricket depends on various factors like home crowdadvantage, performances in the past, experience which the player brings in matches,performance at the specic venue, performance against the specic team,toss decisionand the current form of the team and the player. During the past few years lot of workand research papers have been published which measure the performance of the playerand their winning predictions.
In this work a model has been given which is predictingthe winning team. We maintain few information like number of matches they have playedbetween them,toss winner,venue where the match was played,city.who were the umpires.The prediction mainly depends on the teams which are playing the match,who wins thetoss and what the team decision is to do after winning the toss.
It also depends uponthe venue where the match is played and the city. The prediction method have beenimplemented using Logistic Regression ,K Nearest Neighbour and Gaussian Naive BayesClassier.iiContentsAcknowledgmentsiAbstract iiList of Figuresv1 Introduction1 1.1 Overview. . . . .
. . . . . . . .
. . . .
. . . . . . .
. . . . . . .
. . .
. .11.2 Brief Description. . . .
. . .
. . . . .
. . . . . . . .
. . . . . . .
. . . .11.
3 Problem Denition. . .
. . . . . . .
. . . .
. . . .
. . . .
. . . . . .
. .22 Literature Survey33 Dataset Description4 3.1 Introduction. . . .
. . . . . . .
. . . . . .
. . . . .
. . .
. . .
. . . . .
.43.1.1 Purpose. .
. . . . . .
. . . .
. . .
. . . . .
. . . . .
. . . . . . .
43.1.2 Pro ject Scope. . . .
. . .
. . . . .
. . . . .
. . . .
. . . . . .
. .53.
1.3 Design and Implementation Constraints. . . .
. . . .
. . .
. . .53.1.4 Assumptions and Dependencies. .
. . . .
. . . . . .
. . . . .
. .54 Data Preprocessing and Visualization6 4.0.1 Steps in Data Preprocessing:. . .
. . . . . . .
. . . . . . .
. . . .64.0.2 Visualization. .
. . . . . . .
. . . . . . . .
. . . . . .
. . . . .
.65 Classication7 5.1 Logistic Regression. . . . . .
. . . . . . . .
. . . . . .
. . . . . . . .
. .75.
2 KNN Classier. . . . .
. . . . . . . .
. . . . . . . .
. . . . . . .
. . . .
75.3 Gaussian Naive Bayes Classier. . . . .
. . . . . .
. . .
. . . . .
. . . .76 Confusion Matrix9 6.0.
1 Analysis of Confusion Matrix. . . . . . . .
. . . . . . .
. . . . .96.0.
2 Compare Classier. . .
. . . . . .
. . . . . .
. . .
. . . . . . .
.10CONTENTS7 Result Analysis117.0.1 Result for Logistic Regression. .
. . . . . . .
. . . . . . . .
. . .117.
0.2 Result for KNN Classier. . . . . . .
. . . . . . . .
. . . .
. . .
117.0.3 Result for GNB Classier. . . . .
. . . . . . .
. . . .
. . .
. . .118 Conclusion and Future Work12 8.
0.1 Conclusion. . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . .128.0.2 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12Bibliography13 Cricket Match Winning PredictionivVPKBIET, BaramatiList of Figures1 Histogram.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi2 Pie Chart.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi3 Bar Graph.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii4 Bar Graph.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii4.1 BarGraph of KKR v/s RR. . . . . . . . . . . . . . . . . . . . . . . . . .66.1 Confusion Matrix of Logistic Regression. . . . . . . . . . . . . . . . . .96.2 Confusion Matrix of KNN Classier. . . . . . . . . . . . . . . . . . . . .106.3 Confusion Matrix of GNB Classier. . . . . . . . . . . . . . . . . . . . .10Figure 1 shows count of wins of teamsFigure 2 shows a Pie Chart of winning toss,winning match and winning toss andloosing match Figure 3 shows a Bar Graph of performance of two teams which are CSK and RCBFigure 4 shows a Bar Graph of performance of two teams which are KKR and RRvLIST OF FIGURESFigure 1: Histogram.Figure 2: Pie Chart.Cricket Match Winning PredictionviVPKBIET, BaramatiLIST OF FIGURESFigure 3: Bar Graph.Figure 4: Bar Graph.Cricket Match Winning PredictionviiVPKBIET, Baramati1Introduction1.1 OverviewAs a sport cricket is played globally across 106 member states of the International CricketCouncil (ICC), with an estimated 1.5 billion fans worldwide (ICC, 2012-2013). However,much of the global nance and interest is focused upon the 10 full ICC member nationsand more specically upon the big three of England, Australia and India as there leagueare very famous. Specially Indian League that is Indian Premier League(IPL) has gainlot of popularity over the years.1.2 Brief DescriptionIn this pro ject we develop a model in order to predict outcomes of the Indian PremierLeague over the years 2008-2016. We used a multi-step approach to analyze the data thatproduced over 500 records.There are dierent attributes used in the pro ject which are idnumber,Season in which the match had been played which ranges from 2008 to 2016,citywhere the match had been played,the date on which the match was played,names of twoteams participating in the match,the toss winner,toss decision,result of match whcih canbe normal,tie or no result.No result can be found due to some interruption in the game,the main reason for these could be due to rain.The other attributes are whether D/Lmethod is applied or not which stands for Duckworth—Lewis which comes into playwhen rain has occured.The other attributes are the Winner of the game,win by how muchruns if the winning batted rst and and win by how much wickets if the winning teambatted second.Several other attributes are player of the match or man of the match,venueor the name of stadium where the match was played and name of the two standing um-pires present on the eld.The prediction of the match is made by eliminating some of the features which is DataCleaning method.For the prediction of the match we have used some classiers whichare Logistic Regression,K Nearest Neighbour Classier and Gaussian Naive Bayes Classi-er The prediction of the match is made mainly on teams participating,toss winner,toss1CHAPTER 1. INTRODUCTIONdecision,city and venue1.3 Problem DenitionThe Indian Cricket Fans have seen the growing popularity of Indian Premier League(IPL).There is always some sort of discussion going on about IPL in World Cricket.There isalways a prediction made on who will the winner.These predictions are from commonpeople,media,celebrities etc.There is always a chat whose prediction is more likely to becorrect.So we have decided to do the same type of prediction using some statistical recordsand some classiers. Cricket Match Winning Prediction2VPKBIET, Baramati2Literature SurveyCricket is the most popular sport in India from earlier days. To make a combinationof cricket and entertainment BCCI started IPL(Indian Premier League). Nowadays, thepopularity of IPL is on the peak. Every business tycoon, bollywood actor wants to investin IPL team. Every team has a large amount of sponsers with them.It became a dreamfor for almost all millionaire to have a IPL team on his/her name.There is lot of crazeand buzzer for IPL in India.Every Indian is a part of these IPL event this or the otherway.They might be right from childrens to thier parents and to their parents its onlyIPL.Every team has larger fanbase cheering and supporting them. IPL has became a bigevent. So, everyone likes to guess the results of IPL match. Many times news channelsorganize debates on predictions of IPL matches. So, primary motivation behind thispro ject is increasing popularity of IPL. This pro ject will be interesting for biggest IPLfans and those who always like to guess the results of matches.33Dataset Description3.1 IntroductionThe dataset contains more than 500 records and more than 15 attributes.The attributesare as follows:1)id number2)season:Season in which the match was played.3)city:City in which the match was played.4)date:Date on which the match was played.5)team1:First Team participating in the match.6)team2:Second Team participating in the match.7)toss_winner:Winner of the toss.8)toss_decision:Decision of the toss which is being made.9)result:Result of the match whether it is normal or tie or interuptted due to some reasonsthat is no result.10)dl_apllied:Whether Duckworth-Luis (D/L) method is applied or not.11)winner:Winner of the match12)win_by_runs:Winning the match by how many runs.13)win_by_wickets:Winning the match by how many wickets.14)player_of_match:Player of the match.15)venue:Venue or Name of Stadium where the match has been played.16)umpire1:First Standing Umpire in the match.17)umpire2:Second Standing Umpire in the match.3.1.1 PurposeThe purpose is to Predict the Winner of the Match using some Statistical records andsome classiers.4CHAPTER 3. DATASET DESCRIPTION3.1.2 Pro ject ScopeThe Cricketing World will start to believe in Prediction which will be based on somestatistical records rather than some theoretical concepts.It will be easier to Predict thewinner.3.1.3 Design and Implementation ConstraintsThe Prediction is depended upon few Attributes other than that attributes it is dicultto Predict the Winner.3.1.4 Assumptions and DependenciesAsumptions : 1) In the Pro ject we have assumed that the form of the player is temporaryso we have not shown our dependency on one respective player.We have believed on thestatement said in the Cricketing world that is Form is Temporary but Class is permanent.2)We have assumed that the third umpire will not play a vital role in the matches eventhough in real scenarios it is been observed that the role played by the third umpireis very crucial.The third Umpire can completely change the course of the match by hisdecision.Dependencies:The pro ject is completely depended on the few attributes which are TwoTeams playing the match,city,venue and the Toss winner team and the Decision of winningtoss team. Cricket Match Winning Prediction5VPKBIET, Baramati4Data Preprocessing andVisualization4.0.1 Steps in Data Preprocessing:1.Import the libraries2.Import the dataset3.Check out the missing values4.See the Categorical Values5.Splitting the data-set into Training and Test Set4.0.2 Visualization Figure 4.1: BarGraph of KKR v/s RR65Classication5.1 Logistic RegressionLogisticregressionisastatisticalmethodforanalyzingadatasetinwhichthereareoneor more in-dependent variables that determine an outcome. The outcome is measured with a dichoto-mous variable in which there are only two possible outcomes. The dependent variable isbinary or dichotomous, i.e. it only contains data coded as 1 or 0. The binary logisticmodel is used to estimate the probability of a binary response based on one or morepredictor variables . The goal of logistic regression is to ï¬nd the best ï¬tting modelto describe the relationship between the dichotomous characteristic of interest and a setof independent (predictor or explanatory) variables. Logistic regression equation – Here pis the probability of presence of the characteristic of interest. The logistic transformationis deï¬ned as the logged odds:Odds = p/(1-p) and Logit(p) = ln(p/(1-p))The logistic transformation is dened as the logged odds: Odds = p/(1-p) and Logit(p)= ln(p/(1-p))5.2 KNN ClassierIn the classiï¬cation setting, the K-nearest neighbor algorithm essentially boils downto forming a ma jority vote between the K most similar instances to a given unseenobservation. Similarity is deï¬ned according to a distance metric between two datapoints. A popular choice is the Euclidean distance given by q Pni =1 (xiyi) 25.3 Gaussian Naive Bayes ClassierNaive Bayes Algorithm is a classiï¬cation technique based on Bayes Theorem with anassumption of independence among predictors. In simple terms, a Naive Bayes classiï¬erassumes that the presence of a feature in a class is unrelated to the presence of any other7CHAPTER 5. CLASSIFICATIONfeature. Naive Bayes model is easy to build and particularly useful for very large datasets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticatedclassiï¬cation methods. Formula: P(A | B) = P(B | A)P(A) In decision analysis, a decision tree can be used to visually and explicitly representdecisions and decision making. In data mining, a decision tree describes data (but theresulting classication tree can be an input for decision making). Cricket Match Winning Prediction8VPKBIET, Baramati6Confusion MatrixFigure 6.1: Confusion Matrix of Logistic Regression6.0.1 Analysis of Confusion MatrixLogistic RegressionAccuracy:27.58%Precision:0.25Recall:0.28KNN ClassierAccuracy:40%Precision:0.38Recall:0.40 GNB Classier9CHAPTER 6. CONFUSION MATRIXFigure 6.2: Confusion Matrix of KNN ClassierFigure 6.3: Confusion Matrix of GNB ClassierAccuracy:17.24%Precision:0..36Recall:0.176.0.2 Compare ClassierThe Accuracy Percentage of all the Classiers is very dierent.The result of AccuracyPercenatge of KNN Classier is very high and can be found as useful. Cricket Match Winning Prediction10VPKBIET, Baramati7Result Analysis7.0.1 Result for Logistic RegressionThe Accuracy of Logistic Regression is found to greater than GNB Classier and lessthan KNN Classier.Accuracy is 27.58%.7.0.2 Result for KNN ClassierThe Accuracy of KNN Classier is found to be highest and is 40%.7.0.3 Result for GNB ClassierThe Accuracy of GNB Classier is found to be least among KNN and Logistic Regres-sion.Accuracy is 17.24%.118Conclusion and Future Work8.0.1 ConclusionOur Cricket Match Winning Prediction Pro ject will be very useful in the coming time.ThePrediction will be made on statistical records and using some proposed model specicallyKNN Classier so it will help the people to have a look over it.Our results also show thatour prediction will be almost correct.8.0.2 Future WorkThe main focus will be on increasing the accuracy of the model.We also to consider someof the main factors which we have not considered yet in this pro ject.Like in this pro ject we have not considered the role umpires play in the match but inreality there role is very crucial.Also we can include the role played by Third Umpire.12Bibliography1Brooks, R. D., Fa, R. W., Sokulsky, D. (2002). An ordered response model of test cricket performance. Applied Economics , 34 (18), 2353-2365. ICC. (2012-2013). ICCAnnual Report.2Bandulasiri, A. (2008). Predicting the winner in one day international cricket. Journal of Mathematical Sciences Mathematics Education , 3 (1), 6-17.13