Fight Odds Analysis
Description
This script analyzes UFC fight odds data.
Libraries
library(tidyverse)
library(knitr)
Examine Data
Load data.
load("./Datasets/df_master.RData")
Get summary.
summary(df_master)
## NAME Date Event City
## Length:5986 Length:5986 Length:5986 Length:5986
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## State Country FightWeightClass Round
## Length:5986 Length:5986 Length:5986 Min. :1.00
## Class :character Class :character Class :character 1st Qu.:1.00
## Mode :character Mode :character Mode :character Median :3.00
## Mean :2.43
## 3rd Qu.:3.00
## Max. :5.00
##
## Method Winner_Odds Loser_Odds Sex
## Length:5986 Length:5986 Length:5986 Length:5986
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## fight_id Result FighterWeight FighterWeightClass
## Min. : 1 Length:5986 Min. :115.0 Length:5986
## 1st Qu.: 749 Class :character 1st Qu.:135.0 Class :character
## Median :1497 Mode :character Median :155.0 Mode :character
## Mean :1497 Mean :163.8
## 3rd Qu.:2245 3rd Qu.:185.0
## Max. :2993 Max. :265.0
##
## REACH SLPM SAPM STRA
## Min. :58.00 Min. : 0.000 Min. : 0.100 Min. :0.0000
## 1st Qu.:69.00 1st Qu.: 2.680 1st Qu.: 2.630 1st Qu.:0.3900
## Median :72.00 Median : 3.440 Median : 3.230 Median :0.4400
## Mean :71.77 Mean : 3.531 Mean : 3.435 Mean :0.4417
## 3rd Qu.:75.00 3rd Qu.: 4.250 3rd Qu.: 4.030 3rd Qu.:0.4900
## Max. :84.00 Max. :11.140 Max. :23.330 Max. :0.8800
## NA's :215
## STRD TD TDA TDD
## Min. :0.0900 Min. : 0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.5100 1st Qu.: 0.560 1st Qu.:0.2700 1st Qu.:0.5100
## Median :0.5600 Median : 1.210 Median :0.3700 Median :0.6400
## Mean :0.5527 Mean : 1.518 Mean :0.3745 Mean :0.6157
## 3rd Qu.:0.6000 3rd Qu.: 2.160 3rd Qu.:0.5000 3rd Qu.:0.7600
## Max. :0.9200 Max. :14.190 Max. :1.0000 Max. :1.0000
##
## SUBA
## Min. : 0.0000
## 1st Qu.: 0.1000
## Median : 0.4000
## Mean : 0.5516
## 3rd Qu.: 0.8000
## Max. :12.1000
##
Redefine variables.
df_master$NAME = as.factor(df_master$NAME)
df_master$Date = as.Date(df_master$Date)
df_master$Event = as.factor(df_master$Event)
df_master$City= as.factor(df_master$City)
df_master$State = as.factor(df_master$State)
df_master$Country = as.factor(df_master$Country)
df_master$FightWeightClass = as.factor(df_master$FightWeightClass)
df_master$Method = as.factor(df_master$Method)
df_master$Winner_Odds = as.numeric(df_master$Winner_Odds)
df_master$Loser_Odds = as.numeric(df_master$Loser_Odds)
df_master$fight_id = as.factor(df_master$fight_id)
df_master$Sex = as.factor(df_master$Sex)
df_master$Result = as.factor(df_master$Result)
df_master$FighterWeightClass = as.factor(df_master$FighterWeightClass)
Summarize again… There are infinite odds and overturned / DQ fight outcomes. These will have to be removed.
summary(df_master)
## NAME Date
## Donald Cerrone : 24 Min. :2013-04-27
## Ovince Saint Preux: 21 1st Qu.:2015-08-23
## Jim Miller : 19 Median :2017-05-28
## Neil Magny : 19 Mean :2017-06-19
## Derrick Lewis : 18 3rd Qu.:2019-04-20
## Tim Means : 18 Max. :2021-02-06
## (Other) :5867
## Event City
## UFC Fight Night: Chiesa vs. Magny : 28 Las Vegas :1246
## UFC Fight Night: Poirier vs. Gaethje: 28 Abu Dhabi : 258
## UFC Fight Night: Whittaker vs. Till : 28 Boston : 124
## UFC 190: Rousey vs Correia : 26 Rio de Janeiro: 124
## UFC 193: Rousey vs Holm : 26 Chicago : 118
## UFC 210: Cormier vs. Johnson 2 : 26 Newark : 114
## (Other) :5824 (Other) :4002
## State Country FightWeightClass
## Nevada :1246 USA :3464 Welterweight : 986
## Abu Dhabi : 258 Brazil : 532 Lightweight : 984
## Texas : 256 Canada : 378 Bantamweight : 852
## New York : 252 United Arab Emirates: 258 Featherweight: 724
## California: 250 Australia : 236 Middleweight : 654
## Florida : 176 United Kingdom : 184 Flyweight : 498
## (Other) :3548 (Other) : 934 (Other) :1288
## Round Method Winner_Odds Loser_Odds Sex
## Min. :1.00 DQ : 14 Min. :1.06 Min. :1.07 Female: 766
## 1st Qu.:1.00 KO/TKO :1910 1st Qu.:1.42 1st Qu.:1.77 Male :5220
## Median :3.00 M-DEC : 34 Median :1.71 Median :2.38
## Mean :2.43 Overturned: 20 Mean : Inf Mean : Inf
## 3rd Qu.:3.00 S-DEC : 628 3rd Qu.:2.33 3rd Qu.:3.36
## Max. :5.00 SUB :1060 Max. : Inf Max. : Inf
## U-DEC :2320
## fight_id Result FighterWeight FighterWeightClass
## 1 : 2 Loser :2993 Min. :115.0 Welterweight :1007
## 2 : 2 Winner:2993 1st Qu.:135.0 Lightweight : 980
## 3 : 2 Median :155.0 Bantamweight : 799
## 4 : 2 Mean :163.8 Featherweight: 731
## 5 : 2 3rd Qu.:185.0 Middleweight : 659
## 6 : 2 Max. :265.0 Flyweight : 561
## (Other):5974 (Other) :1249
## REACH SLPM SAPM STRA
## Min. :58.00 Min. : 0.000 Min. : 0.100 Min. :0.0000
## 1st Qu.:69.00 1st Qu.: 2.680 1st Qu.: 2.630 1st Qu.:0.3900
## Median :72.00 Median : 3.440 Median : 3.230 Median :0.4400
## Mean :71.77 Mean : 3.531 Mean : 3.435 Mean :0.4417
## 3rd Qu.:75.00 3rd Qu.: 4.250 3rd Qu.: 4.030 3rd Qu.:0.4900
## Max. :84.00 Max. :11.140 Max. :23.330 Max. :0.8800
## NA's :215
## STRD TD TDA TDD
## Min. :0.0900 Min. : 0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.5100 1st Qu.: 0.560 1st Qu.:0.2700 1st Qu.:0.5100
## Median :0.5600 Median : 1.210 Median :0.3700 Median :0.6400
## Mean :0.5527 Mean : 1.518 Mean :0.3745 Mean :0.6157
## 3rd Qu.:0.6000 3rd Qu.: 2.160 3rd Qu.:0.5000 3rd Qu.:0.7600
## Max. :0.9200 Max. :14.190 Max. :1.0000 Max. :1.0000
##
## SUBA
## Min. : 0.0000
## 1st Qu.: 0.1000
## Median : 0.4000
## Mean : 0.5516
## 3rd Qu.: 0.8000
## Max. :12.1000
##
How many events does the dataset include?
length(unique(df_master$Event))
## [1] 261
How many fights?
length(unique(df_master$fight_id))
## [1] 2993
Over what time frame?
range(sort(unique(df_master$Date)))
## [1] "2013-04-27" "2021-02-06"
Analyse Odds
Make copy for analysis.
df_odds = df_master
rm(df_master)
Filter out controversial results and infinite odds.
df_odds %>%
dplyr::filter(
(Method != "DQ") & (Method != "Overturned")
, is.finite(Winner_Odds)
, is.finite(Loser_Odds)
) -> df_odds
Get rid of fighter-specifics so that we can spread the data frame. This will give us one event per row.
df_odds %>%
dplyr::select(-c(FighterWeight:SUBA)) %>%
spread(Result, NAME) -> df_odds_short
How often were the (best) odds equal?
mean(df_odds$Winner_Odds == df_odds$Loser_Odds)
## [1] 0.005410889
sum(df_odds$Winner_Odds == df_odds$Loser_Odds)
## [1] 32
Filter out equal odds and identify if Favorite won the fight.
df_odds_short %>%
dplyr::filter(Winner_Odds != Loser_Odds) %>% # filter out equal odds
dplyr::mutate(
Favorite_was_Winner = ifelse(Winner_Odds < Loser_Odds, T, F)
, Favorite_Unit_Profit = ifelse(Favorite_was_Winner, Winner_Odds - 1, -1)
, Underdog_Unit_Profit = ifelse(!Favorite_was_Winner, Winner_Odds - 1, -1)
) -> df_odds_short
What was the mean unit profit (i.e. ROI) if one bet solely on the Favorite?
mean(df_odds_short$Favorite_Unit_Profit)
## [1] -0.02309419
What was the mean unit profit if one bet solely on the Underdog?
mean(df_odds_short$Underdog_Unit_Profit)
## [1] -0.002040122
What proportion of the time does the Favorite win?
mean(df_odds_short$Favorite_was_Winner)
## [1] 0.6460388
Calculate implied probability of each fight based on odds.
df_odds_short %>% dplyr::mutate(
Favorite_Probability = ifelse(Favorite_was_Winner, 1/Winner_Odds, 1/Loser_Odds)
, Underdog_Probability = ifelse(!Favorite_was_Winner, 1/Winner_Odds, 1/Loser_Odds)
) -> df_odds_short
Calculate overround for each fight.
NOTE: these odds are the best available odds for each fight / fighter. Therefore, this is not overround in the traditional sense (looking at one particular odds maker).
df_odds_short %>%
dplyr::mutate(
Total_Probability = Favorite_Probability + Underdog_Probability
, Overround = Total_Probability - 1
) -> df_odds_short
There is very little overround. This is because we are picking the best odds for each fight / fighter. By picking the best odds, we are counteracting the built-in overround of any particular odds-maker (typically around 5% as a rough estimate).
mean(df_odds_short$Overround)
## [1] 0.004461755
mean(df_odds_short$Total_Probability)
## [1] 1.004462
Odds Performance
Add year as variable.
df_odds_short %>%
dplyr::mutate(
Year = format(Date,"%Y")
) -> df_odds_short
Compute Adjusted Implied Probability to account for the overround and get an unbiased estimate of the probability of victory implied by the odds.
df_odds_short %>%
dplyr::mutate(
Adjusted_Favorite_Probability = Favorite_Probability - Overround/2
, Adjusted_Underdog_Probability = Underdog_Probability - Overround/2
, Adjusted_Total_Probability = Adjusted_Favorite_Probability + Adjusted_Underdog_Probability
) -> df_odds_short
Looking at summary, we see that Adjusted Total Probability is always equal to 100%. Moreover, the Favorite Probability never dips below 50%, whereas the Underdog Probability never exceeds it.
summary(df_odds_short)
## Date Event
## Min. :2013-04-27 UFC Fight Night: Chiesa vs. Magny : 14
## 1st Qu.:2015-08-23 UFC Fight Night: Poirier vs. Gaethje: 14
## Median :2017-05-13 UFC Fight Night: Whittaker vs. Till : 14
## Mean :2017-06-17 UFC 190: Rousey vs Correia : 13
## 3rd Qu.:2019-04-20 UFC 193: Rousey vs Holm : 13
## Max. :2021-02-06 UFC 210: Cormier vs. Johnson 2 : 13
## (Other) :2860
## City State Country
## Las Vegas : 607 Nevada : 607 USA :1699
## Abu Dhabi : 127 Abu Dhabi : 127 Brazil : 258
## Rio de Janeiro: 60 Texas : 127 Canada : 187
## Boston : 59 California: 123 United Arab Emirates: 127
## Chicago : 57 New York : 123 Australia : 117
## Newark : 57 Florida : 88 United Kingdom : 92
## (Other) :1974 (Other) :1746 (Other) : 461
## FightWeightClass Round Method Winner_Odds
## Welterweight :486 Min. :1.000 DQ : 0 Min. : 1.060
## Lightweight :484 1st Qu.:2.000 KO/TKO : 942 1st Qu.: 1.420
## Bantamweight :420 Median :3.000 M-DEC : 17 Median : 1.710
## Featherweight:355 Mean :2.435 Overturned: 0 Mean : 1.975
## Middleweight :316 3rd Qu.:3.000 S-DEC : 312 3rd Qu.: 2.300
## Flyweight :246 Max. :5.000 SUB : 521 Max. :12.990
## (Other) :634 U-DEC :1149
## Loser_Odds Sex fight_id Loser
## Min. : 1.070 Female: 378 1 : 1 Jim Miller : 10
## 1st Qu.: 1.760 Male :2563 2 : 1 Ross Pearson : 10
## Median : 2.380 3 : 1 Angela Hill : 9
## Mean : 2.813 4 : 1 Donald Cerrone : 9
## 3rd Qu.: 3.350 5 : 1 Gian Villante : 9
## Max. :14.050 6 : 1 Jeremy Stephens: 9
## (Other):2935 (Other) :2885
## Winner Favorite_was_Winner Favorite_Unit_Profit
## Donald Cerrone : 15 Mode :logical Min. :-1.00000
## Derrick Lewis : 14 FALSE:1041 1st Qu.:-1.00000
## Francisco Trinaldo: 13 TRUE :1900 Median : 0.31000
## Neil Magny : 13 Mean :-0.02309
## Dustin Poirier : 12 3rd Qu.: 0.57000
## Max Holloway : 12 Max. : 1.10000
## (Other) :2862
## Underdog_Unit_Profit Favorite_Probability Underdog_Probability
## Min. :-1.00000 Min. :0.4000 Min. :0.07117
## 1st Qu.:-1.00000 1st Qu.:0.5780 1st Qu.:0.27397
## Median :-1.00000 Median :0.6410 Median :0.35971
## Mean :-0.00204 Mean :0.6579 Mean :0.34658
## 3rd Qu.: 1.30000 3rd Qu.:0.7299 3rd Qu.:0.42553
## Max. :11.99000 Max. :0.9434 Max. :0.52356
##
## Total_Probability Overround Year
## Min. :0.7639 Min. :-0.236148 Length:2941
## 1st Qu.:0.9988 1st Qu.:-0.001198 Class :character
## Median :1.0085 Median : 0.008472 Mode :character
## Mean :1.0045 Mean : 0.004462
## 3rd Qu.:1.0147 3rd Qu.: 0.014713
## Max. :1.0684 Max. : 0.068376
##
## Adjusted_Favorite_Probability Adjusted_Underdog_Probability
## Min. :0.5012 Min. :0.0673
## 1st Qu.:0.5780 1st Qu.:0.2725
## Median :0.6408 Median :0.3592
## Mean :0.6557 Mean :0.3443
## 3rd Qu.:0.7275 3rd Qu.:0.4220
## Max. :0.9327 Max. :0.4988
##
## Adjusted_Total_Probability
## Min. :1
## 1st Qu.:1
## Median :1
## Mean :1
## 3rd Qu.:1
## Max. :1
##
Create function to graphically assess over performance as a function of several variables. These are not inferential analyses but are instead meant to visualize the data to observe trends for further analysis. Use adjusted implied probabilities along with unit profits derived from non-adjusted odds to simulate what one actually would have won using best available odds.
gauge_over_performance = function(num_bin = 10, min_bin_size = 30, variable = NULL) {
# get bins for Favorite
df_odds_short$Favorite_Probability_Bin = cut(df_odds_short$Adjusted_Favorite_Probability, num_bin)
# get bins for Underdog
df_odds_short$Underdog_Probability_Bin = cut(df_odds_short$Adjusted_Underdog_Probability, num_bin)
if (is.null(variable)) {
# check over/under performance for Favorites
df_odds_short %>%
dplyr::group_by(Favorite_Probability_Bin) %>%
dplyr::summarise(
Prop_of_Victory = mean(Favorite_was_Winner)
, Size_of_Bin = length(Favorite_was_Winner)
, ROI = mean(Favorite_Unit_Profit)
) -> fav_perf
} else {
# create dummy variable for function
df_odds_short$Dummy = df_odds_short[
,which(colnames(df_odds_short) == sprintf("%s", variable))
]
# check over/under performance for Favorites
df_odds_short %>%
dplyr::group_by(Favorite_Probability_Bin, Dummy) %>%
dplyr::summarise(
Prop_of_Victory = mean(Favorite_was_Winner)
, Size_of_Bin = length(Favorite_was_Winner)
, ROI = mean(Favorite_Unit_Profit)
) -> fav_perf
}
# extract bins
fav_labs <- as.character(fav_perf$Favorite_Probability_Bin)
fav_bins = as.data.frame(
cbind(
lower = as.numeric( sub("\\((.+),.*", "\\1", fav_labs) )
, upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", fav_labs) )
)
)
# get value in middle of bin
fav_bins %>% dplyr::mutate(mid_bin = (lower + upper)/2 ) -> fav_bins
# add mid bin column
fav_perf$Mid_Bin = fav_bins$mid_bin
# add Over performance column
fav_perf %>% dplyr::mutate(Over_Performance = Prop_of_Victory - Mid_Bin) -> fav_perf
if (is.null(variable)) {
# plot over/under performance
fav_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=Over_Performance * 100))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("Over Performance (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites")->gg
print(gg)
# plot over/under performance
fav_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin * 100, y=Prop_of_Victory*100))+
geom_point()+
geom_smooth(se=F)+
ylab("Probability of Victory (%)")+
xlab("Adjusted Implied Probability (%)")+
geom_abline(slope=1, intercept=0, linetype = "dotted")+
ggtitle("Favorites")->gg
print(gg)
# plot ROI - only real difference is scale along y axis
fav_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y= ROI* 100))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("ROI (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites") -> gg
print(gg)
} else {
# plot over/under performance
fav_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=Over_Performance * 100, group=Dummy, colour = Dummy))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("Over Performance (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites")+
labs(color=sprintf("%s", variable)) -> gg
print(gg)
# plot ROI - only real difference is scale along y axis
fav_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y= ROI* 100, group=Dummy, colour = Dummy))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("ROI (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites")+
labs(color=sprintf("%s", variable)) -> gg
print(gg)
}
if (is.null(variable)) {
# check over/under performance for Underdogs
df_odds_short %>%
dplyr::group_by(Underdog_Probability_Bin) %>%
dplyr::summarise(
Prop_of_Victory = mean(!Favorite_was_Winner)
, Size_of_Bin = length(!Favorite_was_Winner)
, ROI = mean(Underdog_Unit_Profit)
) -> under_perf
} else {
# check over/under performance for Underdogs
df_odds_short %>%
dplyr::group_by(Underdog_Probability_Bin, Dummy) %>%
dplyr::summarise(
Prop_of_Victory = mean(!Favorite_was_Winner)
, Size_of_Bin = length(!Favorite_was_Winner)
, ROI = mean(Underdog_Unit_Profit)
) -> under_perf
}
# extract bins
under_labs <- as.character(under_perf$Underdog_Probability_Bin)
under_bins = as.data.frame(
cbind(
lower = as.numeric( sub("\\((.+),.*", "\\1", under_labs) )
, upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", under_labs) )
)
)
# get value in middle of bin
under_bins %>% dplyr::mutate(mid_bin = (lower + upper)/2 ) -> under_bins
# add mid bin column
under_perf$Mid_Bin = under_bins$mid_bin
# add Over performance column
under_perf %>% dplyr::mutate(Over_Performance = Prop_of_Victory - Mid_Bin) -> under_perf
if (is.null(variable)) {
# plot over/under performance
under_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=Over_Performance * 100))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("Over Performance (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")->gg
print(gg)
# plot over/under performance
under_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin * 100, y=Prop_of_Victory*100))+
geom_point()+
geom_smooth(se=F)+
ylab("Probability of Victory (%)")+
xlab("Adjusted Implied Probability (%)")+
geom_abline(slope=1, intercept=0, linetype = "dotted")+
ggtitle("Underdogs")->gg
print(gg)
under_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=ROI * 100))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("ROI (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")-> gg
print(gg)
} else {
# plot over/under performance
under_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=Over_Performance * 100, group=Dummy, colour = Dummy))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("Over Performance (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")+
labs(color=sprintf("%s", variable)) -> gg
print(gg)
under_perf %>%
dplyr::filter(Size_of_Bin >= min_bin_size) %>%
ggplot(aes(x=Mid_Bin*100, y=ROI * 100, group=Dummy, colour = Dummy))+
geom_point()+
geom_smooth(se=F)+
geom_hline(yintercept = 0, linetype = "dotted")+
ylab("ROI (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")+
labs(color=sprintf("%s", variable)) -> gg
print(gg)
}
# process to return()
under_perf$Is_Fav = F
under_perf %>%
rename(Probability_Bin = Underdog_Probability_Bin) -> under_perf
fav_perf$Is_Fav = T
fav_perf %>%
rename(Probability_Bin = Favorite_Probability_Bin) -> fav_perf
return(rbind(fav_perf, under_perf))
}
Look at how expected performance predicts over performance.
odds_perf = gauge_over_performance(num_bin = 10, min_bin_size = 100, variable = NULL)
kable(odds_perf)
Probability_Bin | Prop_of_Victory | Size_of_Bin | ROI | Mid_Bin | Over_Performance | Is_Fav |
---|---|---|---|---|---|---|
(0.501,0.544] | 0.5160494 | 405 | -0.0155802 | 0.52250 | -0.0064506 | TRUE |
(0.544,0.587] | 0.5361050 | 457 | -0.0585558 | 0.56550 | -0.0293950 | TRUE |
(0.587,0.631] | 0.5376984 | 504 | -0.1169643 | 0.60900 | -0.0713016 | TRUE |
(0.631,0.674] | 0.6410256 | 390 | -0.0204615 | 0.65250 | -0.0114744 | TRUE |
(0.674,0.717] | 0.7078947 | 380 | 0.0163947 | 0.69550 | 0.0123947 | TRUE |
(0.717,0.76] | 0.7589577 | 307 | 0.0231922 | 0.73850 | 0.0204577 | TRUE |
(0.76,0.803] | 0.8384279 | 229 | 0.0692576 | 0.78150 | 0.0569279 | TRUE |
(0.803,0.846] | 0.8271605 | 162 | -0.0021605 | 0.82450 | 0.0026605 | TRUE |
(0.846,0.89] | 0.8961039 | 77 | 0.0325974 | 0.86800 | 0.0281039 | TRUE |
(0.89,0.933] | 0.9333333 | 30 | 0.0236667 | 0.91150 | 0.0218333 | TRUE |
(0.0669,0.11] | 0.0666667 | 30 | -0.2100000 | 0.08845 | -0.0217833 | FALSE |
(0.11,0.154] | 0.1038961 | 77 | -0.1935065 | 0.13200 | -0.0281039 | FALSE |
(0.154,0.197] | 0.1728395 | 162 | -0.0545062 | 0.17550 | -0.0026605 | FALSE |
(0.197,0.24] | 0.1615721 | 229 | -0.2699127 | 0.21850 | -0.0569279 | FALSE |
(0.24,0.283] | 0.2410423 | 307 | -0.0922150 | 0.26150 | -0.0204577 | FALSE |
(0.283,0.326] | 0.2921053 | 380 | -0.0502105 | 0.30450 | -0.0123947 | FALSE |
(0.326,0.369] | 0.3589744 | 390 | 0.0181795 | 0.34750 | 0.0114744 | FALSE |
(0.369,0.413] | 0.4623016 | 504 | 0.1802976 | 0.39100 | 0.0713016 | FALSE |
(0.413,0.456] | 0.4638950 | 457 | 0.0675055 | 0.43450 | 0.0293950 | FALSE |
(0.456,0.499] | 0.4839506 | 405 | 0.0109136 | 0.47750 | 0.0064506 | FALSE |
Is there any stability across years? Need to reduce minimum bin size to get estimates. As a result, estimates will be more noisy.
odds_perf_by_year = gauge_over_performance(num_bin = 10, min_bin_size = 30, variable = "Year")
kable(odds_perf_by_year)
Probability_Bin | Dummy | Prop_of_Victory | Size_of_Bin | ROI | Mid_Bin | Over_Performance | Is_Fav |
---|---|---|---|---|---|---|---|
(0.501,0.544] | 2013 | 0.6000000 | 10 | 0.1580000 | 0.52250 | 0.0775000 | TRUE |
(0.501,0.544] | 2014 | 0.6086957 | 23 | 0.1791304 | 0.52250 | 0.0861957 | TRUE |
(0.501,0.544] | 2015 | 0.4791667 | 48 | -0.0906250 | 0.52250 | -0.0433333 | TRUE |
(0.501,0.544] | 2016 | 0.5777778 | 90 | 0.0936667 | 0.52250 | 0.0552778 | TRUE |
(0.501,0.544] | 2017 | 0.5869565 | 46 | 0.1126087 | 0.52250 | 0.0644565 | TRUE |
(0.501,0.544] | 2018 | 0.3859649 | 57 | -0.2568421 | 0.52250 | -0.1365351 | TRUE |
(0.501,0.544] | 2019 | 0.4383562 | 73 | -0.1578082 | 0.52250 | -0.0841438 | TRUE |
(0.501,0.544] | 2020 | 0.5714286 | 56 | 0.0882143 | 0.52250 | 0.0489286 | TRUE |
(0.501,0.544] | 2021 | 0.5000000 | 2 | -0.0250000 | 0.52250 | -0.0225000 | TRUE |
(0.544,0.587] | 2013 | 0.4444444 | 9 | -0.1977778 | 0.56550 | -0.1210556 | TRUE |
(0.544,0.587] | 2014 | 0.4324324 | 37 | -0.2456757 | 0.56550 | -0.1330676 | TRUE |
(0.544,0.587] | 2015 | 0.4848485 | 66 | -0.1515152 | 0.56550 | -0.0806515 | TRUE |
(0.544,0.587] | 2016 | 0.4590164 | 61 | -0.1957377 | 0.56550 | -0.1064836 | TRUE |
(0.544,0.587] | 2017 | 0.5915493 | 71 | 0.0374648 | 0.56550 | 0.0260493 | TRUE |
(0.544,0.587] | 2018 | 0.6000000 | 55 | 0.0556364 | 0.56550 | 0.0345000 | TRUE |
(0.544,0.587] | 2019 | 0.6025641 | 78 | 0.0608974 | 0.56550 | 0.0370641 | TRUE |
(0.544,0.587] | 2020 | 0.5479452 | 73 | -0.0369863 | 0.56550 | -0.0175548 | TRUE |
(0.544,0.587] | 2021 | 0.4285714 | 7 | -0.2457143 | 0.56550 | -0.1369286 | TRUE |
(0.587,0.631] | 2013 | 0.5000000 | 18 | -0.1722222 | 0.60900 | -0.1090000 | TRUE |
(0.587,0.631] | 2014 | 0.6046512 | 43 | -0.0016279 | 0.60900 | -0.0043488 | TRUE |
(0.587,0.631] | 2015 | 0.4545455 | 66 | -0.2554545 | 0.60900 | -0.1544545 | TRUE |
(0.587,0.631] | 2016 | 0.4791667 | 96 | -0.2204167 | 0.60900 | -0.1298333 | TRUE |
(0.587,0.631] | 2017 | 0.6557377 | 61 | 0.0716393 | 0.60900 | 0.0467377 | TRUE |
(0.587,0.631] | 2018 | 0.5000000 | 84 | -0.1680952 | 0.60900 | -0.1090000 | TRUE |
(0.587,0.631] | 2019 | 0.5068493 | 73 | -0.1689041 | 0.60900 | -0.1021507 | TRUE |
(0.587,0.631] | 2020 | 0.6315789 | 57 | 0.0371930 | 0.60900 | 0.0225789 | TRUE |
(0.587,0.631] | 2021 | 0.8333333 | 6 | 0.3666667 | 0.60900 | 0.2243333 | TRUE |
(0.631,0.674] | 2013 | 0.6666667 | 15 | 0.0280000 | 0.65250 | 0.0141667 | TRUE |
(0.631,0.674] | 2014 | 0.5744681 | 47 | -0.1161702 | 0.65250 | -0.0780319 | TRUE |
(0.631,0.674] | 2015 | 0.6029412 | 68 | -0.0769118 | 0.65250 | -0.0495588 | TRUE |
(0.631,0.674] | 2016 | 0.7500000 | 56 | 0.1391071 | 0.65250 | 0.0975000 | TRUE |
(0.631,0.674] | 2017 | 0.5476190 | 42 | -0.1669048 | 0.65250 | -0.1048810 | TRUE |
(0.631,0.674] | 2018 | 0.6400000 | 50 | -0.0222000 | 0.65250 | -0.0125000 | TRUE |
(0.631,0.674] | 2019 | 0.6938776 | 49 | 0.0622449 | 0.65250 | 0.0413776 | TRUE |
(0.631,0.674] | 2020 | 0.6491228 | 57 | -0.0077193 | 0.65250 | -0.0033772 | TRUE |
(0.631,0.674] | 2021 | 0.6666667 | 6 | 0.0016667 | 0.65250 | 0.0141667 | TRUE |
(0.674,0.717] | 2013 | 0.8235294 | 17 | 0.2058824 | 0.69550 | 0.1280294 | TRUE |
(0.674,0.717] | 2014 | 0.7254902 | 51 | 0.0450980 | 0.69550 | 0.0299902 | TRUE |
(0.674,0.717] | 2015 | 0.6551724 | 58 | -0.0601724 | 0.69550 | -0.0403276 | TRUE |
(0.674,0.717] | 2016 | 0.6956522 | 69 | -0.0079710 | 0.69550 | 0.0001522 | TRUE |
(0.674,0.717] | 2017 | 0.7272727 | 33 | 0.0348485 | 0.69550 | 0.0317727 | TRUE |
(0.674,0.717] | 2018 | 0.8372093 | 43 | 0.2097674 | 0.69550 | 0.1417093 | TRUE |
(0.674,0.717] | 2019 | 0.6250000 | 40 | -0.1007500 | 0.69550 | -0.0705000 | TRUE |
(0.674,0.717] | 2020 | 0.7142857 | 63 | 0.0231746 | 0.69550 | 0.0187857 | TRUE |
(0.674,0.717] | 2021 | 0.3333333 | 6 | -0.5216667 | 0.69550 | -0.3621667 | TRUE |
(0.717,0.76] | 2013 | 0.9411765 | 17 | 0.2811765 | 0.73850 | 0.2026765 | TRUE |
(0.717,0.76] | 2014 | 0.8250000 | 40 | 0.1090000 | 0.73850 | 0.0865000 | TRUE |
(0.717,0.76] | 2015 | 0.7187500 | 32 | -0.0362500 | 0.73850 | -0.0197500 | TRUE |
(0.717,0.76] | 2016 | 0.7872340 | 47 | 0.0582979 | 0.73850 | 0.0487340 | TRUE |
(0.717,0.76] | 2017 | 0.7500000 | 48 | 0.0143750 | 0.73850 | 0.0115000 | TRUE |
(0.717,0.76] | 2018 | 0.7380952 | 42 | -0.0042857 | 0.73850 | -0.0004048 | TRUE |
(0.717,0.76] | 2019 | 0.7352941 | 34 | -0.0094118 | 0.73850 | -0.0032059 | TRUE |
(0.717,0.76] | 2020 | 0.6666667 | 42 | -0.0973810 | 0.73850 | -0.0718333 | TRUE |
(0.717,0.76] | 2021 | 0.8000000 | 5 | 0.0600000 | 0.73850 | 0.0615000 | TRUE |
(0.76,0.803] | 2013 | 0.9090909 | 11 | 0.1609091 | 0.78150 | 0.1275909 | TRUE |
(0.76,0.803] | 2014 | 0.7500000 | 24 | -0.0437500 | 0.78150 | -0.0315000 | TRUE |
(0.76,0.803] | 2015 | 0.7560976 | 41 | -0.0409756 | 0.78150 | -0.0254024 | TRUE |
(0.76,0.803] | 2016 | 0.9090909 | 33 | 0.1600000 | 0.78150 | 0.1275909 | TRUE |
(0.76,0.803] | 2017 | 0.8857143 | 35 | 0.1317143 | 0.78150 | 0.1042143 | TRUE |
(0.76,0.803] | 2018 | 0.8214286 | 28 | 0.0521429 | 0.78150 | 0.0399286 | TRUE |
(0.76,0.803] | 2019 | 0.8400000 | 25 | 0.0748000 | 0.78150 | 0.0585000 | TRUE |
(0.76,0.803] | 2020 | 0.8709677 | 31 | 0.1067742 | 0.78150 | 0.0894677 | TRUE |
(0.76,0.803] | 2021 | 1.0000000 | 1 | 0.2900000 | 0.78150 | 0.2185000 | TRUE |
(0.803,0.846] | 2013 | 0.8000000 | 10 | -0.0350000 | 0.82450 | -0.0245000 | TRUE |
(0.803,0.846] | 2014 | 0.8571429 | 21 | 0.0380952 | 0.82450 | 0.0326429 | TRUE |
(0.803,0.846] | 2015 | 0.8205128 | 39 | -0.0102564 | 0.82450 | -0.0039872 | TRUE |
(0.803,0.846] | 2016 | 0.7368421 | 19 | -0.1110526 | 0.82450 | -0.0876579 | TRUE |
(0.803,0.846] | 2017 | 0.7619048 | 21 | -0.0819048 | 0.82450 | -0.0625952 | TRUE |
(0.803,0.846] | 2018 | 0.8636364 | 22 | 0.0309091 | 0.82450 | 0.0391364 | TRUE |
(0.803,0.846] | 2019 | 0.8666667 | 15 | 0.0506667 | 0.82450 | 0.0421667 | TRUE |
(0.803,0.846] | 2020 | 0.9166667 | 12 | 0.1108333 | 0.82450 | 0.0921667 | TRUE |
(0.803,0.846] | 2021 | 1.0000000 | 3 | 0.2200000 | 0.82450 | 0.1755000 | TRUE |
(0.846,0.89] | 2013 | 1.0000000 | 6 | 0.1500000 | 0.86800 | 0.1320000 | TRUE |
(0.846,0.89] | 2014 | 0.8000000 | 15 | -0.0713333 | 0.86800 | -0.0680000 | TRUE |
(0.846,0.89] | 2015 | 0.8823529 | 17 | 0.0088235 | 0.86800 | 0.0143529 | TRUE |
(0.846,0.89] | 2016 | 1.0000000 | 5 | 0.1420000 | 0.86800 | 0.1320000 | TRUE |
(0.846,0.89] | 2017 | 0.8571429 | 7 | -0.0142857 | 0.86800 | -0.0108571 | TRUE |
(0.846,0.89] | 2018 | 1.0000000 | 10 | 0.1530000 | 0.86800 | 0.1320000 | TRUE |
(0.846,0.89] | 2019 | 0.8750000 | 8 | 0.0150000 | 0.86800 | 0.0070000 | TRUE |
(0.846,0.89] | 2020 | 0.8888889 | 9 | 0.0300000 | 0.86800 | 0.0208889 | TRUE |
(0.89,0.933] | 2014 | 1.0000000 | 4 | 0.0900000 | 0.91150 | 0.0885000 | TRUE |
(0.89,0.933] | 2015 | 0.8750000 | 8 | -0.0487500 | 0.91150 | -0.0365000 | TRUE |
(0.89,0.933] | 2016 | 1.0000000 | 3 | 0.0933333 | 0.91150 | 0.0885000 | TRUE |
(0.89,0.933] | 2017 | 1.0000000 | 3 | 0.1100000 | 0.91150 | 0.0885000 | TRUE |
(0.89,0.933] | 2018 | 1.0000000 | 5 | 0.1080000 | 0.91150 | 0.0885000 | TRUE |
(0.89,0.933] | 2019 | 1.0000000 | 4 | 0.1050000 | 0.91150 | 0.0885000 | TRUE |
(0.89,0.933] | 2020 | 0.6666667 | 3 | -0.2766667 | 0.91150 | -0.2448333 | TRUE |
(0.0669,0.11] | 2014 | 0.0000000 | 4 | -1.0000000 | 0.08845 | -0.0884500 | FALSE |
(0.0669,0.11] | 2015 | 0.1250000 | 8 | 0.6237500 | 0.08845 | 0.0365500 | FALSE |
(0.0669,0.11] | 2016 | 0.0000000 | 3 | -1.0000000 | 0.08845 | -0.0884500 | FALSE |
(0.0669,0.11] | 2017 | 0.0000000 | 3 | -1.0000000 | 0.08845 | -0.0884500 | FALSE |
(0.0669,0.11] | 2018 | 0.0000000 | 5 | -1.0000000 | 0.08845 | -0.0884500 | FALSE |
(0.0669,0.11] | 2019 | 0.0000000 | 4 | -1.0000000 | 0.08845 | -0.0884500 | FALSE |
(0.0669,0.11] | 2020 | 0.3333333 | 3 | 2.5700000 | 0.08845 | 0.2448833 | FALSE |
(0.11,0.154] | 2013 | 0.0000000 | 6 | -1.0000000 | 0.13200 | -0.1320000 | FALSE |
(0.11,0.154] | 2014 | 0.2000000 | 15 | 0.5980000 | 0.13200 | 0.0680000 | FALSE |
(0.11,0.154] | 2015 | 0.1176471 | 17 | -0.1276471 | 0.13200 | -0.0143529 | FALSE |
(0.11,0.154] | 2016 | 0.0000000 | 5 | -1.0000000 | 0.13200 | -0.1320000 | FALSE |
(0.11,0.154] | 2017 | 0.1428571 | 7 | -0.0671429 | 0.13200 | 0.0108571 | FALSE |
(0.11,0.154] | 2018 | 0.0000000 | 10 | -1.0000000 | 0.13200 | -0.1320000 | FALSE |
(0.11,0.154] | 2019 | 0.1250000 | 8 | 0.0937500 | 0.13200 | -0.0070000 | FALSE |
(0.11,0.154] | 2020 | 0.1111111 | 9 | -0.1088889 | 0.13200 | -0.0208889 | FALSE |
(0.154,0.197] | 2013 | 0.2000000 | 10 | 0.0550000 | 0.17550 | 0.0245000 | FALSE |
(0.154,0.197] | 2014 | 0.1428571 | 21 | -0.2428571 | 0.17550 | -0.0326429 | FALSE |
(0.154,0.197] | 2015 | 0.1794872 | 39 | -0.0174359 | 0.17550 | 0.0039872 | FALSE |
(0.154,0.197] | 2016 | 0.2631579 | 19 | 0.4815789 | 0.17550 | 0.0876579 | FALSE |
(0.154,0.197] | 2017 | 0.2380952 | 21 | 0.3666667 | 0.17550 | 0.0625952 | FALSE |
(0.154,0.197] | 2018 | 0.1363636 | 22 | -0.2586364 | 0.17550 | -0.0391364 | FALSE |
(0.154,0.197] | 2019 | 0.1333333 | 15 | -0.3406667 | 0.17550 | -0.0421667 | FALSE |
(0.154,0.197] | 2020 | 0.0833333 | 12 | -0.5541667 | 0.17550 | -0.0921667 | FALSE |
(0.154,0.197] | 2021 | 0.0000000 | 3 | -1.0000000 | 0.17550 | -0.1755000 | FALSE |
(0.197,0.24] | 2013 | 0.0909091 | 11 | -0.5772727 | 0.21850 | -0.1275909 | FALSE |
(0.197,0.24] | 2014 | 0.2500000 | 24 | 0.1258333 | 0.21850 | 0.0315000 | FALSE |
(0.197,0.24] | 2015 | 0.2439024 | 41 | 0.1065854 | 0.21850 | 0.0254024 | FALSE |
(0.197,0.24] | 2016 | 0.0909091 | 33 | -0.5712121 | 0.21850 | -0.1275909 | FALSE |
(0.197,0.24] | 2017 | 0.1142857 | 35 | -0.4914286 | 0.21850 | -0.1042143 | FALSE |
(0.197,0.24] | 2018 | 0.1785714 | 28 | -0.1767857 | 0.21850 | -0.0399286 | FALSE |
(0.197,0.24] | 2019 | 0.1600000 | 25 | -0.2760000 | 0.21850 | -0.0585000 | FALSE |
(0.197,0.24] | 2020 | 0.1290323 | 31 | -0.4500000 | 0.21850 | -0.0894677 | FALSE |
(0.197,0.24] | 2021 | 0.0000000 | 1 | -1.0000000 | 0.21850 | -0.2185000 | FALSE |
(0.24,0.283] | 2013 | 0.0588235 | 17 | -0.7847059 | 0.26150 | -0.2026765 | FALSE |
(0.24,0.283] | 2014 | 0.1750000 | 40 | -0.3487500 | 0.26150 | -0.0865000 | FALSE |
(0.24,0.283] | 2015 | 0.2812500 | 32 | 0.0665625 | 0.26150 | 0.0197500 | FALSE |
(0.24,0.283] | 2016 | 0.2127660 | 47 | -0.1972340 | 0.26150 | -0.0487340 | FALSE |
(0.24,0.283] | 2017 | 0.2500000 | 48 | -0.0447917 | 0.26150 | -0.0115000 | FALSE |
(0.24,0.283] | 2018 | 0.2619048 | 42 | 0.0047619 | 0.26150 | 0.0004048 | FALSE |
(0.24,0.283] | 2019 | 0.2647059 | 34 | -0.0150000 | 0.26150 | 0.0032059 | FALSE |
(0.24,0.283] | 2020 | 0.3333333 | 42 | 0.2304762 | 0.26150 | 0.0718333 | FALSE |
(0.24,0.283] | 2021 | 0.2000000 | 5 | -0.2200000 | 0.26150 | -0.0615000 | FALSE |
(0.283,0.326] | 2013 | 0.1764706 | 17 | -0.4088235 | 0.30450 | -0.1280294 | FALSE |
(0.283,0.326] | 2014 | 0.2745098 | 51 | -0.1156863 | 0.30450 | -0.0299902 | FALSE |
(0.283,0.326] | 2015 | 0.3448276 | 58 | 0.1263793 | 0.30450 | 0.0403276 | FALSE |
(0.283,0.326] | 2016 | 0.3043478 | 69 | -0.0256522 | 0.30450 | -0.0001522 | FALSE |
(0.283,0.326] | 2017 | 0.2727273 | 33 | -0.1075758 | 0.30450 | -0.0317727 | FALSE |
(0.283,0.326] | 2018 | 0.1627907 | 43 | -0.4579070 | 0.30450 | -0.1417093 | FALSE |
(0.283,0.326] | 2019 | 0.3750000 | 40 | 0.2240000 | 0.30450 | 0.0705000 | FALSE |
(0.283,0.326] | 2020 | 0.2857143 | 63 | -0.0673016 | 0.30450 | -0.0187857 | FALSE |
(0.283,0.326] | 2021 | 0.6666667 | 6 | 1.1216667 | 0.30450 | 0.3621667 | FALSE |
(0.326,0.369] | 2013 | 0.3333333 | 15 | -0.0546667 | 0.34750 | -0.0141667 | FALSE |
(0.326,0.369] | 2014 | 0.4255319 | 47 | 0.2155319 | 0.34750 | 0.0780319 | FALSE |
(0.326,0.369] | 2015 | 0.3970588 | 68 | 0.1205882 | 0.34750 | 0.0495588 | FALSE |
(0.326,0.369] | 2016 | 0.2500000 | 56 | -0.2896429 | 0.34750 | -0.0975000 | FALSE |
(0.326,0.369] | 2017 | 0.4523810 | 42 | 0.2807143 | 0.34750 | 0.1048810 | FALSE |
(0.326,0.369] | 2018 | 0.3600000 | 50 | 0.0230000 | 0.34750 | 0.0125000 | FALSE |
(0.326,0.369] | 2019 | 0.3061224 | 49 | -0.1281633 | 0.34750 | -0.0413776 | FALSE |
(0.326,0.369] | 2020 | 0.3508772 | 57 | -0.0136842 | 0.34750 | 0.0033772 | FALSE |
(0.326,0.369] | 2021 | 0.3333333 | 6 | -0.0133333 | 0.34750 | -0.0141667 | FALSE |
(0.369,0.413] | 2013 | 0.5000000 | 18 | 0.2916667 | 0.39100 | 0.1090000 | FALSE |
(0.369,0.413] | 2014 | 0.3953488 | 43 | 0.0104651 | 0.39100 | 0.0043488 | FALSE |
(0.369,0.413] | 2015 | 0.5454545 | 66 | 0.3907576 | 0.39100 | 0.1544545 | FALSE |
(0.369,0.413] | 2016 | 0.5208333 | 96 | 0.3155208 | 0.39100 | 0.1298333 | FALSE |
(0.369,0.413] | 2017 | 0.3442623 | 61 | -0.1122951 | 0.39100 | -0.0467377 | FALSE |
(0.369,0.413] | 2018 | 0.5000000 | 84 | 0.2789286 | 0.39100 | 0.1090000 | FALSE |
(0.369,0.413] | 2019 | 0.4931507 | 73 | 0.2667123 | 0.39100 | 0.1021507 | FALSE |
(0.369,0.413] | 2020 | 0.3684211 | 57 | -0.0591228 | 0.39100 | -0.0225789 | FALSE |
(0.369,0.413] | 2021 | 0.1666667 | 6 | -0.5983333 | 0.39100 | -0.2243333 | FALSE |
(0.413,0.456] | 2013 | 0.5555556 | 9 | 0.3000000 | 0.43450 | 0.1210556 | FALSE |
(0.413,0.456] | 2014 | 0.5675676 | 37 | 0.3216216 | 0.43450 | 0.1330676 | FALSE |
(0.413,0.456] | 2015 | 0.5151515 | 66 | 0.1771212 | 0.43450 | 0.0806515 | FALSE |
(0.413,0.456] | 2016 | 0.5409836 | 61 | 0.2350820 | 0.43450 | 0.1064836 | FALSE |
(0.413,0.456] | 2017 | 0.4084507 | 71 | -0.0650704 | 0.43450 | -0.0260493 | FALSE |
(0.413,0.456] | 2018 | 0.4000000 | 55 | -0.0750909 | 0.43450 | -0.0345000 | FALSE |
(0.413,0.456] | 2019 | 0.3974359 | 78 | -0.0793590 | 0.43450 | -0.0370641 | FALSE |
(0.413,0.456] | 2020 | 0.4520548 | 73 | 0.0404110 | 0.43450 | 0.0175548 | FALSE |
(0.413,0.456] | 2021 | 0.5714286 | 7 | 0.3157143 | 0.43450 | 0.1369286 | FALSE |
(0.456,0.499] | 2013 | 0.4000000 | 10 | -0.1610000 | 0.47750 | -0.0775000 | FALSE |
(0.456,0.499] | 2014 | 0.3913043 | 23 | -0.1865217 | 0.47750 | -0.0861957 | FALSE |
(0.456,0.499] | 2015 | 0.5208333 | 48 | 0.0939583 | 0.47750 | 0.0433333 | FALSE |
(0.456,0.499] | 2016 | 0.4222222 | 90 | -0.1254444 | 0.47750 | -0.0552778 | FALSE |
(0.456,0.499] | 2017 | 0.4130435 | 46 | -0.1436957 | 0.47750 | -0.0644565 | FALSE |
(0.456,0.499] | 2018 | 0.6140351 | 57 | 0.3015789 | 0.47750 | 0.1365351 | FALSE |
(0.456,0.499] | 2019 | 0.5616438 | 73 | 0.1806849 | 0.47750 | 0.0841438 | FALSE |
(0.456,0.499] | 2020 | 0.4285714 | 56 | -0.1198214 | 0.47750 | -0.0489286 | FALSE |
(0.456,0.499] | 2021 | 0.5000000 | 2 | 0.0200000 | 0.47750 | 0.0225000 | FALSE |
Does the method of victory affect the relationship between odds and outcome? Reduce number of bins (compared to Year comparison above) to stabilize estimates. Graphs do not tell whole story due to number of data points available across bins.
odds_perf_by_method = gauge_over_performance(num_bin = 5, min_bin_size = 30, variable = "Method")
kable(odds_perf_by_method)
Probability_Bin | Dummy | Prop_of_Victory | Size_of_Bin | ROI | Mid_Bin | Over_Performance | Is_Fav |
---|---|---|---|---|---|---|---|
(0.501,0.587] | KO/TKO | 0.5240000 | 250 | -0.0403200 | 0.54400 | -0.0200000 | TRUE |
(0.501,0.587] | M-DEC | 0.6666667 | 9 | 0.1844444 | 0.54400 | 0.1226667 | TRUE |
(0.501,0.587] | S-DEC | 0.4385965 | 114 | -0.2028947 | 0.54400 | -0.1054035 | TRUE |
(0.501,0.587] | SUB | 0.5548387 | 155 | 0.0037419 | 0.54400 | 0.0108387 | TRUE |
(0.501,0.587] | U-DEC | 0.5419162 | 334 | -0.0062874 | 0.54400 | -0.0020838 | TRUE |
(0.587,0.674] | KO/TKO | 0.6028881 | 277 | -0.0437906 | 0.63050 | -0.0276119 | TRUE |
(0.587,0.674] | M-DEC | 0.6666667 | 6 | 0.0666667 | 0.63050 | 0.0361667 | TRUE |
(0.587,0.674] | S-DEC | 0.4107143 | 112 | -0.3425893 | 0.63050 | -0.2197857 | TRUE |
(0.587,0.674] | SUB | 0.5144928 | 138 | -0.1839130 | 0.63050 | -0.1160072 | TRUE |
(0.587,0.674] | U-DEC | 0.6454294 | 361 | 0.0236842 | 0.63050 | 0.0149294 | TRUE |
(0.674,0.76] | KO/TKO | 0.7268722 | 227 | 0.0123789 | 0.71700 | 0.0098722 | TRUE |
(0.674,0.76] | M-DEC | 0.0000000 | 1 | -1.0000000 | 0.71700 | -0.7170000 | TRUE |
(0.674,0.76] | S-DEC | 0.5714286 | 63 | -0.1866667 | 0.71700 | -0.1455714 | TRUE |
(0.674,0.76] | SUB | 0.7304348 | 115 | 0.0180000 | 0.71700 | 0.0134348 | TRUE |
(0.674,0.76] | U-DEC | 0.7722420 | 281 | 0.0755516 | 0.71700 | 0.0552420 | TRUE |
(0.76,0.846] | KO/TKO | 0.8014184 | 141 | -0.0039716 | 0.80300 | -0.0015816 | TRUE |
(0.76,0.846] | M-DEC | 1.0000000 | 1 | 0.2600000 | 0.80300 | 0.1970000 | TRUE |
(0.76,0.846] | S-DEC | 0.4500000 | 20 | -0.4270000 | 0.80300 | -0.3530000 | TRUE |
(0.76,0.846] | SUB | 0.8426966 | 89 | 0.0495506 | 0.80300 | 0.0396966 | TRUE |
(0.76,0.846] | U-DEC | 0.9142857 | 140 | 0.1424286 | 0.80300 | 0.1112857 | TRUE |
(0.846,0.933] | KO/TKO | 0.8510638 | 47 | -0.0374468 | 0.88950 | -0.0384362 | TRUE |
(0.846,0.933] | S-DEC | 1.0000000 | 3 | 0.1533333 | 0.88950 | 0.1105000 | TRUE |
(0.846,0.933] | SUB | 0.9583333 | 24 | 0.0820833 | 0.88950 | 0.0688333 | TRUE |
(0.846,0.933] | U-DEC | 0.9393939 | 33 | 0.0772727 | 0.88950 | 0.0498939 | TRUE |
(0.0669,0.154] | KO/TKO | 0.1489362 | 47 | 0.3393617 | 0.11045 | 0.0384862 | FALSE |
(0.0669,0.154] | S-DEC | 0.0000000 | 3 | -1.0000000 | 0.11045 | -0.1104500 | FALSE |
(0.0669,0.154] | SUB | 0.0416667 | 24 | -0.7025000 | 0.11045 | -0.0687833 | FALSE |
(0.0669,0.154] | U-DEC | 0.0606061 | 33 | -0.5239394 | 0.11045 | -0.0498439 | FALSE |
(0.154,0.24] | KO/TKO | 0.1985816 | 141 | -0.0098582 | 0.19700 | 0.0015816 | FALSE |
(0.154,0.24] | M-DEC | 0.0000000 | 1 | -1.0000000 | 0.19700 | -0.1970000 | FALSE |
(0.154,0.24] | S-DEC | 0.5500000 | 20 | 1.6125000 | 0.19700 | 0.3530000 | FALSE |
(0.154,0.24] | SUB | 0.1573034 | 89 | -0.2049438 | 0.19700 | -0.0396966 | FALSE |
(0.154,0.24] | U-DEC | 0.0857143 | 140 | -0.5875714 | 0.19700 | -0.1112857 | FALSE |
(0.24,0.326] | KO/TKO | 0.2731278 | 227 | -0.0531718 | 0.28300 | -0.0098722 | FALSE |
(0.24,0.326] | M-DEC | 1.0000000 | 1 | 2.0300000 | 0.28300 | 0.7170000 | FALSE |
(0.24,0.326] | S-DEC | 0.4285714 | 63 | 0.5015873 | 0.28300 | 0.1455714 | FALSE |
(0.24,0.326] | SUB | 0.2695652 | 115 | -0.0703478 | 0.28300 | -0.0134348 | FALSE |
(0.24,0.326] | U-DEC | 0.2277580 | 281 | -0.2165836 | 0.28300 | -0.0552420 | FALSE |
(0.326,0.413] | KO/TKO | 0.3971119 | 277 | 0.0597473 | 0.36950 | 0.0276119 | FALSE |
(0.326,0.413] | M-DEC | 0.3333333 | 6 | -0.0916667 | 0.36950 | -0.0361667 | FALSE |
(0.326,0.413] | S-DEC | 0.5892857 | 112 | 0.5446429 | 0.36950 | 0.2197857 | FALSE |
(0.326,0.413] | SUB | 0.4855072 | 138 | 0.2974638 | 0.36950 | 0.1160072 | FALSE |
(0.326,0.413] | U-DEC | 0.3545706 | 361 | -0.0556510 | 0.36950 | -0.0149294 | FALSE |
(0.413,0.499] | KO/TKO | 0.4760000 | 250 | 0.0419600 | 0.45600 | 0.0200000 | FALSE |
(0.413,0.499] | M-DEC | 0.3333333 | 9 | -0.2344444 | 0.45600 | -0.1226667 | FALSE |
(0.413,0.499] | S-DEC | 0.5614035 | 114 | 0.2307018 | 0.45600 | 0.1054035 | FALSE |
(0.413,0.499] | SUB | 0.4451613 | 155 | -0.0121935 | 0.45600 | -0.0108387 | FALSE |
(0.413,0.499] | U-DEC | 0.4580838 | 334 | 0.0074251 | 0.45600 | 0.0020838 | FALSE |
How does fight finishing method vary with implied probability of vegas odds?
odds_perf_by_method %>%
dplyr::filter(Is_Fav == T) %>%
ggplot(aes(x=Mid_Bin, y=Size_of_Bin, group = Dummy, color = Dummy))+
geom_point()+
geom_smooth(se=F)+
ylab("Count")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites")+
labs(color="Method")
odds_perf_by_method %>%
dplyr::filter(Is_Fav == F) %>%
ggplot(aes(x=Mid_Bin, y=Size_of_Bin, group = Dummy, color = Dummy))+
geom_point()+
geom_smooth(se=F)+
ylab("Count")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")+
labs(color="Method")
Calculate the proportion of fights that end by various methods as a function of implied probability of fight odds.
odds_perf_by_method %>%
group_by(Is_Fav, Mid_Bin) %>%
summarise(Total_Count = sum(Size_of_Bin)) -> total_count
odds_perf_by_method %>%
group_by(Is_Fav, Mid_Bin, Dummy) %>%
summarise(Count= Size_of_Bin) -> single_count
method_count_by_odds = merge(single_count, total_count)
method_count_by_odds %>%
dplyr::mutate(Method_Prop = Count / Total_Count ) -> method_count_by_odds
method_count_by_odds %>%
dplyr::filter(Is_Fav == T) %>%
ggplot(aes(x=Mid_Bin*100, y=Method_Prop*100, group = Dummy, color=Dummy))+
geom_point()+
geom_smooth(se=F)+
ylab("Probability of Method (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Favorites")+
labs(color="Method")
method_count_by_odds %>%
dplyr::filter(Is_Fav == F) %>%
ggplot(aes(x=Mid_Bin*100, y=Method_Prop*100, group = Dummy, color=Dummy))+
geom_point()+
geom_smooth(se=F)+
ylab("Probability of Method (%)")+
xlab("Adjusted Implied Probability (%)")+
ggtitle("Underdogs")+
labs(color="Method")
Fighter Odds
Convert short back to long format.
df_odds_short %>%
gather(key = "Result", value = "NAME", Loser:Winner) -> df_odds_long
Identify if fighter was favortie to assign proper Implied Probability.
df_odds_long %>%
dplyr::mutate(
Was_Favorite = ifelse(
(Favorite_was_Winner & (Result == "Winner")) | (!Favorite_was_Winner & (Result == "Loser"))
, T
, F
)
) -> df_odds_long
summary(df_odds_long[, "Was_Favorite"])
## Mode FALSE TRUE
## logical 2941 2941
Identify Implied Probability of each fighter.
df_odds_long %>%
dplyr::mutate(
Implied_Probability = ifelse(
Was_Favorite
, Favorite_Probability
, Underdog_Probability
)
, Adjusted_Implied_Probability = ifelse(
Was_Favorite
, Adjusted_Favorite_Probability
, Adjusted_Underdog_Probability
)
) -> df_odds_long
summary(df_odds_long[,c("Implied_Probability", "Adjusted_Implied_Probability")])
## Implied_Probability Adjusted_Implied_Probability
## Min. :0.07117 Min. :0.0673
## 1st Qu.:0.35971 1st Qu.:0.3593
## Median :0.50000 Median :0.5000
## Mean :0.50223 Mean :0.5000
## 3rd Qu.:0.64103 3rd Qu.:0.6407
## Max. :0.94340 Max. :0.9327
Get rid of useless columns.
df_odds_long %>% dplyr::select(
c(
NAME
, Event
, Date
, Result
, Implied_Probability
, Adjusted_Implied_Probability
)
) -> df_odds_long
Summarize data.
summary(df_odds_long)
## NAME Event
## Length:5882 UFC Fight Night: Chiesa vs. Magny : 28
## Class :character UFC Fight Night: Poirier vs. Gaethje: 28
## Mode :character UFC Fight Night: Whittaker vs. Till : 28
## UFC 190: Rousey vs Correia : 26
## UFC 193: Rousey vs Holm : 26
## UFC 210: Cormier vs. Johnson 2 : 26
## (Other) :5720
## Date Result Implied_Probability
## Min. :2013-04-27 Length:5882 Min. :0.07117
## 1st Qu.:2015-08-23 Class :character 1st Qu.:0.35971
## Median :2017-05-13 Mode :character Median :0.50000
## Mean :2017-06-17 Mean :0.50223
## 3rd Qu.:2019-04-20 3rd Qu.:0.64103
## Max. :2021-02-06 Max. :0.94340
##
## Adjusted_Implied_Probability
## Min. :0.0673
## 1st Qu.:0.3593
## Median :0.5000
## Mean :0.5000
## 3rd Qu.:0.6407
## Max. :0.9327
##
Add Win and Log Odds columns.
df_odds_long %>%
dplyr::mutate(
Won = ifelse(Result == "Winner", T, F)
, Logit_Prob = qlogis(Implied_Probability)
, Adjusted_Logit_Prob = qlogis(Adjusted_Implied_Probability)
) -> df_odds_long
summary(df_odds_long[, c("Won", "Logit_Prob", "Adjusted_Logit_Prob")])
## Won Logit_Prob Adjusted_Logit_Prob
## Mode :logical Min. :-2.56879 Min. :-2.6289
## FALSE:2941 1st Qu.:-0.57661 1st Qu.:-0.5786
## TRUE :2941 Median : 0.00000 Median : 0.0000
## Mean : 0.01186 Mean : 0.0000
## 3rd Qu.: 0.57982 3rd Qu.: 0.5786
## Max. : 2.81341 Max. : 2.6289
Get performance and odds for each fighter using Adjusted Implied Probability.
df_odds_long %>%
dplyr::group_by(NAME) %>%
dplyr::summarise(
Exp_Prop = mean(Adjusted_Implied_Probability)
, Logit_Exp_Prop = mean(Adjusted_Logit_Prob)
, Win_Prop = mean(Won)
, N_Fights = length(Won)
, Over_Performance = Win_Prop - Exp_Prop
, Logit_Over = qlogis(Win_Prop) - Logit_Exp_Prop
, Back_Trans_Exp = plogis(Logit_Exp_Prop)
) -> df_odds_long_fighters
Look at which fights were included in the dataset for a specific fighter.
df_odds_long %>%
dplyr::filter(NAME == "Roxanne Modafferi") -> df_roxy
kable(df_roxy)
NAME | Event | Date | Result | Implied_Probability | Adjusted_Implied_Probability | Won | Logit_Prob | Adjusted_Logit_Prob |
---|---|---|---|---|---|---|---|---|
Roxanne Modafferi | UFC Fight Night: Dos Anjos vs. Edwards | 2019-07-20 | Loser | 0.4385965 | 0.4399686 | FALSE | -0.2468601 | -0.2412893 |
Roxanne Modafferi | UFC Fight Night: Blaydes vs. Volkov | 2020-06-20 | Loser | 0.4651163 | 0.4694002 | FALSE | -0.1397619 | -0.1225522 |
Roxanne Modafferi | The Ultimate Fighter: Team Rousey vs. Team Tate Finale | 2013-11-30 | Loser | 0.1937984 | 0.2000738 | FALSE | -1.4255151 | -1.3858330 |
Roxanne Modafferi | UFC Fight Night: Chiesa vs. Magny | 2021-01-20 | Loser | 0.2777778 | 0.2657546 | FALSE | -0.9555114 | -1.0162702 |
Roxanne Modafferi | UFC 230: Cormier vs. Lewis | 2018-11-03 | Loser | 0.1724138 | 0.1695402 | FALSE | -1.5686159 | -1.5888892 |
Roxanne Modafferi | UFC Fight Night: Waterson vs. Hill | 2020-09-12 | Winner | 0.2777778 | 0.2657546 | TRUE | -0.9555114 | -1.0162702 |
Roxanne Modafferi | UFC Fight Night: Overeem vs. Oleinik | 2019-04-20 | Winner | 0.2666667 | 0.2683698 | TRUE | -1.0116009 | -1.0029092 |
Roxanne Modafferi | UFC 246: McGregor vs. Cowboy | 2020-01-18 | Winner | 0.1246883 | 0.1159156 | TRUE | -1.9487632 | -2.0316905 |
Top 10 over-performers with at least 5 fights where number of fights is simply number available in the dataset (see above).
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(desc(Over_Performance)) %>%
head(10) -> df_top_over_perform
# now with logit
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(desc(Logit_Over)) %>%
head(10) -> df_top_over_perform_logit
kable(df_top_over_perform, caption = "Top 10 Over Performers with at least 5 Fights")
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Leonardo Santos | 0.4454486 | -0.2777403 | 1.0000000 | 5 | 0.5545514 | Inf | 0.4310079 |
Robert Whittaker | 0.4996490 | 0.0065223 | 1.0000000 | 10 | 0.5003510 | Inf | 0.5016306 |
Brandon Moreno | 0.4399010 | -0.2686787 | 0.8571429 | 7 | 0.4172418 | 2.060438 | 0.4332315 |
Arnold Allen | 0.5867006 | 0.3757653 | 1.0000000 | 6 | 0.4132994 | Inf | 0.5928513 |
Brian Ortega | 0.4823820 | -0.0684020 | 0.8750000 | 8 | 0.3926180 | 2.014312 | 0.4829062 |
Alexander Volkanovski | 0.6101305 | 0.5177296 | 1.0000000 | 8 | 0.3898695 | Inf | 0.6266167 |
Bryan Caraway | 0.4194964 | -0.3415057 | 0.8000000 | 5 | 0.3805036 | 1.727800 | 0.4154438 |
Yan Xiaonan | 0.6240270 | 0.5391744 | 1.0000000 | 5 | 0.3759730 | Inf | 0.6316203 |
Amanda Nunes | 0.5507052 | 0.2811614 | 0.9166667 | 12 | 0.3659615 | 2.116734 | 0.5698309 |
Joaquim Silva | 0.4575904 | -0.1889315 | 0.8000000 | 5 | 0.3424096 | 1.575226 | 0.4529071 |
kable(df_top_over_perform_logit, caption = "Logit Scale: Top 10 Over Performers with at least 5 Fights")
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Alexander Volkanovski | 0.6101305 | 0.5177296 | 1 | 8 | 0.3898695 | Inf | 0.6266167 |
Arnold Allen | 0.5867006 | 0.3757653 | 1 | 6 | 0.4132994 | Inf | 0.5928513 |
Demetrious Johnson | 0.8609483 | 1.8803058 | 1 | 9 | 0.1390517 | Inf | 0.8676462 |
Israel Adesanya | 0.7002859 | 0.8678323 | 1 | 7 | 0.2997141 | Inf | 0.7042944 |
Jon Jones | 0.7892464 | 1.3952860 | 1 | 7 | 0.2107536 | Inf | 0.8014348 |
Kamaru Usman | 0.6925314 | 0.8901129 | 1 | 10 | 0.3074686 | Inf | 0.7089135 |
Khabib Nurmagomedov | 0.7598282 | 1.2002959 | 1 | 9 | 0.2401718 | Inf | 0.7685774 |
Kyung Ho Kang | 0.6633446 | 0.7104716 | 1 | 6 | 0.3366554 | Inf | 0.6705054 |
Leonardo Santos | 0.4454486 | -0.2777403 | 1 | 5 | 0.5545514 | Inf | 0.4310079 |
Petr Yan | 0.8144451 | 1.5256823 | 1 | 5 | 0.1855549 | Inf | 0.8213737 |
Top 10 under performers with at least 5 fights.
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(Over_Performance) %>%
head(10) -> df_top_under_perform
# with logit
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(Logit_Over) %>%
head(10) -> df_top_under_perform_logit
kable(df_top_under_perform, caption = "Top 10 Under Performers with at least 5 Fights")
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Kailin Curran | 0.5404624 | 0.1811195 | 0.1428571 | 7 | -0.3976052 | -1.972879 | 0.5451565 |
Joshua Burkman | 0.3760531 | -0.5400292 | 0.0000000 | 7 | -0.3760531 | -Inf | 0.3681808 |
Hyun Gyu Lim | 0.5720479 | 0.3587458 | 0.2000000 | 5 | -0.3720479 | -1.745040 | 0.5887368 |
Alexander Gustafsson | 0.6271431 | 0.5898086 | 0.2857143 | 7 | -0.3414288 | -1.506099 | 0.6433212 |
Gray Maynard | 0.5072171 | 0.0245074 | 0.1666667 | 6 | -0.3405504 | -1.633945 | 0.5061265 |
Junior Albini | 0.5325358 | 0.1453508 | 0.2000000 | 5 | -0.3325358 | -1.531645 | 0.5362739 |
Rashad Evans | 0.5236378 | 0.1041933 | 0.2000000 | 5 | -0.3236378 | -1.490488 | 0.5260248 |
Andrea Lee | 0.7055647 | 0.8841184 | 0.4000000 | 5 | -0.3055647 | -1.289583 | 0.7076749 |
Johny Hendricks | 0.5509110 | 0.2250002 | 0.2500000 | 8 | -0.3009110 | -1.323613 | 0.5560140 |
Anderson Silva | 0.4249640 | -0.3485824 | 0.1428571 | 7 | -0.2821068 | -1.443177 | 0.4137262 |
kable(df_top_under_perform_logit, caption ="Logit Scale: Top 10 Under Performers with at least 5 Fights" )
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Joshua Burkman | 0.3760531 | -0.5400292 | 0.0000000 | 7 | -0.3760531 | -Inf | 0.3681808 |
Kailin Curran | 0.5404624 | 0.1811195 | 0.1428571 | 7 | -0.3976052 | -1.972879 | 0.5451565 |
Hyun Gyu Lim | 0.5720479 | 0.3587458 | 0.2000000 | 5 | -0.3720479 | -1.745040 | 0.5887368 |
Gray Maynard | 0.5072171 | 0.0245074 | 0.1666667 | 6 | -0.3405504 | -1.633945 | 0.5061265 |
Junior Albini | 0.5325358 | 0.1453508 | 0.2000000 | 5 | -0.3325358 | -1.531645 | 0.5362739 |
Alexander Gustafsson | 0.6271431 | 0.5898086 | 0.2857143 | 7 | -0.3414288 | -1.506099 | 0.6433212 |
Rashad Evans | 0.5236378 | 0.1041933 | 0.2000000 | 5 | -0.3236378 | -1.490488 | 0.5260248 |
Anderson Silva | 0.4249640 | -0.3485824 | 0.1428571 | 7 | -0.2821068 | -1.443177 | 0.4137262 |
Ronda Rousey | 0.8322407 | 1.8077087 | 0.6000000 | 5 | -0.2322407 | -1.402244 | 0.8590847 |
Brad Pickett | 0.3826965 | -0.5461462 | 0.1250000 | 8 | -0.2576965 | -1.399764 | 0.3667590 |
Most favored fighters with at least 5 fights
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(desc(Exp_Prop)) %>%
head(10) -> df_most_fav
# with logit
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(desc(Logit_Exp_Prop)) %>%
head(10) -> df_most_fav_logit
kable(df_most_fav)
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Demetrious Johnson | 0.8609483 | 1.880306 | 1.0000000 | 9 | 0.1390517 | Inf | 0.8676462 |
Ronda Rousey | 0.8322407 | 1.807709 | 0.6000000 | 5 | -0.2322407 | -1.4022436 | 0.8590847 |
Cristiane Justino | 0.8252814 | 1.703941 | 0.8571429 | 7 | 0.0318615 | 0.0878185 | 0.8460487 |
Petr Yan | 0.8144451 | 1.525682 | 1.0000000 | 5 | 0.1855549 | Inf | 0.8213737 |
Zabit Magomedsharipov | 0.8050291 | 1.485833 | 1.0000000 | 6 | 0.1949709 | Inf | 0.8154520 |
Tatiana Suarez | 0.7972730 | 1.391506 | 1.0000000 | 5 | 0.2027270 | Inf | 0.8008325 |
Jon Jones | 0.7892464 | 1.395286 | 1.0000000 | 7 | 0.2107536 | Inf | 0.8014348 |
Magomed Ankalaev | 0.7647264 | 1.211622 | 0.8000000 | 5 | 0.0352736 | 0.1746719 | 0.7705859 |
Khabib Nurmagomedov | 0.7598282 | 1.200296 | 1.0000000 | 9 | 0.2401718 | Inf | 0.7685774 |
Mairbek Taisumov | 0.7342042 | 1.072182 | 0.7777778 | 9 | 0.0435736 | 0.1805805 | 0.7450117 |
kable(df_most_fav_logit)
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Demetrious Johnson | 0.8609483 | 1.880306 | 1.0000000 | 9 | 0.1390517 | Inf | 0.8676462 |
Ronda Rousey | 0.8322407 | 1.807709 | 0.6000000 | 5 | -0.2322407 | -1.4022436 | 0.8590847 |
Cristiane Justino | 0.8252814 | 1.703941 | 0.8571429 | 7 | 0.0318615 | 0.0878185 | 0.8460487 |
Petr Yan | 0.8144451 | 1.525682 | 1.0000000 | 5 | 0.1855549 | Inf | 0.8213737 |
Zabit Magomedsharipov | 0.8050291 | 1.485833 | 1.0000000 | 6 | 0.1949709 | Inf | 0.8154520 |
Jon Jones | 0.7892464 | 1.395286 | 1.0000000 | 7 | 0.2107536 | Inf | 0.8014348 |
Tatiana Suarez | 0.7972730 | 1.391506 | 1.0000000 | 5 | 0.2027270 | Inf | 0.8008325 |
Magomed Ankalaev | 0.7647264 | 1.211622 | 0.8000000 | 5 | 0.0352736 | 0.1746719 | 0.7705859 |
Khabib Nurmagomedov | 0.7598282 | 1.200296 | 1.0000000 | 9 | 0.2401718 | Inf | 0.7685774 |
Mairbek Taisumov | 0.7342042 | 1.072182 | 0.7777778 | 9 | 0.0435736 | 0.1805805 | 0.7450117 |
Least favored fighters with at least 5 fights.
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(Exp_Prop) %>%
head(10) -> df_least_fav
# with logit
df_odds_long_fighters %>%
dplyr::filter(N_Fights >= 5) %>%
dplyr::arrange(Logit_Exp_Prop) %>%
head(10) -> df_least_fav_logit
kable(df_least_fav, caption = "Top 10 Least Favored Fighters with at least 5 Fights")
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Roxanne Modafferi | 0.2743472 | -1.0507130 | 0.3750000 | 8 | 0.1006528 | 0.5398874 | 0.2590882 |
Daniel Kelly | 0.2769185 | -0.9737988 | 0.6000000 | 10 | 0.3230815 | 1.3792639 | 0.2741240 |
Jessica Aguilar | 0.2859562 | -0.9707245 | 0.2000000 | 5 | -0.0859562 | -0.4155698 | 0.2747361 |
Dan Henderson | 0.2887631 | -0.9309147 | 0.5000000 | 6 | 0.2112369 | 0.9309147 | 0.2827392 |
Thibault Gouti | 0.2982523 | -0.9293738 | 0.1666667 | 6 | -0.1315857 | -0.6800641 | 0.2830518 |
Anthony Perosh | 0.2985353 | -0.9384366 | 0.4000000 | 5 | 0.1014647 | 0.5329715 | 0.2812162 |
Leslie Smith | 0.3018944 | -0.9783537 | 0.4000000 | 5 | 0.0981056 | 0.5728886 | 0.2732186 |
Garreth McLellan | 0.3067211 | -0.8343938 | 0.2000000 | 5 | -0.1067211 | -0.5519005 | 0.3027168 |
Yaotzin Meza | 0.3076578 | -0.8634526 | 0.4000000 | 5 | 0.0923422 | 0.4579875 | 0.2966185 |
Takanori Gomi | 0.3093210 | -0.8732262 | 0.2000000 | 5 | -0.1093210 | -0.5130682 | 0.2945834 |
kable(df_least_fav_logit, caption = "Logit Scale: Top 10 Least Favored Fighters with at least 5 Fights")
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Roxanne Modafferi | 0.2743472 | -1.0507130 | 0.3750000 | 8 | 0.1006528 | 0.5398874 | 0.2590882 |
Leslie Smith | 0.3018944 | -0.9783537 | 0.4000000 | 5 | 0.0981056 | 0.5728886 | 0.2732186 |
Daniel Kelly | 0.2769185 | -0.9737988 | 0.6000000 | 10 | 0.3230815 | 1.3792639 | 0.2741240 |
Jessica Aguilar | 0.2859562 | -0.9707245 | 0.2000000 | 5 | -0.0859562 | -0.4155698 | 0.2747361 |
Anthony Perosh | 0.2985353 | -0.9384366 | 0.4000000 | 5 | 0.1014647 | 0.5329715 | 0.2812162 |
Dan Henderson | 0.2887631 | -0.9309147 | 0.5000000 | 6 | 0.2112369 | 0.9309147 | 0.2827392 |
Thibault Gouti | 0.2982523 | -0.9293738 | 0.1666667 | 6 | -0.1315857 | -0.6800641 | 0.2830518 |
Takanori Gomi | 0.3093210 | -0.8732262 | 0.2000000 | 5 | -0.1093210 | -0.5130682 | 0.2945834 |
Yaotzin Meza | 0.3076578 | -0.8634526 | 0.4000000 | 5 | 0.0923422 | 0.4579875 | 0.2966185 |
Julian Erosa | 0.3202609 | -0.8581763 | 0.2000000 | 5 | -0.1202609 | -0.5281181 | 0.2977205 |
Examine odds for specific fighters.
# Israel Adesanya
df_odds_long_fighters %>% dplyr::filter(NAME == "Israel Adesanya") -> df_Izzy
kable(df_Izzy)
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Israel Adesanya | 0.7002859 | 0.8678323 | 1 | 7 | 0.2997141 | Inf | 0.7042944 |
# Anthony Smith
df_odds_long_fighters %>% dplyr::filter(NAME == "Anthony Smith") -> df_Smith
kable(df_Smith)
NAME | Exp_Prop | Logit_Exp_Prop | Win_Prop | N_Fights | Over_Performance | Logit_Over | Back_Trans_Exp |
---|---|---|---|---|---|---|---|
Anthony Smith | 0.4539811 | -0.2286408 | 0.6428571 | 14 | 0.1888761 | 0.8164275 | 0.4430875 |