The Anatomy of Debate Land's Rankings

The OTR (3.0) Score

We’ve built a metric for ranking competitors that we colloquially refer to as the OTR score. OTR awards points based on scaled tournament attendance, strong performance (e.g. deep elimination round runs), and success against top-ranked opponents. Every system has flaws, including ours. We’ve worked hard to open-source OTR and explain our methodology. We always welcome feedback and are interested in hearing how we might make OTR more robust.

For our fellow math nerds, here’s the OTR formula:

OTR_{comp}=(pWP*opWPM)+(\sum RxR_{comp})*(BreakBoost)*(rnpScore)

Here’s a detailed breakdown, piece by piece.

The OTR Composite Formula (Tournament x Tournament)

OTR composite

OTR_{comp}

This is the OTR composite score on a tournament-by-tournament basis. Local tournaments earn fewer points than national circuit tournaments. Smaller tournaments earn fewer points than larger tournaments. This formula doesn't change by circuit. A higher score is better; in other words, a higher OTR score means a competitor performed better over the time being measured.

Prelim Win Percentage

Prelim win percentage (“pWP”) is calculated as

\frac{numPrelimWins}{numPrelimRounds}

PwP is derived by dividing the number of preliminary round wins by the total preliminary rounds in a tournament. It rewards teams that win more rounds in prelims. This number is always a decimal ranging from 0 to 1.00, where 1 would mean a competitor has won 100% of their prelim debates and 0 would mean a competitor has won no prelim rounds.

One implication of pWP is that competitors aren’t rewarded for debating top-ranked competitors and losing, even if those debates were competitive and close. We think this is robust because random assignment against a high OTR team is common to non-power matched rounds.

Opponent Win Percentage Metric

opWPM

Opponent win percentage metric is the average of

{\frac{opNumPrelimWins}{numPrelimRounds}}

This metric is often generated by tabroom. It represents the average number of prelim wins a team's opponents had. Essentially, the summation of the opponent’s prelim wins is divided by the number of prelim opponents. This, more than anything, indicates what prelim bracket a team ended in. All teams in the 3-win bracket will have a similar opWPM (which can be slightly different due to pre-matched rounds). This metric ranges from 0 to 1.00

Take two teams, Team Yellow and Team Red. Team yellow-faced two (2) one-win teams in their pre-matched rounds. Team Red-faced two (2) five-win teams in their pre-matched rounds. In this scenario, Team Yellow will receive a lesser opWPM score than Team Red.

Round-by-Round Composite

RxR_{comp}

The round-by-round comp is one of the ELO-like elements of the OTR formula:

RxR_{comp}=1+(\frac{x^5}{1+\sqrt{x^9}})

where x=opPWP-PWP

Note: If x is negative, then a composite of 0 is assigned.

This is our underdog modifier. Teams receive a boost if a team can beat a relatively higher-rated team. The boost is relative to the difference in prelim win percentages calculated by logarithmic regression. Losing to a team with a higher WP will not decrease the RxR score.

A one-win team beats a six-win team in round two. Although the one-win team’s tournament composite score will likely be low, having a great round will give them a slight boost and be reflected, unlike version 1 or 2 of the OTR formula.

Break Boost

BreakBoost

BreakBoost supplies credit to teams who get to and win elim rounds. It is calculated by the round last won plus one. It is one of the largest sources of a team’s OTR comp. Teams who progress furthest at any given tournament will likely receive the highest tournament composite scores.

In a tournament where the first elim round is octafinals, a team that loses in quarterfinals will receive a break boost of two, doubling their score. A team that loses in semifinals will receive a break boost of 3, tripling their score, etc.

Final OTR Score Calculation

\frac{\sum OTRcomp * avgNumTournamentsAttended}{tournamentsAttended}

The OTR composites are summed together and multiplied by a deflator (a statistic that decreases the overall OTR score as a function of the number of tournaments attended). Simply put, going to more tournaments doesn't induce a disproportionate increase in score. This technique was implemented to reduce the bias toward debaters at large schools or those who can afford to attend tournaments frequently.

For example, Team Yellow attends 12 tournaments with an OTR sum of 20. Team Red attends 6 tournaments with an OTR score of sum 15. With the computation above (Assuming the average number of tournaments attended is 4), Team Yellow (12 tournaments) is assigned a final OTR of 6.6. Team Red (6 tournaments) earns 10. Thus, attending many more tournaments no longer induces disproportionately high scores.

The constants are placeholders to make the number smaller, between 0 and 5. Constants will change at the beginning and end of the season to normalize. Some years are “harder” than others (more bids among different teams). This is part of our retroactive normalization procedure each year.

Why OTR is Unique

Our formula is unique because of our continuous efforts to refine and improve our platform. It is designed to account for different tournament attendance frequencies, ensuring that all debaters, regardless of how often they compete, are represented. These enhancements reflect our dedication to offering a balanced and comprehensive experience for everyone involved in competitive debate.

1. A linear deflator for the number of tournaments attended. Simply put, going to more tournaments will no longer induce a disproportionate increase in score. This deflator was implemented to reduce the bias toward debaters at large schools or those who can afford to attend tournaments frequently.

2. A round-by-round score. The V6 scraper can view round-by-round results and compile metrics based on the data. We like to call it the underdog modifier. They receive a boost if a team can beat a higher-rated team. The boost is relative to the difference in win percentages in a logarithmic regression model. This composite is tournament-by-tournament-based and incorporates an ELO model into our holistic metric. This RxR composite was introduced to reflect teams that show significant improvement.

3. Local circuit insights. Along with additions to the formula itself, Debate Land has added local circuits to the site. This includes tournament insights, competitor data, and judge data. This allows more debaters, especially those without the funding or ability to compete on the national circuit, to view their stats.

4. Three debate types. Debate Land now includes PF, LD, and Policy rankings. The new debate types were added to allow more debaters to gain insight into their statistics.

Why Rankings?

Our rankings are merely a statistical representation of results, not skill. One should not interpret their or any other debater’s stats or ranking as an analysis of skill. Debate Land serves to democratize information on Tabroom.com and serves as a database for future studies and self-improvement. Statistics are not answers. Just a tool.

So, why does Debate Land have a leaderboard, doesn't that promote inequalities and elitism in debate?

While rankings themselves can’t directly address the deeper inequities present in debate, they also don't discriminate based on demographics. The lack of diversity among top-ranked teams highlights the existing disparities within the activity and underscores the need for continued efforts to address them. However, rankings can offer some benefits. Debaters from underrepresented groups can leverage these rankings to showcase their achievements, especially in contexts like college applications. Uniquely, Debate Land's rankings are less dependent on frequent attendance than any other model.

We've built this platform because, as former debaters, we recognized the fun of rankings in an inherently competitive activity and wished that thoughtful, open-source rankings were around when we were debating. Our overarching goal is to democratize access to information in the high school debate community and provide resources for all competitors to analyze and promote their results. We believe rankings are an important step in that direction.