killer_of_giants wrote: ↑Wed Dec 11, 2019 7:12 am
PR0v3 wrote: ↑Tue Dec 10, 2019 10:17 am
The data is based on 2-point conversion rates over the history of the NFL. By this point, the sample size should be large enough to be statistically significant.
no, because the sample is not homogeneous: if you sample the heights of man and women from around the world in the past century, that's hardly going to be significant on what height you would expect from a caucasian male born after 2000.
regardless, that's not even the variance you need to consider. you can assume 50% or 60% conversion rates (or whatever) without uncertainties, you still have to consider what's the variance on repeating the same "experiment" (the 2pt conversion) a number of times (let's say 60 times a season if your offense is good) assuming those same exact expected conversion rates.
and this is already assuming that the conditions of the "experiment" are the same, which aren't, which adds to variance.
jenkins.math wrote: ↑Tue Dec 10, 2019 9:55 am
Yes Prov3's analysis is correct.
it lacks an analysis of the uncertainties. i would expect someone with "math" in his username to pick up on that.
As someone with a math degree, that specialized in probability and statistics, and uses this on a daily basis, I'm well aware of any shortcomings it may have. If you really want to go down the rabbit hole mathematically we can take this as far as you can handle.
The problem with what you're asking for is that it hasn't been done. No team has gone for 2 60 times in a season, so you have to use the data that is available. That doesn't make it perfect, but that doesn't discredit it either. It would be awesome if we had that, but we don't. The data used was 2 pt conversion percentages over the history of the game, which doesn't care about weather, dome, home/road splits, offense rank vs defense rank in that particular game, etc. The variance is essentially already baked in to the sample size. It makes those individual breakdowns irrelevant for the discussion at hand. Unless you are in a lab environment where you have full control over everything in your experiment there will always be a level of variance. Since the NFL is played all over the country (and outside of the country a few times a year), and is officiated by humans, and played by humans, your "variance" stance basically forces you into a corner proclaiming that all data and advanced analytics when it comes to sports are completely irrelevant and false. I highly doubt you are trying to die on that hill.
The original premise was that you would have to convert at the historical league average or higher for this to be effective. Obviously I would assume that a team with a good offense that can utilize both the run and pass (the Ravens, 49ers, Seahawks immediately come to mind) would be much more likely to achieve the historical league average than a team like the Dolphins. If you want to say his original point totals of going for 2 every single time yielding somewhere between 90-120 might have a lower floor overall, I could see that stance, due to your "variance" claim. I would also say those numbers would only be for top half offenses (maybe only top 10, not sure since I haven't gone that in depth), and that bottom tier teams would have a much larger range of outcomes, with a much lower floor.
I'm well aware there are plenty of breakpoints within the data that would make for some interesting talking points, but I don't have the time nor inclination to do all that work. If you have that much free time feel free to get after it because I would be interested to see the results. But just pooping on the premise because the data isn't broken down enough for you and using the term "variance" doesn't mean the data was irrelevant or the math is meaningless like you're trying to imply.