moishetreats wrote: ↑Fri Jun 07, 2019 10:32 am @Dynasty DeLorean: PLEASE publicly correct if I'm way off or off in a small way. And thank you, thank you, thank you again for sharing your thoughts AND for so fastidiously replying to all the questions, comments, and critiques.
----------
My understanding of this report is that it is looking for correlative indicators to success. By entering many points of concrete data (e.g., height, weight, draft slot, etc.) and then by determining what "success" means (e.g., 1,000 yards rushing), you can being to identify which data correlates to success.
I think that many people on this thread are confusing correlation with causation. Just because a player's data does or not fit the successful profile (i.e., correlate) does NOT mean that that player will or will not succeed (i.e., causation). DD's report isn't telling you that Player X will be successful and that Player Y will not be successful. Rather, DD's report is telling you that Player X's concrete data gives him a high, middle, or low likelihood of success based on the how other players with similar metrics ultimately performed.
Note that this kind of data analysis does not answer "why" questions. Why did Player X over-perform or under-perform? Why does Data Point A correlate to predicting success but not Data Point B? How do new schemes and play-calling affect metrics? These are not the questions that DD's report will answer. He's using a data-based approach to predict which RBs profile as more or less likely to succeed.
Indeed, one strength of this model is the ability for correlative indicators to change with more data. That's a good thing!! If there is the occasional outlier, then the correlative indicators won't be affected in anything more than a minimal way. But, if when there are numerous outliers and/or some players that entirely break the model, then the correlative indicators for success would change. Again, that's a good thing: the correlative indicators change because there is now more data to confirm or potentially reject the previous correlative assumptions. That makes the newly-updated correlative indicators MORE reliable!
For those who look at tape, schemes, coaching fit, etc. (i.e., subjective analysis), DD's report is likely not going to be your starting place or even necessarily something on which you would rely heavily. For those who look to survey methodology and data (i.e., concrete information), this is gold.
------------
@DD: Is this close, far off? Helpful, in your estimation, or just confusing people even more? My hope is the former!! And thank you again for your contributions!!
I started off trying to see if anything correlated with success. Most things didn't, but I did indeed find a few things that did. I think what I stumbled upon that maybe nobody else really thought of is that you can't make one big blanket statement. So for example (and i'm not saying I do this exactly, it's just an example), instead of saying "agility is important", you could ask is agility as important for a bigger back as it is a smaller back. Another example, if you have good speed do you need good agility. What about the inverse, if you have good agility do you need good speed. I don't think many people have asked these types of questions before and it's why I don't believe there's been anything like what I do out there.
As for if this would have worked in the past or will it work in the future. I have my doubts that 30-40 years ago this same exact model would have worked. These days everything is much more standardized, we have a lot more data, nutrition is better, workouts and strength and condition are optimized, from a teams perspective scouting and data analysis is better and there's more of an emphasis on efficiency and maybe to some extent the passing game. Scouts have a lot more access to smaller school players now than they did before. Years ago, there was a bigger emphasis on the running game and "Lesser" rb's were probably getting more work then than they would now. I know I looked back at a few rb's that had long careers and their YPC was in the toilet, and I wonder how long they would have lasted in today's game. If we fastforward into the future, let's say teams decide to go from a 50/50 or 40/60 run pass split to a 30/70 or 20/80 run pass split. The 1k threshold probably would be largely irrelevant. Will new rules be introduced to the game that affect the run game, who knows. So idk how long this will last. Does it work right now? Yeah i'm pretty convinced it does because it's so simple and so effective and it's been the same for a long stretch of years (15 or so, whatever I have data for). Is it possible it's just a giant fluke coincidence? Possible but unlikely imo.
Again I really want to say that i'm not wildly changing things on a yearly basis. The lists have not really changed over the years. I would say as I gather more data i'm able to simplify things rather than complicate or change it more. I think there is a distinct difference. I'm sure there will be outliers here and there, and disappointments. Ideally the studs list has 3+ 1k yard seasons, so guys like Ryan Mathews and DMC were kind of disappointments already. I'm sure there will be more. I don't know about the thing you're saying about numerous outliers and then I change everything, I think that would sort of be impossible.
If we look at the "studs" and "semi studs" (which is essentially the main part of the report) the model predicted since it's inception (2015), pre-2018 every player on that list with the exception of Foreman (who had the injury of course) has at least 1 1k yard season under their belt. I'd say that is a good indication that it's working as intended. Only time will tell if it continues on that course or not.