Friday, July 15, 2016

Ultimate Frisbee

The concept of Fantasy Football + computer simulation + statistical analysis + a game accessible to all students = the Ultimate Frisbee Draft.

Back in my student teaching year, I went a bit off the rails with project design in my Stats class.  After a rough first semester of Stats that turned around with a successful Minute to Win It project, I decided to tie every content area into some project.  My favorite, and the longest-lasting one over seven semesters, is the Ultimate Frisbee Draft.

We lead off the unit by watching one of my favorite movies, Moneyball, as a class.  Even students who have already seen it get a chance to watch it from the perspective of better understanding sports analytics while newbies get introduced to the power of data analysis in the sports world.  This intro event sets up a class discussion around which statistics matter and how someone could use those to predict overall value in any industry.

From there, I introduce students to our sport, Ultimate Frisbee.  I very intentionally didn't choose baseball, football, or another common sport -- these have tons of existing analytical approaches already covering the web that students could simply adopt without deep thought.  In addition, I have a number of students who are not motivated by sports.  Ultimate Frisbee, however, is one of those rare activities that nerds (my robotics students LOVE Ultimate), chatters, and intense athletes can all find really engaging.  To get an intuitive sense of the game, we spend a day outside or in the gym playing short games during class.  I use the time after games to reflect on the last match and encourage students to track specific stats of their own during the next game.

Finally, we get to the data.  I wrote a computer simulation that plays games of Ultimate between virtual players and tracks all throws, catches, and drops, per player, during the game.  After each virtual player plays 30 games (a number large enough for analysis but small enough for lots of noise to creep into the data), I dump the data of all 98 players into a giant spreadsheet.  From here, teams of students start to dig into the overwhelming pile of numbers to figure out who they want to draft for the ultimate Ultimate team.

Once given the hook and a chance to explore, I introduce the statistical tools that will become their friends.  Depending on when I teach it, I like to start with a search for the stats that actually matter, using a few rounds of multiple regression with the most important team stat, winning, against nearly everything else.  Since there are only 30 games, picking a player based on their number of wins has far too much noise to be reliable.  However, if you find that something like short catch percentage or the number of successful long throws are highly correlated across all players, you can use those much larger numbers and sort by categories with less noise.

We also talk about tools for sorting individuals.  You could use the raw stat, a rank order, a percentile, or a z-score.  Each has their own benefits.  When students want to combine two stats into one "power stat", I usually recommend the use of the z-score since it makes both values unitless while retaining a the magnitude relative to the mean.  Since there is no obvious right way to do all of this, it leads to fantastic student questions and discussions.

Some groups of students take my recommended approach of finding key stats, getting z-scores, adding the totals, and sorting.  Others modify this to break players into categories / positions, choosing a strong thrower or two with a supporting cast of receivers.  Others put their money on defense, looking for players who effectively deny the Frisbee on the logic that it ends drives quickly and gives them a short field for offense.

On the practice draft day, student teams are assigned a random draft order.  From there, they have 30 seconds to pick an unchosen player when they are up.  I only do 2-4 rounds in the practice draft and then assign the remainder of the slots on their teams with equivalent players.  Once teams are selected, I run their teams back through the original simulator to have them square off against one another.  The simulator will actually play through a full game, simulating one throw at a time based on probabilities from the throwing player, their defender, the match-ups of the receivers, the selected receiver to get the pass, and whether or not the pass was caught based on ability, defense, and other factors.  Fortunately, computers are crazy fast, so this takes fractions of a second.  The simulator, looking old-school fancy with its command-line spewing of text, prints out the records of each team as they play every other team 21 separate times, each up to 15 touchdowns.  It then enters into a "tournament mode" where teams go into a bracket (based on their "regular season" rank) and face off in a single match.  The 21 games per competitor series gives useful feedback on which teams are statistically solid, while the playoffs is a chance for even the worst of underdogs to eek out an occasional win.  This is analogous to many real sports, and it leads to more interesting discussions that a lesson on the "Law of Large Numbers" could ever do.

All of this is just a practice round -- I do this to help students become familiar with the format.  However, it also has the nice side effect of motivating teams with awful strategies and analysis to pull things together.  I have seen a number of teams make a strong turn-around based on lessons learned in the practice round.  When students share out their strategies, I have them focus on process, not their "secret sauce", since they tend to get pretty competitive.

By the final game day, teams are ready with spreadsheets, have a process to cross out the selected players as they get drafted, and get pretty audibly upset when someone takes their star player right in front of them.  Coming in much more organized, the draft usually takes only 5-10 minutes for all 7 rounds to complete.  After running through the regular season, I hand out awards/candy to the top teams, but then save the ultimate treat for the playoffs.  I click through just one game at a time here, encouraging teams to cheer and show their pride before revealing their fate.  In the end, we all have a lot of fun, including many students who usually don't get into sports or anything that smells like fantasy football.

Altogether, I spend 6-7 days in a block class with this unit.  It is long, but it is always memorable for students, creates fantastic student questions, and often inspires interesting projects when I provide an open-ended opportunity later in the course.

See videos and spreadsheet links for students to reference / download.

How to get up and running with the simulation:

  • Download and install Python 2.7 for Windows or Mac.  This is the environment that runs that code.  It is a simple "click next" type of installer.  You may want to reboot at the end if things are not running properly in later steps.
  • After installation, download this zip file and unzip it somewhere on your computer.
  • Double click "" to run the drafting program -- this is where you will setup a class, name the teams, and input the results of the 7-round draft.  Whatever name you give your "period" will be the same name you use later when running the simulator.
  • Double click "" to run the season + tournament simulator.  Whenever the screen pauses, just press enter to make the computer do the next thing.  This can be run as many times as you like using the same draft data.
  • Tweet at me, email me, or comment below if things don't work the way they should.  I would love to help you get this running!  Thanks to @stoodle for encouraging me to finally write this up after years of procrastinating!

Friday, July 8, 2016

GCD Update: Feedback and Research

Since my last post, I have been heavily researching and designing the new Grand Challenge Design course with the help of a huge support network. Here is my most succinct summary of the need and solution that make up the course:

Students are entering a world with variety of Grand Challenges, problems that are interdisciplinary, interconnected, and extremely complex. They span topic areas like the environment, healthcare, security, and urban infrastructure. A growing technology trend, developing "smart" devices that are being connected to the "internet of things", and then developing software platforms that make sense of all of the data that is generated, offers a powerful starting point for rethinking many of the most intractable problems. Grand Challenge Design was proposed as a course that gives students the tools and skills to create smart devices and exposure to the Grand Challenges where they can be applied for the greatest benefit to humanity.
To facilitate students immersing into the problem areas, the course will run as a year-long simulated world. Picture a 5'x13' wooden table with 4" squares marking out territories and waterways. Students will own territories and manage a growing society of virtual citizens. They will start with simple farms (fast-growing plants in solo cups), harvesting enough to feed their people and selling off the excess to grow their cash supplies. From there, they will advance to create powered city plots, running actual data lines from their Raspberry Pi and Arduino under the table to LEDs on their plots. As the societies advance, students will improve their farms with automated irrigation and moisture sensors. They will create factories with working actuators and engage in commerce with overseas markets to sell their products. While they build out these basic game elements, I will introduce new challenges such as polluted water supplies, disease outbreaks, unstable market prices, rising populations, and other twists that can only be addressed through effective student cooperation and the design of continuously smarter cities. As students all advance in skills, the challenges will become increasingly realistic with the help of community experts coming to class to pose the next round of problems and test out potential solutions. Even for simple things, like getting a loan to build an automated factory, I will require students to create a plan and pitch it to real investors before handing out a dime from my virtual bank.
Students will live through dozens of problems in healthcare, security, infrastructure, and the environment. They will experience building water treatment devices, factory assembly lines, disease tracking apps, automated irrigation and lighting systems for farms, and other smart systems that take real-world actions based on processed sensor data. They will interact with adults working on similar problems to see how their new skills can provide real value immediately, and through consulting challenges mid-game, actually build devices for real-world use. With the help of this course and the network of amazing people supporting it, we will build the next wave of Grand Challenge Designers.

The most exciting part of the design process thus far has been engaging with the feedback of those willing to reach out, particularly those who were critical of various concepts proposed in the last post. Discussing the challenges and refining the ideas is an ongoing process. This process ramps up next week with a few more in-person planning sessions. Things will start to get finalized in early August when I have a couple design sessions with students enrolled in the course. More than anyone else, I want students to be on-board with the plan before it gets etched in stone.

I also wanted to document some of the research I have been pouring through recently. Pouring through my search history over the past week reveals two major themes: world problems and sensors.

While researching the problem areas, I was honestly concerned I would get added to the FBI's watchlist: what teacher looks up details around urban infrastructure vulnerabilities? I read about bridge collapses, problems in the aging electrical distribution system, threats from terrorism, threats from hackers, water treatment systems, and other similarly riveting reads. I broadened the problem space by looking at the 14 Engineering Grand Challenges (where the course got its name) and past topics from the Future Problem Solving Program (topics ranging from "nutrition" to "space junk" to "virtual corporations"). I also looked at related solution spaces such as the smart grid, targeted irrigation (and how to build an in-home version), and simulating disease vectors through the cooperative game "Pandemic". As I tried to imagine what it would look like in a game setting, I watched an hour of Civilization 5 game play on YouTube.

With the sensors, I spend most of my time with my jaw on the floor at the insanely low cost of anything made and shipped from China via eBay. Arduinos are known for being that dirt cheap electronics prototyping platform at under $30/board. Given the open-source hardware design, they are also open to knock-offs. Somehow, it is possible to make them, list them on eBay, and ship them to the US from Hong Kong for $3.60/board. WHAT?! Given that it takes over a month for them to get here, I went on a shopping spree for these, moisture and current sensors, water pumps, relays, and other key components for our projects at insanely low prices. I found sample code with a number of the sensors that interfaces simply with the Arduino. Based on this research, most of our devices will be a two-step solution connecting a Raspberry Pi 3 to an Arduino to the sensor / actuator. This allows us to have a full computer gathering data and running multiple processes while also having a dedicated devices running simple, continuous loops with existing software.

There are a lot of logistics to work through as we turn this concept into a working, playable simulation with its huge physical footprint and its digital infrastructure. Even if it is rocky, the learning that I have along the way will better prepare me with the tech and problem area knowledge that will be essential to be an awesome facilitator of this course. And it will make for a fun summer!