weighted randomization

how randomization can improve user-experience

April 16, 2008

10 min, 1913 words

Many years ago one of my children hit a wall with Math in elementary school. He had always been bright, and quick to pick things up, so this was something of a surprise to him as well as to me. His class was in the early stages of learning multiplication, and it turned out that he had initially been able to add numbers in his head quickly enough that he hadn't needed to memorize the times-tables. This worked fine for him with lower numbers like 5x2, but began to break down pretty quickly with higher numbers like 9x7 and especially 23x76.

Flashcard app

So I ended up doing something I had sworn I'd never do: I created a flashcard program on the computer to help him memorize his basic times-tables. I had always thought that using computer programs for rote memorization was a travesty -- knowing that computers could be used for interesting and mind-expanding purposes rather than boring, mind-numbing ones. But this became a fascinating project. I started out implementing it in FileMaker Pro, which I had begun using extensively for some database projects, and eventually reimplemented the program in REALbasic, a wonderful program (at least at the time; I haven't used it in years) that introduced me to object-oriented programming, and through the sheer elegance of its interface and structure nurtured my growing sense that programming was about Art in addition to Logic.

In addition to wanting to present a nice interface, I also knew that I wanted the questions presented to my son to flow in such a way that he would be quizzed more often on the questions he answered incorrectly, and less often on the ones he answered correctly. In any given work-session, I also wanted him to be able to work with a narrow slice of the whole set of 'difficult' problems so that he would perceive some sense of gaining mastery over material.

To clarify this last point, imagine a problem set of 10,000 questions, 5,000 of which are very easy, and 5,000 of which are very hard. Imagine you sit down to a 10-minute 'memorization-session'. If the program utilizes a very simple algorithm which chooses randomly from among the 5,000 difficult questions, chances are that after the 10 minutes you will have learned nothing and will be quite discouraged: you'll likely not have seen the same question twice, and you'll have answered everything wrong. This would be bad enough if you were an adult committed to learning the material, but if you were a kid in elementary school, this would be likely to crush any flickering desire to learn the material.

And I wanted my program to be fun.

Details

My solution was to use what I call 'weighted-randomization'. Sometime I'll do a thorough search for the files I used those many years ago to see if I can find the code I created -- for posterity, amusement, and fond reflection. The basic idea consisted of two steps. First, from among all the... I guess 169 possible times-table queries (0x0 through 12x12), I chose a small subset: 10 as I recall, although eventually I made this number selectable via a preferences-pane (remember, I made this for a young kid, and assumed two 5-minute practice-sessions a day -- the latter one being optional but often completed because the app was fun). Second, I presented the queries. But of course the joy was in the details.

For the selection-step, my recollection is that I iterated through each possibility, and assigned it a 'selection-number', then sorted the list on selection-number and took the top 10. Before I describe the selection-number, I must first note that each possible query was initially assigned a 'score' of 0. When a query was answered correctly, the score was incremented; when answered incorrectly, the score was decremented. I eventually put the increment and decrement values into the app's preferences-pane to play around with their effect on the operation of the application, and as I recall usually had them set to increment correct answers by 1, and decrement incorrect answers by 2.

So why the selection-number rigamarole? Why not just sort on score, and take the 10 lowest numbers? That certainly would achieve the goal of presenting the user with the most-problematic problem-set. I did do that initially, but added the selection-number for two reasons. First, the simple presentation of only the hardest problems felt, well... boring. Second, I thought if I occasionally threw in some problems 'mostly' known, I would encourage crucial reinforcement to take place. And if one or two of the problem-set were 'easy', well, those queries' scores would rapidly increase and be less likely to be selected as part of the subset next time. Thus the selection-number. I don't remember the exact details, which I changed and experimented with extensively, but the basic idea: I took the lowest-score and the highest score, and split the range between them into, say, 10 equal segments. I then iterated through the queryset. For each score, I determined the segment it belonged to, then created a certain number of random-numbers, took the highest one, and assigned that as the selection-number. A more specific concrete example... Say the lowest score was -25 and the highest score was 5. That's a difference of 30. Splitting that into 10 segment-ranges yields segment ranges of 3 values. Thus the bottom segment-range would be -25 through -23, while the top segment would be 3 through 5. During the iteration step, if I came across -23 (in selection-range '10'), I would create '10' random numbers, and the highest one would become that query's selection-number. If I came across '3' (in selection-range '1'), I would create '1' random number; it would become that query's selection-number. Thus sorting then on selection-number and choosing the ten highest numbers would usually offer a subset of all queries consisting of mostly the hardest queries, mixed with some somewhat-hard queries, with an occasional easy query thrown in. Beautiful. Again, I don't remember the specific details, and do remember that I played frequently with the number of ranges and the number of random-numbers assigned for each range, but this description reflects the general approach to the slice-selection step.

The program was a hit, and did exactly what I had hoped; it helped my child memorize his times-tables. Shortly after I built it, I was immersed in learning Italian, and realized that with just a few tweaks this would be a terrific tool to help me memorize vocabulary. That also worked well, especially the ability to focus a review-sesion on a subset of all the words, because I built up a vocabulary set of hundreds of words. But abstracted from the tool, I kept ruminating on what made 'weighted-randomization' so compelling.

The idea

The concept binds together polar notions in a way that feels 'right'. Variety, as we've all heard, is the spice of life; morphing random-chance into likelihood offers possibilities for combining intentionality with variability to create joyful experience, as any gamer knows. The game Dungeons & Dragons wonderfully codified realms of probability in spirals of detail useful to both the casual gamer and the addict: a strong character will be more 'likely' to defeat a weaker character, but more granularly, a particular delivery of a particular blow will be more or less likely to be successful based on the quality of the attacking weapon and the defending shield, the skill of the attacker and defender, etc. The varied dice used to calculate these percentages seem themselves talismans. But on a simpler level, any player of Risk or Backgammon perceives these same issues. The roll of the dice is pure chance, but the strength of one's position makes the outcome of a battle or game dependent on the same order-and-chance qualities embodied in weighted-randomization.

Making the world a better place

I find it interesting to think about applying randomization to other systems.

LifeBalance is a wonderful little to-do list program I miss using since I gave up my Palm for an iPhone. I don't have direct knowledge of its internals, but I suspect it incorporates weighted-randomization in it's preparation of one's task-list. One part of the program offers the ability to define broad areas of life (i.e. work, maintaining friendships, family-life, etc.) and assign a rough percentage of time one would ideally like to devote to each area. Since to-do tasks end up being subsets of these areas, and since each task can be assigned a rough amount of time it will take to accomplish, the program tracks how much time is actually being expended in the broad areas, and optionally adjusts the current to-do list to maintain the preferred balance. The result is that there may be occasions in which a task of a slightly lower priority can appear higher on a to-do list than a task with a higher priority, if the lower-priority task will better-serve the less-immediate goal of achieving the life-area balance specified. This approach could conceivably be implemented using algorithms that don't utilize weighted-randomization, but this is just the kind of situation that weighted-randomization would be good for -- injecting a bit of flexibility into a system while still maintaining overall goals.

As a final thought experiment, imagine possibilities for injecting weighted-randomization into political systems.

I have a vague memory from a college class, likely jumbled by time, of hearing that in 15th Florence, a cohort of possible city-council leaders was chosen from among the populace much like those chosen for jury-duty today. As I recall, each individual in the cohort was then voted on by the populace, but only whether the individual was fit to serve: a yes or no vote. If a person received enough yes-votes, his name was put into a hat from which the new leaders were drawn at random. This unusual form of weighted-randomization suggests that when injected into political systems, it might reduce corrupting influences and by inference improve quality. Usually, and reasonably, years-of-service or merit-evaluations or test-scores are the institutional barriers to corruption. Weighted-randomization could be another tool worth exploring.

Imagine that of all bills a legislative committee debates publicly, 80% are selected for debate according to usual political processes, and 20% via a weighted-randomization scheme akin to my math flash-card program (i.e., perhaps a manual ranking of preference followed by randomizations based on those rankings). Obviously no one would want chance to factor into whether a bill actually becomes law, but since so many good ideas die for want of being scheduled for committee debate (often for questionable political reasons), injecting a bit of weighted-randomization into such a process could be a good thing, certainly worth trying.

I may add to this post other realms in which weighted-randomization might offer systemic improvements. Feel free to suggest ideas; I may post a few of them here.