Friday, November 30, 2012

The recent big lottery brings up randomness again. Lotteries are a very rare case, maybe the only situation in common experience, where a mathematical randomizer shows up in PURE or raw form.

The vast majority of "randomized" events in our lives are THRESHOLDED random.

Let's see if I can illustrate the difference.

This first animation shows some bars rising and falling by a pure mathematical random process. The height of each bar is decided separately at each frame of the movie. Notice that this set of motions looks fully random; you can't see any pattern in it.

Now I've added a THRESHOLD across the bars. This makes the situation far more realistic, corresponding to many random-driven events in ordinary life. From this angle it still looks unpatterned.

Now we're looking at the same THRESHOLDED situation from the top. Now we can see all sorts of patterns! At each moment we can see CLUSTERS of bars that have popped above the threshold, and we can't see the bars that are below. Most importantly, we don't see the continuously variable heights any more; we only see the DECISION. Each bar has turned into a yes-no vote.

Everything we sense is thresholded. These bars might represent sounds coming from all sorts of things (crickets, doors, cars, dogs in your yard, dogs in China, rivers in Argentina.) All of those things are in the air, but you only hear the nearest and strongest. Same with points of light, or weights on your hand, or differences in income and status between you and your neighbor. You only sense values that pop up above your internal threshold.

The most direct analogy for this image might be a field of grass seeds popping through the soil. They are driven by temperature and moisture, so they will tend to sprout within a limited range of time; but each one has a unique micro-climate depending on shadows, bacteria, earthworms, etc.

Another prime example: Cancer clusters. Each bar corresponds to one person, with a varying number of cancerous cells. Everyone has some cancerous cells all the time, but we don't register a case of cancer until the number of cells pops through the threshold of a screening test. Each frame in the animation might correspond to a map of cancer cases in one year. Some of the frames show very definite clusters of cancer cases! Better look for known carcinogens where those clusters formed! Is there a power line? A kerosene lamp? A cell phone? No, it's most likely just random stuff.

Or we could be talking about weather events. Rivers rise and fall all the time, but we don't call it a flood until a river rises above the line of the nearest occupied land. Some of these bars seem to be flooding several years in a row! It's global warming! No, it's most likely just random stuff.

But not always. In some cases a repeated flood is just part of this clustering effect, but repetition is actually more likely than plain clustering would imply. Everything in Nature depends in infinitely complex ways on previous events. If Wildcat Creek floods in March, the ground is still wetter than usual in May, so it takes less rain than usual to bring the creek up to flood stage. The threshold has moved. There are also long-term trends like sunspot cycles and El Nino / La Nina ocean oscillations. If conditions favor big rains this year, the trends are likely to favor big rains next year as well. Probably have to wait several years until the cycles return to a dry phase.

To illustrate, I've moved the threshold up and down in a sine wave. First as seen from the side, just to show what's really happening:

Now from the top. Wow! We got 500-year floods everywhere, for several years in a row! And then we have terrible droughts everywhere, for several years in a row! This can't be random!

Yes it can. The bars are still moving in the very same pattern; the driving forces haven't changed. It's just that the conditions for popping each event above the threshold are changing from year to year as they do in Nature.

People who see life through the prism of statistics have trouble handling thresholds. Abstract academics have to shoehorn life into closed-form real-number equations, and you can't use a threshold in that context. Thresholding is perfectly natural to a binary computer. An on-off choice is easy to write as code, and the computer can handle it more precisely than a continuous number. But this naturalness doesn't penetrate the academic mind. If you can't write a continuous function suitable for a slide rule, you can't begin to think about the problem.