Random Forest

In general, Machine Learning (ML) systems are decision-making machines. Self-driving cars decide when to accelerate, turn, and break. Language translators will hone in on a word based on any context provided. No matter what the ML’s application, decisions are at the heart of its functionality.

Trees are a handy and frequent way to represent decisions. Consider the taxonomic Tree of Life: the trunk represents life as a whole. Each branching point represents a classification decision: does this organism have gills, fur, a backbone? Accurate decisions lead to accurate identifications.

Imagine an algorithm designed to generate 3D models of oak trees. If you simulate the sun, soil, air and water accurately, the algorithm will function essentially like DNA. The DNA inside every cell of an acorn is a decision-making machine that generates an oak tree. Each cell’s unique environmental stimuli determines its growth cycle. Some cells become roots, some bark, some leaves. Where a plant receives water and sunlight shapes its growth. The better an algorithm accounts for all these factors, the more accurate its depiction of an oak tree will be.

There is a common ML strategy called a Random Forest. When a machine has to find the best solution to a problem, it can run a simulation of that problem over and over, until a reliable pattern emerges. This simulation can be represented as a forest, where each tree is a series of decisions played out. The outcomes of the trees, averaged over large numbers, reveal the decision patterns that lead to the desired results.

A simulation of a complex problem space must be limited in scope, and establishing those limitations changes the results of the experiment. How are the problems defined? And the solutions? What are the metrics and thresholds of success? It’s the machine’s taste in information which determines the data that gets captured. The scope must be limited to what’s accountable, and calculable.

Imagine a machine built to optimize a forest. What outcome is it trained to seek? What’s best for the forest — a well-balanced mosaic of species — is probably not the same as what’s best for the forester. But it is possible, even likely over the long-term, for the interests of the trees and the loggers to align.

IMG_5872.png