AI Alignment Is Already a Problem for Humanity

Stable Diffusion, Prompt: “Benevolent robot overlords creating a Golden Age on Earth. Soviet propaganda poster.”

At Stitch Fix, I lead the teams that built our in house recommender systems. These systems were incredibly powerful, and misunderstood by leadership. We were facing the AI alignment problem head on. That problem is getting larger everyday, engulfing the tech industry and society at large. AI alignment is not a problem for the future: It’s already here.

Recommender System Alignment

Three observations, taken together, explain why recommender system alignment is already a problem:

  1. Well-tuned recommenders are extremely competent. When we tested replacing human expert stylists with mature algorithms, the algorithms typically outperformed. Often they outperformed stylists by a wide margin across many measures.
  2. Recommender systems give you exactly what you ask for. When we tuned the system to maximize number of items sold per box, we sold a lot of items. When we changed it to maximize revenue, we could drive a lot of dollars. Decisions about the algorithm’s objective function were crucial to business operations, and changing configuration could create enormous operational headaches.
  3. It is hard to align recommender systems with our true values. Due to the way our systems were configured, it was easy for us to maximize on outcomes that were tied to a single item, such as sales or revenue. It was hard to directly train on the thing we really cared about: Valuable client relationships. Sometimes humans disagreed with what the algorithm was doing and would add hard coded rules and constraints to solve for perceived problems. These typically made things worse. The cost to algorithm efficacy was large, while underlying problems weren’t actually fixed.

The combination of the above means that you must be careful what you wish for when you configure an algorithm. They are both extremely powerful, and not easy to align with human values. As a corollary, improving the alignment of these systems is the current frontier of making them more useful, even in the short term.

We might call this the Narrow alignment problem. Today’s algorithms don’t threaten to destroy the world the way a misaligned AGI might (see Scott Alexander’s review of Stuart Russel’s book Human Compatible here), but they can certainly make mischief: Selling us things we don’t need or addicting us to media that makes us emotionally worse off.™£

What to watch for

In the coming decade, this will play out across many spheres, so here are some trends to look out for:

Ultimately I believe recommendation algorithms are the solution to what David Perell has dubbed the Paradox of Abundance. With so much information available, we have the tools at our fingertips to create an era of wisdom and enlightenment. But most people are drowning in the onslaught. Algorithms have proven they can tame the seas and command people’s focus. We need them to use that power for good.