Rich Sutton Plays Minesweeper

2 minute read

Published:

Month Two

Turns out, a new year’s resolution to blog every week is too ambitious. Maybe I can update that to once a month? Here goes a simple February blog post, coming back into the CompCath sandbox to see what I can put together.

I’m still on a bit of an unhealthy Minesweeper kick, so I channeled that energy this morning into a vibe-coded autosolver here. It is still very much a work in progress, and I couldn’t have done it without my loyal Cursor agent.

The Bitter Lesson

One of my primary motivations in having my own coded version of Minesweeper was my attempt to foray in the coming weeks into reinforcement learning. Back in the ACME Junior Core, we briefly experimented with RL in a Q-Learning lab, but I’d be lying if I said I fully internalized its beauty and simplicity.

Recently, I stumbled upon Rich Sutton’s Bitter Lesson, and it resonated with my intellect on a profound level. While I might go so far as to make this required reading for anyone living in the post-ChatGPT world, at the very least I hope the AI curious will take the time to give it a read. The TLDR: Rich is one of the pioneers of RL. The bitter lesson is that additional compute beats additional human thought. Instead of trying to simplify complex phenomena, to truly innovate in this day and age we need to just let the wonders of computational search and learning scale. That release–letting go of the relentless and futile search for understanding by the human mind and just embracing the inconceivable complexity–embodies the spirit of Computational Catharsis better than any other idea I’ve seen. Thanks Rich.

Maybe soon you will see my humble attempt at a reinforcement learning agent conquering the simple world of Minesweeper come live in your browser. Can it beat the best human solvers I could find? I’m skeptical.

Miscellanea

  • Scientifically accurate Financial Times article using quantum physics as a metaphor for current geopolitical relations, written by the former president of Armenia who happens to be a physicist
  • Dario Amodei gives his two cents on AI alignment, condemning doomerism while still encouraging restraint