brandur.org

Tweeting for 10,000 Years: An Experiment in Autonomous Software

Deep inside a mountain in Texas, a clock is being built. Unlike other clocks, this one is designed to outlast every other invention of humankind, carefully engineered to maximize longevity on a scale of time that’s incompatible with our most fundamental intuitions.

The counterweight for its drive mechanism is housed in a hollowed out shaft that’s 500 feet high and 12 feet in diameter. It’s the size of a small car and weighs an unbelievable 10,000 pounds. The clock’s periodic chimes are controlled by 20 huge gears stacked on top of one another – each of which is 8 feet in diameter. It keeps time through a 6-foot pendulum assembly terminating with football-sized titanium weights that swing as unhurriedly as one might imagine from such a leviathan, taking a full ten seconds to move all the way back and forth. Components are machined to within tolerances of a fraction of an inch, rather than thousandths of an inch common in similar devices, so that they’ll keep working as time takes its inevitable toll through expansion and rust.

The design of the orrery to be used in the 10,000 year clock. It shows the relative position of six human-eye visible planets in our solar system.
The design of the orrery to be used in the 10,000 year clock. It shows the relative position of six human-eye visible planets in our solar system.

If all goes well, the clock will keep time for 10,000 years. It’s called the “Clock of the Long Now” and is a project of the Long Now Foundation, who aim to foster that values long-term planning and responsibility, and counteract what seems to be an accelerating trend towards an ever shortening attention span that we see in society today. Their scale is one of centuries and millennia, and they aim to construct frameworks that will be functional for 10,000 years and beyond. As a reminder of this charter, the Long Now represents years in five digits instead of four – under their calendaring system, it’s the year 02018.

Software may not be as well suited as a finely engineered clock to operate on these sorts of geological scales, but that doesn’t mean we can’t try to put some of the 10,000 year clock’s design principles to work. As seen by the short functional lifetime of most software, and its tendency to continually complexify and bloat, our industry is one that’s reliably short-sighted when it comes to building products that will last.

Software does have some advantages for longevity compared to a mechanical apparatus. Especially in the age of the cloud, a well-designed program isn’t dependent on any single host. It can be moved around as the hardware below it succumbs to the physical realities of entropy, and rely on its underlying platform to stay stable thanks to the efforts of human maintainers.

I wanted to write a little experiment inspired by the 10,000 year clock to see how long I could make a simple program last without my intervention. It’s called Perpetual, and it has the simple task of posting a total of ten pre-configured tweets to my timeline on something close to an exponential scale; the last being very optimistically scheduled to fire 10,000 years from now. The first of them went out just a few minutes after this article was published.

Each tweet, or “interval”, is prefixed with a magic string and number like LHI001 (LHI stands for “long heartbeat interval”) so that the scheduled tweets are recognizable, and so that the program can easily find the last one that it published. Here’s the intended timeline:

Interval # Tweet prefix Scheduled time
0 LHI000 Today
1 LHI001 1 day (from now)
2 LHI002 1 week
3 LHI003 1 month
4 LHI004 1 year
5 LHI005 5 years
6 LHI006 10 years
7 LHI007 100 years
8 LHI008 1,000 years
9 LHI009 10,000 years
The scheduled publication time for each tweet/interval.

And here’s the code that checks for old intervals and decides whether a new one should be posted (somewhat simplified for brevity):

func Update(api TwitterAPI, intervals []*Interval, now time.Time)
        (int, error) {

    it := api.ListTweets()

    for it.Next() {
        lastTweet = it.Value()

        id, ok = extractIntervalID(lastTweet.Message)
        if ok {
            break
        }
    }

    if it.Err() != nil {
        return -1, it.Err()
    }

    var nextIntervalID int
    if ok {
        // Pick the next interval in the series
        nextIntervalID = id + 1
    } else {
        // If ok is false, we never extracted an interval ID, which
        // means that this program has never posted before. Pick the
        // first interval ID in the series.
        nextIntervalID = 0
    }

    if nextIntervalID >= len(intervals) {
        return -1, nil
    }

    interval := intervals[nextIntervalID]

    if interval.Target.After(now) {
        fmt.Printf("Interval not ready, target: %v\n", interval.Target)
        return -1, nil
    }

    tweet, err := api.PostTweet(
        formatInterval(nextIntervalID, interval.Message))
    if err != nil {
        return -1, err
    }

    return nextIntervalID, nil
}

It’s a cute idea, but as you may have already guessed, my program won’t be tweeting for 10,000 years. It’ll be lucky if it makes it to 10 years, and lucky beyond all reason if it makes it to 100 (more on this in Existential threats below). Humans tend to have a hard time imagining increasing orders of magnitude, a fact that’s demonstrated by the well-documented cognitive bias of scope insensitivity. We can all do the basic arithmetic that tells us there are 1,000 ten year segments in 10,000, but it’s difficult to appreciate how much more time that really is. After some size, all numbers, whether they’re a thousand, ten thousand, a million, or ten million, are just really big.

Consider that the oldest pyramid, the Pyramid of Djoser at Saqqara, isn’t quite 5,000 years old, and that’s ancient. As young Cleopatra, and who lived contemporaneously with some of history’s other most famous figures like Julius Caesar, Mark Antony, and Augustus, looked up the huge stone monuments that were her country’s legacy, consider that they’d been constructed further back in history for her (she was born 69 BC) than she is back in history for us in 2018. There are a few human artifacts from as far back as 10,000 years ago, but they mostly amount to nothing more than fragments of pots.

But just because the program is unlikely to succeed on its 10,000 year mission doesn’t mean that we can’t try to improve its chances.

We have many artifacts from ancient humanity, but 10,000 years predates almost all of them.
We have many artifacts from ancient humanity, but 10,000 years predates almost all of them.

The program’s goal for longevity is extremely ambitious, so it’s engineered with a number of features that aim to protect it against the decaying forces of time and make it as minimally prone to failure:

  • It runs on a serverless architecture to insulate it against failures in underlying infrastructure. If a single server were to die, it would just be run somewhere else. Its platform will also get regular updates for security and stability.

  • That platform is AWS Lambda, a service provided by a big company (Amazon) that’s more likely than others to be long-lived. It also has a reliable history of not retiring products, and making relatively few breaking changes.

  • It has no persistent state of its own, and instead relies entirely on state returned from Twitter’s API. Databases are especially prone to aging and operational problems, and not including one improves the program’s chances.

  • In the spirit of production minimalism, there are very few moving parts: just the program itself, Twitter’s API, and the underlying serverless platform.

  • I’m using Go. As described in Go on Lambda, its 1.x series has a remarkable history of longevity and near perfect backwards compatibility. Even if Go 2 were to be released, I expect that there’s a good chance that my program would work with it.

  • Relatedly, Go is a statically typed language which means that the code I wrote is more likely to actually work compared to if it’d been written in an interpreted language where many problems only appear at runtime. I’ve also included a comprehensive test suite.

  • The program compiles down to a self-contained binary and won’t be as susceptible to breakage by a change in its underlying bootstrap or dependencies (compared to say Ruby, where an eventual upgrade to Bundler could mean that your program no longer starts).

Over this kind of timeline, the program faces many existential threats. One of them will knock it offline eventually, with the only question being: which one?

  • Maybe the most common of all failures is an application bug. I’ve tried to protect against this pitfall through testing, but I could’ve easily overlooked a subtle edge case.

  • Changes in Twitter’s API could spell the end. This would take the form of a backwards-incompatible change like a new required parameter, change in the structure of responses, or adjustment to how applications authenticate.

  • Relatedly, changes in Twitter’s product are also dangerous. They could move to a new pricing model, remodel the product’s core design, or fold as a company.

  • Risks on AWS are similar. There’s a minimal API that Go programs on Lambda use to communicate with the service, and that could change. The Lambda product could be retired. I’ve set up the program to be able to run only on free tier, but that could change, or the account it’s running under could become otherwise delinquent.

  • If left running long enough, the binary I’ve upload to Lambda might become incompatible with the underlying virtual and hardware infrastructure through changes in machine code or low level operating system APIs. It would need to be recompiled with a newer version of Go to work again.

I’d personally bet that it will be changes in Twitter’s API that will take the program down in the end. Their API has been stable for some time, but has accumulated its share of rough edges over the years. It stands to reason that Twitter eventually undertake a project to revitalize it, and the chances are that will be the end of the current API after some deprecation period that’s likely to span a maximum of a handful of years.

A core set of guiding principles were devised to help design the 10,000 year clock:

  • Longevity: The clock should be accurate even after 10,000 years, and must not contain valuable parts (such as jewels, expensive metals, or special alloys) that might be looted.

  • Maintainability: Future generations should be able to keep the clock working, if necessary, with nothing more advanced than Bronze Age tools and materials.

  • Transparency: The clock should be understandable without stopping or disassembling it; no functionality should be opaque.

  • Evolvability: It should be possible to improve the clock over time.

  • Scalability: It should be possible to build working models of the clock from table-top to monumental size using the same design.

The Long Now describe the principles above as “generally good for designing anything to last a long time,” and they are, even when it comes to software. It doesn’t take much creativity to rethink them as a set of values that could help guide our industry. I’d phrase them like this:

  • Longevity: Software should be written as robustly as possible to maximize longevity. Consider edge cases, test comprehensively, and use statically typed languages. Avoid dependencies that are complex or brittle.

  • Maintainability: Use frameworks that will make software easily maintainable by developers who come after you. Development should only require a minimal toolchain, and one that’s demonstrated a good history of stability and support.

  • Transparency: Write code simply and elegantly. Use abstractions, but don’t abstract so heavily as to obfuscate. It should be obvious how code works not only to you, but for any others who might read it in the future.

  • Evolvability: It should be possible to improve software over time. A good compiler and test suite should let future developers who aren’t deeply familiar with the existing code make those improvements safely.

  • Scalability: To ensure that production software will work properly, write an extensive test suite and deploy the prototype in high-fidelity pre-production environments before taking it live.

Software tends to stay in operation longer than we think it will when we first wrote it, and the wearing effects of entropy within it and its ecosystem often take their toll more quickly and more destructively than we could imagine. You don’t need to be thinking on a scale of 10,000 years to make applying these principles a good idea.

Post-mortem analysis

Updated April 14, 2023: entropy won. The official time in operation of this experiment was 4 years, 8 months. Not half bad, but a little short of the stated goal.

I got this email from Twitter today:

This is a notice that your app - 10000-years - has been suspended from accessing the Twitter API.

Please visit developer.twitter.com to sign up to our new Free, Basic or E> nterprise access tiers.

Free access to Twitter’s API is being disabled for the vast majority of applications. A free account is available for write-only operations, but the program above needs read access to make sure it doesn’t double-post. And updating the project in any way would contradict the spirit of the experiment anyway.

It turns out that writing software that can stand the test of time isn’t easy.

Did I make a mistake? Please consider sending a pull request.