One concern about artificial intelligence new developments is that it could
threaten humanity existence.

I will start to wonder if I am not just an unknowing simulation of a Nick
Bostrom fanboy. But, he, again, has the perfect a metaphor for that concern.
What if a system built by humans could run amok and destroy us just by being
untethered and too efficient at a menial task.

This metaphor asks us to understand that risks from artificial intelligence are
probably far way from what we picture from science fiction. They do not require
consciousness or volition to become real.

A Paperclip Maximizer

Imagine a system designed to produce paperclips. It is not very critical for
humanity, neither completely useless, we might build it. We might want to
optimize it. One day maybe, we would have an artificial intelligence running
this paperclip production. And for some reason, we would not be able to stop it.
We would have to let it build as many paperclips as it can. It would compete
with us for resources to build them, and after easily available resource
exhaustion, it would probably use humans as a source of primary material, and,
in the end the universe.

Let’s admit it: it is far-fetched. But isn’t it the point of a thought
experiment?

Humanity annihilating itself

It is not the first occurrence where humanity scares itself with its own power.
To the dismay of some of his colleagues, Fermi used to suggest bets about Earth
annihilation during the Trinity test. For now, we are still alive and well.

And nuclear power generates fears because it is good source of weaponry.
However, the interesting part about the Paperclip Maximizer is that it
highlights how problems can occur without anyone willing to make them.

Let’s hypothesize in our times for a moment

How far are we from such a scenario considering generative models?

What would you think of a worm trying to maximize Bitcoin production?

Risk assessment usually considers two dimensions around a scenario:

  1. Likelyhood
  2. Impact

Likelyhood

Bitcoin enabled something that did not exist at the era of the
first high profile computer worms: a
direct profit from available computing power. Hence, it is not just a cool kid
hacking tricks, it can become a very profitable endeavor. That gives us a
motivation.

Computer security used to be a secondary topic. Nowadays, it is a prime matter
for everyone in the industry. It is still far from perfect, some branches are
actually still lagging far behind. I am looking down at you, useless IoT
manufacturers! But still, even for the top performers maintaining updated
systems is a nightmare and zero days are common. But could generative systems,
like an evil chatGPT, alleviate that?

Impact

Mankind at war against robots usually involves robots … and so far, we do not
have that many of those. The way for software issue to become a physical world
threat is not that straight forward. I do not expect HAL to come knock on our
doors anytime soon. That gives quite a good moderation of the impact.

However think about what could go wrong if software only were to go wrong:

  • Payments systems, meaning basically a minimum of 80% of developed countries
    economic activities.
  • Utility networks.
  • Healthcare systems.
  • Transportation systems.

That gives us quite a scary mix still.

Architecture

A computer worm has two aspects:

  1. Propagation, from a system to another.
  2. Payload, what to run once on a infected system. Here the payload would be a
    crypto mining operation.

The utility function of our system here is how much it can mine. That is how it
would consider itself a good, or a bad performer.

How could propagation be optimized / adapted to our security aware world? Simply
give that problem to an evolutionary algorithm!

  1. If the system does not have AI learning capacity (space and computing power):
    go to 4.
  2. Crawl for an updated information security dataset. (subroutine 1)
  3. Train on that. (subroutine 2)
  4. Generate a new set of exploits. Maybe based on the top 10 threats that
    could be identified in step 2. (subroutine 3)
  5. Scan the local and remote networks.
  6. Use said exploits to propagate to X systems.

Do you want to improve that?

  1. Enable worm instances to communicate with one another.
  2. If a worm is performing in the top tier of its peers: reproduce more.
  3. If not: mutate (i.e. subroutines using our generative model). The mutation
    can use all kind of existing tools to check itself, static analysis,
    type checking, … in order to recover fast from erroneous code generation.
  4. If you do not become a top tier performer after a mutation cycles limit,
    replace yourself with another better performing peer program.

Conclusion

That leaves us with a few questions that may keep you awake at night. How fast
would such an imaginary program spread? Could humans react without feeding their
input to the worm that would fight back? Could we afford to unplug every
infected system from the internet?

If the generative models from half a year ago were not able to do that. What
about the current ones? And the future ones?

Read More