Wolfram Cloud Page

We are looking at discrete systems, where we figure out where to go purely stochastically

What is one trying to learn, etc.?

Case 1: there is some action that one is trying to replicate in some other kind of system

https://www.wolframscience.com/nks/p833--intelligence-in-the-universe/

[ ~ genetic algorithms ]

Case 2: We have certain specific examples we want to reproduce, then the rest should be “sensible”

Questions

Even if we’re learning something simple, we may end up with a complicated to achieve that. [Either because it has to be that way, or because we are seeing frozen history]
“Irreducibility of changing our computational basis”

What can we learn?

When we are learning from examples, can we characterize generalization?

[Evolutionary biology] If we look at the population of evolvers, when do we get speciation? I.e. when are there “fundamentally different ideas” being used?

Given a certain set of xi->yi examples, how many can successfully be learned? What does superposition look like?
[A rule array of height t certainly can’t learn more than t arbitrary xi,yi pairs, and probably a lot less]

Objectives

Where does complexity in biology come from? [Irreducible mismatch between computational basis; or frozen history; or intrinsic complexity generation ([probably] not evolvable)]

Can we see speciation?

Why/when does machine learning work?

How can we characterize generalization/attractor basins etc.?

How “compact” is the learned array/network? Do different functions get combined into the same neurons by some kind of superposition? [What is the analog of this in biology?]

Representations

Can we make a multisystem colored by loss values? I.e. color in branchial space. When the blobs of colors have gaps that’s speciation.

Without crossover we have a tree of generated outputs
[neural nets often have the analog by adding weights (implicitly may assume a certain sparsity)]

At every step in the evolution, we can show the histogram of loss changes in (a sample of) the possible next configurations. [This is our analog of gradient descent] <Effectively we are just sampling a discrete set of possible directions, and picking the most downhill> <Calculus lets us get the answer without doing sampling>

[[ Ordinary ML: always keep a single representative; our version: can keep a stack of representatives (cf biology) ]]

Raw material

Global rulial evolution : e.g. homogeneous CA rules
e.g. string replacement rules

[Crossover might or might not be used]

Local rulial evolution:
small set of CA rules
small set of logic functions
[probability of one rule vs another]

Tasks

Live for a certain time, then die out [[ could happen just by having a “blocker line” ]]

Autoencoder [ e.g. for blocks of a certain size with translation ] [ initially: a single block ]

Distributed consensus [ density > 0.5 : all black vs. all white ]

MNIST

Count the blobs. (Is there just a single region of black in the input?)

Compact the 1s [i.e. sorting]

Biology / machine learning analogies

Generalization in ML vs biology:
In ML you are given certain examples where you know the outcome
In biology you have particular individuals where you know if they survive/prosper
Generalization in biology: “it worked for that individual; now it should work for the whole species”
(a particular finch could be more successful on a particular tree because it has a crooked beak)

Overfitting : lookup table : birds that work well only on particular trees