Machine Learning - 1980s style
Being an old gimmer means I mostly remember the 1980s. The colourful clothing (I mainly wore black), the big hair (mine was small and awful), the systemic racism (I am brown) and the casual violence (I was soft as shit).
It was also the age of the "home computer". I had a Vic-20 (3.5k of RAM!), followed by a Commodore 64 (48k) but most people had BBC Micros - so-called because they were designed for a BBC TV show and then the government made schools throughout the country use them for teaching.
Quick history lesson
BBCs were designed and manufactured by a company called Acorn who later produced the Archimedes which ran an amazing operating system called RiscOS. The OS was named after the new-fangled super-fast and powerful RISC chip that Acorn themselves designed and built especially for this next-generation machine. This was unusual as previous home computers used American made Intel 6502 processors.
Acorn renamed themselves Acorn Research Machines and then, following a partnership deal, that got shortened to ARM. You probably own at least one or two of their chips today, forty years on.
BASIC programming
Those home computers ran their own custom OSes and booted straight into a BASIC (the programming language) development environment. When I say "development environment" you had a command prompt but the shell was a BASIC REPL - like launching directly into irb
BBC BASIC had a number of in-built commands that made it easy to write colourful characters to the screen (if you've ever seen Teletext, that was the BBC's mode 7
display) whereas, on the C64, to get it to display anything fancy you had to write directly into the video RAM - using the fantastically named POKE command. In turn that meant to do anything beyond the basics (pun not intended but very fitting), you had to understand how the machine itself worked.
In fact the POKE command, writing directly to memory and knowing how the OS worked is vital to this story.
Machine Learning
In the 1970s Artificial Intelligence researchers released a program called ELIZA that blew people away. It was an AI psychotherapist - people would type out their issues and ELIZA would respond. It used basic pattern matching (essentially regex'ing the input string) to pick out the important words and then reformulating those words into the next question.
I hate my dog
Why do you hate your dog?
He bit the postman
Tell me more about your postman
My friend and I decided to type in an ELIZA listing from a magazine. The printed word was a powerful distribution method for software in those days. I think we were using his Acorn Electron (a cut down BBC micro that had less RAM - maybe 20k instead of 32k) and was therefore cheaper).
DATA statements
Home computers didn't have any long term storage. You would buy a magazine, type the code in yourself, then plug in a cassette deck and tell the computer to SAVE the current program to tape. Later you could LOAD your code back from the tape - if you had spooled the cassette to the correct place. Depending on the size of your code, this could take a while - some games took 20 minutes to LOAD.
So "just reading a file from disk" was out of the question.
Instead BASIC had a DATA statement - which was a longhand way of bulk loading data into memory. Rather than accessing a file on your non-existent disk, you hard-coded it into your program code.
You added DATA statements to the end of your program code - and when the program was told to, it would jump ahead, read the DATA statements into RAM and then return the execution pointer to the next line of your program. This ELIZA listing contained a series of DATA statements with the words that the fake psychotherapist would recognise.
After playing with the app for a while we got bored and thought it would be good if it recognised more words - especially words that you had previously used in that conversation. But in BASIC, when you defined an array it's size was fixed. And we didn't have the RAM to make copies as and when required.
But we did have the POKE command.
Learning about the Machine
We realised that when you typed on the keyboard, the machine didn't respond immediately but instead buffered your key presses so it could react when it had time (an event loop). And the keyboard buffer itself was just a 32 byte area of RAM that we could POKE to.
So when we wanted the program to learn a new word, we POKED the characters for a DATA statement into the keyboard buffer, followed by the RUN command.
And then terminated the program.
On termination, the OS read the characters from the buffer and inserted them at the shell prompt. This meant our DATA statement got appended to the current program. Then the OS read the RUN command from the buffer, inserted that at the shell prompt and the program started from scratch. On startup, the program would read its list of known words from the DATA statements - now including this extra line we had POKEd into existence.
And in that way we built our first ever dynamic machine learning application whose knowledge grew the more you used it.
Until it ran out of RAM.
Or you used a word that didn't fit into the 32 byte keyboard buffer.