Abstract
It is hardly possible to pick up a computer journal today without seeing references to the Alvey project in the UK, the Esprit project run by the EEC, the Japanese “Fifth generation” initiative, the proliferation of knowledge-based computer companies around Stanford in California and MIT in Massachusetts, and the large amounts of money that are being poured into the development of expert systems. The references abound: as I write this article the September 1984 issue of “International Management” (1) arrives on my desk with a cover depicting a digitized human brain image on a computer screen and a caption saying “Artificial intelligence: the race to make it work for managers”. I like the use of the word “race” here; it seems to imply that this is our last chance now that the prospect of human intelligence in managers has declined asymptotically to zero.
Among the savants cited in the international Management article is Sir Clive Sinclair, no less, who has become notorious with the British Computer Society's Expert Systems Specialist Group: notorious for arriving unexpectedly at meetings and giving the rest of the audience something to think about. Sir Clive apparently said in the interview: “Once 'machines of silicon' surpass us, they will be capable of their own design. In a real sense, they will be reproductive, and silicon will have ended carbon's long monopoly. And ours too, I suppose, for we will no longer be able to deem ourselves the finest intelligence in the known universe.”
Maybe expert systems will dominate us after all — it's just that there'll be a ten-month delay between quoting your Visa number and becoming an inferior intelligence.
All this is quite fascinating since the phenomenon is very recent: before 1983 the phrase “expert systems” was largely unheard of and certainly not in itself viewed as the potential salvation of the Western world (there are as yet few hints of Eastern bloc development apart from in Hungary, where the University of Budapest, along with those of Marseilles, Edinburgh and London's Imperial College, claims to have invented PROLOG). However, by now there has been “Horizon” television coverage of the genre and there seems little doubt that expert systems are here to stay. But before I take up the theme of my title, perhaps I might be allowed to state my position on one or two related matters. First of all: “Expert systems are not the same thing as artificial intelligence”
We can see the beginnings of a bandwagon effect here in which technical namedroppers, bored of stating that “UNIX is the coming thing” or “The future of computing lies in networking” are catching up with this latest trend: “Life is just another form of domain specialism”. Expert systems are in fact a very narrowly defined class of computer program with certain in-built inference-drawing capabilities. Artificial intelligence, on the other hand, is a much wider field involving the generic search for synthetic rational life. Much of the most interesting work in artificial intelligence does not involve digital computers at all but concerns the possibility of producing artificial models of brain process, using special fabrication techniques to emulate neurons, axons and synapses. Which brings me to my second definition: “Artificial intelligence will not be implemented on digital computers”
It turns out that the use of a digital clocked device, irrespective of speed, imposes a fatal limitation on the sophistication of a genuine search for artificial intelligence, and parallel processing makes no difference. The reason is that in the brain it is the dynamic connections between at least 10 to the power 10 asynchronous independent processing elements that provide the kind of complexity that is required. The processors in the brain are called neurons and the interconnection are called axons, and there is substantial evidence to indicate that to a first approximation, almost any neuron can grow an axon to connect to almost any other neuron.
Now the number of connections between 'N' elements increases in proportion to the square of 'N'. If you have three neurons you can connect A to B, A to C, B to C, and the same in the other direction, which is valid since it corresponds to a negative signal (inhibition vs. excitation). Consequently we have 6 possibilities for the network topology: i.e. 3x2. For 4 elements we have 12 options: 4x3. If we had a mere 100 neurons the result of 100x99 is still only about 10,000. However, in the brain we are considering about 10,000,000,000 neurons. So the number of possible connections is 100,000,000,000,000,000,000 — a number of literally astronomic proportions. We haven't yet introduced the complexity of neurons which connect to more than one other neuron (the majority do), nor signal processing within the neurons themselves (estimated as a seventh order differential equation), nor chemical effects at the synapses, which are retransmission points along the axons. When you drink too much alcohol, it is primarily the synaptic effects which alter your responses. I am unconvinced that there yet exists an artificially intelligent program capable of intoxication.
If you now consider the implications of asynchronous (but not parallel) operations taking place between arbitrarily connectable elements on this scale, you begin to see that even a billion 64-bit microprocessors would make no difference at all. The emphasis in the brain is on analogue circuits in which feedback and frequency of signal are the determining factor and where there is no single location at which processing can be said to be taking place; the whole of the brain is in this sense intelligent.
When I asked the Alvey representative at a recent meeting of the Expert Systems Specialist Group (yes, Sir Clive was there too) why there was so little emphasis on neurophysiology in the Alvey program (I was uncharacteristically diplomatic as in fact there is none at all) his response was that the program concerned the “hard” sciences and engineering: the bridge to biology was too narrow for engineers to cross. I believe he was genuinely rueful. However, if your appetite is in any way whetted for “real” artificial intelligence, you had better go out and buy two rather extraordinary books. One of them is, of course, Hofstadter's “Godel Escher Bach: An Eternal Golden Braid” (2). The other is less well-known: William T. Powers's “Behaviour: The Control of Perception” (3).
In Hofstadter's book the notion of recursion is explored in many delightful ways. For those who have missed it, here is a sampler. Our heroes, Achilles and the Tortoise, have found a lamp. They rub it and a Genie appears. The Genie will grant three wishes. This prospect is greeted with glee by Achilles, who has read the 'Arabian Night' and knows the form. He makes his first wish: Achilles: “I wish that I had 100 wishes instead of just three.” Genie: “I am sorry, Achilles, but I don't grant meta-wishes.” Achilles: “I wish you'd tell me what a 'meta-wish' is.” Genie: “But that is a meta-meta-wish, Achilles, and I don't grant them either.” But after some pleading, the Genie agrees to try (meta-wishes are his favorite wishes too) and from inside his cloak he produces a lamp. He rubs it and out pops a MetaGenie: Genie: “I wish for permission for temporary suspension of all type-restrictions on wishes, for the duration of one typeless wish.” MetaGenie:“Half a moment please (produces a lamp, rubs it, etc.).” This continues for a short infinite period until we get to the top-level MetaGenie, also known as GOD, who grants permission for the typeless wish: Achilles: “I wish my wish would not be granted!” (There follows a SYSTEM CRASH: or as Hofstadter puts it, “the land of dead hiccoughs and extinguished lightbulbs”). Earlier, when quizzed on what the word GOD means, the Genie explains that it's an acronym — G.O.D. — standing for “GOD OVER DJINN” (Djinn being a generic type of intermediate Eastern Deity). Does this help? Ah yes, because we can expand it: GOD = GOD OVER DJINN = (GOD OVER DJINN) OVER DJINN = ((GOD OVER DJINN) OVER DJINN) OVER DJINN ………… If this is starting to look familiar, it's because it is. A deep knowledge of the relation of a system to its metasystem and of the closely connected notion-of recursion is an essential pre-requisite for serious work in artificial intelligence. In fact I suspect that: “Artificial intelligence = recursion + superficial incompetence” What about the other book I referred to: “Behaviour: The Control of Perception” (or as they write it in the USA, “Behavior: The Control of Prception”)? In my estimation this book ranks with Norbert Wiener's “Cybernetics” (4) and Stafford Beer's “Brain of the Firm” (5): together they form the twentieth century triumvirate in this field. Powers starts off with some deceptively simple experiments and ends up with a necessary but not sufficient model of human intelligence based on first principles and an appeal to reason. The argument is convincing and is closer than any other I have read to the true core of artificial intelligence. Significantly, I don't recall in the book a single serious mention of digital computer programming.
Powers explains that the basic circuit elements in organic brains are not digital at all. They are analogue, but they use frequency not voltage as the yardstick. If 10 neural impulses per second flow in an axon, that may cause (at the lowest level) a certain amount of contraction to be applied to a single muscle fiber. 20 impulses per second might double the contraction force. However, the key point is the perception of all this. Feedback from local sensors is compared against a reference level for the desired state, and the output signal is adjusted accordingly. This compensates for unexpected resistance to the muscle's work — and if you think about it, you've never lifted a weight by applying linear pressure in your life. Literally millions of low-level frequency sensitive feedback circuits create what Powers calls Level 1 — the interface to the outside world at the sensory/motor level.
Then we start to encounter hierarchy. Where do the reference levels for the desired states come from? Well, these are the outputs from Level 2, but this time they correspond to a higher-level notion: like that of vertical vs. horizontal movement. By positing a hierarchy of feedback circuits whose outputs become the reference levels (i.e.. control signals) for lower level circuits, Powers will lead you to agree that when you are walking at 2 miles per hour, there is an identifiable, traceable neural frequency in your brain which doubles in frequency when you walk at 4 mph. And if you find that hard to swallow, by the time you reach higher levels of hierarchy we are talking about frequencies which correspond to degrees of emotion. Of hate. Of love. To sleep, perchance to dream… “Do Androids Dream of Electric Sheep?” (6).
I suspect I have strayed somewhat from the title of this article. But I do think it's important to set the context. If you want some fun with computers, start playing with expert systems. If you want to learn something useful about artificial intelligence, start reading some books. The two are not the same and I personally doubt that insights from one much help in the assimilation of the other.
Perhaps we should focus on the main theme. This gives me a small problem since the arbitrary reader of this article exists in four configurations: you know about APL (or you don't) and you know about expert systems (or you don't). Here I am going to pitch the discussion slightly in favor of someone who knows a bit about APL but nothing about expert systems, although I hope there will be some comfort for the others here and there.
Let's recap on APL's strengths in this area. We have a language which supports recursion with no limits apart from workspace size. Local variables and arguments to recursive functions are properly masked and pushed to a stack on each recursive invocation. Logical operations in APL are generally implemented at data word bit level and are very comprehensive and very fast. Arrays of arbitrary dimension are supported and most primitive operations apply element-wise without looping. APL is a threaded interpretive language in which individual building blocks are defined and can be invoked in a free-standing manner (like LISP). If LISP suffers from parentheses, APL suffers from silly specialist symbols, but fortunately this last aspect is starting to disappear in the more enlightened implementations. APL is very good indeed at numeric computation of high complexity in both the scientific and the commercial domains, and in addition it is fast and elegant at string handling.
Standard APL disallows non-orthogonal arrays and those of mixed character and numeric type. This is undoubtedly a disadvantage for expert systems work but is overcome in APL2. However, I haven't used APL2 seriously yet, like most other APLers I suspect, and shall confine my remarks to traditional (or as my American friends would put it, “vanilla”) APL. So the question becomes: “What has PROLOG got that APL hasn't?”
The answer, surprisingly to APL aficionados, is: quite a lot. However, perhaps surprisingly to PROLOG programmers, it turns out that one can code all the standard PROLOG backtracking and similar logic capability into APL user-defined functions (i.e.. into a subroutine library). You then end up with an APL workspace oriented towards the development of expert systems, but with the great advantage of carrying with it all the “hooks” and existing facilities which have been developed in APL over the last 20 years.
Evidently PROLOG users don't mind that they have no proper filing, no error control, no serious peripheral control, no multi-process synchronization, no inter-user communication, no user or file security, no computational intrinsics beyond add, subtract, divide and multiply, no interpreter available on IBM mainframes, no links to COBOL and other languages, no screen control, no graphics capability, no links to devices like “mice” … It makes it fun to develop expert systems in PROLOG, but awfully hard to sell them. So far I keep seeing demonstrations of PROLOG expert systems telling me how to water my houseplants, but the florist down the road isn't selling the package yet.
Let's look at the framework of an expert system in APL terms. This path has been trodden before (7) but I'd like to look at the problem in a slightly different way. To make it more practical, let me tell you about an expert system I have been writing and playing with. It's called GENESIS 4. This sounds like a reasonable name for a package: GENESIS Release 4.0; but actually this is to do with the Bible (8). Not that I am particularly religious by nature; it's just that the Bible predates “Behaviour: The Control of Perception” as a functional specification of humanity by about 5000 years, so it's of interest to all artificial intelligence buffs.
All expert systems have three chunks.
The SITUATION MODEL is a whole load of information about (in this case) Adam's descendants, their marriages, their progeny and the main events in their lives.
The KNOWLEDGE BASE is a set of rules which indicate the laws of genealogy. For example, “A is offspring of B if B begat A” and so on.
The SYSTEM MANAGER is a set of programs that allows you to explore the situation model, ask it questions, and ask it to draw conclusions: for example, “Is Z descended from A?”. Clever system managers sometimes try to propose rules by looking for patterns in large amounts of data in a situation model.
That's all.
What I will do now is re-write parts of Genesis Chapter 4 as if I were entering lines into an APL-based expert system, trying to keep it as much like PROLOG as possible. The numbers in the margins are the verses.
1. “And Adam knew Eve his wife; and she conceived, and bare Cain, and said 'I have gotten a man from the Lord'.”
ADAM IS-MARRIED-TO EVE
CAIN IS-OFFSPRING-OF ADAM
CAIN IS-OFFSPRING-OF EVE
CAIN IS-SEX MALE
2. “And she again bare his brother Abel.”
ABEL IS-OFFSPRING-OF EVE
ABEL IS-SEX MALE
8. “Cain rose up against Abel his brother, and slew him.”
CAIN KILLS ABEL
16. “Cain went out from the presence of the Lord, and dwelt in the land of Nod, on the east of Eden.”
CAIN INHABITS NOD
NOD IS-EAST-OF EDEN
17. “And Cain knew his wife, and she conceived, and bare Enoch: and he builded a city, and called the name of the city, after the name of his son, Enoch. And unto Enoch was born Irad: and Irad begat Mehujael: and Mehujael began Methusael: and Methusael begat Lamech.” CAIN IS-MARRIED-TO MRS.X ENOCH IS-OFFSPRING-OF CAIN ENOCH IS-OFFSPRING-OF MRS.X CAIN BUILDS-CITY-CALLED ENOCH IRAD IS-OFFSPRING-OF ENOCH MEHUJAEL IS-OFFSPRING-OF IRAD METHUSAEL IS-OFFSPRING-OF MEHUJAEL LAMECH IS-OFFSPRING-OF METHUSAEL
There's a lot more of this sort of thing. I know that the Bible is now available in compressed text form on floppy discs as “THE WORD Processor” (9) but I'll wager that the logical inferences remain unencoded.
What can we do with APL so far? Well, we need to encode this information into APL data structures. There are many ways of doing this, of course. An obvious approach is to write a little parser which takes statements like LAMECH IS-OFFSPRING-OF METHUSAEL and divide them into their different text strings (3 in this case). We start building a text matrix with a unique instance of each name. When a statement comes in, we look up the row number of each string or we add the new string to the end of the matrix. So the sequence:
MEHUJAEL IS-OFFSPRING-OF IRAD
METHUSAEL IS-OFFSPRING-OF MEHUJAEL
LAMECH IS-OFFSPRING-OF METHUSAEL
would be represented by these two matrices:
MEHUJAEL 1 2 3
IS-OFFSPRING-OF 4 2 1
IRAD 5 2 4
METHUSAEL
LAMECH
Actually it's slightly trickier than this as we shall need a flag bit to indicate which entry is a relationship and which is a noun, and they aren't always binary relationships, and the arguments aren't always on each side. But the basic idea's the same. A simple parsing function in APL is about 5 lines of code, and if we call it ADD it really does look like PROLOG.: ADD 'LAMECH IS-OFFSPRING-OF METHUSAEL'
So far, so trivial, but we've done something epoch-making: we've defined our first function in XPL, which is the name I have chosen for this set of functions, written in APL but using ASCII character set throughout. Now we want to interrogate the database:
“Is Lamech the offspring of Methusael?” DOES 'LAMECH IS-OFFSPRING-OF METHUSAEL' TRUE
“Is Irad the offspring of Mehujael?” DOES 'IRAD IS-OFFSPRING-OF MEHUJAEL' FALSE
Funny syntax, isn't it? I'm trying to stick to PROLOG though, and this is it. Now is it mindbogglingly hard to code this in XPL? Not really: function DOES looks to be at least one line long: it has to find a particular row of integers in the matrix. Of course things get tougher in PROLOG quite quickly:
“Who is the offspring of Methusael?” WHICH 'X:X IS-OFFSPRING-OF METHUSAEL' LAMECH
“Who is the offspring of Lamech?” WHICH 'X:X IS-OFFSPRING-OF LAMECH' NO MATCH
This time not only do we have to find a particular row of integers in the numeric matrix, we also have to work out who X refers to and find him in the list of names. Fiendish stuff. Still, I think we can just about cope. Let's move onto the knowledge base. Here's some useful jargon. You and I are “knowledge engineers” in the context of our translation of the Bible into XPL format. Priests are “domain specialists”: Heaven and Hell is their domain. To this I would add that people who create expert systems languages, as we are doing at this moment, are presumably “metadomain specialists” (I believe this is an original coinage).
It gets boring telling XPL everything twice. For example:
ADD 'LAMECH IS-OFFSPRING-OF METHUSAEL' ADD 'METHUSAEL IS-PARENT-OF LAMECH'
How about defining a rule instead?
ADD 'X IS-PARENT OF Y IF Y IS-OFFSPRING-OF X'
and while we're about it:
ADD 'X IS-OFFSPRING-OF Y IF Y IS-PARENT-OF X'
Not too hard to handle either is:
ADD 'X IS-HUSBAND OF Y IF X IS-MARRIED-TO Y AND X IS-SEX MALE'
Now try:
ADD 'X IS-GRANDCHILD-OF Y IF X IS-OFFSPRING-OF Z AND Z IS-OFFSPRING-OF Y
We've introduced the concept of “place markers”, which are analogous to the local names of the arguments of functions, except that the functions are relations (data) rather than programs. There are many ways to incorporate these into XPL and your own ideas will undoubtedly be better then mine: I use negative numbers in the integer matrix to denote place markers:
MEHUJAEL 1 2 3
IS-OFFSPRING-OF 4 2 1
IRAD 5 2 4
METHUSAEL -1 6 -2
LAMECH -1 2 -3
IS-GRANDCHILD-OF -3 2 -2
The functions DOES and WHICH now become rather more interesting. Suppose we ask the question:
DOES 'ENOCH IS-GRANDCHILD-OF ADAM'
DOES now has to go scouting around to see if this is a bit of known data. If not, it goes to see if it has a rule-based definition for the grandchild relation. If it does, it has to find the first match for Enoch based on the offspring relation, and then pause while it tests whether this match in turn relates to Enoch. If not, it can resume looking for further matches for Enoch. It goes without saying that an XPL function employing recursion is the elegant (possibly the only) way to implement this process. It isn't very difficult to do.
Here's a more interesting rule:
ADD 'X IS-DESCENDANT-OF Y IF X IS-CHILD-OF Y' ADD 'X IS-DESCENDANT-OF Y IF X IS-CHILD-OF Z AND Z IS-DESCENDANT-OF Y'
Recursion groupies will see the classic syntax: the first line denotes the limiting condition, the second line forms the recursive relation. The extensive use of recursive tests based on goal-seeking requests from the user is termed “backtracking” for obvious reasons. The standard mechanism of the state indicator in APL handles such requests beautifully.
I hope that those with some APL experience can see that it is not inordinately tricky to implement these aspects of PROLOG in user-defined APL functions. After a while the imagination starts running riot:
ADD 'X PROBABLY IS-CHILD-OF Y IF X IS-CHILD-OF Z AND Y IS-MARRIED-TO Z'
I suppose there's always the milkman. Adverbs like DEFINITELY, PROBABLY, POSSIBLY, CONCEIVABLY (oops) and INCONCEIVABLY can conveniently carry probability values like 1.0, 0.75, 0.5, 0.25 and 0; if you want to change the climate of expert opinion you can tinker with the values. And don't forget NOT:
ADD 'X IS-SEX FEMALE IF X NOT IS-SEX MALE'
Well, we don't all live in San Francisco.
What about expert systems themselves? If you recall the three elements:
then it turns out in languages like PROLOG that the preceding constructs are used on an intermixed basis to represent data in the situation model and rules in the knowledge base. In XPL one feels it may be cleaner to keep them separate, but that's a matter for personal taste.
More importantly, most PROLOG users don't seem to have caught on to the fact that much of the data in the situation model may well be a transform of, for example, data about patients in a medical file whose records have been painstakingly composed over many years. File access to other systems is imperative; most of the facts are already there.
Furthermore, although PROLOG is good at relations, it's very bad at numerical tests. An expert system for shop floor control needs to cope with quantified tests based on quasi-real-time input from process sensors: for example:
ADD 'TURBINE SWITCH-SETTING OFF IF RPM ABOVE 5000' is as far as I know impossible to handle in PROLOG if RPM is intended to be directly linked to the real world, whereas in APL one simply makes RPM a variable shared with an auxiliary processor which handles direct data capture. For non-APLers this means: variable RPM in the APL workspace is dynamically coupled via programmer-specifiable update protocol with an arbitrary machine code segment, which is itself either handling process input/output or is talking to some other process that is. The point is that a robust and very well established framework for this kind of dynamic linking already exists in APL.
I hope I have managed to whet two kinds of appetite. If you know APL already, you may be starting to think about building your own XPL functions (mine are for private consumption and you've had numerous hints already) and writing your own expert system. If you know about expert systems and are toiling with inadequate tools, you might just think about looking at APL instead. An unlikely marriage? I don't think so. No more than Cain's, anyway; the identity of Mrs X is the theological riddle of the millenium. Good luck with such experiments: may thy domain specialties be brought forth and multiply upon the interface of the digital computer.