Could a Machine Fix Errors in its own Rationality?
This question was one I chose to tackle during my University degree, as part of a Philosophy of Cognitive Science module. The assessed component involved devising a suitable question to use as the subject of an essay. Rather than trying to answer the question, students were required to propose six possible theories which might apply.
compton, 7 November 05
Updated 26 November 07
1. Rationality is a Set of Rules, Fixed at the Time of Manufacture
Let us suppose that when our robot, which we shall call Plato, is manufactured, a set of rules is hard-wired into its memory. Faced with any situation, these rules designate certain elements as meaningful and others as irrelevant. The rules make predictions about on-going and future events, and enable Plato to determine the most successful course of action at any time.
Imagine that Plato is trapped in a room, and wishes to escape. Its sensors tell it of four rectangular areas on the walls which match its definition of door, and its rationality rules determine that to escape the room, it must go through a door. It picks a rectangle and proceeds through, finding itself in the new situation of being in free-fall towards the ground 30 feet below. A human observer would know that it just jumped through a window, but its rationality is concerned with the new problem of landing successfully, and can give no thought to the success or otherwise of its prior decisions. Faced with the same situation again, it would be just as likely to make a similar gaffe (assuming of course it survived the first time).
Plato is unable to perceive its flawed reasoning, and the only possibility for correcting it would lie with an external observer, and the hope of fitting some kind of new module with a better description of a door.
2. An Essential Part of Rationality is an Understanding of the Fallibility of any Rational System
The rationality is the part of an intelligent system which, when given a new situation, identifies familiar parts of that situation by analogy and metaphor with previous experience and facts known about the world. The rationality must indicate which of the various possible actions are most likely to lead to success, where the nature of success depends on the current goals and the type of situation faced.
Once the situation has unfolded, any system with true intelligence must gauge the success or otherwise of its actions. To be of any use, part of this ‘post-mortem’ process must involve appraising the rules and analogies which the rationality produced. The rules might be modified based on their perceived success, or new rules created and added to the existing set. Thus, the next time a similar situation is confronted, previous experience feeds into the process of forming the new judgement.
Rationality can be thought of as an iterative self-refining software activity, which means that over time, it should become more and more reliable as a means of understanding and interacting with the world. Flaws may never be completely removed, but it tends closer and closer to a “flawless ideal” with continued use.
A drawback of this solution is that it raises the new problem of determining the degree of success of past actions, and thus the changes that should be applied to the rationality. This process is itself one of rational criticism, and so it follows that continual improvement cannot be guaranteed. In fact, any self-correcting software system such as this runs the risk of introducing a far more serious and possibly irreparable bug into its program than the one it was trying to fix.
This issue could be mitigated, if not removed, for instance by ‘marking’ new rules as untested, or perhaps by including some rules which are ‘hard-wired’ and may not be changed, such as a requirement that any action must lead to a situation where the robot still exists. If such hardware rules are not included, then it seems one cost would be the risk of the robot’s rationality diverging from the ideal – it ‘goes mad’.
As the rationality determines not only how the system should react to a situation, but also which elements of the situation should be focussed on, a faulty rationality is likely to ignore key elements of some situations.
We could enhance this self-refining software model in an attempt to avoid such blindspots, by having Plato focus on random elements of a situation in addition to those indicated by its rationality.
Clearly however, this is not going to be a very effective learning strategy given the huge number of elements that compose real world situations.