On Computer Transcription of Manual Morse-Reference-Cited by-同舟云学术

On Computer Transcription of Manual Morse

Published:1959-07 Issue:3 Volume:6 Page:429-442
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Blair Charles R.¹

Affiliation:

1. Department of Defense, Washington, D.C.

Abstract

A radio telegrapher can, by operating a key, turn a transmitter on or off for any desired period. International radio-telegraph (Morse) code is predicated on controlling these parameters of key position and duration of the signal. The messages to be sent are represented by a sequence of five elementary symbols: the dot, dash, intra-character space, inter-character space, and inter-word space. In terms of an arbitrary time unit determined by the sender 1 , a dot results from closing the key for one and a dash for three time units. An intra-character space results from opening the key for one unit, an inter-character space for three, and an inter-word space for seven time units (see table 1). All letters, numerals, punctuation marks, and several brevity codes are represented by sequences of dots, dashes, and intra-character spaces between successive occurrences of inter-character or inter-word spaces. The sequences that are used in international radio-telegraph are shown in table 2. The letter “A”, for example, is represented by an inter-character space (shared with the preceding character) followed by a dot, an intra-character space, a dash, and an inter-character space (shared with the succeeding character). Thus any message that can be transmitted by a sequence of characters can be broken down into a sequence of the five primary elements for transmission by radio-telegraph. If manual Morse senders transmitted accurately, it would be easy to construct equipment to copy them. Indeed, just such devices have been constructed for signals sent by machine. Most human operators, however, are unable to control the duration of elements with sufficient precision to be copied by these. An idea of the variability of durations formed by a manual Morse operator can be gained from figure 1. Although all of these patterns are a single operator's sending of the same symbol (a question mark), a great deal of variation is apparent in their formation. Despite these variations, the basic structure of the symbol, that is, the · · —— · · pattern, is immediately evident to both the eye and the ear. This remains true, although the variation is greater, when symbols sent by different operators are compared. To make a machine transcribe Morse as accurately as a human being, we must find some means of conserving the basic information conveyed by the form of the pattern and eliminating the effects of the nonsignificant variations. Many attempts have been made to construct a machine that will automatically transcribe hand-sent Morse into printed copy. 2 In the past, each new proposal has been tested by constructing an operating model of the device. The results of these efforts have been uniformly disappointing, with resultant waste of time, talent and equipment. At the outset of our research in this field, it seemed clear that we could learn a valuable lesson from our predecessors. Clearly, we needed a flexible means of implementing translation techniques to insure that no great loss would accrue from the changes that would undoubtedly be necessary as we better understood its shortcomings. Second, previously constructed devices have suffered from faults that, owing to the nature of the equipment, could not definitely be assigned to engineering or logic. We determined, therefore, to solve the problems of logic independently of those of engineering. These considerations led to the decision to attempt to simulate manual Morse transcription devices on a general-purpose digital computer. 3 A really good manual Morse operator can transmit about 35 words per minute. Thirty-five words per minute corresponds to approximately 28 short elements a second (a short element is a dot or an intra-character space). At this rate, therefore, the shortest duration is about 36 milliseconds. A digital computer which requires 100 microseconds to perform its simple operations (add, etc.) is considered to be moderately slow; yet a computer which operates at this rate is capable of performing 360 instructions during the duration of the shortest element at the highest speed that one normally expects to encounter in manual Morse. Most Morse transmissions are substantially slower than 35 w.p.m., and many computers have substantially faster operation times than 100 microseconds. It seemed reasonable, therefore, to expect a computer to have the speed capability of transcribing manual Morse. In order to use a digital computer for transcribing Morse code, we had to devise a method of converting the keying information into a numerical form which represents the variables of key position and duration. To minimize programming complications, we decided to use an external device to convert the audible signal into a facsimile of the original key operations (i.e., a demodulator). For laboratory-type signals (no noise, fading, etc.), there are demodulators which are able to perform this conversion accurately. The state of the key can be specified by a single binary digit; the duration can be designated by the number of time units 4 that the key has remained in a given position. 5 The values of these two parameters arranged in time sequence convey the information contained in a radio-telegraph signal. 6 There are several ways of converting the original signal into numerical form. First we shall consider the method which requires a minimum of special purpose equipment. Assuming the machine has a conditional jump which is contingent upon the position of an external switch—a facility available in nearly all digital computers—we need only attach the contacts of the telegraph key or demodulator relay in parallel with the contacts of this external switch. The program is written so that the elapsed time between successive conditional jumps, regardless of the path taken, is constant. The amount of time that the key remains in a given position is determined by “sampling” (performing the conditional jump) at these periodic intervals and counting the number of samples between successive changes of position. 7 For example, prior to the beginning of the message, the key is open. The operator closes the key to send the first element of the first character and the next conditional jump detects that the key has been closed; this jump causes the program to add one to a previously cleared counter and, after a predetermined period, sample the key again. If the key is still closed, the program again adds one to the counter and proceeds as before. Several milliseconds later the operator opens the key to indicate the end of the first element of the first character of the message; the next conditional jump detects the change. At this time the accumulated count is the number of time units that the key remained closed; the program stores this value in the first position of the keying time sequence, clears the counter and resumes sampling to determine the duration of the succeeding key-open. The key-open duration is measured in precisely the same manner and stored as the next item in the list. The process continues until a very long key-open indicates the end of the transmission. Clearly, this procedure will convert the key information into a sequence of numbers which indicate the successive states of the key with the duration of each state. 8 The necessity for periodic examination of the key becomes quite burdensome. For, in order to perform all the functions involved in transcribing a signal into printed copy, the program must have many alternative paths. If the machine is to sample the key periodically, it is necessary to insure that every path that can be traced through these alternatives requires the same amount of time, so that the difference between successive samplings is constant. Although this can be and has been done, it places a very severe burden on the programmer. Recall that the computer must sample the key periodically only because it must maintain durations in terms of a fixed time interval. Programming is made far simpler by supplying an independent clock which is advanced periodically and which can be examined by the program to determine element durations. Of course, it is still necessary for the program to examine the key frequently. However, it is considerably less difficult to write a program that samples the key about every 10 milliseconds than it is to write a program which must sample it exactly every 10 milliseconds. The addition of a program interrupt when the key changes state would further simplify programming. The distinguishing, and crucial, characteristic of manual Morse transcription schemes is the method of determining which element is to be assigned to each duration. The sequence of durations is, in effect, a numerical representation of a set of approximations to the five basic elements. Assigning these approximate patterns (durations) to the “ideal” patterns (elements) which they represent is a one-dimensional pattern recognition problem. Virtually each attempt to transcribe Morse has used a different procedure for assigning durations to elements. Even a cursory discussion of each of these would lengthen this paper considerably; we shall limit ourselves, therefore, to the class of discrimination techniques which includes our final proposal. This class of assignment processes assumes that durations associated with the same element tend to be approximately equal and that durations of different elements differ by an amount greater than the variation between durations representing the same element. Figure 2 illustrates the characteristic distributions of Morse elements produced by a typical operator. The abscissa, calibrated in milliseconds, gives the range of durations observed in the transmission of a message. The ordinate represents the number of times that a given duration was observed in the transmission of the message. The values above the center line represent key-closed durations; those below are key-open durations. From this arrangement we see that, although the operator has not associated a unique time interval with each of the elements, he has managed to group durations associated with each element into a cluster. In the key-closed range, for example, there is a cluster of values on the left representing dots, and on the right, another cluster representing dashes. In the key-open range there is a cluster, approximately under the dot distribution, representing intra-character spaces. Under the dash distribution, a second cluster represents inter-character spaces; and further to the right, a very poorly defined cluster represents inter-word spaces. To assign each duration to the correct one of the five basic elements, we must devise a method of determining which observations belong to each cluster. It is evident from examination of the typical distribution in figure 2 that the separation between dots and dashes can be made with relative ease by taking a value lying between the two distributions as the dividing line. This dividing line will be called the discrimination value. And key-closed element which has a duration less than the discrimination value will be called a dot; and any other will be called a dash. Examination of the distributions furnished by a number of operators indicates that this example is typical; that is, there is almost always a clear-cut separation between the two distributions of key-closed elements. 9 The situation is quite different for the key-open distributions. The dividing line between intra- and inter-character spaces is difficult to ascertain and the line between inter-character and inter-word spaces is even more difficult. Clearly, an excellent discrimination technique is needed to determine the assignment of key-open durations to the correct elements. The sample distributions shown are taken from a single operator over a fairly short span of homogeneous text. The parameters of the distribution can change quite markedly with the passage of time (from fatigue for example), or when the message form changes (say from English to random text) or when another operator takes over the circuit. These considerations eliminate the possibility of using fixed thresholds to separate the elements. One might consider the possibility of allowing a person to control variable thresholds on the basis of the distributions and the printed text (and our experience indicates that this approach is possible) but it has the considerable disadvantage of requiring continuing supervision. 10 We have concentrated on simulating a far more useful device which can transcribe Morse without assitance. Thus, it was necessary to devise automatic means of determining the separation between duration distributions. The basic discrimination technique is an iterative process for calculating a dividing line such that the “distance” in lower distribution standard deviations from the mean of the lower distribution is equal to the “distance” in upper distribution standard deviations from the mean of the upper distribution. In practice, it is convenient to perform the equivalent process of finding the discrimination value for which the averaged “distance” (which we call the “goodness of separation”) is maximized. Initial dividing lines are based on the expected behavior of a Morse transmission. For example, in random text, about 54 percent of the key-closures are dots (46 percent dashes), and 68 percent of the key-opens are intra-character spaces (32 percent inter-character). If the text is sent in five letter groups, 20 percent of the inter-character spaces are also inter-word spaces. Similar calculations made from other types of text give values which do not differ greatly from these. (English, for example, has 61 percent dots and intra-character spaces and about 20 percent inter-word spaces.) Since these values are only used as a first approximation, the effect of the small variations is negligible. To separate two distributions, the computer uses the initial dividing line to calculate the mean, X , and the standard deviation, S , of the two distributions; then computes and stores the “goodness-of-separation” statistic, 11 k = ( px u - qx 1 ) ÷ ( ps u + qs 1 ). The dividing line is moved down one unit and k is recalculated; if the new k is greater than or equal to the previous k , this step is repeated. When a new k is found that is less than a previous k , the computer repeats this process while moving the dividing line upward until the value of k is again reduced. It then uses the dividing line which gave the previous k (i.e., the maximum value of k ) as the line separating the distributions. The behavior of k for a typical pair of distributions is shown in figure 3. Note that this procedure assumes there are exactly two distributions; the key-open measurements, however, come from four distributions. The effects of two of these distributions are minimized by using the a priori estimates to eliminate these observations. Since the easiest separation (determined empirically) is intra- from inter-character, this is done first with the upper 20 percent of the observations removed. Using the separation line so determined to remove the effect of the intra-character distribution and the 80th percentile to eliminate extremely large observations (representing the undefined but existing inter-line durations), the computer determines the value which divides inter-character from inter-word spaces. Finally, this separation is used to eliminate the effects of the lower distributions from the attempt to separate inter-word from inter-line spaces. For effective transcription, the computer must recognize the point at which a large change in sending speed occurs so it can transcribe the information prior to the shift using discrimination points based on durations occurring before the change and transcribe the succeeding signals using discrimination points formed from new durations alone. In order to do this, the computer continually monitors the “goodness-of-separation” statistic and, whenever this drops below a preset value, indicating that the current samples are drawn from populations having different parameters, stops revising the discrimination points but continues to read more information and to calculate the “goodness-of-separation” values until the minimum occurs when half the observations are from one population and half from the other. The old discrimination points are then used for durations before the minimum and the duration immediately after the minimum is treated as if it were the first element of a new message. 12 This type of discrimination process can be economically realized by means of an unconventional, special-purpose analog device. Once the sequence of elements has been determined, its transcription into the characters it represents is accomplished by a straightforward recoding and table look-up. To recode, the machine examines the elements in order. For each dot, 1 is added to a counter; for each dash 2 is added. At each intra-character space, the counter is multiplied by 2. When an inter-character or inter-word space occurs, the value which has been formed is used as the entry address of a table which determines the character this sequence of elements represents. The relationship between the values so formed and Morse characters is given in table 3. Occasionally the machine produces a number which is not associated with any of the defined Morse characters. Clearly, either the sender has made a mistake, or one of the elements has been incorrectly assigned. In this case, the machine could print some special symbol as an indication that a character is unassignable; but this symbol does not convey all the information that the machine has about the garbled character. It is far more desirable to produce the best possible transcription under the circumstances together with an indication that it has been necessary to perform a special process to obtain it. 13 An incorrect assignment of a word or an end-of-line space cannot cause an incorrect character assignment. An incorrect assignment of a dot or a dash, as can be seen from the keying-time distributions, is very improbable and, since it usually produces a possible character, would not normally be detected. Interpretation of an intra-character space as an inter-character space although more probable, is equally difficult to detect. But when an inter-character space is transcribed as an intra-character space (i.e., when two characters are run together), the resulting value usually does not correspond to a possible character. Thus, if the table look-up indicates that a character has an incorrectly assigned element, it seems most probable than an inter-character has been transcribed as an intra-character space. Since experience has shown that the “intra-character” space that has the longest duration should be the inter-character space, the computer is programmed to make that change. Since every character can ultimately be resolved into a series of dots and dashes (that is, e's and t's), this process must eventually result in valid characters. 14 With some extremely poor senders, as many as 10 percent of the characters may be resolved by this procedure. When using a general purpose digital computer to translate manual Morse, there is a continuing temptation to use the information conveyed by the context of the message to improve the quality of the transcription. Comparative studies of human and machine copies of the same signal clearly reveal that people do use this information—even to the extent of subconsciously correcting the sender's spelling errors. Despite the obvious advantages, we have not included context analysis in our programs because it would complicate considerably the special-purpose device the computer program is simulating. Fortunately, virtually all the information conveyed by context is preserved in the transcribed text, so that context analysis can be performed independently of transcription. Now that we have considered the details involved in converting radio-telegraph signals into printed copy, let us consider the way that they are combined into a single useful program. Between transmissions the computer is sampling the key and measuring the amount of time that has elapsed since the end of the last message. As the first element starts, the computer begins to measure its duration and prints the idle time notation. When the number of durations sufficient to insure the accuracy of the statistical process have been received, the computer calculates the discrimination points. If the “goodness-of-separation” is greater than a previously assigned minimum, the machine assigns elements to the first durations of the message, converts the elements into character values, looks them up in the table, corrects any inter-character mistakes and begins printing the message. Of course, while this analysis is going on, the computer is continuing to sample the key periodically and to measure the durations of the new elements. Since the typewriter can type out information faster than the Morse operator can send it, it will eventually reduce the length of the list of durations to a number that is too small to insure the accuracy of the statistical process. At this point the typing is delayed until additional elements are read into the machine. The discrimination calculation is based only on those elements that have been received but not transcribed, so that the dividing lines will follow any slow changes in sending speed. From this point on, unless something unusual happens, the computer continues to print the letters at approximately the same rate at which they are arriving (after a slight delay). Of course, if the operator suddenly shifts his rate of sending, the “goodness-of-separation” value goes down. The machine waits until it finds the point at which the rate was shifted and then treats the sequence of elements as if it were sent as two messages. Eventually the operator will stop sending. The resulting very long key-open duration indicates that the message has ended, so the machine can use the discrimination points based on the last set of durations to print the remaining characters, even though the list is reduced below the minimum length. When the final character of the message has been printed, the computer reassumes the state between messages. Thus, we see that the computer is able to receive a message in any language, print it, automatically adjust to slow or rapid changes in the sending rate, indicate the points in the text where the transcription technique was so poor that it was necessary to resort to fractionation, and even keep track of the amount of idle time between messages. There are a number of minor “fringe” benefits to be obtained from the program which hardly need to be mentioned; for example, the computer can indicate the number of words per minute at which the message was sent or the overall goodness of assignment of the characters in the message. Testing a manual Morse transcription procedure is complicated by the fact that one cannot obtain an exact copy of the test message; for no human sender can produce precisely the message given to him 15 and no receiver can copy a signal without making mistakes. It is virtually impossible therefore to make an absolute statement about the accuracy of a manual Morse transcription device. One can, however, compare the output of the machine with copy made by a man. In one sets of tests, for example, seven operators each sent two messages which were recorded and a “standard” text was established by replaying them for our best human receiver, at normal and reduced rates, until he was unable to improve his copy by further listening. This type of comparison is given in table 4. In these 14 messages the computer copied between 0 and 6 percent of the characters “incorrectly.” On the whole, the machine copies are 96.5 percent “correct.” On the other hand, this procedure gives no information about the relative merits of the human and the machine under realistic operating conditions. We decided, therefore, to include tests in which each person has one opportunity to monitor each message. 16 In these tests, two groups of five members each sent four context-free messages 17 a piece—forty in all—which were recorded on magnetic tape. Each sender copied all the signals from his group, including his own, and his copy was compared with the machine's. Whenever the two were not in agreement, if either had copied the original text the other was charged with an error; if neither copied it the difference was attributed to the sender and no errors were charged. The four messages sent by each operator were then considered to be a single long message and the percent characters copied incorrectly by man and machine was calculated. The percentage of incorrectly copied characters in these tests is summarized in tables 5 and 6. The error percentages of the machine are italicized and appear under the human percentages for those messages with which they were compared. Although each of the operators used in these comparisons is considered competent, the difference in their ability both to send and receive is apparent. While the machine does not copy as well as some of the operators ( K, L , and M ), it does better than the others ( H, I, J, N, O, P, and Q ). Furthermore, with the exception of sender 18 L , there is no great disparity between even the best human copy and the machine copy of any message. The relative merits of the two are more easily seen in table 7, which classifies the 50 man-machine comparisons in these two groups according to the errors made by man and machine. Entries below and to the left of the main diagonal (boldface type) represent those messages in which the machine produced better copy than the man (24 messages or 48 percent). Entries above and to the right represent messages copied better by the man (10 messages or 20 percent). The remainder were copied equally well by each (16 messages or 32 percent). These tests indicate that the machine does about as well as a man in copying hand-sent Morse code.

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/320986.320997

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Facial Electromyogram Activation as Silent Speech Method;Human-Computer Interaction. Towards Intelligent and Implicit Interaction;2013

2. Using rhythmic patterns as an input method;Proceedings of the SIGCHI Conference on Human Factors in Computing Systems;2012-05-05

3. Machine Learning from Noisy Information;Nature;1964-10

4. "Pattern recognition" computers as models for form perception.;Psychological Bulletin;1963-01