The hardware by itself is clearly not enough for the aide we were building, there also needed to be some driver software to integrate the aide for use by the client. We decided to write that software in an authoring system called CanDo™. It allows programming in high level language and provides full support for hardware interrupts, multitasking, and process communication. This gave us an extremely flexible environment and greatly eased our integration tasks. CanDo™ is preprogrammed with information about the hardware on the Amiga computer and has provisions for both interrupt and buffered handling of exterior devices. It also is able to start and communicate with exterior processes in a multitasking environment.
We decided to make the user interface be a menuing based on eight buttons. The software goal was to maximize capability without sacrificing ease of use. We wanted to make the output of the aide as normal and comfortable (as compared to human speech) as possible. We also wanted a non-computer literate facilitator to be able to easily modify and program this communication aide.
The first step to developing the software for the aide was to break down the selection strategy for efficient use with the eight buttons (see Figure 6). In keeping with the familiar machine, Michaels Touch Talker™, we decided to have the same forty-nine base selections made by pressing two key combinations, to be referred to as Quick Keys; see Table 2 (we used a system of denoting selection based on the button layout to increase ease of use and allow for a consistent methodology). We copied the initial contents of these Quick Keys from the setup of the Touch Talker™ system.
Figure 6 : Speech Device Layout
Table 2: Quick Keys Selection Formation Based on Client Interface
First Button Goes From
Main Menu to:
|Submenu 1||Submenu 2||Submenu 3||Infinite Menu|
|Submenu 4||Submenu 5||Submenu 6||Submenu 7|
Second Button Goes From
Submenu N to Selection:
|Selection [N,1]||Selection [N,2]||Selection [N,3]||Goto Main Menu|
|Selection [N, 4]||Selection [N,5]||Selection [N,6]||Selection [N,7]|
note: to convert Selection [N,M] to a number 1-49
number = (N-1)*7 + M
This means that the first button pressed takes you to a submenu and the second button finalizes a selection which is then spoken. The upper right button was set to default to return to the main menu. The only time that the upper right button does not return the user to the main menu is when the user is on the main menu in which case it takes the user to the Infinite Menu; see next paragraph for detailed information on the Infinite Menu.
The Infinite Menu is provided for conveying communication beyond the limited 49 selections available from the Quick Keys. The infinite menu consists of six infinite list selections (corresponding to 6 lists of letters/words/phrases), return to main menu, and a key which is reserved for the client to edit entries (left for later implementation) as shown in Table 3.
Table 3: Infinite List Selection
|Letters||Words||MWords||Goto Main Menu|
The six choices are defined by what they build:
Each one of these choices takes the client to a different list, of arbitrary length, for further selection. Having created a new extension to the existing selection method it is necessary to have a new methodology for selecting entries from the list and building them into the desired words/phrases/ideas. The first step we took was to transform the list, which is alphabetically sorted, into a matrix as demonstrated in Table 4. This matrix is N columns and N rows where N is the ceiling function (integer formed by rounding up if there is any fraction) of the square root of the number of entries. The client is placed in the "center" of the list. This is done by setting the current selected entire set to by the entry at (length of list)/2 spot (lady in Table 4). The clients interface is now configured so that four of his buttons navigate him in one of four directions within the matrix. Each time one of these buttons is pressed it "moves" the cursor in the corresponding direction. The buttons are assigned as shown in Table 5.
Table 4: List To Matrix Conversion
Table 5: Button Assignment Within Infinite List Mode
|right 1 word||up 1 row of words||add current to output||return main menu|
|left 1 word||down 1 row of words||say current output||add output to list|
After having navigated to a desired entry the client then presses the "add current" button. This allows him to add items one at a time in a building fashion. When he has finished adding selections to the statement he wishes to be said he presses the "say" button and the communication aide will say whatever it statement the client has built. If he wishes to use this statement again he can then press the "add output" and it will be stored either in the Mwords or Mphrases list.
The next part of the software we considered was the facilitator interface. We began with the main menu, shown in Figure 7. We made this menu fully configurable; the names of the submenus, the number of seconds that a spoken phrase will appear on the LCD screen (0 seconds causes the LCD not to display the spoken phrases), and whether or not a beep will sound will be made after a button is pressed is fully programmable by the facilitator on this menu.
Figure 7: Main Menu Screen
We made the facilitator interface so that activating the button portion of the menu takes the facilitator to the applicable submenu (i.e., pressing the left mouse button within the box above Michael in the MainMenu will take you to the Michael submenu, Figure 8).
Figure 8: Editing Screen From Quick Keys Selection
The submenus corresponding to Quick Keys selections allow the editing (via a special editing screen, see Figure 9), of the response phrases when the appropriate button areas are selected.
Figure 9: Michael Submenu Screen
This enables the facilitator to bring up the editing screen for any of the seven button combinations under a submenu simply by selecting the button area that corresponds to it. To edit the Hello message from the Michael submenu, Figure 8, the facilitator would press the mouse button in the box above Hello and the editing screen would appear as is shown in Figure 9. The editing screen has two windowed areas, one for what will be said and one for what will be displayed on the LCD. This is particularly handy when one is saving disk space by using a single, digitized word for multiple words which have the same pronunciation (i.e., two, to, and too) or when spelling phonetically. Additional boxes provide for non-digitized speech, saving changes, and other convenient functions.
When the client or facilitator selects the Infinite Menu from the main menu they are taken to the screen shown in Figure 10.
Figure 10: Infinite List Selection Screen
They then select the list they wish to work with- Letters, Number, Word, Phrases, Mword, or Mphrase. This puts them on the Infinite Options screen (Figure 11) with the appropriate previously selected list loaded in.
Figure 11: Infinite List Mode Screen
To modify a list entry, the facilitator selects that entry with the mouse and is taken to the list entry editing screen (Figure 12). This editing function here is nearly identical to that of the Quick Keys, except that it has a delete entry button. Note that it is also possible to put anything into any list (e.g. put a phrase into the letters list) in keeping with maximum flexibility. Also on this screen are eight boxes corresponding to the eight buttons, which allow the facilitator to simulate client navigation and use. Figure 12 shows an example in which the facilitator has built the statement "Michael is a good boy".
Figure 12: Editor Screen From Infinite List Mode
We designed a system that has forty-nine Quick Keys phrases and six Infinite Lists to select from, mapped onto our eight arcade buttons. From the main menu, a button selection takes the client to a submenu or to Infinite Menu. On a submenu, a button selection results in a Quick Keys phrase being selected and said. In the case of Infinite Menu, the second button selects the list the client will be working with. These two key mappings are shown in Figure 13.
Figure 13: Client Interface Mapping
The numbered circles correspond to the (1-49) Quick Keys selections and the L1-L6 correspond to the six Infinite Lists. Every entry is fully configurable by a facilitator. The system is simple to use; training time among non-disabled test subjects (from an informal study of 10 people at various demonstrations) was under 5 minutes to learn to use the client interface, and under 15 minutes to learn to use the facilitator interface. We consider this sufficient proof to claim that the system is easy to use (training time of 20 minutes cannot be achieved on a system that is not easy to use). It is fast for the client to use, requiring only 2 buttons or key selections to activate one of the Quick Keys phrases. The delay between selection and vocalization of selection is no greater than a few seconds. We maintained capability by the use of the Infinite Lists, which require a reasonably small number of buttons selections to be made. This facility is capable of saying anything the client wishes.
The LCD continuously displays feedback information. It has two display modes, one for showing the current button definitions and one displaying the phrase that was just spoken. In the Quick Keys section this means breaking the display into 10 areas, eight corresponding to the buttons and two informing the client of the current submenu on which that they are working. In the Infinite Lists, this means dividing the screen into 7 sections (4 on top and 3 on bottom) which show:
When the aide is displaying the current statement it displays the text assigned to the display box of the selected entry. This produces a display layout looking like the following:
Table 6: LCD Display General Layout
Quick Keys Display
|Item 1||Item 2||item 3||: MainMenu||| Current|
|Item 4||Item 5||Item 6||Item 7||| menu|
Infinite List Mode Display
|Phrase right||Phrase up||current phrase||
| cur list
|phrase left||phrase down||currently build up super-phrase|
Displayed Speech Window
|The last thing said by the speech device|
Examples taken from the testing of the communications aide include:
Table 7: LCD Examples
|Want||: Remember||: Mess||: MainMenu||| Michael|
|Bathroom||: Warm||: Walk||: Walk||||
Infinite Words List
|birth||: a||: boy||
|cold||: happy||Michael is a good boy|
|I have to go to the bathroom.|
The speed of conversation varies greatly, depending on how it is calculated. The factors involved are:
A 1 second delay per button push, to allow the LCD to display the new information, will be used for calculations. This mean that Quick Keys selection will be much faster than Infinite Lists selection because they require fewer button presses. Quick Keys require 2 button presses and therefore a 2 second delay will be used to calculate time for button presses. Infinite Lists require 3 button presses plus as many as square root (number of entries) + 1 ("add" button) per selection. Based on a target list length of 200-225 entries, the aide requires a maximum of 15 button presses per selection and an average of 8. This is an efficient number of button press, but we feel that further research should be able to reduce this number in half. The disk access delay is between 0 seconds (words in memory) and 15 seconds (time to load speech synthesizer), with an average of .4 seconds delay per unique word loaded. The words are spoken at full speed with no additional delay with an average of .5 seconds in length. Any selection can be as many words as desired, but the average phrase length was 7 words. Putting it all together we get 50 words per minute from the Quick Keys, and 25 words per minute for the first selection from the Infinite Lists (phrase selections) and 31 words per minute for each subsequent utterance from the list. The speed of Infinite Lists word selection (based on 7 words) is 4.3 words per minute, and letter selection (based on 5 letter words) is 4.5 words per minute. This represents the capabilities of the system; it doesnt count the time it takes the client to press the buttons (if greater than 1 second). We feel that these results are good, but that there is still room for improvement. By rearranging the selection method and adding a word fill algorithm we should be able to improve the letter selection to about 20 words per minute.
File space was a very big consideration as the whole system had to fit onto a single 880k floppy disk; however by adding $200 to the system cost we can increase the storage of the aide by a factor of 100. With this in mind, we needed some supplementary programs to conserve space.. The first program to save space, was a lossy sound compression program (the output is similar but not exactly the same as the original input). Many techniques were tried, but the two that worked were decreasing the resolution and transforming the sound into a series of exponential differences; both gave a two-to-one compression factor. The resolution of the sound is the number of bits that make up each sound sample (music CDs have a resolution between 16 and 18 bits). The words were initially digitized with 8 bits of resolution and then reduced to 4 bits by ignoring the least significant 4 bits. This produced an acceptable result where the words were clearly recognizable, but there was some noise that did not occur in the original 8 bit sample. This noise took the form of static and was much more profound on poorer amplifiers. The second method is based on the fact that the samples, when put together, make a waveform (see Appendix A). This means that the current value tends to be a "small" offset from the previous value. This method takes the difference between the current sample and the previous one, and applies the log2 function to the result. The result is that small differences are very accurately portrayed and large differences are not. The net effect of exponential differences is slightly better at low sampling rates than decreasing resolution (i.e. the 8,000 samples per second we were using) and it produces nearly perfect results at higher sampling rates (such as the 44,000 samples per second that CDs use). Because of its performance exponential differences was the method used in the aide delivered to the client.
We found other programs within the public domain and shareware fields that were useful. These included Playsound, a flexible program that will play a list of digitized sounds based on a "command line" of Playsound sample1, sample2, ..., sampleN. Cache-Disk, a program that speeds up repetitive drive access by setting aside a space in memory to keep track of previous disk accesses. Nuke, a program that transparently compress and decompress files in a very quick fashion; and finally Turbo Imploder, another compression program that decompresses in a fast, transparent fashion. We would like to thank the authors of those programs for permission to use them, all free of charge, and would like to congratulate them on the overall quality of their programs.
Previous Section Next Section Return To Thesis Home Return To Home Page