Category Archives: Usability

Trying out Dragon Dictation for the iPhone

I’ve been trying out the Dragon Dictation iPhone app. It’s still not what I really want, which is system-level speech control of a mobile device that would give me the option to use speech for anything. But it’s a step in the right direction of making the iPhone more hands-free.

Here’s how Dragon Dictation for the iPhone works: open the app, hit one button, speak up to 30 seconds of dictation, then hit another button to say you’re done. Your dictation shows up on the screen a few seconds later. Behind the scenes the audio file you’ve dictated is sent to a server, put through a speech-recognition engine, and the results sent back to your screen. Now you can add to your text by dictating again, or hit an actions button that gives you three choices: send what you’ve written to your e-mail app, send it to your text app, or copy it to the clipboard so you can paste it someplace else.

The recognition is usually fairly accurate in quiet environments. Not surprisingly, you get a lot of errors in noisy environments. To its credit, on a mobile device the built-in microphone is not optimal for speech-recognition. It does pretty well given these constraints.

Here’s a practical suggestion that should be easy to implement: Add a decibel meter so people can see exactly how much background noise there it is at any given time. This would make people more aware of background noise so they could set their expectations accordingly.

The interface for correcting errors is reasonable. Tap on a word and there are sometimes alternates available or you can delete it. Tap the keyboard button and you can use the regular system keyboard to clean things up.

I have two interface suggestions:

1. You can’t use the regular system copy and paste without going into the keyboard mode. You should be able to. I suspect this is fairly easy to fix.

2. There is no speech facility for correcting errors. I think there’s a practical fix here as well.

First, some background. Full dictation on a mobile device is tricky. Full dictation speech engines take a lot of horsepower. Dragon Dictation sidesteps the problem by sending the dictation over the network to a server running a speech engine. The trade-off is it’s difficult to give the user close control of the text — you must dictate in batches and wait briefly to see the results. This makes it more difficult to offer ways to correct using speech. But I think there is a good solution already in use on another platform.

Although it’s difficult to implement most speech commands given the server setup, the “Resume With” command that’s part of the Dragon NaturallySpeaking desktop speech application is a different animal. This command lets you start over at any point in the phrase you last dictated by picking up the last couple of words that will remain the same and dictating the rest over again.

This would make Dragon Dictation much more useful for people who are trying to be as hands-free as possible. It would also lower the frustration of misrecognitions and subtly teach people to dictate better.

It’s nice to see progress on mobile speech. I’m looking forward to more.

Tip: Scrolling by speech

I’ve gotten several questions lately about scrolling by speech, which is key to comfortable hands-free operation. Utter Command gives you several ways to scroll by speech. The best way depends on the situation.

To quickly look something over, use the speech command that allows you to see successive screens with a pause between changes. For example, “3 Screen Down Wait” moves down a screen, then after a default wait of two seconds moves down another screen, then two seconds later moves down a third screen. If you want a longer wait, add a specific number of seconds, e.g. “3 Screen Down Wait 5” (UC Lesson 7.23). 

To directly control the scroll bar by speech, place the mouse pointer on the scroll bar using a command like “99 by 10” and use the vertical drag command to move the scroll bar to a given point. For example “Drag By 50” moves the scroll bar to the middle. Then, if you then want to go three quarters of the way down say “Drag By 75”. You can also control the scroll bar incrementally, for instance, “Drag 3 Down” (UC Lesson 4.2, 4.5).

In some programs, including some versions of Word, the cursor moves to the page you scrolled to when you use an arrow command like “5 Down”. And in some programs, like Firefox, you can say a link number to move the cursor. In these cases you can leave the arrow parked on the scroll bar, edit the text, than say another drag command to move the scrollbar without having to move the mouse to the scrollbar again. In some programs, including WordPad, you have to move the cursor to the new page by clicking. In this case, keep the right ruler open on your screen so you can easily click back to the scroll bar when you’re ready to scroll again.

– If you use this method a lot, try naming a mouse click to move the arrow to the scroll bar at the home position (UC Lesson 10.24).

– You can also use this method to control horizontal scrollbars — use the “Drag 1-100 By” command.

– If you’re a ZoomText user, you can use this method even when the scrollbar is not showing on the screen.

Tell me what you think about scrolling by speech – reply here or let me know at info@ this website address.

Highlighting and hot water

Have you ever used a faucet that had a hot water knob on the right side instead of the left?

Even if it’s well labeled, chances are you’ll turn the wrong handle a good percentage of the time. This is because controlling the faucet is something you usually do without thinking and your habit is to turn with your left hand when you want hot, not your right.

Consistency allows for habit, which saves time. Do a consistent navigation task a few times and after that you don’t have to think about it. It’s become habit, which means you can use more of your brain to think. The system backfires, however, when you unconsciously expect consistency, use habit, and are caught by surprise.

I often talk about the importance of consistent keyboard shortcuts across programs, because I use keyboard shortcut navigation more than mouse/toolbar navigation.

But consistency is just as important in toolbars.

The default order for many common groups of items is consistent across programs. For instance, Bold, Italic and Underline are commonly shown in that order. Left Justify, Center and Right Justify are commonly shown in that order. Style, Font and Size are commonly shown in that order. There’s a glaring problem, however, when it comes to the highlight and text color icons.

Microsoft Office toolbars put the highlight on the left and the text color icon on the right, while Google Docs and OpenOffice defaults put the highlight button on the right and the text color icon on the left.

The inconsistency makes it impossible to form a habit that’s useful across programs. If you get used to one way you’ll inevitably pick the wrong button when you’re in the program you’re not used to. If you regularly use a mix of inconsistent programs you’re likely to get things wrong fairly often.

In a world where people use multiple programs, inconsistent default order in groups of icons puts a larger-than-necessary cognitive load on folks. Worse, it makes habit a liability rather than an advantage.

It would be good for people if we had a standard order for related icons like Highlight and Text Color just as we have a standard order for faucet controls. The exact order matters much, much less than consistency across programs. Software is complicated enough already — we need to give people all the easy breaks we can.