Category Archives: Usability

iPhone 4S: speech advances but there’s more to do

By Kimberly Patch

Apple’s iPhone 4S has taken a couple of nice big steps toward adding practical speech to smart phones. There are still some big gaps, mind you. I’ll get to those as well.

Speech on the keyboard

The long-awaited speech button is now part of the keyboard. Everywhere there’s a keyboard you can dictate rather than type. This is far better than having to use an app to dictate, then cut and paste into applications. This is one of the big steps. This will make life much easier for people who have trouble using the keyboard. And I suspect a large contingent of others will find themselves dictating into the iPhone a good amount of time, increasingly reserving the keyboard for situations where they don’t want to be overheard.

The key question about speech on the keyboard is how it works beyond the letter keys and straight dictation.
For instance, after you type
“Great! I’ll meet you at the usual place (pool cue at the ready) at 6:30.”
how easy is it to change what you said to something like this?
“Excellent :-) I’ll meet you at the usual place (pool cue at the ready) at 7:00.”
And then how easy is it to go back to the original if you change your mind again?

Speech assistant

After we all use the speech assistant for a couple of days or weeks it‘ll become readily apparent where Siri lies on the very-useful-to-very-annoying continuum.

The key parameters are
– how much time Siri saves you
– how a particular type of Siri audio feedback hits you the10th time you’ve heard it
– how physically and cognitively easy it is to switch between the assistant and whatever you have to do with your hands on the phone.

One thing that has the potential to tame the annoyance factor is giving users some control over the feedback.

I think the tricky thing about computer-human feedback is it’s inherently different from human-human feedback. One difference is the computer has no feelings and we know that. Good computer-human feedback isn’t necessarily the same as good human-human feedback.

The big gap

There’s still a big speech gap on the iPhone. Speech is still just a partial interface.

Picture sitting in an office with a desktop computer and a human assistant. Type anything you want using the letter keys on your keyboard or ask the assistant to do things for you. You could get a fair amount of work done this way, but there’d still be situations where you’d want to control your computer directly using keyboard shortcuts, arrow keys or the mouse. Partial interfaces have a high annoyance factor.

Even if you use a mix of speech, keyboard and gesture, if you’re able to choose the method of input based on what you want to do rather than what happens to be available, true efficiencies will emerge.

Ultimately, I want to be able to completely control my phone by speech. And I suspect if we figure out how to do that, then make it available for everyone, the general mix of input will become more efficient.

I’d like to see the computer industry tap folks who have to use speech recognition as testers. I think this would push speech input into practical use more quickly and cut out some of the annoyance-factor growing pains.

What do you think? Let me know at Kim@ this domain name.

Getting Gmail working well with speech commands

By Kimberly Patch

If you haven’t used speech commands to control a computer, it might not be obvious that single character commands, for instance “y” to archive a message in Gmail, can present a challenge.

Single-character commands seem like a great idea, especially for Web programs, because your Web browser already takes up some common keyboard shortcuts. Gmail has a lot of single-character commands, and once you get to know them you can fly along using the keyboard. In general I’m all for more keyboard shortcuts because it’s easy to enable them using speech.

Command conundrum

Single-character commands that can’t be changed, however, can get speech users in a lot of trouble. Say a command or make a noise that’s misheard as text in a program that doesn’t use single-character shortcuts and either nothing happens or you get some stray text you can easily undo. Do the same thing in a single-character-command program and you can cause many actions to happen at once.

A stray “Kelly” in your Gmail inbox, for instance, will move the cursor up one message (single-character command “k”) and archive it (single-character command “y”). “Bruno” causes even more damage.

Turn off the keyboard shortcuts, though, and the program becomes fairly inaccessible for speech users. We need the shortcuts, and we can combine multiple keystrokes into single utterances to make things even better. It’s having little control over them that presents a problem.

Speech-safe single character shortcuts

Google Labs has a nifty extension that presents a simple fix. It lets you change the characters you use for keyboard shortcuts, including using two characters rather than one. Add a plus sign (+) to the beginning of every shortcut and they all become speech-safe.

Here are step-by-step instructions.
– go to your Gmail account, click the settings gear icon at the top right of the screen
– click “Labs”
– search for the “Custom Keyboard Shortcuts” extension and click to download. This will add a ”Keyboard Shortcuts” tab to your Gmail settings
– now, click the settings gear icon at the top right of the screen
– click Keyboard Shortcuts
– add “+“ to the beginning of every command

If you’re using Utter Command 2.0 you’re now all set. Say “Plus” and any one- or two-character command. Say, for instance “Plus j” or “Plus Juliet” to move down one item. You can also say a command multiple times in a single utterance. Say “Plus j Repeat 5” to move down five items, for instance. And you can combine two commands: “Plus j Plus y” moves down one item, then archives that item (say “Question Mark” to call up the keyboard shortcuts list.)

Raising the bar

The Google Labs add-on enables Gmail for speech users, but there are many other programs out there that use single-character shortcuts, including other Google programs, and other Web-based programs like Twitter. Message for Google: How about one facility that would let us control keyboard shortcuts across Google programs?

It would also improve things if we could have a larger number of characters available for a given character shortcut, the ability to also control control-key shortcuts, the ability to save and share different sets, and the ability to apply at least some shortcuts across applications

Important Note: If you were a beta tester or received the Utter Command 2.0 pre-release, you might not have the “Plus” set of commands. If this is the case, send e-mail to “Info” at this web address, and we’ll make sure you have the release version. The release version shows 15 new sets of commands on the “New commands for 2.0” list you can open from the Taskbar icon menu.

Tips, tricks, productivity, accessibility, usability and all things speech recognition.

Good signs around Google accessibility

By Kimberly Patch

It looks like Google is stepping up its accessibility effort and resources.

– Google accessibility page:

– Google Accessibility Twitter account:
@gooogleaccess Google Accessibility

– Accessibility Google Group

Here’s a tweet about accessibility in Google+:!/googleaccess/status/86442474523992065
“We considered accessibility of Google+ from day 1. Find something we missed? Press Send Feedback link & let us know.”

I do think there’s a lot missing.

For starters, Google+ is quite short on keyboard shortcuts (the Google Manager addon addresses this in part). It’s also short on basic keyboard navigation — in a perfect world, the down/up arrows and enter key should allow you to navigate anything that looks like a list or a menu.

Asking for feedback like this is a very good sign, however. One thing I’ve use the Send Feedback link to point out is once you get past a dozen circles or so it’s important to have a list view unless you’re willing and able do a lot of unnecessary scrolling.

Here’s a recent Google blog post about accessibility in Docs, Sites, and Calendar that talks about additional keyboard shortcuts:

Some Google applications are gaining more keyboard shortcuts. You still can’t use down/up arrows on everything that looks like a list or a menu, however.

The bottom line is there are some channels open and some good intentions. This is great. Now let’s hold them to it, and keep the keyboard shortcuts coming.

My #1 request as a speech user is the ability to adjust, organize and share keyboard shortcuts across apps. An adjust-your-shortcuts facility that works across apps would not only be good for many different types of users, it would address a special problem of speech users and the type of keyboard shortcuts that web apps tend to use. More on that issue next.

Making filling out forms fast and easy

By Kimberly Patch

Here’s a simple way to make filling out forms in Firefox easier.

If you find yourself frequently putting the same old information — name, address etc. in a Web form, this will save you a lot of time, and it’s probably worth the time to set up even if you fill out forms just a few times a year (speech instructions are for Dragon plus Utter Command):

– Click on this link to download the Autofill Forms extension:
– In Firefox say “Under Tango Alpha” to Click Tools/Add-ons
– “Shift Tab”, and if necessary “1-10 Down” to Navigate to Autofill Forms
– “2 Tab · Enter” to Click “Options”
– “2 Tab” to Navigate to the first field
– Fill in all applicable fields
– “Enter” to save your information

Now anytime you find yourself in a form field say “Under Juliet” and applicable fields will automatically fill in.

That was the quick easy setup. If you want to change the keyboard shortcut or set several different profiles, take a look at the options. There’s a lot you can do with this add-on.

Feel free to +Kim Patch if you want me to add you to my Utter Command Circle on Google+


Making Google+ easier to use

By Kimberly Patch

Here are a couple of Firefox add-ons that make Google+ easier to use.

– Google+Manager
This adds the keyboard shortcuts Google should have included in the first place, a translate button for every post, a drop-down menu for common functions, and a tiny URL generator.

– i rec Plus 1 and Like
This adds a “+f” icon in Firefox next to the homepage button (near the top right corner). Click the button to get Google +1 and Facebook Like buttons for any webpage. You can use the Utter Command naming-a-mouse-click utility to click the button, then click the icon to share to either service using a single speech command (details in UC Lesson 10.24).

Spell Everywhere

I’ve been getting a lot of questions lately about the Dragon NaturallySpeaking “Spell XYZ” command. This command lets you say, for instance “Spell s a”. People are complaining that it sometimes doesn’t work. They’re right.

This command doesn’t work everywhere. It only works in text boxes. This is an unfortunate oversight in the Dragon user interface.

Logically, any speech command should work in all contexts where it could be useful. It’s unnecessarily difficult to make the user remember different commands to carry out the same operations in different contexts. Something as basic as pressing a letter key should work anywhere you might want to use a letter, including menus.

This is what people are complaining about. Those who are complaining have gotten adept enough at speech that something basic like pressing letter keys becomes second nature. They have a habit of saying “Spell” and then a letter, number or symbol name whenever they have to hit separate keys. The definition of habit is you don’t have to think about it. And this is where they get in trouble — the habit kicks in everywhere, including when you are in a drop-down menu that doesn’t respond to full words.

If you’d like to use the “Spell XYZ” command everywhere rather than having to stop and think about where you can and can’t use it, complain to Nuance, the company that makes Dragon (there are couple of ways to do this — details are posted on the Redstart wikki:

Thunderbird tabs and consistency

Thunderbird now has tabs for open messages, which is very convenient. You can have three messages open and see where they are from the tabs — this is similar to tabbed browsing in programs like Firefox and Internet Explorer. And you can move among tabs using the same commands you use to move among tabs in your browser: “Tab Back”, “Tab Forward” and “1-20 Tab Back/Forward”.

Unfortunately, however, the keyboard shortcut to close a message tab is different from the standard close document/tab command used in most programs including Firefox, even though Thunderbird is developed by the same organization as Firefox. The usual command “Control Function 4” logically mirrors the common “Alternate Function 4” that’s used to close a window.

If the standard keyboard shortcut were enabled like it is in programs like Microsoft Word and Firefox, you could say the shortcut or “Document Close” to close a document or tab. And if you wanted to close more than one you could say “Document Close Times 3”, for instance.

If you dig through the keyboard shortcuts for Thunderbird, you’ll find that there is a nonstandard keyboard shortcut to close a message tab: “Control w”. So you can train yourself to say “Control w” to close a message when you’re in Thunderbird. Also keep in mind you can also say “Control w Times 3” to close three open messages. But it would be far better to not have to think about which program you are in when closing a tab or document. Feel free to complain to Thunderbird about this oversight at the Thunderbird support forum.

Here’s another Thunderbird tip: If you want to move a message rather than just closing it try “Move Recent”, “1-10 Down Enter”.
There’s more Thunderbird strategy on the Redstart Wiki:

Tip: Easier scrolling with mouse-speech combination

If you use a mouse to scroll, have you noticed how much fine motor control you use to keep the arrow on the scroll bar as you move the page? You’re doing a fair bit of work to do this. It’s akin to keeping on a balance beam.

If you can move your mouse, you can use an Utter Command touch/speech combination that’ll show you just how hard you have to work to use just the mouse to control the scroll bar.

Next time you use the mouse to scroll, place the mouse arrow on the scroll bar, then say “Touch Hold”. This command holds down the left mouse button. Now you can scroll by simply moving the mouse up and down. There’s no need to click, and there’s no need to keep inside the narrow confines of the width of the scroll bar. This command is especially effective when you’re reading and can leave the left mouse button down between moves. It’s also especially effective when you’re skimming quickly through a document — you can concentrate more on what you’re reading because there’s no need to take your eyes off it to make sure the mouse is on the scroll bar. When you’re done using this command make sure to release the mouse button: “Touch Release”.

You can use the same method in a drawing program to draw without having to have the pen touch the tablet.

There are more details on the “Touch Hold/Release” command in UC Lesson 4.5.

Keep in mind that the Touch Hold/Release method is one of several ways to control the scroll bar using Utter Command — if the combination is comfortable for you it’s a good one. If you need to be completely hands-free, see UC Lesson 1.8, which details all the ways you can use speech to navigate documents, and UC Lesson 9.5, which details Web navigation.

Happy navigating.

Discover, Adjust, Organize and Share

By Kimberly Patch

Keyboard shortcuts have a lot of potential. They’re fast.

For example, cutting and pasting by

– Hitting “Control x”
– Moving the cursor to the paste location
– Then hitting “Control v”

is speedier than

– Moving the mouse to the “Edit” menu
– Clicking “Edit ”
– Clicking “Cut”
– Moving the cursor to the paste location
– Moving back up to click “Edit ”
– Then clicking “Paste”.

Add this up over many tasks and you have a big difference in productivity.

So why don’t we see more people using keyboard shortcuts?

Ask someone who uses the mouse for just about everything and you’re likely to get a compelling answer — it’s easier. And it is — it’s cognitively easier to choose a menu item than to remember a shortcut.

Given a choice, people generally do what’s easier. On a couple different occasions I’ve heard  people say that, all else being equal, they’d hire a blind programmer over a sighted one because the blind programmer is faster. The blind programmer must use keyboard shortcuts.

This is a common theme  — we have something potentially better, but human behavior stands in the way of adoption.

In the case of keyboard shortcuts there’s a little more to the story, however.

As a software community we haven’t implemented keyboard shortcuts well.

Many folks know keyboard shortcuts for a few very common actions like cut, paste and bold, but it’s more difficult to come up with keyboard shortcuts for actions like adding a link or a hanging indent because they are used less often and are less likely to be the same across programs.

So the user is often stuck with different shortcuts for the same tasks in different programs, requiring him to memorize and keep track of multiple sets of controls. This is cognitively difficult for everyone, and more so for some disabled populations and the elderly.

This type of implementation is akin to asking someone to speak different languages depending on who they are speaking to. Depending on how motivated and talented they are, some folks may be able to do it, but not many. And if there’s an easier way, even those capable of doing it either way will often choose easier even if it’s less efficient.

So we aren’t letting keyboard shortcuts live up to their potential.

There’s a second keyboard shortcuts issue that’s getting worse as Web apps become more prevalent: clashing shortcuts. If you hit “Control f” in a Google document, do you get the Google Find facility or the browser Find facility? Go ahead and try it out. It’s messy.

This is already an issue in the assistive technology community, where people who require alternate input or output must use software that runs all the time in conjunction with everything else. For example, a speech engine must be on all the time listening for commands, and screen magnifier software must be running all the time to enlarge whatever you’re working in.

So there are two problems: keyboard shortcuts aren’t living up to their potential to increase efficiency, and, especially on the Web, keyboard shortcuts are increasingly likely to clash.

I think there’s a good answer to both problems: a cross-program facility to easily discover, adjust, organize and share shortcuts.

– We need to easily discover shortcuts in order to see them all at once so we can see patterns across programs and conflicts in programs/apps that may be opened at once.

– We need to easily adjust shortcuts so we can choose common shortcuts and avoid clashes. We need to organize so we can remember what we did.

– We need to easily arrange commands and add headings so we can find commands quickly and over time build a good mental map of commands.. Lack of ability to organize is the Achilles’ heel of many macro facilities. It’s like asking people to play cards without being able to rearrange the cards in their hand. It’s possible, but unless there’s a reason for it, makes things unnecessarily difficult.

– We need to share the adjustments because it makes us much more efficient as a community. My friend Dan, for instance, is very logical. He uses many of the same programs I do, and we both use speech input. So if there were a facility to discover, adjust, organize and share keyboard shortcuts, I’d look to see if Dan had posted his changes, and I would adjust to my needs from there.

The organizing and sharing parts are the most important, because they allow for crowdsourcing.

Over the past few decades the computer interface ecosystem has shifted from single, unrelated programs to separate programs that share information, to programs so integrated that users may not know when they are going from one to another. This has increased ease-of-use and efficiency but at the same time complicated program control.

At the same time programs have grown more sophisticated. There’s a lot of wasted potential in untapped features.

If we give users the tools to discover, adjust, organize and share, I bet we’ll see an increase in speed and efficiency and an uptick in people discovering nifty new program features.

Suggestion for Dragon: Easier Correction

In the last couple of months I’ve had a couple occasions to suggest to the folks at Nuance, the company that makes the Dragon NaturallySpeaking speech engine, that their “Resume With” command is under advertised. The command is very useful, but I keep meeting people who don’t know about it.

“Resume With” lets you change text on the fly. For instance, if you say “The black cat jumped over the brown dog”, then — once you see it on the screen — change your mind about the last bit and say “Resume With over the moon”, the phrase will change to “The black cat jumped over the moon.”

This is a particularly useful command for doing something people do a lot — change text as they dictate.

Now I have a suggestion that I think would make the command both better and more often used. Split “Resume With” into two commands: “Try Again” and “Change To”. The two commands would have the same result as “Resume With”, but “Try Again” would tell the computer that the recognition engine got it wrong the first time and you are correcting the error. “Change To” would tell the computer that you are simply changing text.

This would be a less painful way to correct text than the traditional correction box. Users are tempted to change text rather correct it because it’s easier. This would make it equally easy to correct and change using what is arguably the fastest and easiest way to make a change.

Easy correcting is important because NaturallySpeaking learns from correcting and because it’s annoying when the computer gets things wrong. Correcting improves recognition. Minimizing the interruption reduces frustration and lets users concentrate on their work rather than spending time telling Dragon how to do its job. From my observations, many users are tempted to change text rather than correct it when the computer gets something wrong simply because it’s easier.

It would be great to have these commands both in Dragon NaturallySpeaking on the desktop and in Dragon Dictation, the iPhone application. This would enable truly hands-free dictation in Dragon Dictation.