Archive for the ‘Patch on Speech’ Category

iPhone 4S: speech advances but there’s more to do

Thursday, October 6th, 2011

By Kimberly Patch

Apple’s iPhone 4S has taken a couple of nice big steps toward adding practical speech to smart phones. There are still some big gaps, mind you. I’ll get to those as well.

Speech on the keyboard

The long-awaited speech button is now part of the keyboard. Everywhere there’s a keyboard you can dictate rather than type. This is far better than having to use an app to dictate, then cut and paste into applications. This is one of the big steps. This will make life much easier for people who have trouble using the keyboard. And I suspect a large contingent of others will find themselves dictating into the iPhone a good amount of time, increasingly reserving the keyboard for situations where they don’t want to be overheard.

The key question about speech on the keyboard is how it works beyond the letter keys and straight dictation.
For instance, after you type
“Great! I’ll meet you at the usual place (pool cue at the ready) at 6:30.”
how easy is it to change what you said to something like this?
“Excellent :-) I’ll meet you at the usual place (pool cue at the ready) at 7:00.”
And then how easy is it to go back to the original if you change your mind again?

Speech assistant

After we all use the speech assistant for a couple of days or weeks it‘ll become readily apparent where Siri lies on the very-useful-to-very-annoying continuum.

The key parameters are
- how much time Siri saves you
- how a particular type of Siri audio feedback hits you the10th time you’ve heard it
- how physically and cognitively easy it is to switch between the assistant and whatever you have to do with your hands on the phone.

One thing that has the potential to tame the annoyance factor is giving users some control over the feedback.

I think the tricky thing about computer-human feedback is it’s inherently different from human-human feedback. One difference is the computer has no feelings and we know that. Good computer-human feedback isn’t necessarily the same as good human-human feedback.

The big gap

There’s still a big speech gap on the iPhone. Speech is still just a partial interface.

Picture sitting in an office with a desktop computer and a human assistant. Type anything you want using the letter keys on your keyboard or ask the assistant to do things for you. You could get a fair amount of work done this way, but there’d still be situations where you’d want to control your computer directly using keyboard shortcuts, arrow keys or the mouse. Partial interfaces have a high annoyance factor.

Even if you use a mix of speech, keyboard and gesture, if you’re able to choose the method of input based on what you want to do rather than what happens to be available, true efficiencies will emerge.

Ultimately, I want to be able to completely control my phone by speech. And I suspect if we figure out how to do that, then make it available for everyone, the general mix of input will become more efficient.

I’d like to see the computer industry tap folks who have to use speech recognition as testers. I think this would push speech input into practical use more quickly and cut out some of the annoyance-factor growing pains.

What do you think? Let me know at Kim@ this domain name.

Getting Gmail working well with speech commands

Wednesday, September 21st, 2011

By Kimberly Patch

If you haven’t used speech commands to control a computer, it might not be obvious that single character commands, for instance “y” to archive a message in Gmail, can present a challenge.

Single-character commands seem like a great idea, especially for Web programs, because your Web browser already takes up some common keyboard shortcuts. Gmail has a lot of single-character commands, and once you get to know them you can fly along using the keyboard. In general I’m all for more keyboard shortcuts because it’s easy to enable them using speech.

Command conundrum

Single-character commands that can’t be changed, however, can get speech users in a lot of trouble. Say a command or make a noise that’s misheard as text in a program that doesn’t use single-character shortcuts and either nothing happens or you get some stray text you can easily undo. Do the same thing in a single-character-command program and you can cause many actions to happen at once.

A stray “Kelly” in your Gmail inbox, for instance, will move the cursor up one message (single-character command “k”) and archive it (single-character command “y”). “Bruno” causes even more damage.

Turn off the keyboard shortcuts, though, and the program becomes fairly inaccessible for speech users. We need the shortcuts, and we can combine multiple keystrokes into single utterances to make things even better. It’s having little control over them that presents a problem.

Speech-safe single character shortcuts

Google Labs has a nifty extension that presents a simple fix. It lets you change the characters you use for keyboard shortcuts, including using two characters rather than one. Add a plus sign (+) to the beginning of every shortcut and they all become speech-safe.

Here are step-by-step instructions.
- go to your Gmail account, click the settings gear icon at the top right of the screen
- click “Labs”
- search for the “Custom Keyboard Shortcuts” extension and click to download. This will add a ”Keyboard Shortcuts” tab to your Gmail settings
- now, click the settings gear icon at the top right of the screen
- click Keyboard Shortcuts
- add “+“ to the beginning of every command

If you’re using Utter Command 2.0 you’re now all set. Say “Plus” and any one- or two-character command. Say, for instance “Plus j” or “Plus Juliet” to move down one item. You can also say a command multiple times in a single utterance. Say “Plus j Repeat 5” to move down five items, for instance. And you can combine two commands: “Plus j Plus y” moves down one item, then archives that item (say “Question Mark” to call up the keyboard shortcuts list.)

Raising the bar

The Google Labs add-on enables Gmail for speech users, but there are many other programs out there that use single-character shortcuts, including other Google programs, and other Web-based programs like Twitter. Message for Google: How about one facility that would let us control keyboard shortcuts across Google programs?

It would also improve things if we could have a larger number of characters available for a given character shortcut, the ability to also control control-key shortcuts, the ability to save and share different sets, and the ability to apply at least some shortcuts across applications

Important Note: If you were a beta tester or received the Utter Command 2.0 pre-release, you might not have the “Plus” set of commands. If this is the case, send e-mail to “Info” at this web address, and we’ll make sure you have the release version. The release version shows 15 new sets of commands on the “New commands for 2.0” list you can open from the Taskbar icon menu.

Tips, tricks, productivity, accessibility, usability and all things speech recognition.

Good signs around Google accessibility

Monday, September 19th, 2011

By Kimberly Patch

It looks like Google is stepping up its accessibility effort and resources.

- Google accessibility page:
http://www.google.com/accessibility/

- Google Accessibility Twitter account:
@gooogleaccess Google Accessibility

- Accessibility Google Group
http://groups.google.com/group/accessible

Here’s a tweet about accessibility in Google+:
http://twitter.com/#!/googleaccess/status/86442474523992065
“We considered accessibility of Google+ from day 1. Find something we missed? Press Send Feedback link & let us know.”

I do think there’s a lot missing.

For starters, Google+ is quite short on keyboard shortcuts (the Google Manager addon addresses this in part). It’s also short on basic keyboard navigation — in a perfect world, the down/up arrows and enter key should allow you to navigate anything that looks like a list or a menu.

Asking for feedback like this is a very good sign, however. One thing I’ve use the Send Feedback link to point out is once you get past a dozen circles or so it’s important to have a list view unless you’re willing and able do a lot of unnecessary scrolling.

Here’s a recent Google blog post about accessibility in Docs, Sites, and Calendar that talks about additional keyboard shortcuts:
http://googleblog.blogspot.com/2011/09/enhanced-accessibility-in-docs-sites.html

Some Google applications are gaining more keyboard shortcuts. You still can’t use down/up arrows on everything that looks like a list or a menu, however.

The bottom line is there are some channels open and some good intentions. This is great. Now let’s hold them to it, and keep the keyboard shortcuts coming.

My #1 request as a speech user is the ability to adjust, organize and share keyboard shortcuts across apps. An adjust-your-shortcuts facility that works across apps would not only be good for many different types of users, it would address a special problem of speech users and the type of keyboard shortcuts that web apps tend to use. More on that issue next.

Change People’s Lives

Monday, September 19th, 2011

By Kimberly Patch

If you’re anywhere near Boston this week, make sure to check out the Change People’s Lives Conference and Expo this Friday, September 23 at the Hynes Convention Center. The event is hosted by the Commonwealth of Massachusetts and Governor Deval Patrick will be giving the keynote address. Event collaborators include The Institute for Human Centered Design, Massachusetts Rehabilitation Commission, Work without Limits, and Easter Seals.

Dragon NaturallySpeaking maker Nuance is exhibiting, and Peter Mahoney, Senior Vice President & General Manager of the Dragon Business Unit, is scheduled to give a talk. Information about Utter Command will also be available at the Nuance booth.

The expo is free. It costs $75 to attend the conference sessions. Registration details are here: www.ChangePeoplesLives.org

Making filling out forms fast and easy

Friday, September 2nd, 2011

By Kimberly Patch

Here’s a simple way to make filling out forms in Firefox easier.

If you find yourself frequently putting the same old information — name, address etc. in a Web form, this will save you a lot of time, and it’s probably worth the time to set up even if you fill out forms just a few times a year (speech instructions are for Dragon plus Utter Command):

- Click on this link to download the Autofill Forms extension:
https://addons.mozilla.org/en-US/firefox/addon/autofill-forms/
- In Firefox say “Under Tango Alpha” to Click Tools/Add-ons
- “Shift Tab”, and if necessary “1-10 Down” to Navigate to Autofill Forms
- “2 Tab · Enter” to Click “Options”
- “2 Tab” to Navigate to the first field
- Fill in all applicable fields
- “Enter” to save your information

Now anytime you find yourself in a form field say “Under Juliet” and applicable fields will automatically fill in.

That was the quick easy setup. If you want to change the keyboard shortcut or set several different profiles, take a look at the options. There’s a lot you can do with this add-on.

Feel free to +Kim Patch if you want me to add you to my Utter Command Circle on Google+

617-218-7018    laura.catanzaro@gmail.com

Using speech recognition for passwords

Friday, August 12th, 2011

By Kimberly Patch

I get a lot of questions about how to use speech recognition software for passwords.

Speech is inherently different from the keyboard because people can tell what your password is when you say it out loud. And when the password is unpronounceable you end up spelling it, which is both nonsecure and tedious.

I see a lot of people using the not-so-great solution of mapping a cryptic password to something pronounceable using the Dragon vocabulary manager or the Utter Command Enter List facility (UC Enter List lets you combine words with the Enter key). Neither method is very secure, because the mapping is in a utility that someone can simply look at.

The easiest good solution is to use what’s already there — check “Remember Password” on your browser and when you type your username the password will fill in automatically. Set a master password in your browser to protect the list of passwords (in Firefox click tools/options/security and check “Use a Master Password”).

Once you have Remember Password set up, put your username(s) in the Utter Command Enter List (say “Add Enter” to open the Enter List) and you’ll be able to say your username plus “Enter” in a single phrase. With “Remember Password” checked the password will fill in automatically and you’ll be able to log on using a single speech command.

Another good solution is a password manager like Roboform, which manages all your passwords (there’s a free version). All you need to do is enter a master password when you turn on your computer. Roboform also automatically fills in forms for you. It takes some set up, but in the end it makes things easier.

Dragon 11.5 free upgrade finally here

Wednesday, July 27th, 2011

By Kimberly Patch

If you already use Dragon NaturallySpeaking 11, you’re entitled to a free upgrade to 11.5.

You must access the upgrade from within Dragon 11 — Click on “Help\ Check for Updates”, select “Dragon NaturallySpeaking 11.5” and follow the instructions to download the upgrade. Once it’s downloaded, click on the upgrade to install.

There’s more information here:  nuance.custhelp.com/app/answers/detail/a_id/6213

We strongly recommend upgrading. The upgrade fixes several bugs and gives you the ability to use your iPhone as a wireless microphone over a wifi network.

Utter Command is fully compatible with Dragon 11.5.

Making Google+ easier to use

Wednesday, July 27th, 2011

By Kimberly Patch

Here are a couple of Firefox add-ons that make Google+ easier to use.

- Google+Manager
https://addons.mozilla.org/en-US/firefox/addon/google-manager/
This adds the keyboard shortcuts Google should have included in the first place, a translate button for every post, a drop-down menu for common functions, and a tiny URL generator.

- i rec Plus 1 and Like
https://addons.mozilla.org/en-US/firefox/addon/i-rec-plus-1-and-like/
This adds a “+f” icon in Firefox next to the homepage button (near the top right corner). Click the button to get Google +1 and Facebook Like buttons for any webpage. You can use the Utter Command naming-a-mouse-click utility to click the button, then click the icon to share to either service using a single speech command (details in UC Lesson 10.24).

Quick Hotmail control by speech

Tuesday, July 26th, 2011

By Kimberly Patch

I got a question today about controlling Hotmail by speech. Here’s a short answer.

The good news about web programs is more of them now have keyboard shortcuts, and more of the shortcuts are standard. This makes it easier to use speech control without customization.

Do a web search for “Hotmail keyboard shortcuts” and you’ll find several lists. Here’s one from about.com.

Hotmail has a pretty good set of shortcuts, including some defacto standards. Utter Command e-mail commands like “New Message”, “This Reply”, “Reply All” and “This Save” work in Hotmail because Hotmail uses the common shortcuts for these functions (see UC Lesson 8.3).

For the less standard functions you can speak keyboard. Here are a few that are particularly useful:
“Letter fi” or “Foxtrot India” goes to the Inbox folder
“Letter fs” or “Foxtrot Sierra” goes to the Sent folder
“Control Dot” goes to the next message
“Control Comma” goes to the previous message
“Control Enter” sends

Tip: Make sure to say “Shift” before “Control “ if you use any of the shift control commands.

Probably the best way to control drop-down menus that you use frequently in web programs is to use the naming-a-mouse-click ability (see UC Lesson 10.24). You can say two mouse clicks in a row to control a drop-down using a single speech command.

Heat and computers

Monday, July 25th, 2011

Those of us who use speech recognition are giving our computers a pretty good workout — the speech engine takes a lot of compute power. As long as you have a fairly powerful computer and do a couple minutes of maintenance every few weeks you’ll be all set.

Unless, like last week, it’s very hot, and you try to use your computer in a room that’s not air-conditioned.

Computers naturally get warmer when you use them. On a cool day the heat dissipates pretty well by itself. Last week was a different story, however. When a computer gets too hot, the computer fan kicks on to cool it down. If it’s still too hot the chip will automatically slow down. This all presents problems for the speech user. First, the excess noise of the fan can make it harder for the microphone to process your speech, and can make the signal that ultimately passes to the computer chip less clean so the computer has to work harder to decipher it. These can both increase the lag between you saying a command and the computer recognizing it. And if the chip slows down, processing slows down further.

The moral of the story is if you find yourself trying to use speech on a hot day and you think the computer is slowing down, it probably is. Turn it off for little while and it will do better when you turn a back on. Find air-conditioning or wait till the air temperature is cooler and it will do even better. And make it a habit to turn off your computer when you’re not using it so it can cool down completely.

FYI Here’s advice on the ideal setup for speech-recognition and a two-step maintenance program for speech-recognition.