Heads up – if you liked Patch on Speech, please take a look at my current blog – Patch on Tech. You’ll find my thoughts on speech input there along with many other things tech related. You can also find me @PatchonTech.
By Kimberly Patch
Ten years ago the Boston Voice Users Group (BVUG) constructed a top 10 Christmas list of features and fixes we wanted for Dragon NaturallySpeaking Version 7. We solicited ideas by email and had a meeting where people brought more ideas and several dozen active Dragon users voted to rank the top 10. A voice users group in New York City got wind of the project and came up with their own version. Each group sent its top 10 list to Dragon maker Nuance, which at the time was still called ScanSoft. We also sent a supplemental list of 45 suggestions, also ranked by importance.
To our great disappointment, we heard nothing back.
I thought it might be interesting to look at those lists 10 years later, and come up with a new, 2013 list aimed at Dragon 12.5. There is no longer a Boston Voice users group.
The first two items of this year’s Christmas list are the #1 and #4 suggestions from the decade-old list. They are at the top again this year because I believe that if they’d been implemented 10 years ago we would be in a very different place with speech now. Better late than never.
#1: An email suggestion box that is separate from tech support (no personal response necessary) This would enable people to send in suggestions without being charged
#2: A ScanSoft [now Nuance] employee whose job description includes using NatSpeak *all the time*
#3: This is a bit of a cheat because it’s more than one suggestion. It’s a small bundle of suggestions from the decades-old list and supplemental list that are all on the same topic. This group of features would’ve helped what was at the time an active group of users writing custom commands to improve Dragon, and helping other users by doing so. I believe that if these suggestions had been implemented a decade ago Dragon would have a much more thriving user community today. Again, better late than never.
- A link that allows you to open a macro file by clicking on a command in the command history dialog box
- Commands that make the command browser usable hands-free
- An easier way to disable built-in commands or at least change their names
- A way to turn off a single or a set of installed macros
- A way to assign a set of macros to multiple programs
#4 and #5: A pair of suggestions from a decade ago that address user frustration:
Better recognition logic or an option that will cut down on misrecognitions that are ungrammatical (“he walk”)
A strong correction option in the correction box to learn after 1 correction as if you had corrected 10 times
#6: A fix for the problem of a current window losing focus when there is no reason for it to have lost focus (this must be corrected by clicking the mouse in the window, which only sometimes works, or switching to another program and back). A related problem here is Dragon not realizing it’s in a dictation field. Since this has been so difficult to fix, let me suggest a more modest proposal – a practical workaround. Let the user tell Dragon to act like it’s dictating into a text box.
#7: A fix for the problem in Microsoft Word of periodic loss of connection to the text, which disables the Select and Say commands.
In 2009, shortly after Dragon NaturallySpeaking 10 came out I wrote a blog entry suggesting 10 improvements for Dragon 10. The last three items on this year’s list – #8, #9, and #10 – are the top three items from the 2009 blog entry:
- I’d like a default user option that would let me start the program hands-free.
- I’d like the ability to check audio settings hands-free.
- I’d also like the ability to save and switch Check Audio settings — this is useful if you travel a lot. I do an audio check whenever I land someplace new, but there’s no reason I should have to do another audio check rather than go back to a saved once I’m back in the office.
Giving credit where credit is due, I will say that #4 on the 2009 list was fulfilled. We now have separate controls for buttons and menus. I can say whatever’s on a button – like “yes” or “no”, and at the same time set Dragon to require longer names for menu items, so I can say “File Menu” rather than just File because menu items are often active when I’m writing text. Thanks for that.
We still have a ways to go, however. Here’s hoping for a good year in 2014.
By Kimberly Patch
[2-27-13 Update: We’ve gotten word that the issue with the Dragon update service has been fixed. It’s safe to turn on automatic updates if you wish.
In addition, there is a service pack available for Dragon 12. We strongly recommend downloading and installing this update.
A version of Utter Command that is compatible with this update is scheduled for release next week.]
Dragon Naturallyspeaking maker Nuance is having technical issues with its check for update service.
The bottom line: don’t let Dragon automatically check for updates until this is fixed. The software checks periodically unless the “Check for Product Updates at Startup” feature is turned off. This feature is turned on by default.
Trouble is, if your software checks for updates and runs into this issue, Dragon will then not open, making it difficult to turn off the “Check for Product Updates at Startup” feature.
To protect yourself from this potential problem turn off the “Check for Product Updates at Startup” feature: go to Dragon Options\Tools\Administrative Settings\Miscellaneous and Uncheck “Check for Product Updates at Startup”.
If you’ve already run into this problem and Dragon won’t open, there’s a more elaborate fix posted in the Dragon forum: http://nuance.custhelp.com/app/answers/detail/a_id/15105
It’s fairly obvious from the trouble that Nuance is getting ready to release an update. Once Nuance solves the update issues, you’ll want to download the update. The update is fully compatible with Utter Command.
Check back here periodically – we’ll let you know when you can turn the update service back on.
By Kimberly Patch
Every so often someone asks about calling up a folder directly without having to open Windows Explorer or a program like Word.
Until now my answer has been you can combine opening the program and the folder by saying, for instance “Word Open Budget Folder”, or “Windows New Budget Folder”.
But there’s a better way. Here’s a neat trick discovered by a clever UC user:
If you want to you call up a folder in any context and not just in a Windows Explorer window create a shortcut to that folder, and then add that shortcut to UC List as a file (not a folder) by issuing the command “Add File”.
You can also use the shortcut trick to create UC File links to files on network drives.
Don’t hesitate to let me know if you’ve come up with a clever trick that takes you beyond the Utter Command documentation.
Let me know at Kim at this website address or on Google+ +KimPatch (feel free to + me if you want to be in my Accessibility, or Speech Recognition circles).
By Kimberly Patch
Nuance has released a free iPhone Recorder application you can use with the “Transcribe Recording” feature of Dragon NaturallySpeaking for the desktop.
Dragon Recorder is a relatively simple recorder with a fairly clean interface that lets you record WAV files and transfer them to your computer via wifi. Once the files are on your computer, you can process them through Dragon’s Transcribe Recording feature, which is designed to transcribe the voice of a person who has trained a profile on Dragon NaturallySpeaking. It does pretty well with a relatively quiet recording of just that person’s voice.
Dragon Recorder gives you some useful, basic abilities:
- You can pause, then continue recording.
- You can play back the recording on the iPhone, and you can move the pause/play button to jump to different portions of the recording.
- You can continue recording at the end of any previous recording. This is a little tricky — drag the play button all the way to the right and the play button will turn into a record button
- You designate the first portion of the name of your file in settings. The second portion of the name is an automatic date and time stamp.
I can think of a couple of additions I’m hoping to see in updates:
- The ability to bookmark recordings on-the-fly during recording and playback. I’m picturing several types of bookmarks you can use like hash tags. Bookmarks should also show up in the transcription.
- Although this is designed to be transcribed automatically, it would also be useful to have slider bars for controlling the speed and pitch of recording on playback so you have a good way to manually transcribe as well.
What do you think? Let me know at Kim at this website address or look me up on Google+. Feel free to + me if you want to be in my Accessibility, Utter Command or Redstart Reports circles.
By Kimberly Patch
Apple’s iPhone 4S has taken a couple of nice big steps toward adding practical speech to smart phones. There are still some big gaps, mind you. I’ll get to those as well.
Speech on the keyboard
The long-awaited speech button is now part of the keyboard. Everywhere there’s a keyboard you can dictate rather than type. This is far better than having to use an app to dictate, then cut and paste into applications. This is one of the big steps. This will make life much easier for people who have trouble using the keyboard. And I suspect a large contingent of others will find themselves dictating into the iPhone a good amount of time, increasingly reserving the keyboard for situations where they don’t want to be overheard.
The key question about speech on the keyboard is how it works beyond the letter keys and straight dictation.
For instance, after you type
“Great! I’ll meet you at the usual place (pool cue at the ready) at 6:30.”
how easy is it to change what you said to something like this?
“Excellent :-) I’ll meet you at the usual place (pool cue at the ready) at 7:00.”
And then how easy is it to go back to the original if you change your mind again?
After we all use the speech assistant for a couple of days or weeks it‘ll become readily apparent where Siri lies on the very-useful-to-very-annoying continuum.
The key parameters are
– how much time Siri saves you
– how a particular type of Siri audio feedback hits you the10th time you’ve heard it
– how physically and cognitively easy it is to switch between the assistant and whatever you have to do with your hands on the phone.
One thing that has the potential to tame the annoyance factor is giving users some control over the feedback.
I think the tricky thing about computer-human feedback is it’s inherently different from human-human feedback. One difference is the computer has no feelings and we know that. Good computer-human feedback isn’t necessarily the same as good human-human feedback.
The big gap
There’s still a big speech gap on the iPhone. Speech is still just a partial interface.
Picture sitting in an office with a desktop computer and a human assistant. Type anything you want using the letter keys on your keyboard or ask the assistant to do things for you. You could get a fair amount of work done this way, but there’d still be situations where you’d want to control your computer directly using keyboard shortcuts, arrow keys or the mouse. Partial interfaces have a high annoyance factor.
Even if you use a mix of speech, keyboard and gesture, if you’re able to choose the method of input based on what you want to do rather than what happens to be available, true efficiencies will emerge.
Ultimately, I want to be able to completely control my phone by speech. And I suspect if we figure out how to do that, then make it available for everyone, the general mix of input will become more efficient.
I’d like to see the computer industry tap folks who have to use speech recognition as testers. I think this would push speech input into practical use more quickly and cut out some of the annoyance-factor growing pains.
What do you think? Let me know at Kim@ this domain name.
By Kimberly Patch
If you haven’t used speech commands to control a computer, it might not be obvious that single character commands, for instance “y” to archive a message in Gmail, can present a challenge.
Single-character commands seem like a great idea, especially for Web programs, because your Web browser already takes up some common keyboard shortcuts. Gmail has a lot of single-character commands, and once you get to know them you can fly along using the keyboard. In general I’m all for more keyboard shortcuts because it’s easy to enable them using speech.
Single-character commands that can’t be changed, however, can get speech users in a lot of trouble. Say a command or make a noise that’s misheard as text in a program that doesn’t use single-character shortcuts and either nothing happens or you get some stray text you can easily undo. Do the same thing in a single-character-command program and you can cause many actions to happen at once.
A stray “Kelly” in your Gmail inbox, for instance, will move the cursor up one message (single-character command “k”) and archive it (single-character command “y”). “Bruno” causes even more damage.
Turn off the keyboard shortcuts, though, and the program becomes fairly inaccessible for speech users. We need the shortcuts, and we can combine multiple keystrokes into single utterances to make things even better. It’s having little control over them that presents a problem.
Speech-safe single character shortcuts
Google Labs has a nifty extension that presents a simple fix. It lets you change the characters you use for keyboard shortcuts, including using two characters rather than one. Add a plus sign (+) to the beginning of every shortcut and they all become speech-safe.
Here are step-by-step instructions.
– go to your Gmail account, click the settings gear icon at the top right of the screen
– click “Labs”
– search for the “Custom Keyboard Shortcuts” extension and click to download. This will add a ”Keyboard Shortcuts” tab to your Gmail settings
– now, click the settings gear icon at the top right of the screen
– click Keyboard Shortcuts
– add “+“ to the beginning of every command
If you’re using Utter Command 2.0 you’re now all set. Say “Plus” and any one- or two-character command. Say, for instance “Plus j” or “Plus Juliet” to move down one item. You can also say a command multiple times in a single utterance. Say “Plus j Repeat 5” to move down five items, for instance. And you can combine two commands: “Plus j Plus y” moves down one item, then archives that item (say “Question Mark” to call up the keyboard shortcuts list.)
Raising the bar
The Google Labs add-on enables Gmail for speech users, but there are many other programs out there that use single-character shortcuts, including other Google programs, and other Web-based programs like Twitter. Message for Google: How about one facility that would let us control keyboard shortcuts across Google programs?
It would also improve things if we could have a larger number of characters available for a given character shortcut, the ability to also control control-key shortcuts, the ability to save and share different sets, and the ability to apply at least some shortcuts across applications
Important Note: If you were a beta tester or received the Utter Command 2.0 pre-release, you might not have the “Plus” set of commands. If this is the case, send e-mail to “Info” at this web address, and we’ll make sure you have the release version. The release version shows 15 new sets of commands on the “New commands for 2.0” list you can open from the Taskbar icon menu.
Tips, tricks, productivity, accessibility, usability and all things speech recognition.
By Kimberly Patch
It looks like Google is stepping up its accessibility effort and resources.
– Google accessibility page:
– Google Accessibility Twitter account:
@gooogleaccess Google Accessibility
– Accessibility Google Group
Here’s a tweet about accessibility in Google+:
“We considered accessibility of Google+ from day 1. Find something we missed? Press Send Feedback link & let us know.”
I do think there’s a lot missing.
For starters, Google+ is quite short on keyboard shortcuts (the Google Manager addon addresses this in part). It’s also short on basic keyboard navigation — in a perfect world, the down/up arrows and enter key should allow you to navigate anything that looks like a list or a menu.
Asking for feedback like this is a very good sign, however. One thing I’ve use the Send Feedback link to point out is once you get past a dozen circles or so it’s important to have a list view unless you’re willing and able do a lot of unnecessary scrolling.
Here’s a recent Google blog post about accessibility in Docs, Sites, and Calendar that talks about additional keyboard shortcuts:
Some Google applications are gaining more keyboard shortcuts. You still can’t use down/up arrows on everything that looks like a list or a menu, however.
The bottom line is there are some channels open and some good intentions. This is great. Now let’s hold them to it, and keep the keyboard shortcuts coming.
My #1 request as a speech user is the ability to adjust, organize and share keyboard shortcuts across apps. An adjust-your-shortcuts facility that works across apps would not only be good for many different types of users, it would address a special problem of speech users and the type of keyboard shortcuts that web apps tend to use. More on that issue next.
By Kimberly Patch
If you’re anywhere near Boston this week, make sure to check out the Change People’s Lives Conference and Expo this Friday, September 23 at the Hynes Convention Center. The event is hosted by the Commonwealth of Massachusetts and Governor Deval Patrick will be giving the keynote address. Event collaborators include The Institute for Human Centered Design, Massachusetts Rehabilitation Commission, Work without Limits, and Easter Seals.
Dragon NaturallySpeaking maker Nuance is exhibiting, and Peter Mahoney, Senior Vice President & General Manager of the Dragon Business Unit, is scheduled to give a talk. Information about Utter Command will also be available at the Nuance booth.
The expo is free. It costs $75 to attend the conference sessions. Registration details are here: www.ChangePeoplesLives.org
By Kimberly Patch
Here’s a simple way to make filling out forms in Firefox easier.
If you find yourself frequently putting the same old information — name, address etc. in a Web form, this will save you a lot of time, and it’s probably worth the time to set up even if you fill out forms just a few times a year (speech instructions are for Dragon plus Utter Command):
– Click on this link to download the Autofill Forms extension:
– In Firefox say “Under Tango Alpha” to Click Tools/Add-ons
– “Shift Tab”, and if necessary “1-10 Down” to Navigate to Autofill Forms
– “2 Tab · Enter” to Click “Options”
– “2 Tab” to Navigate to the first field
– Fill in all applicable fields
– “Enter” to save your information
Now anytime you find yourself in a form field say “Under Juliet” and applicable fields will automatically fill in.
That was the quick easy setup. If you want to change the keyboard shortcut or set several different profiles, take a look at the options. There’s a lot you can do with this add-on.
Feel free to +Kim Patch if you want me to add you to my Utter Command Circle on Google+