Monthly Archives: March 2009

Ten things I'd like to see

In December, 2003 the Boston Voice users group (BVUG) and its New York City counterpart (NYPC) did top 10 lists of what they would like to see in speech recognition engines. At the time both Dragon NaturallySpeaking and IBM’s ViaVoice were available.

Here’s my version for Dragon NaturallySpeaking 10. This list is also posted on the UC Exchange Wiki so I can keep track of whether and when they’re implemented.

1. I’d like a default user option that would let me start the program hands-free.

2. I’d like the ability to check audio settings hands-free.

3. I’d also like ability to save and switch Check Audio settings — this is useful if you travel a lot. I do an audio check whenever I land someplace new, but there’s no reason I should have to do another audio check rather than go back to a saved once I’m back in the office. I have a couple more minor suggestions for the Check Audio dialog box. First, it’s important enough to deserve its own menu item rather than only being buried in the Accuracy menu. Second, there’s an interface gotcha. Once you’ve finished checking the microphone, the focus is still on the go button. If you’re not thinking and click without moving the focus you find yourself checking the microphone again instead of going onto the accuracy check, which at best makes the process longer, and at worst is confusing.

4. I’d like separate the controls for buttons and menus. I’d like to be able to say whatever’s on the button — “yes”, “no”. But at the same time I want a longer command for menu items, e.g. “File Menu” rather than just “File”, because menu options are often active when I’m writing text.

5. The Dragon NaturallySpeaking engine should understand that when I say “Cap” what I’m looking for is a written word, not a number or symbol. “Cap Sixty” should return “Sixty”, not “60”. And “Cap Ampersand” should return “Ampersand” not “&”.

6. In the Spell Correction dialog box, I’d like a way to tell NatSpeak to type a a whole word. I’d like to say the word “Word” to indicate that the rest of the phrase is going to be a word just like I can say “Spell” to indicate that the rest of the phrase is going to be spelled.

7. The old Dragon Dictate where you could say separate words was better for people who have some types of disabilities. Putting a “speak words separately” mode in NaturallySpeaking would help a lot of people.

8. I’d like the option to be able to train the NatSpeak speech engine by repeating audio read to me through headphones rather than reading from text. This would also make training easier for younger kids.

9. I’d like a simple way to duplicate a user. Right now you can do this, but it’s a multistep and confusing process. To make a copy of the current user you have to backup, then restore. A separate menu item for duplicating would take the confusion out of the process.

10. Bring back the Dragon logo:-). The Dragon was much cooler then the green spiky blob.

What do you think of my top 10 list for NaturallySpeaking? What’s yours? Reply here or let me know at info@ this website address.

Ten things I’d like to see

In December, 2003 the Boston Voice users group (BVUG) and its New York City counterpart (NYPC) did top 10 lists of what they would like to see in speech recognition engines. At the time both Dragon NaturallySpeaking and IBM’s ViaVoice were available.

Here’s my version for Dragon NaturallySpeaking 10. This list is also posted on the UC Exchange Wiki so I can keep track of whether and when they’re implemented.

1. I’d like a default user option that would let me start the program hands-free.

2. I’d like the ability to check audio settings hands-free.

3. I’d also like ability to save and switch Check Audio settings — this is useful if you travel a lot. I do an audio check whenever I land someplace new, but there’s no reason I should have to do another audio check rather than go back to a saved once I’m back in the office. I have a couple more minor suggestions for the Check Audio dialog box. First, it’s important enough to deserve its own menu item rather than only being buried in the Accuracy menu. Second, there’s an interface gotcha. Once you’ve finished checking the microphone, the focus is still on the go button. If you’re not thinking and click without moving the focus you find yourself checking the microphone again instead of going onto the accuracy check, which at best makes the process longer, and at worst is confusing.

4. I’d like separate the controls for buttons and menus. I’d like to be able to say whatever’s on the button — “yes”, “no”. But at the same time I want a longer command for menu items, e.g. “File Menu” rather than just “File”, because menu options are often active when I’m writing text.

5. The Dragon NaturallySpeaking engine should understand that when I say “Cap” what I’m looking for is a written word, not a number or symbol. “Cap Sixty” should return “Sixty”, not “60”. And “Cap Ampersand” should return “Ampersand” not “&”.

6. In the Spell Correction dialog box, I’d like a way to tell NatSpeak to type a a whole word. I’d like to say the word “Word” to indicate that the rest of the phrase is going to be a word just like I can say “Spell” to indicate that the rest of the phrase is going to be spelled.

7. The old Dragon Dictate where you could say separate words was better for people who have some types of disabilities. Putting a “speak words separately” mode in NaturallySpeaking would help a lot of people.

8. I’d like the option to be able to train the NatSpeak speech engine by repeating audio read to me through headphones rather than reading from text. This would also make training easier for younger kids.

9. I’d like a simple way to duplicate a user. Right now you can do this, but it’s a multistep and confusing process. To make a copy of the current user you have to backup, then restore. A separate menu item for duplicating would take the confusion out of the process.

10. Bring back the Dragon logo:-). The Dragon was much cooler then the green spiky blob.

What do you think of my top 10 list for NaturallySpeaking? What’s yours? Reply here or let me know at info@ this website address.

New Videos: Commandline and quick Perl


We have a couple of new demo videos up.

Utter Command: Commandline by Speech shows how you can use the UC List Enter facility to speed up the commandline interface.

Utter Command: Writing a Perl Script by Speech shows how you can use UC’s combined keyboard commands to speed up writing code. Note that for this demo we don’t use any custom coding commands, just standard commands that work the same in any program.

You may recognize this Perl script from a YouTube video of a Microsoft speech demonstration. The big difference between the videos is with UC I had fewer commands to say and therefore fewer potential points of failure. There were a couple of other differences as well. I’m using the ideal speech set up: the NaturallySpeaking Pro speech engine running on XP with a Sennheiser ME3 Microphone and a buddy USB pod. I also wasn’t in front of an audience. I suspect the computer hardware is similar. My laptop is a two-year old Intel Core duo 2.16 with 2GB of memory.

Useful free software on UC Exchange


People often ask me for advice on software, and there are a lot of free programs I regularly recommend. I put up a UsefulFreeSoftware page on UC Exchange as a general reply. I also included links to help pages and forums for the software. Let me know if there’s something you think I should add to the list. Reply here or let me know at info @ this website address.

Dealing with the Office 2007 ribbon


I’ve been getting a lot of questions lately about Microsoft Office 2007 versus Microsoft Office 2003.

My stock answer is I prefer the 2003 drop-down menus to the 2007 ribbon. It’s funny, at the same time as Office made the switch from drop-down menus to the more Web-like ribbon, the Web application Google Documents made the opposite move — changing from a tab-based interface to drop-down menus. Out of the box, 2007 is less efficient — it takes up more screen space and requires more steps than 2003.

Having said that, the 2007 interface is also very configurable. You can put any drop-down menu or menu item on the Quick Access Toolbar that runs across the very top of the screen. And you can hide the ribbon. If you take the time to put the items you use most on the Quick Access Toolbar, you can make Office 2007 much more accessible.

For details on setting things up and using Microsoft Office 2007 with Utter Command, see UCExchange: UCandOffice2007 .

What’s your opinion on 2007 versus 2003? Reply here or let me know at info@ this website address.

UC Exchange


The UC Exchange Wiki is up! Check it out (say “UC Exchange”). Over the coming months you’ll see pages on specific applications with advice on how to apply UC to those programs, including step-by-step tours. 

Research Watch: What you see changes what you hear


Who says looks don’t matter?

It looks like what you see changes what you hear. Researchers from Haskins Laboratories and MIT have found that different facial expressions alter the sounds we hear.

This shows that the somatosensory system — the mix of senses and brain filtering that determines how you perceive your body — is involved when you process speech.

This doesn’t have a whole lot to do with speech commands except to show that it’s easy to underestimate the complexity, and subtlety, of our perception of spoken language.

Resources:

Somatosensory function in speech perception
www.pnas.org/cgi/doi/10.1073/pnas.0810063106