Sunday, September 7, 2008

Technology requirements for human-computer vocal interactions



From a trivial point of view, the technology increases with the complexity of the task, but at some point it becomes perhaps more of a conceptual battle than a mechanical one. Computer power is increasing at a steady rate and we have applications like 3D computer games that are ready and waiting to consume that power. But some problems are not computationally hard, but rather they are limited by our understanding of the physics, chemistry, physiology and so on involved in the problem. For some years, handwriting recognition made no progress at all then around 1990 a different approach started to appear based more on a mathematical analysis of the overall shapes than on any kind of line following or simple image matching algorithm. Once that leap had been taken, the quality of handwriting recognition improved very quickly, so we might expect to see this effect with speech recognition and the AI required for the vocal conversations we want to have with computer systems. In the technology vs. time plot in Figure 2 I'm suggesting that the timeline is very wide for when we might get good results but the technology span is flatter. In other words, we will have the computing power way before we know how to really solve the problem.

No comments: