Computers respond to mood and gestures

Tom Shelley reports on some of the latest non-keyboard ways of interacting with computers and machines

Researchers and developers are finding ways to interface to computers with hand gestures observed by video cameras, snapping one's fingers, or enabling them to assess human states of mind. These are far beyond 'blue sky' ideas - the finger snapping interface cost less than £100 to build and mood assessment requires only analysis of facial video images. Hand gestures are being used to work with sheets of electronic 'paper' on a novel desktop interface and can be used to control commercially available lamps. Finger snapping looks promising as an alternative to infra red remotes and mood assessment is a target of car makers. Using a wired up glove to communicate with a computer seems to have come and gone along with the idea of wearing eye-phones to immerse oneself in cyberspace and generate mis-orientation as well as a headache. The latest ideas involve using only what we are all born with. One such is being developed by Philip Tuddenham, a research student in the Rainbow Group in the University of Cambridge Computer Laboratory. This is an enhancement of the 'Escritoire', conceived by Mark Ashdown and Peter Robinson, which uses two projectors to produce a composite image on a white desk, 914mm x 1220mm. One projector creates a large, low definition periphery while the second creates a smaller, high resolution 'fovea' where documents can be read and drawings worked on in detail. The original concept involved the user being equipped with two pens, a digitiser pen for the dominant hand and an ultrasonic pen for the other hand. In the new development, a webcam picks up the hand and where the fingers are pointed, allowing the user to mark and turn pages without mouse clicking. In the system described to us, users are required to wear black gloves, so that the system can find the user's hands "Robustly" despite the bright light from the projectors. Philip Worthington, a recent graduate of the Royal College of Art, on the other hand, has devised a system to interpret hand gestures which is unaffected by skin colour, using a camera to pick up hand gestures in front of a white back-lit panel. In his demonstration system, he projects a version of the hand image enhanced with teeth, if the system detects open fingers, along with the generation of suitable sound effects. While the goal of his system is essentially fun, it is not hard to see a way in which various finger openings and pointings could be linked to graphical commands, rather than used to produce growling dogs and roaring lions. Rather more commercial and already on the market is a range of lamps from London based Mathmos that respond to hand movements. Moving one's hand from one side to the other side above one of the lamps turns it on or off, while moving the hand down dims it and moving the hand away brightens it. Looking inside reveals LED devices and the designers say that the key technology is the use of infrared motion sensors. The 'Airswitch 1' is a small blown glass light resembling a chemistry flask which sells for £38. The company also now offers a floor standing 'Airswitch AZ' and its latest 'Airswitch TC'. The technology is patent pending. The company says, "We are investigating a number of other potential applications for the Airswitch, but as you can understand, we cannot provide further information at this moment." Also available for immediate commercial exploitation although not yet a product is Dr James Scott's audio location technology, which detects clicking from fingers or clicker, and triggers what he describes as "Appropriate reactions" according to the 3D location in which the fingers are clicked. As a demonstration, he can click his fingers to turn a music player off when somebody comes into his office and on when they go out again. His prototype system has six microphones and six sound cards, all obtained for a total cost of around £100 from www.dabs.com. Location accuracy in is better than 28cm in 3D improving to 10cm in 2D and for repeated clicks. 1ms time error is equivalent to 30cm spatial error but the system gets around problems caused by computational delays with some cunning software written in Java and about 200 lines of 'C' code. The Intel Research Cambridge team has also developed an interface that is triggered by a camera phone being pointed at circular bar codes on the other side of a shop window glass. But most extraordinary of all, is a system also being researched at Cambridge, which picks up people's moods and states of mind from their facial expressions and head gestures. The work is being lead by Dr Neil Dodgson, another member of the Rainbow Group. As he explains, he is trying to automate what we all do when we meet people without normally being aware of it. The inability of computers to transmit this information, along with pheromones, body language, and other non-verbal cues is one of the reasons why face to face meetings are never likely to be completely replaced by electronic methods of collaboration. Furthermore, because computer systems are unable to detect unspoken wishes and intent, it is why devices such as the Microsoft paperclip and audio navigation systems are often so annoying. The system they have been studying captures video images, marks spots on various identified features on the face, and tracks these to identify head and lip movements. It then uses a Dynamic Baysian network to assess mood. Written in 'C' it can establish: disagreeing, agreeing, unsure, thinking, interested and concentrated, in up to 89 per cent of instances (concentrating). This is better than was achieved by a panel of 18 humans watching the same recorded video clips used by the computer, who apparently scored only 64.5 per cent. Run on a 3.6GHz Pentium PC, computation time was about two seconds, similar to the time required for a human to reach a decision. We can think of applications that might require the detection of other moods such as lying, cheating, and intending to kill somebody but these emotions are notoriously more difficult to detect, especially when professional criminals are involved. In any case, the goal was to devise a means by which a car navigation system could judge that the user was lost and annoyed, and modify its response accordingly. One of the challenges now, apparently, is to reduce the computation load so it can run on a typical car-type system. It is in future intended to extend system capabilities to include voice inflexion and body language. If a system in a car or in an aeroplane were able to detect that a driver or pilot was falling ill, before they were aware themselves, it could be a life saver. The Cambridge computing developments were revealed at one of the university's 'Horizon' series seminars Mark Ashdown's web site Rainbow Group at Cambridge Computer Laboratory Philip Worthington's web site Mathmos Email Mathmos Audiolocation web site University of Cambridge Horizon Seminars Eureka says: . If a system in a car or in an aeroplane were able to detect that a driver or pilot was falling ill, before they were aware themselves, it could be a life saver. Pointers * It has been shown possible to devise electronic systems that can pick up human hand gestures in order to improve user-system interaction in document handling, or control lights * It is also possible to control systems by locating finger snaps in 3D space. * The latest idea is to assess mood from facial expressions, which it seems, a computer can do more reliably than normal humans.