Since the advent of computers, innovators have looked towards them as a new system for communications. As such, over the last 40 years, the world has witnessed many amazing technologies, starting from TCP/IP, the bread and butter of most networks, to technologies such as streaming video casts and VOIP. With a change in backend technologies, one would expect a corresponding change in the front end systems to reflect these changes. However, upon a more detailed inspection, it is clear that front-end technologies for mutli-user communications have not changed much in the last decade beyond basic implementations of backend multimedia capabilities.
Our research team aims to investigate the issues with the status quo in multi-user communications, and to implement front-end changes that correspond to new capabilities presented on the backend. A key backend that wefve chosen to focus on is VOIP (Voice over IP), which has revolutionized user-to-user communication by allowing real-time voice communication. However, on the front-end, most companies have resigned themselves to simply re-implementing a phone, giving the capabilities to dial a contact, voicemail and call forwarding. While these backend to front-end technologies made sense to develop in the physical world, with market instituted limitations on what users were willing to pay for; with software one can implement new solutions at little to no fixed costs to the end user. As such, we choose to focus on our development work upon currently existing backend technologies such as XML and NAT, and to build a better software front-end to them.
For the fall semester, we will focus on adding new ways to make VOIP more intuitive to the user in multi-user environments by the addition of spatial information. The status quo is to simply show a list of users being dialed upon the screen, and with maybe an indication on screen of who is talking, and to present the audio in a mono-channel environment. If two people have similar voices, especially after compression on the network, it is extremely hard to tell them apart if they speak at the same time. However, as previous research done has shown, human beings are able to recognize audio spatially with their two ears, allowing them to differentiate between similar sounds by adding relative positional data. This is used in meetings, where even if people in a room have similar voices, the listener can process them as two separate audio sources, with distinction achieved by the spatial information. By implementing this within software, and presenting spatial information through graphical and audial means, we hope to bring similar benefits to the virtual multi-user environment.