Video conferencing has entered the mainstream in a big way during the pandemic work-at-home reality. This is part of a three post series on how I think that video conferencing can be improved from the point of view of (1) the participant, (2) the technology, and (3) social cues during a video conference.
Video conferencing technology has improved greatly over the years, primarily due to improvements in hardware, networking, and coding algorithms. However, there are still major problems that inhibit a good user experience. A few issues include latency and bandwidth, audio and video processing, and user interface.
Latency is the time delay of the audio/video from one participant in a video conference to another. Natural face-to-face conversation has no perceivable latency. With greater latency, however, people begin to talk over each other and “normal” conversation is altered.
Transmission of video and audio in a video conference goes through a number of software and hardware steps. On modern laptops and mobile devices the camera, microphones, coding software, decoding software, display, and speakers are, for the most part, highly capable and reliable. The contribution to latency is consistent, if not small.
On the other hand, the networks which the data traverse are often a bottleneck in terms of both latency and bandwidth. These networks are inherently heterogenous, depending on each participant’s Internet Service Provider (ISP) and the instantaneous data path that each part (packet). Problems and interruptions in the data flow through a network contribute to inconsistent latency, glitches such as “freezing,” lower quality, and outright drop-off.
There are great efforts being made to improve networking bandwidth and shorten the latency. However, I wonder if the different vendors in a video conferencing system contribute to the problem. The manufacturer of the devices, the video conferencing software, and the ISPs all have different technical goals. Is there a way to rethink video conferencing that would make these systems work even better together? For example, ISPs usually offer home customers network bandwidth biased for downloading. This makes sense for streaming and web browsing, but becomes a bottleneck for video conferencing. Is there a way to have a “smart” reallocation of bandwidth when a user is engaged in a video conference?
A number of video conferencing platforms perform video and audio processing to improve quality. Google Meet, for example, uses noise cancellation to reduce distracting noises (I’m looking at you, Fido) and speaker-to-microphone feedback (echoing). This works, for the most part, until the latency is too great. (It is best to use headphones and eliminate many of these problems at the source.) Video processing includes brightness and contrast enhancement. Zoom has added an “automatic green screen” technology that superimposes a virtual background behind the participant. (These effects are getting better, but the best quality is still achieved by having the proper lighting and background.)
Finally, I would like to mention a problem with the user interfaces from different video conferencing systems. To me, the problem is not that one interface is better than another. If you are like me, you often participate in video conferences (and webinars) with others outside your organization. Chances are good that you are using a video conferencing system different from the one with which you are familiar. I find myself using one of many systems on any given day. Finding the mute button, the share screen button, or the chat button in real-time becomes a challenge. Would it be possible to have a universal (possibly optional) interface (or API for a common third-party interface) for all video conferencing systems? I understand this flies in the face of many companies’ desire to have their beautiful user interface as a differentiator from their competitors.
For all its ubiquity, video conferencing is still a young technology. In the next few years I predict we will see great technical improvements.