Computer vision technology is increasingly used in areas such as automatic surveillance systems, self-driving cars, facial recognition, healthcare, and social distancing tools. Users need accurate and reliable visual information to fully exploit the benefits of video analytics applications, but the quality of video data is often affected by environmental factors such as rain, nighttime conditions, or crowds (where many images of people overlap each other in a scene). Using computer vision and deep learning, a team of researchers led by Yale-NUS College Associate Professor of Science (Computer Science) Robby Tan, also from the Engineering Faculty of the National University of Singapore (NUS), has developed new approaches that solve the problem of low level vision in videos caused by rain and nighttime conditions, as well as improving the accuracy of pose estimation 3D human in videos.
The research was presented at the 2021 Computer Vision and Pattern Recognition (CVPR) Conference.
Fight against visibility problems in the event of rain and at night
Nighttime images are affected by low light and man-made light effects such as glare, glare, and spotlights, while rain images are affected by streaks of rain or a build-up of rain (or rain veil effect).
“Many computer vision systems, such as automatic surveillance and self-driving cars, rely on clear visibility of input videos to function well. For example, self-driving cars cannot perform robustly in heavy rain, and automatic CCTV surveillance systems often fail at night, especially if the scenes are dark or there is strong glare or spotlights, ” explained Professor Assoc Tan.
In two separate studies, Assoc Prof Tan and his team introduced deep learning algorithms to improve the quality of night videos and rain videos, respectively. In the first study, they increased brightness while simultaneously suppressing the effects of noise and light (glare, glow, and spotlights) to produce clear nighttime images. This technique is new and meets the challenge of clarity of images and nighttime video when the presence of glare cannot be ignored. In comparison, existing advanced methods fail to deal with glare.
In tropical countries like Singapore where heavy rains are common, the rain veil effect can significantly degrade the visibility of videos. In the second study, the researchers introduced a method that uses frame alignment, which allows them to obtain better visual information without being affected by the rain streaks that appear randomly in different frames and affect the quality of the frames. images. Subsequently, they used a mobile camera to use the depth estimate to eliminate the rain haze effect caused by accumulated rain droplets. Unlike existing methods, which focus on removing rain streaks, the new methods can remove both rain streak and the rain haze effect.
3D Human Pose Estimation: Combating Inaccuracy Caused by Overlapping Multiple Humans in Videos
At the CVPR conference, Professor Assoc Tan also presented his team’s research into 3D human pose estimation, which can be used in fields such as video surveillance, video games and sports broadcasting.
In recent years, 3D multi-person pose estimation from monocular video (video taken from a single camera) is increasingly an area of interest for researchers and developers. Instead of using multiple cameras to take videos from different locations, monocular videos offer more flexibility as they can be taken using just one regular camera, even that of a cell phone.
However, the accuracy of human detection is affected by high activity i.e. multiple individuals in the same scene, especially when the individuals are closely interacting or when they appear to overlap in the monocular video.
In this third study, the researchers estimate human poses in 3D from a video by combining two existing methods, namely a top-down approach or a bottom-up approach. By combining the two approaches, the new method can produce a more reliable pose estimation in multi-person environments and handle distance between individuals (or variations in scale) more robustly.
Researchers involved in the three studies include members of Professor Assoc Tan’s team at the Department of Electrical and Computer Engineering at NUS where he holds a joint position, and his collaborators from the City University of Hong Kong, ETH Zurich and the Tencent Game AI Research Center. His lab focuses on computer vision and deep learning research, particularly in the areas of low-level vision, analysis of human pose and movement, and deep learning applications in Healthcare.
“In the next step of our 3D human pose estimation research, which is supported by the National Research Foundation, we will examine how to protect the privacy information of videos. For methods of improving visibility, we we strive to contribute to advancements in the field of computer vision, as they are essential for many applications that can affect our daily lives, such as enabling self-driving cars to perform better in adverse weather conditions, ”said Professor Assoc Tan.