Quantcast

NYC Gazette

Saturday, December 21, 2024

Katz School researchers develop AI framework to improve video streaming quality

Webp oin2ngems2u7hxkbxjrty11hldr7

Rabbi Dr. Ari Berman, President and Rosh Yeshiva | Yeshiva University

Rabbi Dr. Ari Berman, President and Rosh Yeshiva | Yeshiva University

Hang Yu, a student in the Katz School's M.S. in Artificial Intelligence, will present his research at the 2025 IEEE Conference in January. The study, co-authored with Dr. David Li, focuses on improving video quality assessment (VQA) using deep learning techniques.

The research introduces a dual-path deep learning framework aimed at ensuring flawless video streaming under varying network conditions. Traditional methods for assessing video quality often fall short when dealing with complex real-world challenges. The new approach by Katz School researchers uses artificial intelligence to analyze data and identify subtle distortions, enhancing the viewing experience.

Video Quality Assessment presents several challenges, including balancing sharp detail with broader motion context. Models focusing too much on fine details may miss larger motion contexts, while those emphasizing motion might overlook critical details in fast-moving videos.

The researchers employ the SlowFast model architecture for video analysis. This dual-speed processor captures detailed information through a "slow" pathway analyzing video at a lower frame rate and a "fast" pathway that observes overall motion and flow. This combination ensures both fine details and large-scale context are considered.

Additional tools developed by the team include PatchEmbed3D for understanding spatial and temporal dynamics, WindowAttention3D for maintaining local details, Semantic Transformation and Global Position Indexing for consistency, and Cross Attention and Patch Merging to enhance communication between pathways.

To train their system, the team used PLCC Loss and Rank Loss functions along with cosine annealing for efficient learning. Dr. David Li noted that testing on public datasets showed the model outperforms existing methods both numerically and subjectively.

The two-stage training process was crucial to success; it taught broad patterns first before fine-tuning intricate detail recognition. Future exploration could involve more sophisticated strategies or addressing specific challenges like compression artifacts.

“This research bridges the gap between technology and human experience,” said Hang Yu, highlighting its potential impact across fields such as gaming and virtual reality where video quality is vital.

ORGANIZATIONS IN THIS STORY

!RECEIVE ALERTS

The next time we write about any of these orgs, we’ll email you a link to the story. You may edit your settings or unsubscribe at any time.
Sign-up

DONATE

Help support the Metric Media Foundation's mission to restore community based news.
Donate

MORE NEWS