Artificial intelligence system learns concepts shared across video, audio, and text | MIT News | Massachusetts Institute of Technology

MIT researchers developed a machine-learning technique that learns to represent data in a way that captures concepts shared between visual and audio modalities. Their model can identify where certain action is taking place in a video and label it.
— Read on news.mit.edu/2022/ai-video-audio-text-connections-0504

%d bloggers like this: