OpenAI今日发布GPT-5,支持文本、图像、音频、视频无缝融合处理。
OpenAI officially released GPT-5 today, marking the dawn of a fully multimodal era for large language models. GPT-5 not only handles text generation but also simultaneously processes image recognition, audio transcription, and video content understanding, enabling cross-modal reasoning. On multiple benchmarks, GPT-5 outperformed all previous models in visual QA, speech synthesis, and long video summarization tasks. OpenAI CEO Sam Altman stated that GPT-5 will first be available to developers via API, with consumer products rolling out in the coming weeks. This move has sparked widespread discussion about expanding AI application scenarios.