先做个广告:如需代注册ChatGPT或充值 GPT5会员(plus),请添加站长微信:gptchongzhi
OpenAI’s new GPT-4o lets people interact using voice or video in the same model
     推荐使用GPT中文版,国内可直接访问:https://ai.gpt86.top
 推荐使用GPT中文版,国内可直接访问:https://ai.gpt86.top 
GPT-4o:语音视频交互,一“模”到底
1
OpenAI just debuted GPT-4o, a new kind of AI model that you can communicate with in real time via live voice conversation, video streams from your phone, and text. The model is rolling out over the next few weeks and will be free for all users through both the GPT app and the web interface, according to the company.OpenAI CTO Mira Murati led the live demonstration of the new release one day before Google is expected to unveil its own AI advancements at its flagship I/O conference on Tuesday, May 14.
debut v.首次亮相
demonstration n. 演示,展示
flagship n.旗舰,最重要的产品
OpenAI刚刚发布了GPT-4o,这是一种新型的人工智能模型,用户可以通过实时语音对话、手机视频流和文本与其进行实时交流。据公司透露,该模型将在未来几周内逐步推出,并将通过GPT应用程序和网页界面向所有用户免费提供。OpenAI的首席技术官米拉·穆拉蒂(Mira Murati)在谷歌预计于5月14日星期二举行的旗舰I/O大会上发布自己的AI进展的前一天,进行了新产品的现场演示。
GPT-4 offered similar capabilities, giving users multiple ways to interact with OpenAI’s AI offerings. But it siloed them in separate models, leading to longer response times and presumably higher computing costs. GPT-4o has now merged those capabilities into a single model, which Murati called an “omnimodel.” That means faster responses and smoother transitions between tasks, she said.“We’re looking at the future of interaction between ourselves and the machines,” Murati said of the demo. “We think that GPT-4o is really shifting that paradigm into the future of collaboration, where this interaction becomes much more natural.”
silo n.简仓;地下贮存库
paradigm n.范式;样式
点击下方查看翻译
GPT-4提供了类似的功能,为用户提供了多种与OpenAI人工智能产品交互的方式。但它将它们隔离在不同的模型中,导致更长的响应时间和可能更高的计算成本。gpt - 4o现在将这些功能合并到一个模型中,Murati称之为“全能模型”。她说,这意味着更快的反应和任务之间更平稳的过渡。穆拉蒂在谈到演示时说:“我们正在研究人类与机器之间互动的未来。”“我们认为gpt -4o确实将这种模式转变为未来的合作,这种互动变得更加自然。”
Like previous generations of GPT, GPT-4o will store records of users’ interactions with it, meaning the model “has a sense of continuity across all your conversations,” according to Murati. Other new highlights include live translation, the ability to search through your conversations with the model, and the power to look up information in real time.As is the nature of a live demo, there were hiccups and glitches. GPT-4o’s voice might jump in awkwardly during the conversation. It appeared to comment on one of the presenters’ outfits even though it wasn’t asked to. But it recovered well when the demonstrators told the model it had erred. It seems to be able to respond quickly and helpfully across several mediums that other models have not yet merged as effectively.
hiccups and glitches 小故障
outfit n.套装;团队;全套装备
demonstrator n.示范者
Prior to GPT-4o, you could use Voice Mode to talk to ChatGPT with latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average. To achieve this, Voice Mode is a pipeline of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model converts that text back to audio. This process means that the main source of intelligence, GPT-4, loses a lot of information—it can’t directly observe tone, multiple speakers, or background noises, and it can’t output laughter, singing, or express emotion.
latency n.潜伏;潜在因素
tone n. 语气,腔调
点击下方查看翻译
在gpt - 4o之前,您可以使用语音模式与ChatGPT通话,平均延迟为2.8秒(GPT-3.5)和5.4秒(GPT-4)。为了实现这一点,Voice Mode是一个由三个独立模型组成:一个简单模型将音频转录为文本,GPT-3.5或GPT-4接收文本并输出文本,第三个简单模型将文本转换回音频。这个过程意味着智力的主要来源GPT-4会丢失大量信息——它不能直接观察音调、多个说话者或背景噪音,也不能输出笑声、歌声或表达情感。
END
写作句式积累
Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.
scratching the surface of exploring ... 浅尝辄止地探索...
由于 GPT-4o 是我们第一个结合所有这些模式的模型,因此我们仍然只是浅尝辄止地探索该模型的功能及其局限性。
翻译练习
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs.
     
本文链接:https://shikelang.cc/post/1261.html
Chatgpt中文版chatgpt开通会员会快一些吗chatgpt软件官网微软chatgpt4.0安装chatgpt注册用什么手机号科大讯飞能跟chatgpt比吗chatgpt干什么的国内可以使用chatgptchatgpt提问模板ChatGPT局限性







网友评论