OpenAI has announced a new update to the Voice Mode in ChatGPT, available on both the web and mobile app. According to Al-Bawaba Tech, the update allows direct voice interaction within the ongoing conversation, enabling users to see a live transcription of the dialogue and accompanying visual elements—such as maps or images—without switching to a separate interface.
Key Features of the Update
-
Integrated Voice Interaction: Users can start a voice conversation by clicking the "sound waves" icon next to the text input field. Unlike the previous independent interface, the voice mode is now fully integrated into the chat, making it easier to switch between speaking and typing.
-
Visual Enhancements: As shown in OpenAI’s demo video, the model can display the transcribed text of the conversation, maps of locations requested by the user, and additional images alongside the voice response.
-
Option to Use Classic Interface: Users can revert to the old voice mode interface by enabling “Separate Mode” in the Voice Mode settings.
Gradual Rollout and Improved User Experience
-
The integrated voice mode is being rolled out gradually on web and mobile platforms.
-
The update improves usability by reducing the friction between different conversation modes, addressing user feedback about the previous separation of voice and text interfaces.
-
This update aligns with ChatGPT’s multimodal capabilities, allowing users to issue voice commands accompanied by images or videos while receiving visually integrated responses.
Background and Enhancements
-
Voice Mode Launch: Originally introduced in September 2024.
-
June 2025 Update: Added enhanced voice expressiveness and real-time translation.
-
Voice Mode remains a core feature of ChatGPT, now offering a seamless blend of voice, text, and visual content.
This integration makes ChatGPT more interactive and visually informative, enhancing the overall conversational experience for users who rely on both voice and visual cues.

Post a Comment