ChatGPT has been busy getting new designations. If you’ve been scrolling on 𝕏 over the last week, then you’ve seen the ChatGPT-4o announcement and probably thought of Joaquin Phoenix’s virtual girlfriend on Her.
Beyond the references to flicks, the latest update, released in May 2024, should benefit developers in terms of performance. These improvements are very specific. Some devs are already claiming that 4o’s code is of better quality (some dispute this). Others point out that its “contextual window” is larger. (Our take: it’s too soon to know if these allegations are accurate or one-off situations). In any case, many coders and designers have already been showcasing their implementations on social media.
Let’s quickly talk about what’s new with the updated version (just so everyone is on the same page), and then we can dig into some of the most useful ways developers work with ChatGPT-4 and ChatGPT-4o.
ChatGPT Plugins For Developers →
So… what’s new about GPT-4o?
The all-new ChatGPT-4o isn't just about text any more—it's a multimodal machine that also works with voice. This adds up to a new modal capability on top of 2023’s version 4, which already worked with images, sometimes in a lacklustre fashion. Now that the chatbot can speak up, there are plenty of possible scenarios that will help just about any white-collar trade. We're talking about accessibility tools for the visually impaired, for example, or having a live translator from your phone.
To put it clear, the two main new features of ChatGPT-4o:
- Multimodal capabilities with voice: ChatGPT-4 “understood” (rather poorly) images and could produce images. The new version can also process voice.
- Conversational AI: This chatbot understands voice and also speaks up. One of the most talked-about is the ability to engage in real-time, voice-based conversations. It’s a Siri but on steroids.
Before we continue, let's also take a moment to look back at the improvements made by the previous launch of ChatGPT-4 in March 2023:
- Increased word limit (25,000 vs. 3,000)
- Image reading capabilities
- 82% less likely to respond to requests for disallowed content (This is an official metric; parties that are not OpenAI might dispute how helpful this gigantic data point actually is).
- More concise, less wordy answers (which led to many users calling it a lazier chatbot)
- Cleaner formatting of answers (again, according to OpenAI)
- Possibility of defining the model’s tone, style, and behaviour
This is just the start. We'll be looking at new ways to use this tool to speed up your coding tasks.
Is the product called “ChatGPT-4o”?
No, the product we usually use with the URL is still called ChatGPT. What’s changing is the engine running behind it, with its newest version called GPT-4o. But many authoritative outlets are calling this new release “ChatGPT-4o” to make it easier for readers to understand we’re talking about a new product with the same UX. We’re also using the terms loosely because it’s easier to follow that way.
Is GPT-5 coming soon?
There is no official release date for version 5, and there is no reason to think that GPT-5 is around the corner. The hype for it is so big that when OpenAI announced they were showing off GPT-4o, they had to first make it clear that they weren’t dropping version 5.
What GPT-5 will do is only speculation. Some are too quick to call it something close to AGI. But researchers are sceptical and are telling GPT-5 will do the following:
- It will hallucinate, just as in its previous versions—this will keep happening because hallucinations are embedded into an LLM’s architecture
- It will be quicker than each previous version
Some researchers believe that version 5 is so far away that OpenAI was forced to launch 4o. These experts think the company reached a period of diminishing returns and opted to work on something unusual—voice—instead. This is all guesswork, since OpenAI is as closed-off and secluded as companies come.
Ways developers can use ChatGPT-4 and GPT-4o
As soon as GPT-4o dropped, the dev community started sharing some use cases that, some claim, couldn’t be done with GPT-4. Some of these use cases are just for kicks, while others are bonafide time-savers.
We’re collecting an assortment of tasks that devs shared they could do with either ChatGPT-4o or GPT-4. If you can’t do some of this with version 4, it means you should try your luck with the 4o release instead.
1. Code with a voice assistant
Too obvious to point out, yet too practical to skip trying it. GPT-4o has a voice synthesis feature with which you can interact with. The main use case here involves using all its multimodal capacities, so it works as a teacher. For example, if you have a hard time cracking a bug on your PyCharm IDE, you can show it to GPT-4o (image recognition) and allow it to talk with you until it finds out what’s wrong. It’s the closest to being in an office and having a colleague sit next to you and check your code that we’ve ever had with assistants.
Now, will this method be reliable or even useful, or did the live demo stretch it a bit? We can’t tell right now. The best idea is to try it. If it doesn’t work in the first three or four interactions, keep coding like you have and let OpenAI roll an update before trying this again.
2. Generate templates to start with
A software engineer, an artist, or a philosopher usually finds facing a blank page at the beginning of a project nerve-wracking. Luckily, ChatGPT-4 can help you hit the ground running by generating foundational code templates. Need a basic React component structure? Simply provide a prompt, and ChatGPT-4 will provide a customisable boilerplate template.
3. Write GitHub readmes
A well-written GitHub readme acts as the introduction for your project. ChatGPT-4 can help you create cool readmes that provide users with a clear understanding of your project's goals, installation instructions, and usage examples. Outline the details you want to point out, and ChatGPT-4 will translate them into a simple readme.
In any case, continually try to edit the AI version to avoid generic expressions like “cutting-edge technology” or “in the ever-evolving landscape” that imply ChatGPT's intervention. (It’s not a sin to use ChatGPT, but if you’re using those empty phrases everywhere, you’ll weaken your position of authority because it means you’re using it wrong, and it’s difficult to come back from that.)
4. Infrastructure as code templates
Managing infrastructure can be a time-consuming and highly error-prone process. ChatGPT-4 offers a solution by helping with the creation of Infrastructure as Code (IaC) templates. Describe the desired infrastructure layout, including virtual private clouds (VPCs), subnets, and security groups and then ChatGPT-4 will generate code that defines these resources within tools like CloudFormation.
5. Write shell scripts
ChatGPT-4 can understand your desired functionality and generate shell script code for you. Here's an example:
Imagine you need a script to automate file backups every week. Traditionally, you'd need to research the appropriate commands, write the script logic, and test it thoroughly. With ChatGPT-4, you'd simply describe your preferred functionality – “create a script that backs up all files in a directory every Monday.” ChatGPT-4 would then generate the script for you, likely including:
- Commands to navigate to the target directory.
- File archiving commands (like tar or cp).
- Scheduling using tools like cron to run the script on Mondays.
6. Creating mini-games
This one is already a year old, but it’s still worth checking out. ChatGPT-4 is capable of creating mini-games like snake and pong with JavaScript and HTML5 in one prompt (see below). Now, you’re probably thinking, how does this help you in your everyday life? Well, a lot of developers create mini-games for their portfolios to demonstrate their JS coding skills. You can obviously take these more simple game frameworks and customise. The 4o version allegedly now can do this same thing based on info from a screenshot.
Another Twitter user asked ChatGPT-4 to write HTML that would change what video a user sees based on the time of day. Very cool for web developers and a great detail to add to your portfolio if you want it to stand out.
7. 3D Designs
ChatGPT-4 can be integrated into Unity Editor, which is used for 3D modelling and game design. Prompts can be turned into 3D images. All in all, it’s going to help you model faster and become a lot more productive.
8. Debugging your code
If you’re an experienced developer, it’s probably going to be the other way around: You're debugging the AI code. But if you’re starting out or learning a new programming language, ChatGPT-4 is a great mentor. Firstly, you can share more code with the ChatGPT-4, and query the bot for troubleshooting advice.
9. Finding security vulnerabilities
Although we certainly should, we don’t always have security front and centre when coding or building applications. Okay, you’ve incorporated 2FA, but let’s be real, that’s not really going to do much if a hacker takes aim at your SME. Why not test your code with ChatGPT-4? It can help point out vulnerabilities and do some Quality Assurance. The example below is an Ethereum contract that had been hacked, and ChatGPT-4 points out the exact flaws.
10. Making helpful extensions
People have been going crazy over the last year making extensions that harness the power of ChatGPT-4 — but you can also use ChatGPT-4 to create your own custom extensions. Whatever tedious tasks you find yourself repeating could be automated or sped up with a productivity extension. For example, one user created an extension that saved his most common links so that he could simply right-click and paste the link. The example below is a pirate summariser, which summarises on-page text into pirate speak. It took this guy, who has no coding experience, only a few hours!
11. Transactional data
People were kind of criticising this post back then. Mostly because there are tools that already existed which could parse transactional data like the example below. It’s also quite simple. But we wanted to include this use case anyway because we think the idea could be very useful at a more complex level, or it might be more cost-effective than other apps. Regardless, the example here is someone putting a credit card transaction in and asking for merchant information in JSON format.
12. Copilot Microsoft Excel
Copilot for Microsoft Excel is awesome for anyone dealing with data and creating reports. It can break down data and provide useful insights. You can ask questions about your data and quickly create reports and data visualisations in a fraction of the time. It speeds up your workflow if you’re an experienced Excel user and provides a whole new level of access to those inexperienced users.
13. Sketches into website
As we mentioned before, ChatGPT-4 is a multimodal model, so since last year you can upload images and ask ChatGPT-4 prompts based on your image. In the demo, they turned a hand-drawn sketch into a functional website. The video below is from the demo, it’s quite simple.
14. UI design
UI developers, UI designers, and web developers — you can create UI designs with ChatGPT-4. The tool being demoed is called Galileo AI. You describe the page that you want to be created with a simple text description, and Galileo will generate a design in an instant (with pictures and everything!). All the designs generated are editable in Figma.
15. Learning a language
Whether you’re a native learning English or an ex-pat learning a local language, you kind of have to be at a two-language minimum in Europe. Plus, being multilingual gives you a huge advantage when it comes to job search. Duolingo is probably one of the biggest language learning apps, and last year they incorporated ChatGPT-4 into their application, which has taken language tutoring to a whole new level. Since the integration, the app can explain things to you based on your personal inputs and role play (i.e. speaking with my parents or going on a date).
16. Customer service
You’ll probably see a lot of chatbots incorporated with ChatGPT-4 in the future. Intercom was the first one. Their customer service bot, Fin, has been built entirely on ChatGPT-4. If you want to build a chatbot for your website, it’s totally possible with this technology. Or you can incorporate Intercom into your product. GPT-4o's new voice assistant will likely make your next customer service phone call sound too pristine to be true.
A quick word on Google…
Google's AI department hasn't been sitting back. While Google initially introduced its AI chatbot as Bard in February 2023, it was later unified with Duet AI under the Gemini brand in February 2024. Their chatbot now also supports voice and has a context window of “2 million tokens.” You can now upload your documents directly.
Just like ChatGPT and every LLM out there, Gemini’s new version still hallucinates a great deal. If you ask Gemini about Gemini’s new features, it tells you that it has increased its context window a staggering 2 million times (yes, not to 2 million, but by 2,000,000x). If you read that twice, you’ll notice it would take more than 7 trillion dollars—what competitor Sam Altman asked for—to get such an increase.