Tools for creators

Finally! ChatGPT Can Create Text-Heavy Graphics That Actually Work

If you've tried using AI to create graphics with text before, you know the frustration of garbled letters, bizarre fonts, and completely illegible results. That's all changed now. Here's how to make it work for your creator business.

Jacob Anderson

Apr 3, 2025 • 14 min read

There's this moment when playing around with new tech that I absolutely love. It's when you try something that previously never worked, fully expecting it to fail again - but then it works flawlessly. That little surge of dopamine mixed with morning coffee is just like crack. Very moreish.

I had that exact sensation when testing ChatGPT's latest image generation update. For the past few years as LLMs like GPT-4o and Claude have surged in capability, image generators like Midjourney have been leading the way in image generation - and it's been really impressive to watch. But there's been one thorn in their side... text.You would ask for an infographic, a cover design, or a simple social quote—and get back something with text that looked like it was written by a toddler with a crayon. In the dark. While riding a unicycle.

A poorly made AI generated image showing a circus, with nearly illegible text — This was state of the art this time last year

But that's finally changed. In March 2025, OpenAI rolled out a significant update to ChatGPT with GPT-4o image generation capabilities, and the most impressive improvement is that it can now create images with perfectly legible text. Not just a word or two - entire paragraphs, multiple sections, labels, titles, and more. The text is not just readable; it's properly formatted, consistently styled, and correctly placed.

A victorian style circus poster generated by ChatGPT — From ChatGPT today. Notice how "sweet treats" cuts off there? I'll explain that shortly.

To test this properly, I decided to create something genuinely useful: a cheat sheet for our PATH podcast planning framework. I wanted to see if ChatGPT could transform this educational content into a visually appealing, text-heavy graphic that actually looked professional.

The results? Honestly, I was gobsmacked. Not only did it work, but it created something that looks like it came straight out of Canva—professional, clean, and with perfectly rendered text.

The Technical Breakthrough

This isn't a small improvement. Previous AI image generators (including older versions of DALL-E and Midjourney) struggled to produce more than a few words without introducing bizarre errors. You'd often get:

Doubled letters
Nonsensical pseudo-text that looked like words but wasn't
Characters that merged into each other
Words that trailed off into illegible smudges

The GPT-4o update has essentially solved this problem. According to OpenAI, they've developed a system that "excels at generating coherent and legible text within images." Based on my testing, that's not marketing hype - it's genuinely accurate.

What's truly impressive is the range of text-heavy visuals it can handle:

Infographics with multiple sections and labels
Cheat sheets with bulleted lists and headings
Social graphics with titles and descriptions
Podcast cover art with perfectly rendered titles

The breakthrough seems to involve a separate layer in the image generation process that specifically handles text rendering. While OpenAI hasn't confirmed this technical detail, the difference in performance suggests a significant architectural change rather than just an incremental improvement.

My Step-by-Step Process: Creating the PATH Framework Cheat Sheet

PATH framework cheatsheet for planning your podcast, covering your Purpose, Audience, Topic, Hallmark

Let me walk you through exactly how I got ChatGPT to create a professional cheat sheet, so you can replicate the process for your own content.

Step 1: Don't Let ChatGPT Write Your Copy

This was my first big discovery. When you ask ChatGPT to create an image, it's using a completely different model than the one you're chatting with. It seems to be optimized for visual creation rather than understanding complex instructions about content.

My first attempt was a total miss. I fed in our entire "How to Start a Podcast" article and asked it to create a cheat sheet for the PATH framework. The text in the generated image was completely mangled—some words were smudged, others doubled-up letters, and some text was just plain unreadable.

📘

Key lesson: You need to supply the exact text you want to appear in your image. Don't expect ChatGPT to extract it from a large piece of content or generate appropriate text on its own.

Step 2: Pre-process Your Content

Since I needed to give ChatGPT the exact text, I first used ChatGPT in text mode to help me distill our PATH framework into concise, cheat-sheet-friendly content.

I asked it to take our detailed framework description and transform it into four short sections—one for each letter of PATH (Purpose, Audience, Topic, Hallmark)—with bullet points and a "Task" section for each.

This gave me perfectly formatted Markdown text that I could then use in my image prompt. Using Markdown was particularly effective, as it helped ChatGPT understand the text hierarchy and formatting.

Step 3: Provide Visual References

Another breakthrough moment came when I realized ChatGPT can use images as style references. I provided two key references:

A screenshot of a well-designed cheat sheet from a LinkedIn creator (for layout inspiration)
A screenshot of the hero section from our Alitu.com blog (for color palette and branding)

By giving ChatGPT these visual references alongside specific instructions on how to use them, it was able to create something that matched both the layout style of the reference cheat sheet and our brand's visual identity.

This is a massive advantage over other AI image generators—the ability to upload reference images and have the model understand and apply their styles is surprisingly effective.

Step 4: Create a Structured Prompt

After several iterations, I found that a well-structured prompt with clear sections works best. Here's the basic template I used:

Create an image: [Brief description of what you want]

Use these references:
- For layout, use the cheat sheet image I've attached
- For color palette and style, use the screenshot of our website

Layout rules:
- Make it a 4-box layout with P, A, T, H in separate boxes
- Put P in top left, A in top right, T in bottom left, H in bottom right
- Include a checkmark icon before each "Task" section
- Use a clean, legible font for all text

Here's the exact copy to use:

[Title]
PATH Framework: Podcast Planning Made Simple

[Box 1 - Purpose]
Define why you're creating your podcast
• What business goal does it serve?
• How will you measure success?
Task: Write down your primary and secondary podcast goals

[Box 2 - Audience]
...etc.

This structured approach gave ChatGPT everything it needed to create a clean, professional result.

Step 5: Refine Through Conversation

One of the most powerful aspects of ChatGPT's image generation is that you can refine your output through natural conversation. If something isn't quite right, you can simply tell it what to change.

For example, on my first successful attempt, the boxes were arranged in a strange order (P-H-T-A instead of P-A-T-H). I simply told ChatGPT, "Please rearrange the boxes so they're in order: P and A on top, T and H on bottom." And it recreated the image with the correct layout.

This conversational approach to refinement is significantly easier than having to rewrite entire prompts or start over, as is often necessary with other AI image generators.

The Exact Prompt Template That Worked

For those who want to see exactly what worked, here's the full prompt I used to create the final version of our PATH framework cheat sheet:

Create image Create a cheatsheet for the PATH podcast planning model, for our brand The Podcast Host. 

Use the attached cheatsheet for layout inspiration. 
Use the attached screenshot of our blog landing page for design inspiration. 

<rules>

Arrange like this:
* Top left: Purpose
* Top right: Audience
* Bottom left: Topic
* Bottom right: Hallmark

Design:
* Do not include brand name or logo
* Make checkmark background yellow
* Make background blue with the same waves seen on the website
* Montserrat typeface

</rules>

Here's the copy: 

## **PATH: A Podcast Planning Framework**
### *Your Blueprint for Podcasting Success*

---

### **P — Purpose: Why Are You Creating This Show?**

Your "why" sets direction and fuels motivation when challenges arise:

- **Marketing** – Build personal or business brand authority
- **Community** – Connect with like-minded enthusiasts
- **Education** – Share valuable knowledge and insights

✅ **Action Step**: Define your core purpose in one sentence

---

### **A — Audience: Who Will Benefit Most?**

Knowing your ideal listener shapes every decision you make:

- **Demographics** – Age, location, profession of your listener
- **Challenges** – Problems they're actively trying to solve
- **Interests** – Topics that naturally capture their attention

✅ **Action Step**: Create a one-paragraph listener persona

---

### **T — Topic: What's Your Unique Conversation?**

Craft content that resonates with both you and your audience:

- **Core Focus** – The central theme everything revolves around
- **Expertise** – Knowledge or experience you bring to the table
- **Voice** – Your authentic perspective and communication style

✅ **Action Step**: Write your show's premise in 1-2 sentences

---

### **H — Hallmark: What Makes Your Show Memorable?**

Your podcast's distinctive elements that drive growth:

- **Niche** – A specific focus that serves an underserved need
- **Format** – A distinctive structure listeners can anticipate
- **Outcome** – The transformation listeners experience

✅ **Action Step**: Identify your podcast's primary differentiator

Best Practices for Text-Heavy Graphics

Through my experimentation, I've figured out a few best practices that consistently produce better results:

1. Use Markdown for Text Formatting

ChatGPT handles Markdown extremely well in image generation prompts. It correctly interprets:

Bold text for emphasis
Italic text for secondary emphasis
Bullet points (•) for lists
Headings for different levels of text

When I provided text with Markdown formatting, ChatGPT carried that hierarchy into the final image. Without Markdown, it sometimes struggled to determine what should be headings, what should be body text, etc.

2. Be Specific About Layout

The more precise you are about layout, the better the results. Instead of saying "create an infographic," say "create an infographic with a title at the top, three columns below, and a call-to-action at the bottom."

This level of specificity helps the model structure the information correctly and ensures nothing important gets misplaced or cut off.

3. Avoid Overly Complex Instructions About Typography

I found that ChatGPT doesn't handle very specific typography instructions well. When I tried to specify exact font sizes or typefaces, it introduced more visual errors and "smudges" in the text.

Instead, use simpler descriptive terms like:

"Large, bold heading"
"Smaller body text"
"Clean, modern font"
"High contrast text for readability"

This general guidance gives the model flexibility while still guiding it toward your desired outcome.

4. Pay Attention to Text Length

Balance is crucial when working with multiple sections. Keep text amounts relatively consistent across similar elements, or explicitly state how to handle different text lengths.

In my cheat sheet example, I made sure each of the four PATH sections had roughly the same amount of text. When I tried versions with drastically different text lengths, the layout became awkward.

Don't expect perfection on the first try. The real power comes from the ability to refine your image through conversation:

"Make the title larger for better readability"
"Increase the spacing between sections"
"Change the background color to something lighter"
"Make sure all text is fully visible and not cut off"

These iterative refinements often produce dramatically better results than starting over with a new prompt.

More Examples & Tests

After the success with the cheat sheet, I tried several other experiments to test the limits of this new capability.

Podcast Cover Art

I was curious how well it would handle creating podcast artwork, so I tried to create a Spanish-language podcast cover inspired by Matthew's A Scottish Podcast artwork.

I provided the existing podcast cover as a reference and asked for a Spanish-themed version. The result was impressive—it got the layout right and created something that looked genuinely professional. The text was perfectly rendered, though it did struggle to match the exact font style from the original (a slightly horror-movie inspired font).

When I asked it to change "Spanish Niche" to simply "Spanish," it made the change perfectly. This kind of simple editing would have been impossible with previous AI image generators.

💡

Check out Colin's full video on how to create podcast artwork with ChatGPT. It includes reference prompts, as above, as well as including photos and a full style guide.

Style Transfer Experiments

A street mural featuring Colin and Matthew, titled "The Podcast World"

I also experimented with combining different reference images to create mashups. For example, I tried to create a Beavis and Butthead-style mural featuring Colin and Matthew (from our team).

This experiment perfectly showcases the power of ChatGPT's conversational approach to image generation. My initial prompt was simple:

Create image: Recreate this photograph in the style of the beavis and butthead graffiti. Make the text say 'the podcast world'.

The first attempt showed Colin and Matthew in a cartoon style with the text, but it wasn't quite what I was looking for. So I followed up with: "Nice! Can you make it look like it's actual graffiti in a back alley?"

The second version was much better—it placed the cartoon in a realistic urban setting with a weathered wall, door, and authentic street feel. But the cartoon style still wasn't quite right, so I made one more request: "The setting is perfect! Love the weathering, the door and the street wall. But can you make them look more like the gnarly caricature style of beavis and butthead? Without losing their unique features."

The third attempt nailed it. The environment was perfect, the text was clear, and the cartoon style was much closer to what I wanted—all without having to start over with a new prompt. This kind of iterative refinement through natural conversation is something that sets ChatGPT's image generator apart from other tools.

The process involved in creating the mural. Two references: a photograph, and the mural to take inspiration from. Three revisions to reach the final result.

Interestingly, I found that for well-known styles, you don't even need to provide a reference image. I simply asked for an image "in the style of Studio Ghibli," and it created something surprisingly on-target. This works because these styles are well-represented in the training data.

ChatGPT vs. Other Tools

How does ChatGPT's new image generator compare to other options? Here's what I've found through testing:

ChatGPT vs. Midjourney

Midjourney has long been considered the gold standard for AI image quality, particularly for artistic styles. However, it has a critical weakness for creators: text handling.

In side-by-side tests:

Text rendering: ChatGPT produces clear, accurate text for entire paragraphs. Midjourney still struggles with more than a few words.
Ease of use: ChatGPT uses a conversational interface where you can describe what you want in plain language. Midjourney requires learning specific Discord commands and parameters.
Editing capabilities: ChatGPT allows conversational refinement—tell it what to change, and it updates the image. Midjourney typically requires generating new variations or starting over.
Consistency: ChatGPT maintains style and elements across multiple images in a conversation. Midjourney treats each generation independently.

For purely artistic images with minimal text, Midjourney still produces stunning results. But for anything text-heavy—infographics, cheat sheets, labeled diagrams—ChatGPT is now dramatically superior.

ChatGPT vs. Traditional Design Tools

This raises an obvious question: is this the end of Canva and similar design tools?

Not quite. Traditional design tools still offer advantages:

Pixel-perfect control over every element
Vast libraries of templates and assets
Consistent output every time
No generation limits or wait times

However, ChatGPT offers something that traditional tools don't: the ability to go from concept to finished visual in a single conversation. There's no need to learn interfaces, place elements, or adjust spacing manually.

For quick, one-off graphics—especially when you're not a designer—ChatGPT's approach is remarkably efficient. I can see it becoming the go-to tool for creators who need to produce visual content regularly but don't have design expertise.

Current Limitations & Workarounds

While this is a massive breakthrough, it's not perfect. Here are the key limitations I've encountered and how to work around them:

1. Text Styling Limitations

ChatGPT doesn't yet accept specific instructions about font sizes or typefaces. When I tried to specify exact font styles, it actually introduced more "smudges" in the text—almost as if it got confused about what to do.

💡

Workaround: Describe text in general terms (e.g., "use clean, modern typography" or "use bold text for headings") rather than naming specific fonts or sizes.

2. Text Overflow Issues

One persistent challenge was fitting all the text into the design. If you provide too much text for the layout, ChatGPT will try to squeeze it in—sometimes resulting in text getting cut off at the bottom or becoming too small to read.

OpenAI acknowledges this limitation, noting that GPT-4o "can occasionally crop longer images, like posters, too tightly, especially near the bottom."

💡

Workaround: Keep your text concise, and explicitly mention "include padding" or "ensure all text fits without being cut off" in your prompt. If you see overflow in the result, ask ChatGPT to regenerate with "more space between elements" or "larger margins."

3. Font Consistency

While text is now readable, you might notice slight inconsistencies in how letters are rendered. The same "font" might look slightly different across sections or elements.

💡

Workaround: Generate everything in one chat session with a consistent style description. For multi-page content like carousels, describe the style in detail for the first image, then ask ChatGPT to "use the same style as before" for subsequent images.

4. Generation Limits and Speed

You can do about 20-30 image generations per hour before ChatGPT asks you to slow down. The longest wait I encountered was about 10 minutes before I could generate more images.

💡

Workaround: Plan your creative sessions accordingly, and do your text preparation before jumping into image generation. If you need to create many similar graphics, batch your requests efficiently.

Practical Applications for Creators

So what can you actually use this for in your creator business? Here are some practical applications:

Create branded graphics for various platforms that include actual text—no more adding text in a separate app! This works especially well for:

Quote graphics from your podcast or videos
Tip lists and how-to content
Announcement graphics with dates and details
Stat visualizations with proper labeling

2. Educational Content

If you create educational content, you can now make:

Framework visualizations (like our PATH cheat sheet)
Process diagrams with labeled steps
Concept explainers with multiple text sections
Decision trees and flowcharts with proper labels

3. Podcast Assets

For podcasters, this opens up new possibilities:

Create podcast cover art with perfectly rendered show titles
Generate episode-specific graphics with guest names and topics
Design audiogram templates with text overlays
Create promotional graphics for special episodes or series

4. Marketing Materials

For your broader marketing efforts:

Design simple landing page graphics with headlines and CTAs
Create email header images with campaign messaging
Generate LinkedIn carousel graphics for multi-part stories
Design announcement graphics for product launches or events

5. Branded Content

While I wouldn't rely on this for your primary logo, you can create:

Secondary branding elements with text
Headers for different content sections on your website
Email newsletter templates with text areas
Simple product mockups with descriptive text

Conclusion

ChatGPT's new text rendering capabilities represent a genuine breakthrough for creators. After years of frustration with AI image generators that couldn't handle text, we finally have a tool that can produce professional-quality, text-heavy graphics without requiring design skills.

The key takeaways:

Provide the exact text you want to appear in the image
Use reference images for style and layout inspiration
Structure your prompts clearly but avoid overly specific typography instructions
Be prepared to refine through conversation
Check your final output carefully for any errors or overflow issues

Is this the end of professional design? Absolutely not. But it does democratize basic visual content creation in a way that wasn't possible before. For solo creators and small teams without design resources, this is a game-changer that will allow you to produce more visual content, more quickly, without sacrificing quality.

I encourage you to experiment with this yourself. Start with something simple, like a quote graphic or a framework visualization. Use my prompt template as a starting point, but don't be afraid to iterate and see what works for your specific needs.

Jacob is Head of Growth at Alitu, the podcast maker app that helps busy creators produce high-quality podcasts without the technical hassle. You can listen to more conversations about creator tools on The Creator Craft podcast.

Contents

Topics

Podcast Tools

Finally! ChatGPT Can Create Text-Heavy Graphics That Actually Work

Jacob Anderson

The Technical Breakthrough

My Step-by-Step Process: Creating the PATH Framework Cheat Sheet

Step 1: Don't Let ChatGPT Write Your Copy

Step 2: Pre-process Your Content

Step 3: Provide Visual References

Step 4: Create a Structured Prompt

Step 5: Refine Through Conversation

The Exact Prompt Template That Worked

Best Practices for Text-Heavy Graphics

1. Use Markdown for Text Formatting

2. Be Specific About Layout

3. Avoid Overly Complex Instructions About Typography

4. Pay Attention to Text Length

5. Use Multi-Turn Refinement

More Examples & Tests

Podcast Cover Art

Style Transfer Experiments

ChatGPT vs. Other Tools

ChatGPT vs. Midjourney

ChatGPT vs. Traditional Design Tools

Current Limitations & Workarounds

1. Text Styling Limitations

2. Text Overflow Issues

3. Font Consistency

4. Generation Limits and Speed

Practical Applications for Creators

2. Educational Content

3. Podcast Assets

4. Marketing Materials

5. Branded Content

Conclusion

Hey, wanna make a podcast really easily?

Topics

Podcast Tools

The Technical Breakthrough

My Step-by-Step Process: Creating the PATH Framework Cheat Sheet

Step 1: Don't Let ChatGPT Write Your Copy

Step 2: Pre-process Your Content

Step 3: Provide Visual References

Step 4: Create a Structured Prompt

Step 5: Refine Through Conversation

The Exact Prompt Template That Worked

Best Practices for Text-Heavy Graphics

1. Use Markdown for Text Formatting

2. Be Specific About Layout

3. Avoid Overly Complex Instructions About Typography

4. Pay Attention to Text Length

5. Use Multi-Turn Refinement

More Examples & Tests

Podcast Cover Art

Style Transfer Experiments

ChatGPT vs. Other Tools

ChatGPT vs. Midjourney

ChatGPT vs. Traditional Design Tools

Current Limitations & Workarounds

1. Text Styling Limitations

2. Text Overflow Issues

3. Font Consistency

4. Generation Limits and Speed

Practical Applications for Creators

1. Social Media Content

2. Educational Content

3. Podcast Assets

4. Marketing Materials

5. Branded Content

Conclusion

Hey, wanna make a podcast really easily?