Search

Friday, September 13, 2024

How to create images and visuals with generative AI

How to create images and visuals with generative AI

There’s one moment in the process of creating a blog post or news article that every small publisher dreads:

“What do I use for my featured image?”

Agencies and media companies have creative directors, photographers and artists at their beck and call to create this image for them. But what about the rest of us?

Some of us will head over to Google Images despite our better judgment. Others will go to a free site like Pexels or Unsplash. Some will go to sites like Adobe Stock, iStock or Shutterstock to pay for an image.

Hopefully, everyone reading this knows why it’s not a great idea to steal images off the web. Unless you’re using a public domain image, the images you download are owned by somebody.

If you plan on growing your business or brand, you don’t want your site filled with unlicensed images that may come back to haunt you one day.

As for stock photos, everyone who’s used a stock photos site has experienced the frustration of searching through page after page search results and never finding the right one. So many stock photos are repetitive, generic or trite that they’ve literally become a joke.

And if you happen to find a decent stock photo, chances are it’s been used over and over again.

For example, this photo of a diverse group of co-workers on Pexels has been downloaded over 75,000 times and appears in Google Images on 175 sites. Which, ironically, is the opposite of “diversity.”

AI image generators

Remember I said big companies have creative directors, photographers and artists at their beck and call? With AI image generators, you can now have all these, too.

Right now, two types of sites are becoming widely used to generate images from text.

The first are sites that focus only on images. The most popular is Midjourney. The next most popular are sites powered by the open-source Stable Diffusion model, such as Stability.Ai’s own DreamStudio. 

Creatives and designers tend to favor these platforms because of their exclusive focus on AI art; they are at the cutting edge of image quality and allow many customization and fine-tuning options for artists.

For this article, I’m going to focus on AI chatbots, which are a bit more accessible to marketers and non-artists.

As of this writing, Anthropic’s Claude doesn’t support text-to-image and Google Gemini is too inconsistent for my tastes. (Most of the prompts I test there result in an error message or an image that doesn’t match what I asked for.)

On the other hand, OpenAI’s ChatGPT (with image generation powered by DALL-E) and X’s Grok (with image generation powered by FLUX.1) are getting jaw-droppingly good. 

As of this writing, ChatGPT Plus costs $20 a month. It includes DALL-E image generation and access to the ChatGPT chatbot.

ChatGPT is what I had in mind when I wrote my article back in April predicting that people would use Google less once they got used to using AI chatbots. Since then I’d say 80% of the searches I used to do on Google I now do on ChatGPT.

Grok comes as part of the Premium Tier of the social media platform X and costs $8 a month. For that price you get access to FLUX.1 image generation, as well as Grok’s chatbot and premium features on X.

As for which you should choose, I would suggest both.

Right now, I see ChatGPT still ahead of Grok as far as its usefulness as a chatbot, while Grok is arguably superior at generating art.

As you’ll see in a second, $28 a month is a pittance compared to the value you get from image generation alone, not to mention all the other ways AI chatbots can increase your productivity.

Generative AI as your personal creative director, photographer and artist

For those of you who have never used an AI chatbot to do text-to-image generation before, I’ll give a quick rundown of how it works..

Let’s say that you’re writing a blog post or an article on how to buy a mattress and you get to that point of having to choose a featured image.

Instead of hunting all over for an image, you just type this into your chatbot.

  • “Draw me a box mattress in a store.”

Here are the results I get:

ChatGPT

chatgpt-box-mattress-in-store

Grok

grok-mattress-store

You can see that Grok understood what I meant, while ChatGPT thought I was talking about a “mattress in a box.” Score one for Grok.

While it’s a nice photorealistic image, it’s really nothing that you can’t find on any stock photo site. And let’s face it – it’s just as boring, repetitive and unoriginal as most “stock photos of mattresses.” 

Let’s change that.

Getting a little more detailed in your prompt

Let’s say that in your article you referenced the story of The Princess and the Pea. And it dawned on you that a nice visual might be a princess sleeping on a stack of mattresses. 

Type this prompt into your chatbot:

  • “Generate an image of a princess sleeping on top of a stack of mattresses.”

Here’s what ChatGPT gave me:

chatgpt-princess-and-pea

And what Grok gave me:

grok-final-princess

You can start to see the difference in how ChatGPT and Grok approach “art.”

ChatGPT tends to favor illustrations, while Grok seems to favor photorealism. But of course, you can “ask” either to try to draw in whatever style you like.

I should say that I didn’t get these images right away from either AI. In fact, the first images I got from both didn’t match what I wanted at all. But I “talked” to the chatbot just as I would to a Creative Director.

Here was my “conversation” with Grok to get to this final image:

  • “Draw me a picture of a stack of mattresses with a princess sleeping on top.”
Grok princess 1
  • “Those don’t look like mattresses, they look more like blankets. Can you draw me the kind of box mattress you’d find in a store?”
Grok princess 2
  • “I need them stacked up with a princess sleeping on top.”
Grok princess 3
  • “More mattresses!”
Grok princess 4
  • “More mattresses!”
Grok princess 5
  • “No no, draw me at least 10 mattresses stacked on top of each other with a princess sleeping on top.”
Grok princess 6
  • “This is good, but make the mattresses all have different patterns.”
Grok princess 7

It took a while, but I finally got one I was happy with.

Notice that all I had to do was have a “conversation” with Grok, just like I would with a creative director. And unlike a real creative director, Grok didn’t want to throw me out a window after the seventh round of changes.

Now search on any stock photo site for “princess and the pea” or “stacked mattresses”; chances are you won’t find anything nearly as good as you see here.

The girl you see sleeping on top of the mattress? She doesn’t exist. No model release is needed because there is no real human in that photo. 

As you can imagine, this changes everything. Instead of spending thousands of dollars for a photo shoot or $200 for a stock photo subscription, I just spent $8 and about 2 minutes of my time. 

How in the world does AI generation work? 

Imagine that you wanted to learn to draw a picture of a golden retriever. The first step would be to learn basic art techniques, like drawing basic shapes, adding texture and detail and adding shading and depth.

You’ll need to study a lot of pictures of golden retrievers to understand their structure, form and movement. And you’ll need a lot of practice and iteration before your drawing starts to look like the real thing.

That’s essentially the same way that AI models work, except in the AI world this process goes by names like “Generative Adversarial Networks” and “Diffusion Models.”

The difference is that while you probably only have a few hours a week to learn and practice, AI models can “learn and practice” instantly and continuously.

Plus, they have access to billions upon billions of images to train them, including public domain images, Creative Commons images and image data licensed to them by stock photo companies.

Dig deeper: Visual optimization must-haves for AI-powered search

Get the newsletter search marketers rely on.


Getting ideas from AI

Let’s get back to that hypothetical blog post I was writing.

While images of a mattress in a store or even a cute picture of a princess sleeping on a stack of mattresses may get people’s attention, will it get them to click and scroll to read your article?

That is the whole point of the featured image.

In addition to generating an image for you, you can use AI to help you come up with ideas in the first place.

Let’s try this. Instead of telling the AI what to generate for us, let’s ask for advice.

ChatGPT creative brief

Again, I’m just “talking” to the AI as I would a human. In this case, ChatGPT gave pretty good advice.

But if you don’t like the advice you’re given, remember that you can engage in a dialogue with your AI providing details and clarification along the way, similar to what I did above. 

In this case, I asked ChatGPT to generate the image using its answer as my prompt. I did the same to Grok. Here’s what they came up with.

ChatGPT

ChatGPT brief

Grok

Grok brief

Now ask yourself, as a consumer, which image would you be more likely to click on–either one of these two images or a stock photo of a mattress?

If you’re not sure, here’s something else you can do with AI. Come up with different hypotheses for images that would achieve your goals and AB test them.

With generative AI, testing different images becomes as simple as testing copy to optimize your conversion rate.

I’ll preface this section by saying that this is my personal perspective and opinion and not legal advice. For any legal questions, please consult a lawyer, preferably one well-versed in copyright and intellectual property law. 

Three main categories of law arise most often in the use of images and photography on websites:

  • Copyright law.
  • Privacy / right of publicity law.
  • Trademark law. 

Copyright law protects the creator of an original work. Many people erroneously believe that you need to register a copyright for it to be valid. 

The truth is you automatically own the copyright for anything you create, even if it’s just scribbling on the back of a napkin. 

For someone else to legally use anything you create, you need to give them permission. In the art and photography world, that’s usually done through a license. 

Every image you use on your website that you don’t own and that’s created by a human, other than public domain images, should have a license. 

Even when the photo is free to use, it’s covered by a license such as Creative Commons or a license from a free site like Pexels or Unsplash.

Here’s where things get interesting.

Because AI is not human, copyright laws (as of now) don’t apply to AI-generated work.

That means whatever original work you create using AI, you can use without fear of getting sued for copyright infringement. That also means that anyone can come to your site and steal your AI-generated content.

As AI-generated content becomes more ubiquitous, expect laws to be passed quickly to address these kinds of issues.

Trademark law

Even if there are no issues with copyright for original work that AI produces, AI “artists” are still subject to the same laws and rules that human artists need to follow.

For example, what’s wrong with this image?

I went a little over the top in generating this to make a point about some potential risks of generative AI. 

The Apple logo, the modern-day version of Mickey Mouse and the Empire State Building are all trademarked. Yet, Grok was able to generate this image for me with remarkable fidelity.

While most people understand that logos and cartoon characters are trademarked, many don’t realize that building and product design may also fall under trademark protection.

In the stock photo world, major stock companies like Adobe Stock and Shutterstock review every photo in their libraries and mark it “for editorial use only” if it contains a recognizable trademark. This is due to the indemnity that they provide as part of their license.

On the other hand, free sites like Pexels and Unsplash allow images such as this photo of Cinderella’s Castle in Walt Disney World, which is very much trademarked and has been downloaded over 23,000 times and viewed over 9 million times.

Pexels does make clear in their license that commercial use is prohibited, but a simple reverse image search reveals that hundreds of websites don’t adhere to these terms. While Disney’s lawyers could sue all of them, they choose not to – at least for now.

A parallel situation is emerging in the world of generative AI. Google and OpenAI go out of their way to prevent users from generating images that contain trademarks. 

Grok and Stable Diffusion allow it, putting the responsibility for compliance entirely on the user. It’s all but certain this will be deliberated in the courts. Stay tuned.

Incidentally, the reason that I’m able to use this image here is because the use here is protected under a principle called nominative fair use.

Put simply, this article is providing reporting, commentary and education. The use of images in this article is considered editorial use, solely for the purposes of providing context and I made sure my use of the trademarks do not suggest endorsement or affiliation by Apple, Disney and the owners of the Empire State Building. 

Privacy and right of publicity laws

Just as AI can generate photorealistic images of products, buildings and characters, it can also generate very realistic images of people.

AI-generated images of humans can also be problematic, mainly if the image resembles a real person (whether intentionally or not).

Many jurisdictions already have laws regulating “deep fakes”, but as with copyright and trademark law, expect privacy and right of publicity laws to evolve as use of generative AI grows.

Are stock photo companies dead?

Not necessarily. Here’s why.

Most people think that when you pay for a photo or an illustration on a stock photo site, you’re paying for an image you download. 

That’s not the case. What you’re actually paying for is a license to use that image legally and with most of the larger stock photo companies, you’re also buying protection. 

As I mentioned above, with free sites you need to figure out whether your use of a photo you downloaded from their site is legal or not. If someone decides to sue you, you’re on your own. 

On the other hand, larger stock photo companies usually provide indemnification as part of your license, meaning that if you get sued for using images you purchase from them, they’ll cover at least some of your legal expenses.

The “Big Three” stock photo providers, Getty Images, Shutterstock and Adobe Stock, are all providing generative AI capabilities, and all of them do appear to extend indemnification for the use of those images (see the terms from Getty, Shutterstock and Adobe). 

As long as generative AI companies pass on the responsibility of compliance to their users, stock photo sites aren’t going away. But they will evolve. 

If nothing else, you can think of purchasing from a stock agency as buying an insurance policy, especially as laws concerning AI images continue to evolve. Most risk-averse big companies will likely continue to go through the stock photo companies.

For the rest of us, as long as you take the time to educate yourself about intellectual property law and use common sense in your use of generative AI, we’ll probably be fine.

Is photography dead? 

Finally, I want to address the photographers and artists in the room. 

The invention of photography put many artists out of work. The invention of digital photography disrupted traditional photography. 

The advent of stock images meant your photos sold for a few dollars instead of a few hundred. The advent of free stock libraries meant your photos sold for a few pennies instead of a few dollars.

This is just another evolution.

Yes, some photographers and artists may find themselves displaced. But for anyone with skill and talent, believe it or not, the future is brighter than ever for those who can adapt to this new world.

How?

  • Double down on being human: Focus on your unique ability to tell stories, not just create images. Move beyond simply taking photos or drawing pictures.
  • Embrace AI as a tool: See AI like Midjourney or Stable Diffusion as allies that can elevate your creativity, rather than threats.
  • Develop a unique style: Differentiate yourself from AI by creating art that’s deeply personal and custom. Generic content won’t stand out against AI-generated work.
  • Be a storyteller: Don’t let your art “speak for itself.” Explain the process and purpose behind your work, especially when showcasing it on platforms like Behance or Dribbble.
  • Stay one step ahead of AI: As you experiment with AI you’ll quickly find that it is still horrible at most detailed, long-tail queries. In some cases this is simply because it hasn’t had enough training data yet; in others no amount of training will help. Focus on details, specific concepts and brand work that AI struggles with. Anticipate trends and adapt quickly.
  • Quantify the value of your work: You can bet that marketers will rush to embrace generative AI as a “solution” while never understanding the problem they’re supposed to be solving. Artists need to be able to articulate the deeper value and impact of their work, reminding marketers that art isn’t just about aesthetics.
  • Innovate continually: While AI might replicate your innovations eventually, your advantage is in constant creativity and pushing boundaries.

Generative AI is here to stay

Some will resist it, and others will become too heavily reliant on it. Both will be left behind.

On the other hand, those who embrace it as a tool but don’t lose their humanity in the process will succeed and thrive in this new world.

Dig deeper: Advanced image SEO: A secret manual



from Search Engine Land https://ift.tt/cUlzor0
via IFTTT

No comments:

Post a Comment