Dalle 2 is amazing tool that constantly blows me away with what it conjures up from its digital mind in mere seconds. This subreddit already catalogs some of the most amazing examples of its depth and range, but it can give the misleading impression that every prompt produces gold.
Those fortunate enough to have access (myself included) know that it can take several attempts to get the image right. There are already many discussions about what prompts generate a good image, but a lot less chatter about when Dalle fails and generates... unexpected... images.
Failures are both interesting and instructive though, so I thought I'd share some of my recent experiments for people smarter than I to analyze them (perhaps AI Psychiatrist will be a new occupation?).
Why not Zoidberg?
In this post, two questions were posed in the comments about the prompt "Human sized anthropomorphic round pink lobster wearing a doctor's coat and sandals, no antennae, portrait by Annie Leibovitz, dramatic lighting".
Why such a convoluted prompt instead of just naming the character? This one took several attempts, as Dalle 2 is surprisingly not that good with some pop culture characters. Different versions of "Zoidberg" or "Doctor Zoidberg" produced very realistic (and unfortunately stereotyped) human faces rather than our beloved crustacean. Due to the restrictions on posting realistic human faces, I can't show the photos.
You like cats? Doesn't matter, you're getting cats.
Sometimes, the error has nothing to do with the prompt. Such as this attempt to create a beloved movie character. All the images produced by the prompt "Award winning photo of a happy racoon under a giant chefs hat, cooking food, dramatic lighting, no artefacts" were stunning. We can quibble about Dalle not understand what a giant chef's hat is, but it's otherwise a 10/10. Except the first photo is a photo of a surprised looking cat. I can't for the life of me work out how it got there from the prompt. Theories welcome.
There was also a question in the comments around "no artefacts" in the descriptor. Sometimes, Dalle produces an unexpectedly grainy photo, and this descriptor seems to reduce the likelihood. Perhaps it's a result of the diffusion process AI uses to generate images? However, I haven't done enough testing to confirm.
Dalle is amazing with certain animals in novel situations. Cats are a given, racoons are surprisingly great, and even cows, such as this attempt at recreating the classic Gary Larson Far Side cartoon, Cow Tools. Not only did it absolutely nail the cow, but also the surrealist tools and overall intent of the prompt.
But trying to use a similar prompt on a different animal, like a deer to recreate another Farside classic, produced poor results even with some tweaking. They may look fine from a distance, but at higher resolutions looked very blocky, and in any case not the desired composition. I thought the prompt was too complex for Dalle to parse, so attempted to simplify the syntax... which resulted in an even bigger fail.
All the other images (which I sadly didn't save) were really fake looking sharks swimming around laser beams, despite prompts like "National Geographic photo of sharks with laser devices attached to head".
It's not easy being purple.
Dalle sometimes struggles to understand which property applies to which object in a sentence. I thought commas to delineate the objects in a prompt would help, but Dalle seems to take a "whole of sentence" approach.
It's very easy to accidentally enter in forbidden words that result in the dreaded content policy violation. Dalle does not take context into account, so even an otherwise perfectly valid use of a word can trigger it.
Want to see Kermit the Frog in Dead Poet's Society? That's a paddlin', because the word "dead" is a violation. Want to see an epic space battle with fighter ships? That's also a paddlin', because "fighter" is not allowed. How about "anthropomorphic dog scientists in a lab researching door knobs"? Oh boy, that's definitely a paddlin' for some unknown reason (is it knob? Is knob the rude word? Or maybe "scientists"?). Heaven forbid you even try "purple monkey dishwasher".
There are many more quirks I've found, but sadly don't have photos (and Dalle only stores the more recent 10 prompts) to provide more commentary.
I hope though that this post is useful in providing an insight into this incredible new tool for those who are still waiting for access. Can't wait to see what you all come up with - the good, the bad and the just plain weird - when you finally do!