Youtube pushed a review of Caira to my feed and well, I have thoughts.
Alright, if you watched the whole review, skip past this to the next header. If not, here’s the TL;DR: Caira is a camera sensor and lens mount attached to a phone that feeds the images it captures through an AI prompt engine. From what she demonstrated, it is a well-trained model that can produce incredibly realistic results such as changing the color of things, altering backgrounds, completely swapping out entire parts of the image, duplicating what’s in the scene, etc. It’s capabilities are genuinely impressive.
Before I go any further, I want to be clear that I’m not particularly against AI as a tool. In fact, she demonstrates exactly how I utilize AI in my own photography (removing distractions, fixing blemishes, wrinkles in backdrops/clothes/products, etc.). Again, I’m not against AI as a tool.
But is this still “photography”?
In my opinion, no. Caira can take photos but it stops being photography when the user changes/replaces the subject, lighting, and environment of the original image captured.
Here’s why: Photography is (according to Wikipedia) “the art, application, and practice of creating images by recording light, either electronically by means of an image sensor, or chemically by means of a light-sensitive material such as photographic film.” It is about capturing a moment from life and in most cases, romanticizing/embellishing/accenting through stylistic edits of colors, cropping for composition, and removal of distracting elements to tell a story or illustrate a vibe.
Prompting a system to fully replace subjects, backgrounds, lighting, and potentially more puts the final product square into the camp of a collage. I’d even make the argument that the end result should be specifically called a “prompt collage” because the user isn’t even responsible for finding the assets that were generated by the AI.
So what are you on about?
Humans created art to tell our stories; stories that are born from unique, subjective experiences, struggles, practice, and especially flaws. AI simply can’t do that as it is a separate subjective viewpoint that the equivalent of telling a skilled intern to voice your opinion and paint your feelings based on styles/words created from countless uncredited people.
Caira, along with Sora and any other generative AI, are diluting what was meant to be unique representations of how other humans (in this case, photographers) view the world. “Summarizing” our feelings with a prompt and changing photos to tell stories that quite literally didn’t or won’t happen will, in my humble opinion, ultimately paint a false world built on false realities that will/is spiraling out of control.
Doom and gloom aside, this is neat tech. I think it’s wildly impressive on what all it can do with a simple prompt and snapshots but it is actively contributing to an internet that’s rapidly becoming literally unbelievable which will ultimately have some pretty messed up consequences.
