Taking your creative content back from AI
For years, the internet operated in tacit agreement: creators made content, search engines indexed it, and in return, creators got traffic, derivative revenue (clicks turn into purchases) and recognition.
It wasn’t perfect, but it’s how it was.
If you published good work, people could find it. The web was messy, chaotic, and largely open—but the exchange was clear.
With 2022’s release of ChatGPT and the Generative AI era, that broke.
Large AI models—ChatGPT, Gemini, Claude, Grok, etc — scrape the internet’s content to train themselves, not to refer traffic or give credit. Creators (whose hard work was at stake here) were removed from the direct loop except to serve as training for these models.
This opened up a whole new era in copyright gray area and infringement. (Reference the New York Times lawsuit of OpenAI on the use of the NYT’s copy protected content as training material for ChatGPT… OpenAI denies this infringement and cites user privacy.)
This usurping of creative content to train frontier models without fair compensation or attribution sucks for everyone except the AI companies. The knock-on effect is that arts and culture organizations and the artists and writers and creators that fuel the creative community have little choice but to limit what content they put into the public domain of the web.
When the incentives for creating original content evaporate, so does the cultural relevance of the internet itself.
An early internet architecture decision to place a small bit of code (called robots.txt
) on web pages to tell search engines and “bots” that crawl content what is ok to index and what is not. You can think of the robtots.text file like a code-level “get off my lawn” sign for content.
The problem is that adhering to this was voluntary. Google and the internet community have iterated on this through the years but mostly it remained a known thing. AI frontier companies completely ignored it. They stepped right over that “no trespassing” sign and indexed the content anyway, relying on AI’s mash-up effect to obscure verbatim copyright infringement. By doing this, they harvested the world’s information, largely for free, and created super apps that are worth billions/trillions.
I’m an AI fan but this is crap. That’s not innovation. That’s extraction.
Time to shift back toward creators
On July 1, Cloudflare, a company that handles traffic for about 1 in 5 websites globally, flipped the default.
With a new dashboard interface they’re releasing first in beta to a host of publishers and plan to release in general circulation to Cloudflare subscribers, website operators can now block AI crawlers unless they explicitly opt in.
In short, if this thing lives up to the hype… content creators can now choose what AI engines get use of their content.
For the first time, the balance tilts slightly back toward the creator. Or that’s the notion.
This isn’t a silver bullet but it’s a substantial step toward checking the AI piracy of original content. And, I think, a meaningful shift in the architecture of the web.
Cloudflare is calling it “Content Independence Day.” Good for marketing but the real signal here is that companies like Cloudflare are starting to address the problem at-scale.
Original content can’t just be extracted forever without fair compensation. And even more important — they should have SAY about if their work is used at all.
Cloudflare announced that they are also piloting a new marketplace where content owners can set terms, pricing, permission, usage before AI companies ingest their work. The goal? A move from exposure to actual compensation for content creators that AI benefits from.
Why this matters for Arts & Culture
Arts and culture organizations are obviously at the nexus of creativity and the public domain. Rightly so, much of the cultural organization’s focus is on producing amazing LIVE experiences for visitors and audiences. But many of you have embraced digital audiences as well. You should have control over the creative content you publish and know when and how it’s being used to fuel AI.
Even more directly, the artists, musicians, writers and thinkers that fuel our sector are perhaps MOST impacted by this.
This shift potentially opens a door and points to a more fair content-creating future.
It suggests we can design a web that values originality again.
It hints at future models where AI companies license access instead of taking it.
It gives cultural institutions a reason to start asking harder questions about how their content is used—and how it should be valued.
So.
If your organization uses Cloudflare today, talk with your IT staff about this and start thinking how you might decide which AI companies you wish to allow in. If you do not use Cloudflare, watch this space anyway. Other providers will follow suit.
I’d love to hear what you encounter.
kristin@matters.work