Big Ideas in Publishing: Jonathan Woahn on AI Licensing Infrastructure
Why publishers need dedicated infrastructure to participate in the AI economy on their terms.
This is the second post in Big Ideas in Publishing, exploring concepts that could reshape research communication.
In July, I spoke with Tim Vines about AI subscriptions and how publishers might create AI-optimized content as new revenue streams. This time, I’m interviewing Jonathan Woahn, whose startup, Cashmere, has built infrastructure to give publishers control over how their content works with AI systems. As Jonathan puts it, they provide tools to manage entitlement, security, access, reporting, and turn-key Retrieval-Augmented Generation infrastructure from ingestion to Model Context Protocol.
What led you to start Cashmere?
Back in 2022, I was building an AI tool for corporate training based on leadership books. It was fun until it wasn’t. The AI couldn’t stay consistent. Even a program built around The 7 Habits of Highly Effective People would come back with four habits one time and ten the next.
That’s when I realised there was no proper infrastructure for publishers to control how their content gets used in AI systems. This broke down into two core issues. First, long-form content doesn’t work great with AI—there’s too much context. There needed to be a better way to help the AI only get access to the content it needs, when it needs it. That led me down the path to inventing Omnipub, which is the foundational way we model data at Cashmere and how we present it to AI.
The second issue was from the publisher's perspective. They didn’t have any tools to control their data with AI. Licensing was long and complicated, the terms were not standardized, and the use cases varied. For example, using content to train an AI model is very different from when that AI quotes or references the content in its responses to users - but publishers lacked clear frameworks to handle these different scenarios. So we realized premium content publishers needed rails for their content with AI—without it, they were going to lose their position.
The obvious question is who should build this infrastructure. Why couldn’t big tech or publishers build this themselves?
If big tech builds it, why would publishers trust it? They’ve already used premium content at scale without permission. Amazon eroded control from many publishers over distribution and strong-armed pricing across print, audio and digital. AI is a new distribution channel—if big tech builds it, it feels like yet another moment when publishers lose control. This is publishers’ moment to gain it back by controlling how their content is used with AI. So if not big tech, then the publishers?
Individual publishers may build their own solutions, and if they have the resources, it could work. But it’s going to be a full-time effort, and not a cheap one. Building with AI comes at a price and requires skillsets that are more in demand now than ever. So now publishers are competing with big tech for talent. Let’s just say publishers aren’t known for building bleeding-edge technology.
How does Cashmere solve this problem?
We’re the infrastructure layer between publishers and AI companies. Publishers set the rules - what content can be used, how, and at what price. AI companies integrate once and get legitimate access to all that licensed content. We handle the authentication, tracking, and reporting automatically.
We don’t necessarily handle the financial settlement part - we can, but we also allow our partners to handle that on their own. The key is that publishers keep control while AI companies get the content they need without negotiating hundreds of separate deals. In short, we make premium content AI-ready and make sure everyone in the chain gets credit and compensation.
What’s making this urgent for publishers?
AI is replacing the need for users to access publisher websites or purchase content. If the AI has been trained on it, then the users have access to it for free. Publishers are already feeling it. Some academic houses we’ve spoken with report double-digit textbook declines as students shift to AI-based learning. This is the brief window before legal precedents are set around AI content licensing. Publishers can either help shape how this works or watch it happen to them.
Is there anything publishers can do?
Napster allowed users to get access to basically any music they wanted for free. The music industry had to evolve. I call it a Great Reallocation. The demand didn’t disappear, it shifted - from full albums to tracks, from media stores to iTunes, then streaming. Once music moved into legitimate channels, the industry recovered. Today it’s bigger than ever, driven by streaming.
This same thing will happen for content. The market is already shifting from Google-based discovery to access via chatbot conversations. But we need more granular licensing mechanisms. The legal environment is crystallising - using purchased content to train AI models might be fine, but when AI quotes or references that content in live responses, that will likely require specific licensing. For example, publishers need control over how their content gets used in each case - and that’s exactly the infrastructure Cashmere provides.
You’re already working with Wiley and Perplexity. How does that work in practice?
The partnerships prove this infrastructure works. With Wiley, we support two models. The first is Bring Your Own License: if your university (or anyone who has a license) already pays for Wiley content, that access travels with you into Perplexity, like adding Spotify on your iPhone—you already have the Spotify account, and now you’re provisioning new channels to access the content. The second is Metered Token Consumption. Every time Perplexity uses Wiley content to answer a query, the usage is metered in tokens, and Wiley is paid proportionally.
Both models show that legitimate participation is possible and profitable.
If you were advising a publisher today, what would you tell them?
Start with strategy, not technology. Decide where you want to play, training, inference, or both, and how much visibility you need. Then make sure your content can move with full control through those channels. The key mindset shift is to see AI as a new distribution channel, not a disruption.
Cashmere’s partnerships prove these licensing models work. The Wiley deal shows publishers can extend existing subscriptions into AI systems while keeping control over how their content gets used. I asked Jonathan where he sees this heading. What will an AI-native publisher look like five years from now?
What will an AI-native publisher look like five years from now?
I think in 5 years, AI agents are going to act heavily on our behalf. They’re going to know how to find the content we want and need. They’re going to be our tutors. They’re going to be able to dig into any topic at the level of depth of our understanding and take us to the very edge of human knowledge.
And in all of this, the agents are going to respect the creator control of content. They’re going to automatically transact on our behalf in the background. As a user, you’ll have a license for the content you care most about (i.e., Times subscription, academic journals, industry specialized textbooks, trade novels, etc.) that you either purchase directly as a patron or your agent discovers and procures on your behalf (with authorization).
Your agent will use that data to tailor the content to your needs. It will provide sources for everything, so you can know it’s legitimate. You may choose to click on sources because you’re curious, or perhaps you don’t. For the publisher/creator, it won’t matter. They get the attribution, their content has been legitimately procured, and the AI is acting fully within the bounds the consumer and publisher have authorized it to do.
The biggest challenge to solve in all of this will be about discovery. That will be one of the beauties of having an autonomous agent working on your behalf that has the context of your interests, activities, etc., because it will be able to continually monitor the market on your behalf, tirelessly scouring for information that is relevant to you and your interests, and will surface it when it knows you want/need it.
You buy a print book, and it automatically unlocks AI access. Libraries let you explore their collections conversationally. Publishers know exactly where their content travels, readers own their licences, and AI platforms become just another interface.
To see how Cashmere is building this infrastructure, visit cashmere.io.
