PubTech Radar Scan: Issue 25 (NEC Annual Publishing Conference 2024)

Nov 12, 2024

Today's newsletter summarizes my thoughts and reflections from the excellent NEC Publishing Conference AI and the Future of Publishing: Predictions and Possibilities (London, 5 November 2024)

🔴 The meeting began with a keynote from Ann Michael talking about how the AIP is creating a ‘culture of curiosity’ at AIP. Good talk to watch if you’re interested in culture change. New to me was https://mathison.ai/

🔴 Sven Fund’s talk, Peer Review and Research Integrity had the best soundbites including:

Publishing is under-invested in technology, which is reflected in a high number of broken processes

The talk’s subtitle was ‘Harnessing the Full Potential of Digitization’ and Sven presented a vision for a more standardized and technology-enabled future for peer review. Sven’s talk reminded me of a talk about streamlining workflows given by Todd Toler many years ago, which used a robotic car production line as an analogy. I think this vision is still some way off. Whilst most industries have automated or removed people from their processes, publishing has tended to outsource rather than automate. Outsourcing has meant that many publishers haven’t done the hard work of standardization to enable automation. Even processes that have been largely automated like content conversion, still have a fair amount of human input to tidy things up. Many ‘impossible to automatically process’ manuscripts are still entering the system. The few attempts to standardize inputs (manuscripts), or mandate the use of forms or specific editing tools for submission that I am aware of have failed. (I would be interested to know if anyone is working on this now). Standardized inputs would make the whole publishing process, including peer review, more efficient but would probably be unpopular with many authors.

The Q&A session touched on what I think is probably the biggest challenge for publishers wishing to automate or use AI within peer review, how do you build the business case? Given that peer review is outsourced to researchers and largely unpaid it’s difficult to build the case for additional investment other than in the area of research integrity. Rethinking workflows to make the best would be a very bold move.

🔴 In Amy Jones and Daniel Molesworth’s talk about The Perfect AI strategy, I sensed the tension between technologists wanting to explore a new technology to see what it can do and the business need for employees and projects to deliver business value. Daniel asked, “Are rubbish AI experiments a stepping stone that has value or an expensive distraction?”

🔴 ‘New Adventures in AI‘ by David Smith ran through various experiments such as SAFI (Secure AI for the IET) a secure LLM chatbot hosted by the IET and problems encountered such as ChatGPT mangling content Wiring regulations (content that the LLMs shouldn’t have access to). The IET’s work on the Responsible Handover of AI Framework and Guidance was also touched on. I was surprised to hear that new IET journal launches were delayed because everyone was too busy dealing with research integrity issues.

🔴 ‘Language Choices and AI: a Reflection of Human Ethic‘ by Paul Gee was the most thought-provoking talk of the day. Should we treat AI like human employees rather than tools? My visceral reaction was no, but my more thoughtful response is we probably should. It probably would be better if Human Resources (HR) trained an AI in a company’s ethics, standards, and culture. Would the HR team do a better job of working out how to deal with problems arising from the AI’s actions rather than technologists or governance groups? More on this topic from Andrew Irving.

🔴 Christian Kohl‘s talk ‘Building Digital Infrastructure for the Next Era of Open Science at PLOS‘ did what it said on the tin. Key principles for PLOS’s new platform include Scalability • Dogma • Innovate to differentiate • Security, privacy, and accessibility baked in • Open by default • Instrumented and automated • Avoid Conway’s Law • “Strangler Fig” approach • Loose Coupling • Emergent architecture

🔴 I drifted a bit during the post-lunch panel discussion on The Researcher Perspective on AI Data Governance’. I worry about the unintended consequences of the push for open data. A while back I went to a talk at the IET about big data and one of the examples shown was how public datasets could be joined together to work out which of your neighbours was running a cannabis farm. As it becomes easier to combine and reuse data I prefer the ‘as open as possible but as closed as necessary’ approach rather than simply ‘open’.

🔴 Like many in the room I had a wry smile when Andrew Sales mentioned an anecdote about automated subject indexing being ‘just around the corner’ for the past 40 [?] years. Of course, plenty of content is automatically classified, especially for advertising and marketing purposes, but, despite there being plenty of good technical solutions we’re still not quite there. Content quality assurance in the era of AI is one area where AI is already being extensively used in publishing and it’s one of the few areas where it’s fairly easy to create a completing business case. Agreement within the room that high-quality automated alt-text tagging for images really is just around the corner.

🔴 The Hype vs. Reality: Understanding AI Adoption Today‘ by Jing Hu was a good challenge. Did it take 2 years for 40% of the market to adopt GenAI? Or is more accurate to say that only 10% of early adopters in the US have adopted GenAI?

The final talk of the day was by Fiona Romeo on ‘Knowledge is Human: How Wikimedians are engaging with AI‘.

💭 Three closing thoughts:

Like many of the speakers, I think there is huge value in experimentation and learning, especially with GenAI. GenAI isn’t like tools that a publisher might buy for a specific purpose. You need to play around with it to understand what it can and can’t do and how you might be able to use it. Lovely to see Jay Neill writing about his experiences.
What impact are GenAI and AI-generated search summaries having on search engine traffic to journals? Traffic for some publishers seems to be falling. It’s too early to know the cause, is it an algorithm change or are GenAI tools and Google’s summaries reducing traffic to publishers?
Is AI-assisted quality assurance of peer review is, potentially, more interesting than AI peer review?

📚 See also:

How AI is being adopted in scholarly publishing by Michael Upshall for a much shorter set of reflections 😊

PubTech Radar Scan

PubTech Radar Scan: Issue 25 (NEC Annual Publishing Conference 2024)

💭 Three closing thoughts:

📚 See also:

Discussion about this post