Hello,

As you’ve probably noticed, AI is still around, being pushed and integrated into almost everything. I find it interesting talking to friends and colleagues about whether and how they use AI. There are a few instances where it seems genuinely useful, but more often than not it still seems like a waste of time, designed by people who don’t actually understand the work that people do.

I’ve written before about AI, here in 2023 and again in 2024. I’ve been thinking for a few months about writing on this topic again. Perhaps this will turn into an annual AI reflection, as the technology progresses and as society and publishing adapt.

Indexers are no exception to talking about AI. I think it has been on the agenda at just about every indexing conference in the last year or so, in Canada, the UK, and the US. Which is good. We need to be aware of what is happening and what our options and response are.

Elizabeth Bartmess, in the US, and Tanya Izzard, in the UK, have both been particularly active, and perhaps the most visible, researching and testing AI’s indexing capabilities. My thanks to them for all of their work. By AI, Elizabeth and Tanya specifically mean large language models (LLMs) such as ChatGPT and Claude, which is what most people think of when they hear the word AI.

For a quick reference of recent articles and resources, here’s a little round-up:

Tanya Izzard has published two recent articles on AI. The article published in The Indexer is behind a paywall, while the article published in Catalogue & Index is open source and freely available.
The Digital Publications Indexing Special Interest Group (DPI-SIG) has released a statement on AI in indexing, prompted by Elizabeth Bartmess’s presentation at the ISC/SCI conference in Vancouver. The DPI-SIG has also created template versions of the statement for individual indexers to use.
The American Society for Indexing (ASI) has released a white paper, by Elizabeth Bartmess and Michele R. Combs, on LLM-generated book indexes.

Bottom line: current LLMs do not write good indexes.

AI-Powered Indexing Software

Indexing software utilizing AI is also now on the market. I expected this would happen and I have mixed feelings about it finally happening. There are currently two programs available that I am aware of.

IndexStudio appeared first, earlier this summer. It is created by Peter Tanham and claims to create a full draft while also providing tools to edit the resulting index. I tried its free trial, which did not allow me to use the editing tools, but if the sample index it allowed me to see is any indication, it does not do a thorough job, showing many of the same problems found elsewhere by Elizabeth and Tanya. A few other indexers have tested IndexStudio more thoroughly, and have also found it to be significantly lacking.

I also don’t trust the claims that Peter Tanham makes about IndexStudio. The testimonials on the website are obviously fake, as a google search fails to find any of the quoted indexers, authors, and scholars. The website also clearly sees Cindex as a direct competitor. It is hard to not get the impression that IndexStudio is intended to disrupt the indexing profession, as in replace professional indexers. Yet Peter left a comment on my blog claiming that IndexStudio is actually created for self-published authors and is not intended to replace professionals. I agree hiring a professional indexer is often expensive for many authors, which is a problem. But offering a poor quality alternative does not strike me as a good solution, and disingenuous marketing and messaging makes me suspicious.

AI-indexing, created by Ben Vagle, launched a couple of weeks ago. Ben also wrote an article explaining the program. This seems designed to be more of an assistant to professional indexers, rather than claiming to create an entire index by itself. The instructions explicitly state that the output will need further editing. The output can also be exported to Cindex, which is promising as Cindex does make editing a lot easier. I have thought that if AI was ever used in indexing, it would work best in an assisting role, so this program might be a step in the right direction. That said, I have been too busy this month to try this out, so I can’t comment yet on whether AI-indexing lives up to its claims.

AI and the Nature of Indexing Work

Much of the discussion around AI and indexing has focused on whether or not LLMs can actually write an index. Which is a very important question. I have no interest in using a tool that fails to deliver what I need. What I have not seen so much, beyond a fear that AI may replace indexers, is a discussion on how AI may change the nature of indexing.

I write indexes in part because I enjoy the creative and intellectual challenge of analyzing a book and piecing together an index that will serve readers. To write an index is to solve problems, trying to find the perfect balance between the contents of the book, the needs of the audience, and the space available on the page. It is difficult work, and rewarding because it is difficult.

Is AI going to take away that sense of challenge and creativity?

A common fallacy about indexing I’ve noticed in non-indexers is the belief that writing an index is simply a matter of identifying and picking up terms. Before AI, this manifested as creating a list of keywords and using the search function to search PDF proofs for all relevant hits. When I first learned to index, in-house, my supervising editor sometimes used this method, seeing it as a way to get started on the index—creating the word list—before the proofs were ready.

But an index is more than just a list of key terms. Yes, there is an element of searching for terms, and some books lend themselves to keyword searches more than others. But an index also requires paying attention to the implicit content, which is often also indexable, as well as how the book is organized and structured. An indexer considers the audience and shapes the index accordingly. An indexer also shapes the index to fit the pages available, which can lead to drastically different indexes depending on how limited the space is.

All of these decisions, beyond identifying keywords, are uniquely suited to humans. It is a question of priorities, trade-offs, and knowing how to manipulate the index in response to the particular context of each book and audience. My concern with AI tools is that 1) the tool makers are operating under that fallacy, focusing on keyword search, and not making room or providing tools for these other aspects of indexing, and 2) that AI tools, by allowing the user to be more hands-off, will enable users—whether authors, publishers, or even indexers—to fall into that fallacy and to assume that whatever the AI produces is good enough.

How can we keep indexing human centered, both the creation of indexes and their usability?

My other fear with AI tools is that instead of me being in control, problem-solving and making decisions, that I will become an assistant to the AI. By letting the AI create the first draft, I am ceding that initial decision-making about what is relevant and how the index should be structured. Instead of my approach guiding the way, I’m editing what the AI wants to do.

I wonder what is lost in that.

For the index, I fear a loss of value and usefulness. I don’t think an algorithm can properly discern what an audience needs or wants, or how to prioritize and juggle competing demands if space is tight. Not that I or any other human indexer will always get it right either, but I like to think that a human, coming at this from a human perspective, will usually do a better job. If I am simply editing what the AI decides, I am concerned that whatever errors are present, even subtle errors, will be reinforced rather than corrected.

For myself, I fear a loss of creativity and meaning. For better or worse, and as difficult as it can be, I thrive on problem-solving and bringing order to chaos. This is part of what makes me human. If I surrender these aspects of my work to AI, I don’t think I’d want to index anymore, to be honest. I’d find some other line of work that allows me to be creative and to problem-solve. Maybe it is egotistical of me to say that I need to be in control, but I feel like there is something lost in my humanity if I give responsibility for the index to the AI and my role is simply to clean up what the AI produces.

Considering all of this, I’ve also been asking myself the question, is there a difference between using an AI tool and hiring a subcontractor? What is the difference, really, if in both cases I am delegating a portion of the index while retaining the final say? I have to admit I am struggling with the answer. I want to believe that a human subcontractor is inherently better, with their human perspective, which gives them the ability to look beyond a keyword search and the capacity to learn and incorporate feedback. Trust is also an important factor, in that I can build trust with a human subcontractor as we work together and they learn what I am looking for, whereas I have very little trust in an AI’s ability to learn and deliver. Corrections to an index are often very specific, not index-wide, like I assume an algorithm would try to impose. Perhaps it depends on what I am asking the subcontractor or AI tool to do, as sometimes all I want is a simple list of all names or scripture references, for example, and other times I do want the subcontractor to take on a larger role shaping the index.

At this point, the question of AI tools versus subcontractors is still theoretical. I don’t trust AI tools enough and I value what humans bring to the job. But if AI-powered indexing tools progress, this is a question that will need to be answered. Both by indexers who do use subcontractors and by all indexers on if and how to adjust how they work.

This question of subcontractors also ties back to my concern about who is assisting who when using AI. Subcontracting does involve giving up some control over the work, but I feel like the lines are still clearer with subcontractors. I can be clear about what I ask the subcontractor to do, in my review of the subcontractor’s work, and in what I retain for myself. With AI, the lines seem blurred. AI is so fast and at least gives the illusion of power and accuracy that it can be easy to trust the output, to the point where it is no longer really my index anymore. For an AI tool to be effective and for me to feel good about signing off on the index, it clearly needs to be a tool under my control, with me understanding the text, audience, and any other constraints, and making decisions accordingly.

Ethics and Legalities

The last consideration, which is also important to keep in mind, are the ethical and legal dimensions of using AI tools for client work. At least one of my publisher clients has explicitly forbid all AI tools while working on their books. I think that policy is subject to change, as tools evolve and as privacy issues and concerns may be resolved or allayed, but for the time being I think AI in publishing remains a fraught subject and I want to respect my clients’ policies. If I ever find an AI-powered tool that I want to use, I would need to first discuss that with the client before using it with their books.

What do you think about AI and indexing? Or about AI in general? Have you found any instances in which an AI tool is actually useful? Feel free to respond and let me know. I am curious.

As I’ve tried to explain above, I have questions and concerns about AI, not only about whether AI tools can produce quality work, but also about how it may change the nature of indexing work and how it may affect what makes us human. The claims of AI are very different compared to software such as Cindex. Cindex is powerful in its own right, but it still requires me to manually create all of the index entries and decide how the entries are structured together. Cindex is clearly a tool under my control. AI has the potential to blur that distinction. What is the effect on me if my role transforms into feeding the machine? I am concerned about the unintended consequences, both for indexers and the indexes we write, and wish I had more answers. Time will tell, I suppose.

Yours in indexing,

Stephen

Stephen Ullstrom

AI and the Nature of Indexing Work

AI-Powered Indexing Software

AI and the Nature of Indexing Work

Ethics and Legalities

Q&A: Biggest Bottlenecks Starting as an Indexer? (+ Book Sale!)

Q&A: Dealing with Lulls Between Projects

Intentional Index Structure