A major focus of enterprise work these days is automating human tasks for greater efficiency. Computer giant IBM asked in its recent study whether generative artificial intelligence (AI), such as large language models (LLM), could be a stepping stone for automation.
Known as “SNAP,” IBM’s proposed software framework trains an LLM to predict the next step in a business process given all events that have occurred before. These predictions, in turn, can serve as suggestions for what action a business might take.
Also: Can chatgpt predict the future? Training AI to figure out what happens next
“SNAP can improve next activity prediction performance for different BPMs [business process management] dataset,” write Elon Oved and colleagues at IBM Research in a new paper. SNAP: Semantic stories for next activity prediction, Published this week on the arXiv pre-print server.
IBM’s work is just one example of the tendency to use LLM to predict the next event or action in a series. Scholars continue to work with what are called time series data — data that measure the same variable over time to identify trends. The IBM work does not use time series data, but it focuses on the concept of sequential events and possible outcomes.
Also: AI is surpassing our best weather forecasting technology, thanks to DeepMind
SNAP is an acronym for “Semantic Stories for Next Activity Prediction”. Next-activity prediction (the NAP part of SNAP) is an existing, decades-old field of systems research. NAP typically uses older forms of AI to predict what will happen after all actions up to that point have been input, usually from business logs, a practice known as “process mining”.
The semantic story component of SNAP is the part that IBM adds to the framework. The idea is to use the richness of the language in programs like GPT-3 to go beyond the activities of traditional AI programs. Language models can capture the finer details of a business process and turn them into a coherent “story” in natural language.
Older AI programs can’t handle all the data about business processes, Oved and team write. They “use only activity sequences as input to build a classification model,” and, “rarely do additional numerical and categorical features in such frameworks for prediction be considered.”
Also: Why Nvidia is teaching robots to turn pens and how generative AI is helping
An LLM, in contrast, can pick out many more details and mold them into a story. An example is a loan application. There are several steps to the application process. LLM can be fed various items from the database about the loan amount, such as “Amount = $20,000” and “Request Start Date = August 20, 2023”.
These data items can be automatically converted by LLM into a natural language description, such as:
“The amount of the loan requested was $20,000, and it was requested by the customer. The action “Application Registration” occurred on the 6th, which occurred 12 days after the case was initiated. […]”
The SNAP system involves three steps. First, a template for a story is created. Then, that template is used to create a complete description. And finally, the stories are used to train the LLM to predict the next event that will occur in the story.
In the first step, properties — such as loan amounts — are given in the language model prompt, along with an example of how to turn them into a template, a scaffold for a story. The language model is called to do the same for a new set of attributes and it outputs a new template.
In a second step, that new template is fed into the language model and populates the model as a finished story in natural language.
The final step is to train many such stories in an LLM to predict what will happen next. The conclusion to this combination of stories is the “ground truth” training example.
Also: Generative AI does not find its own errors. Do we need a better prompt?
In their study, Oved and team tested whether SNAP is better than older AI programs at predicting next-action. They used four publicly available data sets, including car-maker Volvo’s actual database of IT incidents, a database of environmental permitting process records, and a collection of fictional human resources cases.
The authors use three different “language basis models”: OpenAI’s GPT-3, Google’s BERT, and Microsoft’s DeBERTA. They say all three “provide superior results compared to established benchmarks”.
Importantly, although GPT-3 is more robust than the other two models, its performance in the test is relatively modest. They conclude that “even relatively small open source LFMs such as BERT have solid SNAP results compared to larger models.”
The authors also found that complete sentences in language models appear to be important for performance.
“Does Meaningful Story Structure Matter?” They ask before concluding: “Designing coherent and grammatically correct semantic stories from business process logs is a key step in the SNAP algorithm.”
Also: Five ways to use AI responsibly
They compare GPT-3 and other model stories with a different approach where they combine the same information into a single, long text string. They find the former approach, which uses complete, grammatical sentences, has much higher accuracy than a mere string of features.
The authors conclude that generative AI is useful for helping to mine all the data about processes that traditional AI cannot capture: “It is particularly useful where the categorical feature space is large, such as user pronunciation and other free-text features.”
On the other hand, SNAP’s advantages are reduced when it uses data sets that do not contain much semantic information — in other words, written descriptions.
“A central finding of this work is that the performance of SNAP increases with the amount of semantic information in the dataset,” they write.
Importantly for the SNAP approach, the authors suggest that it is possible that data sets will be increasingly enhanced by new technologies, such as robotic process automation, “where user and system utterances often contain rich semantic information that can be used to improve the accuracy of predictions.”