
NYU’s LLMtime program finds the next possible event in a sequence of events, as represented by strings of numeric digits. New York University
Today’s generative artificial intelligence programs, tools like ChatGPT, are going to produce more types of results than just text, as ZDNET explored in some depth.
One of the most important of these “methods,” as they are known, is called Time series data — Data that measure the same variable at different times to identify trends Data in a time series format can be important such as tracking a patient’s medical history over time with entries made by a physician in a chart. What is the time series called? forecast means taking historical data and predicting what is happening next; For example: “Will this patient recover?”
Also: ChatGPT seems confused as to when the knowledge will end
Traditional approaches to time series data involve software designed specifically for that type of data only. But now, generative AI is gaining a new ability to handle time series data as it handles essay questions, image generation, software coding and various other tasks where ChatGPT and similar programs have excelled.
In new research published this month by New York University’s Nate Gruver and colleagues at NYU and Carnegie Mellon, OpenAI’s GPT-3 program is trained to predict the next event in a time series much like predicting the next word in a sentence.
“Because language models are designed to represent complex probability distributions over sequences, they are theoretically suitable for time series modeling,” Gruver and team write in their paper, “Large Language Models as Zero-Shot Time Series Predictors.” Posted on the arXiv pre-print server. “Time series data generally take the same form as language modeling data, as a collection of sequences.”
The program they developed, LLMTime, is “extremely simple,” Gruver and team write, and “capable of surpassing or matching purpose-built time series methods” on a variety of problems. zero shot fashion, which means that LLMTime can be used without any fine-tuning on downstream data used by other models.”
Also: Generative AI will go beyond what ChatGPT can do. Here’s everything about how technology advances
Key to creating LLMTime was for Gruver and the team to rethink what “tokenization” is, the way a large language model represents the data it’s working on.
Programs like GPT-3 have a specific way that they input words and characters, breaking them up into chunks that can be accepted one at a time. Time series data is represented as a sequence of numbers, such as “123”; A time series is the pattern in which such digit sequences occur.
Given this, GPT-3’s tokenization is problematic because it often splits those strings into awkward groupings. “For example, the number 42235630 is tokenized [422, 35, 630] by the GPT-3 tokenizer, and changing even a single number can result in a completely different tokenization,” related Gruver and team.
To avoid these awkward groupings, Gruver and team developed code to insert white space around each digit of a digit sequence, so that each digit is encoded separately.
Also: 3 ways AI is revolutionizing the way healthcare organizations serve patients Can LLMs like ChatGPT help?
They then went to work training GPT-3 to predict the next sequence of numbers on real-world examples of time series.
Any time series is a sequence of things that happened one after the other, like, “The dog jumped off the sofa and ran to the door,” where one event is followed by another. An example of a real data set that people want to make predictions about is predicting ATM withdrawals based on historical withdrawals. A bank would be very interested in predicting such things.
Predicting ATM withdrawals is, in fact, one of the challenges in a real-time competition series like this one Artificial Neural Networks and Computational Intelligence Compete in Forecasting, run by Lancaster University in the UK. That set of data is simply strings and strings of numbers, in the form:
T1: 1996-03-18 00-00-00 : 13.4070294784581, 14.7250566893424, etc.
The first part is obviously the date and time stamp for “T1”, which represents the first moment in time, and what follows is the number (separated by periods, not commas, as is the case in European notation). The challenge for a neural net is to predict, given thousands or even millions of items, what will happen the moment after the last instance in the series — how much clients will recall tomorrow.
Also: This new technology can blow away everything like GPT-4 and the like
The authors state, “LLMTIME is not only able to produce reasonable inferences of real and synthetic time series, it also achieves high probability […] On zero-shot evaluation of dedicated time series models […]”That’s decades in the making.
The LLMtime program finds where a number is in a distribution, a distinct pattern of repetition of the number, concluding whether a sequence represents an “exponential” or one of the general patterns such as Gaussian. New York University
However, one limitation of large language models, Gruver and team point out, is that they can only take in so much data at a time, known as the “context window.” To handle larger and larger time series, programs need to expand that context window to many more tokens. Hyena team from Stanford University and Canada MILA Institute for AI and Microsoft, among others.
Also: Microsoft gives TikTok generative AI a kind of memory
The obvious question is why a large language model should be good at predicting numbers. As the authors note, for any sequence of numbers such as ATM withdrawals, “there are arbitrarily many generation rules that correspond to the input.” Translation: There are so many reasons why those particular strings of numbers might appear, it’s hard to guess what the underlying rule is for them.
The answer is that GPT-3 and its ilk find rules that are the simplest of all possible rules. “LLMs can predict effectively because they prefer completeness derived from simple rules, taking on a form of Occam’s razor,” write Gruver and team. The principle of parsimony.
Sometimes the GPT-4 program goes astray when it tries to figure out what a time series pattern is, showing that it doesn’t really “understand” time series in the traditional sense. New York University
This does not mean that GPT-3 is really understand What is happening. In a second experiment, Gruver and team submitted to GPT-4 (the more powerful successor to GPT-3) a new data set that they generated using a specific mathematical function. They asked GPT-4 to determine the mathematical function that generated the time series, to answer the question, “Whether GPT-4 can explain the meaning of a given time series in text,” Gruver and team write.
They found that GPT-4 was able to approximate arithmetic functions better than random chance, but it produced some interpretations that were off the mark. “The model sometimes makes incorrect inferences about the behavior of the data it sees or the expected behavior of job candidates.” In other words, even when a program such as GPT-4 can do well at predicting the next thing in a time series, its interpretations end up being “hallucinations,” tending to give wrong answers.
Also: Applying AI to Software Engineering? Here’s what you need to know
Gruver and the team are excited about how time series fits into a multi-model future for generative AI. “Making time series predictions as natural language generation can be seen as another step towards integrating more capabilities into a single large and powerful model, where understanding can be shared across many tasks and methods,” they write in their concluding section.
The code for LLMTtime is Posted on GitHub.