Are LLMs capable of non-verbal reasoning?

13.12.2024 00:55

Arstechnica.com

Large language models have found great success so far by using their transformer architecture to effectively predict the next words (i.e., language tokens) needed to respond to queries. When it comes to complex reasoning tasks that require abstract logic, though, some researchers have found that interpreting everything through this kind of "language space" can start to cause some problems, even for modern "reasoning" models.

Now, researchers are trying to work around these problems by crafting models that can work out potential logical solutions completely in "latent space"—the hidden computational layer just before the transformer generates language. While this approach doesn't cause a sea change in an LLM's reasoning capabilities, it does show distinct improvements in accuracy for certain types of logical problems and shows some interesting directions for new research.

Wait, what space?

Modern reasoning models like ChatGPT's o1 tend to work by generating a "chain of thought." Each step of the logical process in these models is expressed as a sequence of natural language word tokens which are fed back through the model.

Read full article

Comments

Moscow.media

Частные объявления сегодня

Rss.plus

Все новости за 24 часа

Are LLMs capable of non-verbal reasoning?

Wait, what space?

Новости спорта

Саснович вышла в четвертьфинал парного разряда турнира WTA-250 в Румынии

Жители Камчатки стали чаще выезжать в командировки

Как отдохнуть белгородцам 7–9 февраля

Скрывался 12 лет: заявивший о массовых пытках в тюрьмах Сирии раскрыл свою личность

Стартовал новый XIII сезон Международного инженерного чемпионата CASE-IN