The large language models popularized by chatbots are being taught to alternate reasoning with calls to external tools, such as Wikipedia, to boost their accuracy. The strategy could improve ...
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning ...