90年代的黄河路

Renowned AI expert and chief scientist at Meta, Yann LeCun, has recently expressed his preference for Model Predictive Control (MPC) over Reinforcement Learning (RL) in a recent post. LeCun, who has an extensive background in journalism and has worked for prominent media outlets like Xinhua News Agency, People’s Daily, CCTV, The Wall Street Journal, and The New York Times, is known for his critical views on RL, a popular machine learning technique.

LeCun argues that RL requires an excessive amount of trial and error for learning new tasks, a process he deems inefficient compared to MPC. In his view, MPC, which allows for solving new tasks without specific task learning given a good world model and a task objective, embodies the power of planning. He doesn’t dismiss RL entirely but suggests that it should be a last resort in AI development.

The AI pioneer’s stance on RL echoes his previous statements. In a speech half a year ago, he advocated for abandoning RL, a sentiment that he later clarified in an interview. He didn’t mean a complete abandonment, but rather a minimization of RL’s use, emphasizing that the correct approach to training systems is first to teach them a good representation of the world and world models through primarily observation (and possibly a little interaction).

MPC, a control system technology that uses mathematical models for real-time optimization over a finite horizon, has been in use since the 1960s and 70s, finding applications in chemical engineering, refining, advanced manufacturing, robotics, and aerospace. The integration of machine learning (ML) into MPC, known as ML-MPC, has shown potential for significant improvements in control performance and efficiency.

LeCun’s research on world models incorporates MPC concepts, further emphasizing his preference. His views have sparked discussions within the AI community. Some argue that MPC works well when problems can be accurately modeled and have predictable dynamics, suggesting there might be untapped potential in signal processing and control fields for computer scientists. However, others point out the challenge of constructing precise MPC models and the difficulty in obtaining a good world model, a prerequisite in LeCun’s perspective.

Despite the debate, some proponents note that RL and MPC might not be mutually exclusive. Instead, they could complement each other, with studies combining the two techniques achieving promising results. Each method has its strengths and weaknesses, and their suitability depends on the specific problem at hand.

RL, for instance, is a machine learning approach that learns through trial and error, excelling at solving complex dynamic or unknown system models. It has been applied in autonomous systems, robotics, and other control system optimizations, allowing for dynamic adaptation to optimize system behavior. However, RL often requires extensive interaction with the environment and can be computationally intensive.

On the other hand, MPC, by leveraging predictive models, can proactively optimize control actions in a more controlled and efficient manner, but it relies on accurate models and may struggle with uncertainty or non-linear systems.

The ongoing dialogue within the AI community highlights the importance of considering various approaches in solving complex problems. As research continues to push the boundaries of AI, the interplay between RL, MPC, and potentially other methodologies will likely shape the future of AI development, striving for more efficient and human-like learning mechanisms.

【source】https://www.jiqizhixin.com/articles/2024-08-26-15

Views: 16

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注