黄山的油菜花黄山的油菜花

In a groundbreaking development for the field of artificial intelligence, a team from the University of Science and Technology of China (USTC) and Huawei Noah’s Ark Laboratory has been awarded the Best Student Paper at the 30th ACM Knowledge Discovery and Data Mining Conference (KDD2024). The paper, titled Dataset Regeneration for Sequential Recommendation, introduces a novel paradigm that promises to revolutionize the way sequential recommendation systems (SR) capture and adapt to users’ evolving preferences.

Background and Significance

Sequential recommendation systems are a vital component of modern recommendation systems, as they aim to capture the dynamic nature of user preferences. Over the years, researchers have been working to enhance the capabilities of these systems, typically following a model-centric paradigm that focuses on developing effective models based on fixed datasets. However, this approach often overlooks potential quality issues and defects within the data itself.

To address these challenges, the academic community has shifted towards a data-centric paradigm, which emphasizes the generation of high-quality datasets using fixed models. This shift has led to the concept of dataset regeneration, which is at the heart of the USTC and Huawei Noah’s Ark Laboratory’s research.

The Research Team

The research was led by Professor Chen Enrong from the National Key Laboratory of Cognitive Intelligence at USTC, an IEEE Fellow with a profound background in data mining and machine learning. Professor Chen’s team has published numerous papers in top-tier journals and conferences, with their work cited over 20,000 times on Google Scholar. The collaboration with Huawei Noah’s Ark Laboratory, a Huawei affiliate dedicated to basic research in artificial intelligence, further strengthened the research effort.

The Paper and Its Contributions

The paper, presented at KDD2024 in Barcelona, Spain, from August 25 to 29, introduces a new data-centric paradigm called Dataset Regeneration for Sequential Recommendation (DR4SR). The team’s key insight is to learn a new dataset explicitly incorporating item transition patterns to achieve optimal training data.

The modeling process is divided into two stages: extracting transition patterns from the original dataset and learning user preferences based on these patterns. The challenge lies in the fact that learning the mapping from transition patterns to user preferences involves two implicit mappings, making the process complex. The research team has tackled this challenge by exploring the possibility of developing a dataset that explicitly represents item transition patterns, allowing the learning process to be clearly divided into two stages.

Methodology

To facilitate dataset regeneration, the team proposed a novel pre-training task that enables the regeneration process. They also introduced a diversity-enhanced regenerator to model the many-to-one relationship between sequences and patterns during regeneration. Additionally, a hybrid inference strategy was proposed to balance exploration and exploitation, leading to the generation of new datasets.

The team also recognized that while the dataset regeneration process is general, it might not be fully suitable for specific target models. To address this, they introduced DR4SR+, a model-aware regeneration process that customizes datasets based on the characteristics of the target model. DR4SR+ uses a双层 optimization problem and implicit differentiation techniques to personalize ratings and optimize patterns in the regenerated dataset, enhancing its effectiveness.

Implications and Future Directions

The research represents a significant step forward in the field of sequential recommendation systems. By shifting the focus from model-centric to data-centric approaches, the team has opened up new avenues for improving the quality and adaptability of recommendation systems.

The paper has already generated considerable interest within the academic community, and its practical applications are being eagerly awaited by industry professionals. As the field continues to evolve, the insights and methodologies presented in this paper are likely to influence future research and development in AI and machine learning.

For those interested in delving deeper into the research, the paper is available at https://arxiv.org/abs/2405.17795, and the code for the project can be found at https://github.com/USTC-StarTeam/DR4SR.

The KDD2024 Best Student Paper Award is a testament to the innovative spirit and academic excellence of the USTC and Huawei Noah’s Ark Laboratory team, setting the stage for further advancements in the field of AI.


read more

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注