Grab Accelerates Data Discovery with AI-Powered Hubble
Singapore, November16, 2024 – Ride-hailing giant Grabhas significantly streamlined its data discovery process using a combination of large language models (LLMs), specifically GPT-4, and internal tools like Hubble and Glean,integrated with Slack. This innovative approach tackles the significant challenge of locating valuable data within a sprawling data landscape encompassing over 200,000 tables.
The company’s internal data discovery tool, Hubble, built on the Datahub platform, faced the common problem of inefficient data searches. A staggering 18% of searches were abandoned before users even reviewed the results, highlightingthe urgent need for improvement. Data consumers often relied on tribal knowledge, leading to discovery processes that could stretch for days. As Shreyas Parbat, Grab’s Chief Product Manager, explained, Our vision was clear: automate the entire process with LLM-powered products, removing the human element from data discovery. Our goal was to reduce data discovery time from days to seconds, making it accessible to everyone.
Initial improvements focused on enhancing Hubble’s user interface and underlying search functionality. Irrelevant and obsolete tables were hiddenor removed from the search index, relevance ranking was improved, authentication was streamlined, and relevant tags were added. These UI/UX enhancements alone resulted in a 12% increase in search click-through rates.
However, the most significant leap forward came from leveraging the power of GPT-4. Theteam integrated GPT-4 into Hubble to automatically generate documentation for tables based on their schemas and sample data. This feature allows data producers to easily create comprehensive table documentation or customize the AI-generated descriptions. The impact was dramatic: documentation coverage jumped from a mere 20% to 90%, with 95% of users finding the generated documentation valuable. This significantly reduced the reliance on outdated or incomplete documentation, accelerating the data discovery process.
To further enhance accessibility, Grab developed a Slack bot using Glean, seamlessly integrating it with Hubble. This allows data consumers to access and explore data lake table documentation directly withintheir familiar Slack workspace, minimizing context switching and maximizing efficiency. The integration of Glean and the Slack bot effectively democratizes data access, empowering a wider range of users to quickly find the information they need.
Grab’s approach showcases a compelling example of how LLMs can revolutionize data management in large organizations.By combining AI-powered automation with a user-centric design, Grab has dramatically improved data discovery, reducing search times from days to seconds and empowering its data consumers to unlock the full potential of its vast data resources. This initiative not only boosts productivity but also fosters a more data-driven culture within the company.
References:
- Grab Engineering Blog (Specific blog post URL needed here – replace with actual URL)
(Note: The provided text lacked a specific URL for the Grab Engineering Blog post. Please replace the placeholder with the actual URL for complete academic rigor.)
Views: 0