AI-Powered Cloud Storage Solution That Changes Data Management for Enterprises

Manjul Sahay
Manjul Sahay

Over the last few months, there has been rapid progress in the area of Generative Artificial Intelligence (AI). At the Google Cloud Next conference, the company showcased the seamless integration of Generative AI in various enterprise use cases. One very innovative use case was enabling enterprises to manage cloud storage at scale using Generative AI. Manjul Sahay, a product manager at Google, shared the problem of managing storage at scale for cost and security and showed how Generative AI can effectively solve this problem. Sahay's career spans companies like Google, Amazon Web Services, and Nutanix, where he has focused on developing solutions that address the evolving data management field. His work contributed to creating products that have impacted the industry.

AI-Driven Storage Insights

"Storage Insights with Gemini," is a new cloud product that uses Generative AI to give enterprises visibility into their storage estates. This product enables customers to ask questions about their storage environment in natural language by leveraging Google's Gemini Large Language Model, aiming to simplify data management. Gemini 1.5 Pro offers the world's longest context window, with support for up to 1 million tokens. In a security context, it can dramatically simplify the technical and labor-intensive process of reverse engineering malware. For example, it was able to process the entire decompiled code of the malware file for WannaCry in a single pass, taking 34 seconds to deliver its analysis.

"Large enterprises often face the challenge of managing millions or billions of files across their storage infrastructure," explains Sahay. "Storage Insights with Gemini aims to help these organizations better understand their storage environment, providing insights that can inform decision-making and optimize resource allocation."

Cloud storage is widely used by thousands of customers, including Uber, Toyota, and Snapchat, and it is a key component of enterprise cloud applications. Many enterprises are now operating at this scale of billions of files where traditional storage management does not work. The traditional approach involves extensive manual effort to export metadata, build data pipelines, and develop automation, which can be both time-consuming and complex.

Sahay led a team to build this Generative AI product from the ground up by developing requirements, working through architecture, making key technical choices, and eventually launching this capability. Sahay explained that the new product leverages storage insights datasets and BigQuery to collect object metadata and produce daily object metadata snapshots. The data in these daily snapshots is then analysed by a Large Language Model to answer natural language queries.

Sahay also spoke about some key technical challenges and innovations while building this product. One of the biggest problems for Generative AI products is trust and accuracy. Users are continuously worried that the response from AI models may be incorrect and, hence, do not fully trust such AI systems. Sahay worked with his team to build two major trust and accuracy safeguards: pre-curated prompts and displaying SQL queries for user verification. Sahay and the team conducted extensive research to create a set of pre-curated prompts covering common queries related to usage, savings, security, and data discovery. These prompts were engineered for accurate responses and displayed a label indicating high accuracy. Each AI response is also accompanied by the corresponding SQL query, and users can look at the queries to self-verify for correctness.

Ensuring AI systems only respond with permissible data is another major problem. Sahay and the team solved that by ensuring that the logged-in user's role and permissions are used to limit the data available to the AI model, thereby ensuring data access governance.

Sahay recently presented this product to multiple technology thought leaders at Cloud Field Day. Sahay demonstrated how users can use this product to identify storage distribution across regions quickly, check for public access vulnerabilities, and manage costs by locating and addressing orphaned or unnecessary data. With AI-generated insights, enterprises can better understand the security and compliance posture of their stored data and avoid expensive problems. Enterprises also need these insights to understand and optimize cloud storage costs effectively. Cloud costs are increasingly a focus area for enterprise cloud customers. AI-generated insights can quickly analyze costs, security, and compliance across massive-sized storage estates. They can uncover new findings that storage administrators may not even have considered previously.

In response to an analyst's question, Sahay also hinted at the possibility of expanding these capabilities to include more complex operations and other storage services in the future. He emphasized the importance of AI in accelerating analysis and providing deeper insights while cautioning against fully automated actions without human oversight.

New AI Product, New Industry Standards

Thousands of large and small customers use Cloud Storage. For these customers, this product will make a significant difference by helping them avoid security incidents, staying in compliance and managing storage costs. Storage security incidents such as the Capital One cloud storage incident have had major financial and reputation implications for customers and cloud providers. If AI can help reduce such security incidents, that will be a major win for all organisations.

In a Google Cloud Next presentation in Las Vegas with hundreds of attendees, Sahay shared that Storage Insights is already processing trillions of object records every month. He also shared how a leading social media customer analyzed billions of objects within 24–48 hours and at a fraction of the cost as compared to any other alternative. Another major customer, Recursion Pharma, a drug discovery company, shared how they are taking advantage of these storage management features to manage their 50+ petabytes and 3 billion files of data.

This new cloud product is also the first step in a very new and exciting direction of using AI to automate and simplify cloud infrastructure management. Google is the first cloud provider to launch a Generative AI powered storage management product. We believe other large cloud players will soon follow the direction set here by Google Cloud and offer similar features in their cloud storage offering. These AI-generated and AI-powered management insights will soon be the new norm in the cloud industry.

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion
Real Time Analytics