Asset sprawl, siloed data and CloudQuery’s search for unified cloud governance

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

Gaining visibility — and, ultimately, insights — into enterprise cloud assets is growing ever more challenging.

Cloud estates are sprawling and fragmented, and inventory capabilities in existing tools can be narrow and unintuitive, separating elements like cost and security data into disconnected platforms with limited flexibility.

Cloud governance company CloudQuery is positioning itself to address this problem by centralizing cloud assets, security metadata and cost in one place, and making it accessible through easy, built-in SQL queries and reports. The company is taking a developer-first approach to cloud governance, pulling data from 60-plus sources — including AWS, GCP, Azure, Okta and Wiz — into a single, queryable data warehouse.

The company is now announcing a $15 million funding round led by Partech to further scale its approach to cloud visibility.

“The biggest challenge with existing tools is that they’re siloed — one for security, one for cost, one for asset inventory — making it hard to get a unified view across domains,” CQ founder Yevgeny Pats told VentureBeat. “Even simple questions like ‘What EBS volume is attached to an EC2 that is turned off? are hard to answer without stitching together multiple tools.”

CloudQuery under the hood

CloudQuery uses two key technologies under the hood: Data warehouse and open-source database ClickHouse and the Apache Arrow framework for developing data analytics applications.

This high-performance plugin architecture built in Go connects directly to APIs like AWS, Azure, Google Cloud Platform (GCP) and many other platforms pulling in configuration, security, and cost metadata. The platform continuously syncs data from dozens of cloud providers and services into a normalized, centralized asset inventory.

“We place a strong emphasis on data accuracy and freshness, syncing at high frequency to ensure teams are working with the most reliable, up-to-date information,” said Pats.

That data, he explained, is structured relationally to power CloudQuery’s SQL engine and built-in reports, so that teams can have full flexibility without relying on black-box tools.

The company also “selectively” uses large language models (LLMs) for natural language querying, SQL generation and recommendations, “but always on top of a foundation of accurate, transparent data,” said Pats. He pointed out that because AI understands SQL well, tools like Claude and OpenAI can create customized reports and analysis in plain English.

Taking a developer-first approach is critical, said Pats, because developers are ultimately the ones building, operating and securing today’s cloud infrastructure. Still, many cloud visibility tools were built for top-down governance, not for the people actually in the trenches.

“When you put developers first, with accessible data, flexible APIs and native language like SQL, you empower them to move faster, catch issues earlier and build more securely,” he said.

Customers are finding ways to use CloudQuery beyond asset inventory. “Many start with visibility, then quickly grow into use cases like compliance monitoring, security posture management, cost optimization, all from the same core platform,” said Pats.

How Hexagon built a serverless data lake for all its cloud stores

One enterprise already seeing results is Hexagon. The software company’s cloud center of excellence (CCoE) team had a goal to build a fully serverless data lake that could collect data from all of its cloud accounts and store it in a single data lake.

They also wanted the ability to query this data using SQL and visualize it with tools they were familiar with (such as AWS QuickSight), and explore the history of their cloud configuration over time.

The team built a serverless data pipeline using CloudQuery to collect data from all accounts and store it in S3. AWS Glue then ingests data into Glue DB in a format that Amazon Athena can query, which Athena then does, then visualises in QuickSight.

“Having a fully serverless solution was an important requirement,” Hexagon cloud governance and FinOps expert Peter Figueiredo and CloudQuery director of engineering Herman Schaaf wrote in a blog post. “This decision brought lots of benefits since there is no need for time-consuming updates and virtually zero maintenance.”

They did have to overcome some challenges, particularly with Amazon S3 support plugins. The CCoE team was one of the first to try out CloudQuery features in the S3 destination and offered insights leading to new features. These include:

Parquet support: The CloudQuery file destination initially only supported CSV and JSON data formats. Errors in JSON interpretations led CloudQuery to add Parquet support.
Data partitioning: A CloudQuery file destination plugin now allows partitioning on initial write (which previously wasn’t available, resulting in extra unnecessary steps).
Resource view for Athena: CloudQuery initially only offered a resources view for AWS compatible with Postgres. But Athena didn’t support this, so CloudQuery added a function that can retrieve a list of all tables to build or update a resources view.

Figueiredo’s team used CloudQuery to replace AWS’s VPC IP address manager (IPAM) — which he called expensive and limited in that it does not cover other cloud providers.

Ultimately, his team runs CloudQuery in ‘data lake’ mode using “ultra cheap infrastructure” including AWS S3, ECS, Glue, Athena and Lambda,” Figueiredo told VentureBeat. This keeps costs low and allows Hexagon to merge all its IP addresses across different cloud providers.

“We can quickly query any IP across the board and find who the owners are,” said Figueiredo. “We are now able to collect all we need at a very low cost with near zero maintenance. This is the holy grail for our team.”

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link