Hi HN! I’m Peter, the co-founder of OpenMeter (http://openmeter.io). We are building an open-source project to help engineers to meter and attribute AI and compute usage for billing and analytics use cases. Our GitHub is at https://github.com/openmeterio/openmeter, and there’s a demo video here: https://www.loom.com/share/bc1cfa1b7ed94e65bd3a82f9f0334d04.

Why? Companies are increasingly adopting usage-based pricing models, requiring accurate metering. In addition, many SaaS products are expected to offer AI capabilities. To effectively cover costs and stay profitable, these companies must meter AI usage and attribute it to their customers.

When I worked at Stripe, my job was to price and attribute database usage to product teams. You can think about it like internal usage-based pricing to keep teams accountable and the business in the margins. This was when I realized that it’s challenging to extract usage data from various cloud infrastructure components (execution time, bytes stored, query complexity, backup size, etc.), meter it accurately, and handle failure scenarios like backfills and meter resets. I was frustrated that no standard exists to meter cloud infrastructure, and we had to do this on our own.

Usage metering requires accurately processing large volumes of events in real-time to power billing use cases and modern data-intensive applications. Imagine you want to meter and bill workload execution on a per-second granularity or meter the number of API calls you make to a third party and act instantly on events like a user hitting a billing threshold. The real-time aspect requires instant aggregations and queries; scalability means to able to ingest and process millions of usage events per second; it must be accurate—billing requires precise metering; and it must be fault tolerant, with built-in idempotency, event backfills, and meter resets.

This is challenging to build out, and the obvious approaches don’t work well: writing to a database for each usage event is expensive; monitoring systems are cheaper but inaccurate and lack idempotency (distributed systems use at-least-once delivery); batch processing in the data warehouse has unacceptable latency.

Companies also need to extract usage data from cloud infrastructure (Kubernetes, AWS, etc.), vendors (OpenAI, Twilio, etc.), and hardware components to attribute metered usage to their customers. Collecting usage in many cases requires writing custom code like measuring execution duration, listening to lifecycle events, scraping APIs periodically, parsing log streams, and attributing usage of shared and multi-tenant resources.

OpenMeter leverages stream processing to be able to update meters in real-time while processing large volumes of events simultaneously. The core is written in Go and uses the CloudEvents format to describe usage, Kafka to ingest events, and ksqlDB to dedupe and aggregate meters. We are also working on a Postgres sink for long-term storage. Check out our GitHub to learn more: https://github.com/openmeterio/openmeter

Other companies in the usage-based billing space are focused on payments and basically want to be Stripe replacements. With OpenMeter, we’re focusing instead on the engineering challenge of collecting usage data from cloud infrastructure and balancing tradeoffs between cost, scale, accuracy, and staleness. We’re not trying to be a payment platform—rather, we want to empower engineers to provide fresh and accurate usage data to Product, Sales, and Finance, helping them with billing, analytics, and revenue use cases.

We’re building OpenMeter as an open-source project (Apache 2.0), with the goal of making it the standard to collect and share usage across many solutions and providers. In the future, we’ll offer a hosted / cloud version of OpenMeter with high availability guarantees and easy integrations to payment, CRM, and analytics solutions.

What usage metering issues or experiences do you have? We would love to hear your feedback on OpenMeter and to learn from which sources you need to extract usage and how the metered data is leveraged. Looking forward to your comments!