Curator's Take
This comprehensive benchmarking study provides the quantum computing community with much-needed transparency about the real-world performance and costs of running quantum chemistry simulations on IBM's hardware. By systematically testing hydrogen molecule calculations across different IBM processors, shot counts, and optimization strategies, the researchers have created a valuable reference dataset that will help both newcomers and experienced practitioners make informed decisions about resource allocation and expected accuracy. The finding that circuit simplification through tapered mappings delivers the most consistent improvements, while expensive error mitigation features like resilience level 1 provide only marginal gains, offers practical guidance for optimizing the accuracy-to-cost ratio in near-term quantum applications. This type of rigorous, hardware-validated benchmarking is essential for the field's maturation, helping bridge the gap between theoretical quantum advantage and practical implementation realities.
— Mark Eatherly
Summary
We present a hardware-validated reference dataset for variational ground-state energy calculations of the hydrogen molecule H\(_2\) on several IBM Quantum processors available in 2026. Using a standardized workflow, we benchmark the impact of shot count, backend choice, optimization strategy, and runtime variability on the achievable energy accuracy relative to exact diagonalization. The resulting dataset and analysis provide a transparent baseline for assessing the current capabilities and limitations of IBM Quantum hardware for quantum-chemistry applications, and are meant to ease the entry for new users by providing a comprehensive overview of choices and their effects as well as runtime efforts and costs that can be expected. Across the configurations studied here, circuit simplification through tapered mappings provides the most consistent accuracy gains, resilience level 1 improves accuracy at a substantial cost premium, and session-based execution yields no systematic accuracy advantage over single-job execution despite markedly higher billed time.