One Coordinate at a Time: Convergence Guarantees for Rotosolve in Variational Quantum Algorithms

Curator's Take

This research settles a crucial theoretical gap that has plagued variational quantum algorithms since their inception, finally proving that the widely-used Rotosolve optimization method actually converges rather than just working well in practice. The work is particularly significant because Rotosolve has become a go-to algorithm for training parameterized quantum circuits in applications from quantum machine learning to quantum sensing, yet until now it lacked the mathematical rigor that classical optimization enjoys. Beyond just proving convergence, the authors provide explicit convergence rates and demonstrate Rotosolve's advantage of being hyperparameter-free while implicitly using higher-order derivative information, making it especially valuable for quantum applications where gradient estimation is expensive. This theoretical foundation should give researchers much greater confidence in using Rotosolve for real quantum computing applications and may inspire new coordinate-descent approaches specifically tailored for quantum optimization landscapes.

— Mark Eatherly

Summary

In this paper, we resolve an open question in the field of optimization algorithms for training parametrized quantum circuits: Does the popular Rotosolve algorithm converge? Until now, interpolation-based coordinate descent methods such as Rotosolve have mostly been treated as heuristics, lacking any formal convergence guarantees. We rigorously analyze Rotosolve, and show that it converges to $\varepsilon$-stationary points if the optimization landscape is non-convex and smooth; and to $\varepsilon$-suboptimal points if the objective function additionally obeys the Polyak-Lojasiewicz (PL) condition. Further, we derive explicit worst-case rates of convergence in the finite quantum measurement regime. These rates are contrasted against those from a similar coordinate-based method: Randomized Coordinate Descent (RCD). Although in the worst case their rates are, prima facie, equivalent, we present arguments for a more nuanced comparison between the two. We highlight that Rotosolve is hyperparameter-free, and implicitly uses first and second derivatives in its updates. Finally, we supplement our theoretical findings with numerical experiments from Quantum Machine Learning; and compare the performance of Rotosolve against RCD, Stochastic Gradient Descent, Simultaneous Perturbation Stochastic Approximation, and Randomized Stochastic Gradient Free methods.

Read Full Article at arXiv Quantum Physics →