Abstract We consider minimizing finite-sum objective functions via Hessian-averaging based subsampled Newton methods. These methods allow for gradient inexactness and have fixed per-iteration Hessian approximation costs. The recent work (Na et al. 2023) demonstrated that Hessian averaging can be utilized to achieve fast O (k{k}) O log k k local superlinear convergence for strongly convex functions in high probability, while maintaining fixed per-iteration Hessian costs. These methods, however, require gradient exactness and strong convexity, which poses challenges for their practical implementation. To address this concern we consider Hessian-averaged methods that allow gradient inexactness via norm condition based adaptive-sampling strategies. Furthermore, to better control the error in the subsampled Hessian approximations, we utilize Hessian averaging with deterministic cyclic sampling techniques instead of random sampling, which leads to fast local superlinear convergence. We develop a comprehensive convergence theory, including global linear and sublinear convergence rates for strongly convex and nonconvex functions, respectively. Additionally, we establish an improved local superlinear convergence rate of O (1k) O 1 k. Our analysis introduces novel techniques that differ from previous probabilistic approaches. We investigate the performance of these methods on logistic regression problems, demonstrating significant improvements in convergence over similar Hessian-averaging methods that utilize stochastic sampling.
Building similarity graph...
Analyzing shared references across papers
Loading...
Thomas O’Leary-Roseberry
Raghu Bollapragada
Mathematical Programming
The University of Texas at Austin
The Ohio State University
Building similarity graph...
Analyzing shared references across papers
Loading...
O’Leary-Roseberry et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d895206c1944d70ce06110 — DOI: https://doi.org/10.1007/s10107-026-02354-0