Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Initial condition on Lanczos #391

Merged
merged 4 commits into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,14 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.63] unreleased

## [0.4.63] June 4, 2024

### Changed

* Fixed a bug that Lanczos produced NaNs when started exactly in a minimizer, since we divide by the gradient norm.

## [0.4.63] May 11, 2024

### Added

Expand Down
4 changes: 4 additions & 0 deletions src/solvers/Lanczos.jl
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,10 @@ function (c::StopWhenFirstOrderProgress)(
dmp::AbstractManoptProblem{<:TangentSpace}, ls::LanczosState, i::Int
)
if (i == 0)
if norm(ls.X) == zero(eltype(ls.X))
c.reason = "The gradient of the gradient is zero."
return true
end
c.reason = ""
return false
end
Expand Down
34 changes: 24 additions & 10 deletions test/solvers/test_adaptive_regularization_with_cubics.jl
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
using Manifolds, ManifoldsBase, Manopt, Test, Random
using LinearAlgebra: I, tr, Symmetric
using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs

include("../utils/example_tasks.jl")

@testset "Adaptive Regularization with Cubics" begin
Random.seed!(42)
n = 8
k = 3
A = Symmetric(randn(n, n))
A = Symmetric(diagm(0 => 1.0:8.0, 1 => ones(7), -1 => ones(7)))
M = Grassmann(n, k)
f_min = -0.5 * sum(eigvals(A)[(n - k + 1):n])
p_min = eigvecs(A)[:, (n - k + 1):n]

f(M, p) = -0.5 * tr(p' * A * p)
grad_f(M, p) = -A * p + p * (p' * A * p)
Expand Down Expand Up @@ -133,23 +135,24 @@ include("../utils/example_tasks.jl")
p1 = adaptive_regularization_with_cubics(
M, f, grad_f, Hess_f, p0; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
# Second run with random p0
@test abs(f(M, p1) - f_min) < 1e-14
@test isapprox(M, p_min, p1)
Random.seed!(42)
p2 = adaptive_regularization_with_cubics(
M, f, grad_f, Hess_f; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
@test isapprox(M, p1, p2)
@test isapprox(M, p_min, p2)
# Third with approximate Hessian
p3 = adaptive_regularization_with_cubics(
M, f, grad_f, p0; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
@test isapprox(M, p1, p3)
@test isapprox(M, p_min, p3)
# Fourth with approximate Hessian and random point
Random.seed!(36)
p4 = adaptive_regularization_with_cubics(
M, f, grad_f; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
@test isapprox(M, p1, p4)
@test isapprox(M, p_min, p4)
# with a large η1 to trigger the bad model case once
p5 = adaptive_regularization_with_cubics(
M,
Expand All @@ -161,20 +164,20 @@ include("../utils/example_tasks.jl")
η1=0.89,
retraction_method=PolarRetraction(),
)
@test isapprox(M, p1, p5)
@test isapprox(M, p_min, p5)

# in place
q1 = copy(M, p0)
adaptive_regularization_with_cubics!(
M, f, grad_f, Hess_f, q1; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
@test isapprox(M, p1, q1)
@test isapprox(M, p_min, q1)
# in place with approx Hess
q2 = copy(M, p0)
adaptive_regularization_with_cubics!(
M, f, grad_f, q2; θ=0.5, σ=100.0, retraction_method=PolarRetraction()
)
@test isapprox(M, p1, q2)
@test isapprox(M, p_min, q2)

# test both in-place and allocating variants of `grad_g``
X0 = grad_f(M, p0)
Expand Down Expand Up @@ -204,7 +207,10 @@ include("../utils/example_tasks.jl")
return_objective=true,
return_state=true,
)
@test isapprox(M, p1, q3)
@test isapprox(M, p_min, q3)

# test that we do not het nan if we start at the minimizer
r1 = adaptive_regularization_with_cubics(M, f, grad_f, Hess_f, p_min)
end

@testset "A short solver run on the circle" begin
Expand All @@ -220,4 +226,12 @@ include("../utils/example_tasks.jl")
)
@test fc(Mc, p0) > fc(Mc, p2)
end

@testset "Start at a point with _exactly_ gradient zero" begin
p0 = zeros(2)
M = Euclidean(2)
f2(M, p) = 0
grad_f2(M, p) = [0.0, 0.0]
@test adaptive_regularization_with_cubics(M, f2, grad_f2, p0) == p0
end
end
Loading