Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive Gradient Descent Method for e.g., BCG #549

Merged
merged 64 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
433a3ff
adaptive gradient descent method for various things
pokutta Feb 17, 2025
7558164
added to runtests.jl as well
pokutta Feb 17, 2025
9ad4a78
minor / forgotten update
pokutta Feb 17, 2025
0e9aaca
added second example
pokutta Feb 18, 2025
2841507
added proximal version including tests
pokutta Feb 18, 2025
54ecf34
cleaned up files
pokutta Feb 18, 2025
b4f07ca
minor
pokutta Feb 18, 2025
e0e3f58
Tighten threshold for lazy FW vertex. (#550)
dhendryc Feb 18, 2025
5e37019
dummy commit to force CI
pokutta Feb 18, 2025
0039e45
added more projections and tests for those
pokutta Feb 18, 2025
9fb11b2
check code
pokutta Feb 19, 2025
0e3cc78
updated example!
pokutta Feb 19, 2025
f91f635
generalized with all prox from ProximalOperators
matbesancon Feb 19, 2025
b85379a
specific methods
matbesancon Feb 19, 2025
dfa08fb
replace nameof
matbesancon Feb 19, 2025
18813b7
adapt one test
matbesancon Feb 19, 2025
8f82386
adapt one test
matbesancon Feb 19, 2025
5cfff5f
merge master
matbesancon Feb 20, 2025
0729c3d
addapt API
matbesancon Feb 21, 2025
d3e079e
typo
matbesancon Feb 21, 2025
60ed116
PO dep in test
matbesancon Feb 21, 2025
b225a45
correct double packagwe
matbesancon Feb 21, 2025
3188df9
naming
matbesancon Feb 21, 2025
47b8ac0
rename prox kw
matbesancon Feb 21, 2025
b9386ed
rename prox kw
matbesancon Feb 21, 2025
159cef8
max iteration keyword
matbesancon Feb 21, 2025
3327cf1
format
matbesancon Feb 21, 2025
c58f2bd
format examples
matbesancon Feb 21, 2025
6dcb5b3
Merge branch 'master' of github.com:ZIB-IOL/FrankWolfe.jl into AdaGD
matbesancon Feb 21, 2025
ca889ad
fixed the test issues
pokutta Feb 22, 2025
ed951e0
fix tests
matbesancon Feb 22, 2025
3b17f0b
simplify to linf ball
matbesancon Feb 22, 2025
40f36f0
add docs page
matbesancon Feb 23, 2025
ccfd157
remove excessive comment
matbesancon Feb 23, 2025
116067b
fix type
matbesancon Feb 23, 2025
55d627b
relax test and more idiomatic gradient call
matbesancon Feb 23, 2025
9586a74
stable RNG
matbesancon Feb 24, 2025
c4d93ac
loosen tol slightly
matbesancon Feb 24, 2025
05372bc
loosen tol slightly
matbesancon Feb 24, 2025
f8bb6db
loosen gradient, remove verbose
matbesancon Feb 24, 2025
9802981
format
matbesancon Feb 24, 2025
9666d07
loosen iter count
matbesancon Feb 25, 2025
423be27
fix conflict
matbesancon Feb 25, 2025
9cf53fb
tols
matbesancon Feb 25, 2025
97be149
fix seed call
matbesancon Feb 25, 2025
1c4e70e
add tolerance on inequalities
matbesancon Feb 25, 2025
7fa25eb
loosen bound SDP
matbesancon Feb 25, 2025
04e10f9
restrict for exact lmo, minor adjustments
matbesancon Feb 25, 2025
8378b2e
rename functions in gradient file to avoid overwritting
matbesancon Feb 25, 2025
54ca33c
prints for action test
matbesancon Feb 25, 2025
9a0b837
typo
matbesancon Feb 26, 2025
65c1571
fix dicg max step in unit simplex
matbesancon Feb 26, 2025
66c617a
prints for action test
matbesancon Feb 26, 2025
9136b92
loosen tol
matbesancon Feb 26, 2025
955250f
adapt to RNG
matbesancon Feb 26, 2025
0fc0635
adapt to RNG
matbesancon Feb 26, 2025
c10af73
Read and clean
sebastiendesignolle Mar 1, 2025
d6c9335
Make @testset string uniform to length 64
sebastiendesignolle Mar 1, 2025
70433ea
Avoid isnothing for consistency
sebastiendesignolle Mar 1, 2025
8d0cb02
tolerance
matbesancon Mar 4, 2025
5ab4f75
Revert "Make @testset string uniform to length 64"
matbesancon Mar 4, 2025
04637d0
cleanup spacing in test
matbesancon Mar 4, 2025
1c009ca
Merge branch 'master' of github.com:ZIB-IOL/FrankWolfe.jl into AdaGD
matbesancon Mar 4, 2025
6e3145c
action version
matbesancon Mar 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
with:
version: ${{ matrix.version }}
arch: ${{ matrix.arch }}
- uses: actions/cache@v1
- uses: actions/cache@v4
env:
cache-name: cache-artifacts
with:
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,5 @@ cc/*
docs/src/contributing.md
docs/src/index.md
.DS_Store
examples/heavy_ball_tests/data/*
src/heavyball_dirty.jl
15 changes: 13 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,18 @@ LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MathOptInterface = "b8f27783-ece8-5eb3-8dc8-9495eed66fee"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
ProximalCore = "dc4f5ac2-75d1-4f31-931e-60435d74994b"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Setfield = "efcf1570-3423-57d1-acb7-fd33fddbac46"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
TimerOutputs = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"

[weakdeps]
ProximalOperators = "a725b495-10eb-56fe-b38b-717eba820537"

[extensions]
FrankWolfeProxExt = "ProximalOperators"

[compat]
Arpack = "0.5"
DoubleFloats = "1.1"
Expand All @@ -27,7 +34,10 @@ MultivariatePolynomials = "0.5"
PlotThemes = "3"
Plots = "1.10"
ProgressMeter = "1.4"
ProximalCore = "0.1"
ProximalOperators = "0.16"
Setfield = "1"
StableRNGs = "1"
TimerOutputs = "0.5"
ZipFile = "0.9.4"
julia = "1"
Expand All @@ -53,11 +63,12 @@ MultivariatePolynomials = "102ac46a-7ee4-5c85-9060-abc95bfdeaa3"
PlotThemes = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Polyhedra = "67491407-f73d-577b-9b50-8179a7c68029"
ProximalOperators = "a725b495-10eb-56fe-b38b-717eba820537"
ReverseDiff = "37e2e3b7-166d-5795-8a7a-e32c996b4267"
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Tullio = "bc48ee85-29a4-5162-ae0b-a64e1601d4bc"
ZipFile = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea"
StableRNGs = "860ef19b-820b-49d6-a774-d7a799459cd3"

[targets]
test = ["CSV", "Combinatorics", "DataFrames", "Distributions", "DoubleFloats", "DynamicPolynomials", "FiniteDifferences", "ForwardDiff", "GLPK", "HiGHS", "JSON", "JuMP", "LaTeXStrings", "MAT", "MultivariatePolynomials", "Plots", "PlotThemes", "Polyhedra", "ReverseDiff", "ZipFile", "Test", "Tullio", "Clp", "Hypatia", "StableRNGs"]
test = ["CSV", "Combinatorics", "DataFrames", "Distributions", "DoubleFloats", "DynamicPolynomials", "FiniteDifferences", "ForwardDiff", "GLPK", "HiGHS", "JSON", "JuMP", "LaTeXStrings", "MAT", "MultivariatePolynomials", "Plots", "PlotThemes", "Polyhedra", "ProximalOperators", "ReverseDiff", "StableRNGs", "ZipFile", "Test", "Tullio", "Clp", "Hypatia"]
9 changes: 9 additions & 0 deletions docs/src/reference/5_gradient_descent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Adaptive Proximal Gradient Descent Methods

This package implements several variants of adaptive proximal gradient descent methods.
Their primary use is internal to FrankWolfe.jl, specifically for the Blended Conditional Gradients algorithm, but they can also be used as standalone algorithms.

```@autodocs
Modules = [FrankWolfe]
Pages = ["gradient_descent.jl"]
```
83 changes: 83 additions & 0 deletions examples/ada_gradient.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
using FrankWolfe
using LinearAlgebra
using Random

max_iter = Int(1e5)
print_iter = max_iter // 100
epsilon = 1e-10

"""
Simple quadratic function f(x) = 1/2 * x'Qx + b'x
"""
function quadratic_oracle(x, Q, b)
return 0.5 * dot(x, Q, x) + dot(b, x)
end

"""
Gradient of quadratic function ∇f(x) = Qx + b
"""
function quadratic_gradient!(storage, x, Q, b)
mul!(storage, Q, x)
storage .+= b
return storage
end

# Set random seed for reproducibility.
Random.seed!(42)

# Problem dimension
n = 10000

# Generate positive definite Q matrix and random b vector
Q = rand(n, n)
Q = Q' * Q + I # Make Q positive definite
b = rand(n)

# Create objective function and gradient
f(x) = quadratic_oracle(x, Q, b)
grad!(storage, x) = quadratic_gradient!(storage, x, Q, b)

# Initial point
x0 = 10 * rand(n)

println("Testing Adaptive Gradient Descent (variant 1)")
println("============================================")

x1, f1, hist1 = FrankWolfe.adaptive_gradient_descent(
f,
grad!,
x0;
step0=0.1,
max_iteration=max_iter,
print_iter=print_iter,
epsilon=epsilon,
verbose=true,
)

println("\nFinal objective value: $(f1)")
println("Final gradient norm: $(norm(grad!(similar(x0), x1)))")

println("\nTesting Adaptive Gradient Descent (variant 2)")
println("============================================")

x2, f2, hist2 = FrankWolfe.adaptive_gradient_descent2(
f,
grad!,
x0;
step0=0.1,
max_iteration=max_iter,
print_iter=print_iter,
epsilon=epsilon,
verbose=true,
)

println("\nFinal objective value: $(f2)")
println("Final gradient norm: $(norm(grad!(similar(x0), x2)))")

# Compare the two methods
println("\nComparison")
println("==========")
println("Method 1 final objective: $(f1)")
println("Method 2 final objective: $(f2)")
println("Objective difference: $(abs(f1 - f2))")
println("Solution difference norm: $(norm(x1 - x2))")
95 changes: 95 additions & 0 deletions examples/ada_gradient_conditioned.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
using FrankWolfe
using LinearAlgebra
using Random

max_iter = Int(1e5)
print_iter = max_iter // 20
epsilon = 1e-10

n = 1000
s = 42
Random.seed!(s)

# Create test problem with controlled condition number
const condition_number = 10000.0 # Much better than random conditioning
const matrix = begin
# Create orthogonal matrix
Q = qr(randn(n, n)).Q
# Create diagonal matrix with controlled condition number
λ_max = 1.0
λ_min = λ_max / condition_number
Λ = Diagonal(range(λ_min, λ_max, length=n))
# Final matrix with controlled conditioning
Q * sqrt(Λ)
end
const hessian = transpose(matrix) * matrix
const linear = rand(n)

f(x) = dot(linear, x) + 0.5 * transpose(x) * hessian * x

function grad!(storage, x)
return storage .= linear + hessian * x
end

const L = eigmax(hessian)

# Compute optimal solution using direct solve for testing
const x_opt = -hessian \ linear
const f_opt = f(x_opt)

println("\nTesting adaptive gradient descent algorithms...\n")
println("Test instance statistics:")
println("------------------------")
println("Dimension n: $n")
println("Lipschitz constant L: $L")
println("Optimal objective value f*: $f_opt")
println("Optimal solution norm: $(norm(x_opt))")
println("Problem condition number: $(eigmax(hessian)/eigmin(hessian))")
println()

########## SOLVING

# Initial point
x0 = 10 * rand(n)

println("Testing Adaptive Gradient Descent (variant 1)")
println("============================================")

x1, f1, hist1 = FrankWolfe.adaptive_gradient_descent(
f,
grad!,
x0;
step0=0.1,
max_iteration=max_iter,
print_iter=print_iter,
epsilon=epsilon,
verbose=true,
)

println("\nFinal objective value: $(f1)")
println("Final gradient norm: $(norm(grad!(similar(x0), x1)))")

println("\nTesting Adaptive Gradient Descent (variant 2)")
println("============================================")

x2, f2, hist2 = FrankWolfe.adaptive_gradient_descent2(
f,
grad!,
x0;
step0=0.1,
max_iteration=max_iter,
print_iter=print_iter,
epsilon=epsilon,
verbose=true,
)

println("\nFinal objective value: $(f2)")
println("Final gradient norm: $(norm(grad!(similar(x0), x2)))")

# Compare the two methods
println("\nComparison")
println("==========")
println("Method 1 final objective: $(f1)")
println("Method 2 final objective: $(f2)")
println("Objective difference: $(abs(f1 - f2))")
println("Solution difference norm: $(norm(x1 - x2))")
6 changes: 6 additions & 0 deletions ext/FrankWolfeProxExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module FrankWolfeProxExt

using FrankWolfe
using ProximalOperators

end
3 changes: 3 additions & 0 deletions src/FrankWolfe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ using SparseArrays: spzeros, SparseVector
import SparseArrays
import Random
using Setfield: @set
import ProximalCore

import MathOptInterface
const MOI = MathOptInterface
Expand Down Expand Up @@ -49,6 +50,8 @@ include("dicg.jl")
include("tracking.jl")
include("callback.jl")

include("gradient_descent.jl")

# collecting most common data types etc and precompile
# min version req set to 1.5 to prevent stalling of julia 1
@static if VERSION >= v"1.5"
Expand Down
2 changes: 1 addition & 1 deletion src/afw.jl
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ function away_frank_wolfe(

# compute current iterate from active set
x = get_active_set_iterate(active_set)
if isnothing(momentum)
if momentum === nothing
grad!(gradient, x)
else
grad!(gtemp, x)
Expand Down
2 changes: 1 addition & 1 deletion src/blended_cg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,7 @@ function minimize_over_convex_hull!(
tolerance,
)
#Early exit if we have detected that the strong-Wolfe gap is below the desired tolerance while building the reduced problem.
if isnothing(M)
if M === nothing
return 0
end
T = eltype(M)
Expand Down
Loading
Loading