Step function-constrained optimization
\begin{equation} \min_{\mathbf{x}\in\mathbb{R}^{K}} ~~ f(\mathbf{x}),~~~~ \mbox{s.t.}~~ \parallel\mathbf{G}(\mathbf{x})\parallel_0^+\leq s,~\mathbf{x}\in \Omega \tag{SFCO} \end{equation}
where $\mathbf{G}(\mathbf{x})\in\mathbb{R}^{M \times N}$ with each the ($i,j$)th entry being $G_{ij}(\mathbf{x})$, functions $f:\mathbb{R}^{K}\rightarrow \mathbb{R}$ and $G_{ij}:\mathbb{R}^{K}\rightarrow \mathbb{R}$ are (preferably twice) continuously differentiable, $s\ll N$ is an integer, and $\Omega\subseteq\mathbb{R}^{K}$ is a closed and convex set. For a matrix $\mathbf{Z}\in\mathbb{R}^{M \times N}$, measure $\|\mathbf{Z}\|_0^+$ counts the number of its columns that have positive entries, namely, \begin{equation}\|\mathbf{Z}\|_0^+= \mathrm{step}\Big(\max_{i=1,\ldots,M} Z_{i1}\Big)+\cdots+\mathrm{step}\Big(\max_{i=1,\ldots,M} Z_{iN}\Big)\nonumber\end{equation} Here, $\mathrm{step}(t)$ is the step function (or 0/1 loss function) defined by $\mathrm{step}(t)=1$ if $t>0$ and $\mathrm{step}(t)=0$ otherwise. When $M=1$, matrix $\mathbf{Z}$ reduces to a vector $\mathbf{z}\in\mathbb{R}^{N}$, let $\mathbf{z}_+=$ $(\max\{0,z_1\}$ $\ldots$ $\max\{0,z_N\})^\top$ and $\parallel\mathbf{z}\parallel_0$ denote its L0 norm that counts the number of its nonzero entries. As a result, \begin{equation*}\|\mathbf{z}\|_0^+= \mathrm{step}(z_1)+\cdots+\mathrm{step}(z_N)=\|\mathbf{z}_+\|_0\end{equation*} In the present package, $\Omega$ can be one of the following sets:
- A sphare: $\lbrace\mathbf{x}: \parallel\mathbf{x}\parallel\leq r\rbrace$, where $r>0$:
- A halfspace: $\lbrace\mathbf{x}: \mathbf{a}^T\mathbf{x}\leq b\rbrace$, where $\mathbf{a}\in\mathbb{R}^{K}$ and $b\in\mathbb{R}$
- A hyperplane: $\lbrace\mathbf{x}: \mathbf{A} \mathbf{x}= \mathbf{b}\rbrace$, where $\mathbf{A}\in\mathbb{R}^{S\times K}$ and $ \mathbf{b}\in\mathbb{R}^{S}$
- A box: $\lbrace\mathbf{x}: l\leq x_i \leq u, i=1,\ldots,K\rbrace$, where $l \leq u$ can be $-\infty$ and $+\infty$. Hence, when $l=-\infty$ and $u=+\infty$, set $\Omega=\mathbb{R}^{K}$, which means no such constraint; when $l=0$ and $u=+\infty$, set $\Omega$ is a non-negative cone
Package - SFCOpack-Matlab or SFCOpack-Python provides 1 solver from the following paper:
SNSCO - S Zhou, L Pan, N Xiu, and G Li, A 0/1 constrained optimization solving sample average approximation for chance constrained programming, MOR, 2024.
Note that $\texttt{SNSCO}$ is a second-order method, requiring the objective, gradient, and Hessian of functions $f(\mathbf{x})$ and $\mathbf{G}(\mathbf{x})$. Based on Matlab syntax (similar to Python syntax), below is an example of how to define these functions for the solver to solve a recovery problem. The problem has $f(\mathbf{x})=0.5\parallel\mathbf{B}\mathbf{x}-\mathbf{d}\parallel^2$ and $\mathbf{G}_{ij}(\mathbf{x})= \langle\mathbf{A}_{:(j-1)M+i}, \mathbf{x}\rangle-C_{ij}$. The two MATLAB code snippets below respectively define the function values, gradients, and Hessians of $f$ and $\mathbf{G}$. For example, in the inputs of the function handle $\texttt{FuncfRecovery}$, $\texttt{x}$ is the variable, while($\texttt{B}$, $\texttt{d}$, $\texttt{BtB}$)are the data involved in the function $f(\mathbf{x})$. When calling the function $\texttt{FuncfRecovery}$, users need to define these data.
function [objef, gradf, hessf] = FuncfRecovery(x,B,d,BtB)
% This code provides information for an objective function
% f(x) = 0.5*||Bx-d||^2
% x is the variable
% (B,d,BtB) are data and need to be input
% B\in R^{m by n}, d\in R^{m by 1}, and BtB = B'*B
Bxd = B*x-d;
objef = norm(Bxd)^2/2; % objective
gradf = (Bxd'*B)'; % gradient
hessf = BtB; % Hessian
clear B d BtB
end
function [G,gradG,gradGW, hessGW] = FuncGRecovery(x,W,Ind,A,C,K,M,N)
% This code provides information for function G(x): R^K -> R^{M x N}
% (x,W,Ind) are variables
% (A,C,K,M,N) are data and parameters, and need to be input
% For each i=1,...,M and j=1,...,N:
% G_{ij}(x) = <A_{:,(j-1)*M+i}, x>^2 - C_{ij}
% where A_{:,(j-1)*M+i} is the ((j-1)*M+i)th column of A
% A \in R^{K by M*N} and C \in R^{M by N}
G0 = x'*A;
G = reshape(G0,M,N);
G = G.^2-C; % function
if nargout > 1
if isempty(Ind)
gradG = [];
gradGW = zeros(K,1); % gradient
hessGW = zeros(K,1); % Hessian
else
AInd = A(:,Ind);
gradG = AInd.*G0(Ind);
WInd = W(Ind);
gradGW = gradG*reshape(WInd,length(Ind),1); % gradient
hessGW = AInd*diag(WInd)*AInd.'; % Hessian
end
end
clear A C K M N
end
After defining functions $f$ and $\mathbf{G}$, users need to select constraint set $\Omega$. Currently, $\Omega$ can be chosen from {'$\texttt{Ball}$', '$\texttt{Box}$', '$\texttt{Halfspace}$', '$\texttt{Hyperplane}$'}. Each set involves corresponding parameters. For example, a box constraint requires specifying lower bound $l$ and upper bound $u$: when $l=-\infty$ and $u=+\infty$, the box constraint reduces to an unconstrained case; when $=0$ and $u=+\infty$, the box constraint becomes a nonnegativity constraint. For other types of $\Omega$, the relevant parameters are described in the model introduction above. Once $\Omega$ is chosen, $\texttt{SNSCO}$ can be invoked to solve the problem. The following Matlab code demonstrates the process of solving the recovery problem.
% demon recovery problems
clc; close all; clear all; addpath(genpath(pwd));
K = 10;
M = 10;
N = 100;
alpha = 0.05;
s = ceil(alpha*N);
test = 2; % Omega = {x|norm(x) <= r} if test = 1
% Omega = [lb,ub]^n if test = 2
% Omega = {x|a'*x <= b} if test = 3
% Omega = {x|Ax = b} if test = 4
sets = {'Ball','Box','Halfspace','Hyperplane'};
switch sets{test}
case 'Ball'
input1 = 2;
input2 = [];
xopt = randn(K,1);
xopt = input1/norm(xopt)*xopt;
case 'Box'
input1 = -2; % can be -inf
input2 = 2; % can be inf
xopt = max(-10,input1)+(min(10,input2)-max(-10,input1))*rand(K,1);
case 'Halfspace'
xopt = rand(K,1);
input1 = randn(K,1);
input2 = sum(input1.*xopt)+rand(1);
case 'Hyperplane'
xopt = randn(K,1);
input1 = randn(ceil(0.5*K),K);
input2 = input1*xopt;
end
% Generate data and define f and G
B = randn(ceil(0.25*K),K)/sqrt(K);
d = B*xopt;
BtB = B'*B;
xi = randn(K,M,N);
T = randperm(N,s);
Mat = rand(M,N);
D = (Mat>=0.5) .* rand(M,N);
D(:,T) = (Mat(:,T)<1/3).*rand(M,nnz(T))-(Mat(:,T)>=2/3).*rand(M,nnz(T));
A = reshape(xi,K,M*N);
C = (squeeze(sum(xi .* xopt, 1))).^2 + D;
Funcf = @(x)FuncfRecovery(x,B,d,BtB); % f(x) = 0.5||Bx-d||^2
FuncG = @(x,W,J)FuncGRecovery(x,W,J,A,C,K,M,N); % G(x)_ij = <A_ij,x>^2-C_ij
% set parameters and call the solver
if alpha > 0.01
pars.tau0 = 0.05+0.05*(test>4);
else
pars.tau0 = 0.01;
pars.thd = 1e-1*(test==4)+1e-2*(test~=4);
end
out = SNSCO(K,M,N,s,Funcf,FuncG,sets{test},input1,input2,pars);
fprintf(' Relative error: %7.3e \n', norm(out.x-xopt)/norm(xopt));
The inputs and outputs of the Matlab version of $\texttt{SNSCO}$ are detailed below, with analogous specifications for the Python version. Inputs ($\texttt{K}$, $\texttt{M}$, $\texttt{N}$, $\texttt{s}$, $\texttt{Funcf}$, $\texttt{FuncG}$, $\texttt{FeaSet}$, $\texttt{input1}$, $\texttt{input2}$) are required. The parameters in $\texttt{pars}$ are optional, but setting certain ones may improve the solver's performance and the solution quality. It should be noted that $\texttt{FeaSet}$ can only be chosen from {'$\texttt{Ball}$', '$\texttt{Box}$', '$\texttt{Halfspace}$', '$\texttt{Hyperplane}$'}. For each set, the solver requires two inputs, $\texttt{input1}$ and $\texttt{input2}$. If an input is not needed, it can be set to empty $\texttt{[ ]}$. For example, when $\texttt{FeaSet}$ = '$\texttt{Ball}$', one may set $\texttt{input1}$ = 2 and $\texttt{input2}$ = $\texttt{[ ]}$ to represent a ball constraint with radius 2. When $\texttt{FeaSet}$ = '$\texttt{Box}$', one may set $\texttt{input1}$ = 0 and $\texttt{input2}$ = $\texttt{Inf}$ to represent a nonnegativity constraint.
function out = SNSCO(K,M,N,s,Funcf,FuncG,FeaSet,input1,input2,pars)
% This solver solves 0/1 constrained optimization in the following form:
%
% min_{x\in\R^K} f(x), s.t. \| G(x) \|^+_0<=s, x\in Omega
%
% where
% f(x) : \R^K --> \R
% G(x) : \R^K --> \R^{M-by-N}
% s << N
% \|Z\|^+_0 counts the number of columns with positive maximal values
% Omega is a closed and convex set
% -------------------------------------------------------------------------
% Inputs:
% K : Dimnesion of variable x (REQUIRED)
% M : Row number of G(x) (REQUIRED)
% N : Column number of G(x) (REQUIRED)
% s : An integer in [1,N), typical choice ceil(0.01*N) (REQUIRED)
% Funcf : Function handle of f(x) (REQUIRED)
% FuncG : Function handle of G(x) (REQUIRED)
% FeaSet: Feasible set for x, must be one of: (REQUIRED)
% 'Box' [lb,ub]^K
% 'Ball' {x|norm(x) <= r}
% 'Halfspace' {x|a'*x <= b}
% 'Hyperplane' {x|Ax = b}
% Default: R^K
% input1: A parameter related to FeasSet (REQUIRED)
% input2: A parameter related to FeasSet (REQUIRED)
% pars : All parameters are OPTIONAL
% pars.x0 -- Initial point (default:ones(K,1))
% pars.tau0 -- A vector of a number of \tau0 (default 1)
% e.g.,pars.tau0=logspace(log10(.5),log10(1.75),20)
% pars.tol -- Tolerance of halting condition (default 1e-6*M*N)
% pars.maxit -- Maximum number of iterations (default 2000)
% pars.disp -- Show results or not at each step (default 1)
% -------------------------------------------------------------------------
% Outputs:
% out.x: Solution x
% out.obj: Objective function value f(x)
% out.G: Function value of G(x)
% out.time: CPU time
% out.iter: Number of iterations
% out.error: Error
% out.Error: Error of every iteration
% -------------------------------------------------------------------------
% Send your comments and suggestions to <<< slzhou2021@163.com >>>
% WARNING: Accuracy may not be guaranteed!!!!!
% -------------------------------------------------------------------------