https://math.aalto.fi/opetus/MatOhjelmistot/2019spring/Heikki/Lecture4/L4.html

Parallel computing with Matlab including GPUs

And some other means to improve performance

Science it, aalto scip Matlab-course series, 16.4.2019, Heikki Apiola and Juha Kuortti

Afternotes 18.4.

Parallel Computing Toolbox User's Guide [login MathWoks (?)]
Optimization Toolbox™ User's Guide [login MathWoks (?)]
Global Optimization Toolbox, User's Guide [login MathWoks (?)]
Optimization Toolbox, help
Choose a Parallel Computing Solution

Lecture slides

Some more

Examples: Optimization


The rest: More or less organized notes ...

Notebook on miscellaneous material and links

  • https://se.mathworks.com/help/finance/improving-performance-of-monte-carlo-simulation-with-parallel-computing.html
  • https://se.mathworks.com/help/parallel-computing/parfor.html
  • https://se.mathworks.com/help/gads/multistart.html
  • https://se.mathworks.com/help/gads/example-finding-global-or-multiple-local-minima.html Multistart, sawtooth
  • https://se.mathworks.com/help/matlab/math/example-curve-fitting-via-optimization.html
  • Deploy Parallel Optimization

    If you deploy code that calls an optimization solver, and want the solver to use parallel computing, ensure that you explicitly create a parallel pool in your code. Otherwise, the deployed code can fail to run in parallel, and so run only in serial, because MATLAB Compiler™'s dependency analysis can fail to make parallel functionality available. For example, call parpool explicitly, in addition to setting the solver's UseParallel option to true.

    Testing Parallel Optimization

    To test see if a problem runs correctly in parallel, Try your problem without parallel computation to ensure that it runs properly serially. Make sure this is successful (gives correct results) before going to the next test. Set UseParallel to true, and ensure that there is no parallel pool using delete(gcp). Uncheck Automatically create a parallel pool in Home > Parallel > Parallel Preferences so MATLAB does not create a parallel pool . Your problem runs parfor serially, with loop iterations in reverse order from a for loop. Make sure this is successful (gives correct results) before going to the next test. Set UseParallel to true, and create a parallel pool using parpool. Unless you have a multicore processor or a network set up, you won't see any speedup. This testing is simply to verify the correctness of the computations. Remember to call your solver using an options argument to test or use parallel functionality.
  • https://se.mathworks.com/help/gads/how-solvers-compute-in-parallel.html

    How Solvers Compute in Parallel

    Parallel Processing Types in Global Optimization Toolbox
  • https://se.mathworks.com/help/parallel-computing/parallel-profiler-and-code-improvement.html
  • https://se.mathworks.com/help/parallel-computing/choosing-a-parallel-computing-solution.html Good general outline, TABLE of possibilities
  • https://se.mathworks.com/help/parallel-computing/run-a-batch-job.html GOOD "havainnollinen" batch-ohje.
  • https://se.mathworks.com/help/parallel-computing/parfor.html
  • file:///home/heikki/Dropbox/Public/Tietokoneharjoitukset11/MatOhjelmistot/2016_2_syksySCI/examples/spmd_numintLIVE.pdf
  • https://math.aalto.fi/opetus/MatOhjelmistot/2016_2_syksySCI/Lectures/spmd_examplesLIVE.pdf

    Contents

  • Parallel computing basics
  • Optimization

    Parallel computing basics

    Optimization

    Examples

    All Global Optimization Toolbox solvers assume that the objective has one input x, where x has as many elements as the number of variables in the problem. The objective function computes the scalar value of the objective function and returns it in its single output argument (z above).

    General

    ? Optimization toolbox => Tutorials => Solver based:
    https://se.mathworks.com/help/releases/R2019a/pdf_doc/optim/optim_tb.pdf USER's GUIDE
     
    Choose an optimization solver.
    Create an objective function, typically the function you want to minimize.
    Create constraints, if any.
    Set options, or use the default options.
    Call the appropriate solver.
    
    For a basic nonlinear optimization example, see Solve a Constrained Nonlinear Problem.

    fmincon syntax

    See >> help fmincon, >> doc fmincon
      x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlincon,options) 
    
      fun: function to be minimized "objective function"
      x0: starting point: vector or n-column matrix
      A*x <= b:  Linear inequality constraint, A matrix, b vector
      Aeq*x = beq: Linear equality constraint Aeq matrix, beq vector
      lb,ub : Lower bound, Upper bound: scalars, vectors or matrices
      nonlincon: Nonlinear constraints: function
      >>options=optimoptions('fmincon') % Creates struct with defaults...
                                        % to be modified according to needs.
    
    Use empty braces [] for missing arguments.

    The objective function fun for all optimization routines is of the form

    (1) fun=@(x) f(x(1),x(2), ..., x(n)).

    Thus fun is a function of one vector argument. The input x can also be an $m\times n$ matrix. Therefore the definition can also be given in the vectorized form:

    (2) fun=@(x) f(x(:,1),x(:,2),...,x(:,n)).

    Both forms are equally good for the optimization routine, but the latter can also be used for plotting, tabulation and experimantation with several starting points etc.

    Note:
    In the 2 variable case one could also do as follows:

       f=@(x,y)g(x,y)
    	% g(x,y) is a vectorized expression with x's and y's.
       objfun=@(x) f(x(1),x(2))
            % Convert f into a function of one vector argument.
          
    Thus f could be used with meshgrid and surf, contour, etc. and objfun would be good for fmincon and others.

    Example: Rosenbrock's function on a circle

    Very well explained in the User's Guide

    Banana-function of Rosenbrock: $$r(x,y)=100(y -x^2)^2 + (1-x)^2$$ Using the above form (2) we could define:

    rosenbrock=@(x)100*(x(:,2) - x(:,1).^2).^2 + (1 - x(:,1)).^2;

    For optimization routines (here fmincon) it would be as good to write according to form (1):

    rosenbrock1=@(x)100*(x(2) - x(1)^2)^2 + (1 - x(1))^2;

    Constraints: $c(x) \leq 0$ or $ceq(x)=0$.

    This example includes only an inequality constraint, so you must pass an empty array [] as the equality constraint function ceq.

    function [c,ceq] = unitdisk(x)
    c = x(1)^2 + x(2)^2 - 1;
    ceq = [ ];
    
    Note: Rosenbrock's function is a standard test function in optimization. It has a unique minimum value of 0 attained at the point [1,1]. Finding the minimum is a challenge for some algorithms because the function has a shallow minimum inside a deeply curved valley. The solution for this problem is not at the point [1,1] because that point does not satisfy the constraint.

    Run the optimization

    1. Optimiz app (not for us)
    2. Comand line (yes)

    Minimize Rosenbrock's Function at the Command Line

    1. Create options that choose iterative display and the interior-point algorithm
           options = optimoptions(@fmincon,...
          'Display','iter','Algorithm','interior-point');
          
    2. Run the fmincon solver with the options structure, reporting both the location x of the minimizer and the value fval attained by the objective function.
          [x,fval] = fmincon(rosenbrock,[0 0],...
          [],[],[],[],[],[],@unitdisk,options)
          
      The six sets of empty brackets represent optional constraints that are not being used in this example. See the fmincon function reference pages for the syntax.

      MATLAB outputs a table of iterations and the results of the optimization.
      Local minimum found that satisfies the constraints.
      Optimization completed because the objective function is non-decreasing in feasible directions, to within the selected value of the function tolerance, and constraints are satisfied to within the selected value of the constraint tolerance.

      x =
          0.7864    0.6177
      fval =
          0.0457
      

    Tutorial for the Optimization Toolbox

    ? Optimization toolbox => Tutorials => Solver based
    Open live script

    This example shows how to use two nonlinear optimization solvers and how to set options. The nonlinear solvers that we use in this example are fminunc and fmincon. (Much the same as above with Rosenbrock, but there's more, too.)

    All the principles outlined in this example apply to the other nonlinear solvers, such as fgoalattain, fminimax, lsqnonlin, lsqcurvefit, and fsolve.

    Unconstrained Optimization Example

    Minimize: $$ x e^{-(x^2 + y^2)} + (x^2+y^2)/20 $$
       f = @(x,y) x.*exp(-x.^2-y.^2)+(x.^2+y.^2)/20;
       fsurf(f,[-2,2],'ShowContours','on')
      
    Function def. for optimization:
     
       fun = @(x) f(x(1),x(2));
       x0 = [-.5; 0];
       options = optimoptions('fminunc','Algorithm','quasi-newton');
       options.Display = 'iter';
       [x, fval, exitflag, output] = fminunc(fun,x0,options);
      
    >> syms x y
    >> f(x,y) 
    ans = 
    x*exp(- x^2 - y^2) + x^2/20 + y^2/20 
    >> gradient(f(x,y),[x,y]) 
    ans = 
     x/10 + exp(- x^2 - y^2) - 2*x^2*exp(- x^2 - y^2)
                        y/10 - 2*x*y*exp(- x^2 - y^2)
      

    Manual: Writing Scalar Objective Functions

    Good repetition/explanation, read and adopt and include $$f(x,y,z)=3(x-y)^4 + \frac{4(x+z)^2}{1+x^2+y^2+z^2}+\cosh(x-1)+\tanh(y + z).$$

    Minimizing an Expensive Optimization Problem Using Parallel CT

    Minimizing an Expensive Optimization Problem Using Parallel Computing Toolbox
        function f = expensive_objfun(x)
        %EXPENSIVE_OBJFUN An expensive objective function used in
        % optimparfor example.
        % Simulate an expensive function by pausing
        pause(0.1)
        % Evaluate objective function
        f = exp(x(1)) * (4*x(3)^2 + 2*x(4)^2 + 4*x(1)*x(2) + 2*x(2) + 1);
      
    function [c,ceq] = expensive_confun(x)
    %EXPENSIVE_CONFUN An expensive constraint function used in optimparfor example.
    % Simulate an expensive function by pausing
    pause(0.1);
    % Evaluate constraints
    c = [1.5 + x(1)*x(2)*x(3) - x(1) - x(2) - x(4); 
         -x(1)*x(2) + x(4) - 10];
    % No nonlinear equality constraints:
    ceq = [];
      

    Minimizing Using fmincon

      startPoint = [-1 1 1 -1];
    options = optimoptions('fmincon','Display','iter','Algorithm','interior-point');
      startTime = tic;
      xsol = fmincon(@expensive_objfun,startPoint,[],[],[],[],[],[],...
      @expensive_confun,options);
      time_fmincon_sequential = toc(startTime);
      
    Serial FMINCON optimization takes 17.0722 seconds

    Minimizing Using Genetic Algorithm

    Since ga usually takes many more function evaluations than fmincon, we remove the expensive constraint from this problem and perform unconstrained optimization instead. Pass empty matrices [] for constraints. In addition, we limit the maximum number of generations to 15 for ga so that ga can terminate in a reasonable amount of time. We are interested in measuring the time taken by ga so that we can compare it to the parallel ga evaluation. Note that running ga requires Global Optimization Toolbox.
    rng default % for reproducibility
    try
        gaAvailable = false;
        nvar = 4;
        gaoptions = optimoptions('ga','MaxGenerations',15,'Display','iter');
        startTime = tic;
        gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
        time_ga_sequential = toc(startTime);
        fprintf('Serial GA optimization takes %g seconds.\n',time_ga_sequential);
        gaAvailable = true;
    catch ME
        warning(message('optimdemos:optimparfor:gaNotFound'));
    end
      
    Optimization terminated: maximum number of generations exceeded. Serial GA optimization takes 80.2351 seconds

    Setting Parallel Computing Toolbox

    To minimize our expensive optimization problem using the parallel fmincon function, we need to explicitly indicate that our objective and constraint functions can be evaluated in parallel and that we want fmincon to use its parallel functionality wherever possible. Currently, finite differencing can be done in parallel. We are interested in measuring the time taken by fmincon so that we can compare it to the serial fmincon run.
      options = optimoptions(options,'UseParallel',true);
      startTime = tic;
      xsol = fmincon(@expensive_objfun,startPoint,[],[],[],[],[],[],...
      @expensive_confun,options);
      time_fmincon_parallel = toc(startTime);
      fprintf('Parallel FMINCON optimization takes %g seconds.\n',...
      time_fmincon_parallel);
      
    Parallel FMINCON optimization takes 8.11945 seconds. HOW MANY WORKERS?

    Minimizing Using Parallel Genetic Algorithm

    To minimize our expensive optimization problem using the ga function, we need to explicitly indicate that our objective function can be evaluated in parallel and that we want ga to use its parallel functionality wherever possible. To use the parallel ga we also require that the 'Vectorized' option be set to the default (i.e., 'off'). We are again interested in measuring the time taken by ga so that we can compare it to the serial ga run. Though this run may be different from the serial one because ga uses a random number generator, the number of expensive function evaluations is the same in both runs. Note that running ga requires Global Optimization Toolbox.
    rng default % to get the same evaluations as the previous run
    if gaAvailable
        gaoptions = optimoptions(gaoptions,'UseParallel',true);
        startTime = tic;
        gasol = ga(@expensive_objfun,nvar,[],[],[],[],[],[],[],gaoptions);
        time_ga_parallel = toc(startTime);
        fprintf('Parallel GA optimization takes %g seconds.\n',time_ga_parallel);
    end
    Optimization terminated: maximum number of generations exceeded.
    Parallel GA optimization takes 15.6984 seconds.
        

    Compare Serial and Parallel Time

      X = [time_fmincon_sequential time_fmincon_parallel];
      Y = [time_ga_sequential time_ga_parallel];
      t = [0 1];
      plot(t,X,'r--',t,Y,'k-')
      ylabel('Time in seconds')
      legend('fmincon','ga')
      ax = gca;
      ax.XTick = [0 1];
      ax.XTickLabel = {'Serial' 'Parallel'};
      axis([0 1 0 ceil(max([X Y]))])
      title('Serial Vs. Parallel Times')
       
    Utilizing parallel function evaluation via parfor improved the efficiency of both fmincon and ga. The improvement is typically better for expensive objective and constraint functions.

    Optimization Using Symbolic Derivatives

    Let's start with the one by Alan Weiss, MathWorks: RIGHT HERE

    Compare timings: 1) sequential, 2) seq. with symbolic, 3) parallel (seq/symb)

    Most Optimization Toolbox solvers run faster and more accurately when your objective and constraint function files include derivative calculations. Some solvers also benefit from second derivatives, or Hessians. While calculating a derivative is straightforward, it is also quite tedious and error-prone. Calculating second derivatives is even more tedious and fraught with opportunities for error. How can you get your solver to run faster and more accurately without the pain of computing derivatives manually?

    This article demonstrates how to ease the calculation and use of gradients using Symbolic Math Toolbox. The techniques described here are applicable to almost any optimization problem where the objective or constraint functions can be defined analytically

    Running a Symbolically Defined Optimization

    Lecture4/OptimUsingDeriv.m

    Minimize: $$x + y + \cosh(x-1.1y) + \sinh(z/4)$$ over the region defined by the implicit equation $$z^2 = \sin(z-x^2y^2), -1\leq x \leq 1,-1\leq y \leq 1,-1\leq z \leq 1.$$ First write the objective and constraint functions symbolically:

     
     confun=matlabFunction([],w,'vars',{t},'outputs',{'c','ceq'})
      
    ... Let's leave this here for a while...

    First Example: Unconstrained Minimization with Hessian

    HERE
       syms x1 x2 real
       x = [x1;x2]; % column vector of symbolic variables
       f = log(1 + 3*(x2 - (x1^3 - x1))^2 + (x1 - 4/3)^2) % Symbolic expression
     
       @(in1)deal(log((in1(1,:)-4.0./3.0).^2+(in1(1,:)+in1(2,:)-...
       in1(1,:).^3).^2.*3.0+1.0),[-(in1(1,:).*-2.0+(in1(1,:).^2.*3.0-1.0).*...
       (in1(1,:)+in1(2,:)
    
    

    help fmincon

     >> help fmincon
     fmincon finds a constrained minimum of a function of several variables.
        fmincon attempts to solve problems of the form:
         min F(X)  subject to:  A*X  <= B, Aeq*X  = Beq (linear constraints)
          X                     C(X) <= 0, Ceq(X) = 0   (nonlinear constraints)
    				    LB <= X <= UB        (bounds)
           ...