Monday 3 November 2014

Finding eigenvalues of sparse matrices in C++

Just a quick note about the headaches I've been having when trying to find the eigenvalues of sparse matrices in my C++ simulation programs. Essentially, I just want to solve a problem of the type: \[\mathsf{H}\mathbf{x} = E\mathbf{x}\], where \(\mathsf{H}\) is a square matrix, \(\mathbf{x}\) is an eigenvector of the matrix and \(E\) is the corresponding eigenvalue. I'm handling quite large matrices (say, \(5000\times 5000\) elements) so a general eigenvalue solver takes an extremely long time to solve. In fact, the matrices I'm dealing with are tridiagonal (i.e., only the diagonal and the bands either side of it contain non-zero elements) and this, in theory at least, makes it possible to solve the problems much more rapidly. However, my matrices are not symmetrical, and this causes some difficulties! I've found three possible library approaches for solving this, none of which are ideal:

LAPACK

LAPACK provides (slow) solvers for general eigenvalue problems, and very fast solvers for symmetric tridiagonal systems. Unfortunately, there doesn't appear to be a solver for non-symmetric tridiagonal systems. LAPACK is a FORTRAN library, but since version 3.4, there are some convenient C/C++ bindings that can be used.

ARPACK

ARPACK seems to be the best/fastest sparse-matrix solver around, but using the raw API looks extremely painful (you need to manually iterate the solver). I did find a C++ wrapper (ARPACK++), but development seems to have been abandoned, the API is still quite arcane, and documentation is pretty sparse. There's an excellent Python wrapper as part of SciPy... now, if someone could do this in C++, I'd be very happy!

Armadillo

The Armadillo library has, in fact, started wrapping ARPACK functionality for C++, and the API is very user-friendly. The only problem is that the sparse-matrix algorithm converges first on the eigenvalues with largest magnitude... I want the smallest ones! It is possible to request the smallest magnitude solutions, but the algorithm fails to converge within Armadillo's hard-coded iteration limit for many cases. ARPACK itself provides some very handy ways around this problem, which haven't yet been fully implemented in the Armadillo wrapper. These are:
  • Increase the tolerance on precision of the solution. This has been implemented in Armadillo, but it's not an ideal solution since the eigenvalues are found to a lower precision.
  • Increase the maximum number of iterations. This essentially gives ARPACK more time to find the solution. It's not yet available in Armadillo, but it's a bit of a brute-force approach, because obviously it can take a very long time. I don't know the details of the algorithm, but I'd be concerned about the stability... potentially, I guess it could be possible to get stuck in an oscillation or to hit a divergent condition. In this case, increasing the maximum number of iterations wouldn't help!
  • Use a shift-and-invert algorithm. This approach preconditions the eigenvalue problem such that the eigenvalues are transformed \[E\to\frac{1}{E-\sigma}\], where \(\sigma\) is a "pivot" point near to the desired solutions. In my case, \(\sigma = 0\) is a decent choice, because I'm actually looking for the energies of electrons close to the bottom of an energy band. This approach transforms the problem so that the algorithm works in its "preferred" mode of finding the largest eigenvalues first.
I exchanged a couple of emails with the Armadillo developers a few months ago, with a view to getting the latter functionality into a new version. They were very helpful, and hopefully some day I will get round to implementing the feature in their wrapper and sending them a patch. Until then, I'm stuck with either using very slow general eigenvalue solvers, or very complex APIs!

Friday 10 October 2014

Figure references in MS Word and LaTeX ("Figure" vs. "Fig.")

Reason #98856 why I hate Microsoft Word: It's incredibly difficult to caption and cross-reference figures in a flexible way. Many journals have style rules that require you to refer to figures as "Figure 1" at the start of a sentence, or "Fig. 1" in the middle of a sentence. In LaTeX, this is super easy:
  1. Label the figure using
    \caption{\label{example} (a) blah (b) whatever.}
    
    Label the figure using
    \caption{\label{example} (a) blah (b) whatever.}
    

  2. In the document, type:
    This is shown in Fig. \ref{example}"

  3. Save and compile.
How to do it in MS Word:
  1. Insert a caption under Figure 1:
    1. References -> Captions -> Insert Caption
    2. Enter " (a) blah (b) whatever". Don't forget the space at the start.
    3. Select "Figure" from the Label: drop-down box
    4. Press "OK"
  2. Insert the reference in the text:
    1. Type "This is shown in Fig. "
    2. References -> Captions -> Cross-reference
    3. Select "Figure" in the Reference type drop-down list
    4. Select "Only label and number" from the "Insert reference to:" drop-down list
    5. Select the Figure you want to reference from the "For which caption" list.
    6. Click "Insert"
    7. Click "Close"
    8. Click on the badly-formatted "Figure 1" text that has appeared in the document (in the middle, not at the start or end)
    9. Press Shift+F9 to display the field format
    10. Replace the text that says:
      { REF _Ref361669033 \h }
      (or whatever the reference code happens to be) with:
      { REF _Ref361669033 \h \# 0 }
      Make sure that all the spaces are in precisely the right place, otherwise it won't work
    11. Hide the field format by pressing Shift+F9 again
    12. Press F9 to refresh the field
  3. Save the file and punch the wall until your fist bleeds.