Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default sparse linear solver is slow for large systems #45

Open
simulkade opened this issue Aug 10, 2024 · 3 comments
Open

Default sparse linear solver is slow for large systems #45

simulkade opened this issue Aug 10, 2024 · 3 comments

Comments

@simulkade
Copy link
Member

In a discussion with my friend @skieninger about the speed of sparse linear solvers of Matlab versus Python for solving a large system of equation (around 1e6) , we compared the backslash solver of Matlab with scipy's SuperLU, we observed a huge difference (with Matlab much faster), which is due to the fact that suitesparse and its poly algorithm (umfpack) is not installed by default in scipy. A package is available here but is a pain to install on Windows and even on WSL (or I had difficulties getting it to work easily with all the recent changes in numpy and scipy). When it works though, it makes a huge difference. For me, Pypardiso also did not work either with system and memory errors. I also noticed that switching between the different existing scipy solvers can also make a significant difference.
The problem is very well documented here.

To address this problem, I suggest the following list of tasks/objectives:

  • A notebook to introduce some of the solvers for large problems
  • Describing a workflow for iterative solvers and preconditioners
  • Possibly adding more sparse solvers that are more efficient than the default scipy solver (provided as optional packages)

Here's the script I am using to test the solvers:

import pyfvtool as pf
import numpy as np
import time
start_time = time.time()

# Solving a 1D diffusion equation with a fixed concentration 
# at the left boundary and a closed boundary on the right side


# Calculation parameters
Nx = 20 # number of finite volume cells
Lx = 1.0 # [m] length of the domain 
c_left = 1.0 # left boundary concentration
c_init = 0.0 # initial concentration
D_val = 1e-5 # diffusion coefficient (gas phase)
t_simulation = 120.0 # [s] simulation time
dt = 60.0 # [s] time step

# Define mesh
mesh = pf.Grid3D(Nx, Nx, Nx, Lx, Lx, Lx)

# Create a cell variable with initial concentration
# By default, 'no flux' boundary conditions are applied
c = pf.CellVariable(mesh, c_init)

# Switch the left boundary to Dirichlet: fixed concentration
c.BCs.left.a[:] = 0.0
c.BCs.left.b[:] = 1.0
c.BCs.left.c[:] = c_left
c.apply_BCs()

# Assign diffusivity to cells
D_cell = pf.CellVariable(mesh, D_val)
D_face = pf.geometricMean(D_cell) # average value of diffusivity at the interfaces between cells

Mt, RHSt = pf.transientTerm(c, dt, 1.0)
Mbc, RHSbc = pf.boundaryConditionsTerm(c.BCs)
Md = pf.diffusionTerm(D_face)
M = Mt - Md + Mbc
RHS = RHSt + RHSbc
c_new = pf.solveMatrixPDE(mesh, M, RHS)
# Print the execution time
print("--- %s seconds ---" % (time.time() - start_time))
@mhvwerts
Copy link
Member

mhvwerts commented Aug 11, 2024

It will be great to have a notebook of 'recipes' for getting external sparse solvers to work with PyFVTool.

For our present calculations (2D cylindrical, 375000 cells), Pypardiso works well (Win11, miniconda, specific PyFVTool environment) and decreases total calculation time by a factor up to 10x (laptops, Intel i7, 32 Gb RAM). I think we are lucky.

Concerning MATLAB: you get what you pay for! I guess that MATLAB has been partnering for long with developers of sparse solvers to get these to work in a hassle-free manner and optimize the performance. This is probably true for other commercial software. Sparse array support in scientific Python still needs further work.

Some ideas:

  • It may be insightful to keep a (long-term) log book of calculation times for specific calculations. Clearly specify the configuration (OS version, Python distribution, Python requirements.txt), so that these may be reproduced. I am quite happy with Anaconda/miniconda/Conda-Forge and environments.
  • You may find inspiration with projects that have the same sparse solver problem: SfePy (UMFPACK, PETSc) and of course FiPy. Back in the Python 2.7 days, FiPy used pysparse on Linux (not Windows). It was really good, fast and easy-to-use, but not available in Python 3. I remember that our calculations became much slower, even with PETSc, in Python 3.
  • Starting from a certain size of problem, it stops being convenient to run calculations on a laptop (heat and fan noise...), and it is better to set up the calculations on a fixed (Linux) workstation. The workstation may even be old and slower, as you can leave it running all through the night! Store the results to HDF5 files for post-processing. We have used Jupyter Lab (in server mode) to connect to the workstation with a web browser via the local network (in addition to standard ssh and tmux tools).
  • Specifying the whole workflow for larger calculations (batch processing, data storage, post-processing) will indeed be helpful, in particular to students getting started.
  • UMFPACK via scikit-umfpack seems the way to go. Encouraging news concerning installation from conda-forge.
  • I have the impression that the biggest chance of getting this kind of thing working easily, is to use a dedicated Linux installation, perhaps on a separate workstation.

@simulkade
Copy link
Member Author

I have the impression that the biggest chance of getting this kind of thing working easily, is to use a dedicated Linux installation, perhaps on a separate workstation.

I will give it a shot later, when my son does not play don't starve together on our Linux machine.

UMFPACK via scikit-umfpack seems the way to go. Encouraging news concerning scikit-umfpack/scikit-umfpack#85.

Totally agree. I wish it was installed by default in scipy. It seems that the problem is a license incompatibility.

@mhvwerts
Copy link
Member

For now, Linux seems indeed to be the easiest OS for working with UMFPACK: scikit-umfpack/scikit-umfpack#85 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants