Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MUMPS not using more than one core #17

Open
ianwilliamson opened this issue May 14, 2018 · 5 comments
Open

MUMPS not using more than one core #17

ianwilliamson opened this issue May 14, 2018 · 5 comments

Comments

@ianwilliamson
Copy link

During my testing I haven't been able to get MUMPS to use more than a single core. Following the instructions in the MUMPS documentation I have set the env variable OMP_NUM_THREADS=12 (prior to starting Julia). On this same system, this approach of exporting OMP_NUM_THREADS works for telling Pardiso to use more cores. What else could I look into trying to get MUMPS to use parallelism in factorization and solving?

@Pbellive
Copy link
Contributor

Pbellive commented Jun 8, 2018

What platform are you on? This might be OS or BLAS library dependent. MUMPS runs multithreaded for me with OpenBLAS on Linux when I set OMP_NUM_THREADS before starting Julia. To be precise, I tested this just now on:

julia> versioninfo()
Julia Version 0.6.4-pre.0
Commit 423d4dfcec* (2018-05-30 17:34 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=16)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

If you're on OS X and using Apple's BLAS you need to set VECLIB_MAXIMUM_THREADS, rather than OMP_NUM_THREADS.

@ianwilliamson
Copy link
Author

Also on Linux:

Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Opteron(tm) Processor 6386 SE
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Piledriver)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, bdver1)

and I can successfully get Pardiso to use multiple cores on this system via the export OMP_NUM_THREADS approach.

Oddly (I'm not sure if this is related) but I have the following testing script:

using MUMPS
N = Int64(1e4)
d = 0.00001
A = sprandn(N, N, d) + 1im*sprandn(N, N, d)
b = rand(N) + 1im*rand(N)
A′ = @time factorMUMPS(A, 0)
x  = @time applyMUMPS(A′, b)

which I was able to run once, but after aborting mid-factorization, and then re-running now only outputs the error:

ERROR: LoadError: MUMPS: error --> -3 <--. Please refer to Ch. 7 of MUMPS User's guide!
Stacktrace:
 [1] checkMUMPSerror(::Array{Int64,1}) at /home/---/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:69
 [2] factorMUMPS(::SparseMatrixCSC{Complex{Float64},Int64}, ::Int64, ::Int64) at /home/---/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:33
 [3] factorMUMPS(::SparseMatrixCSC{Complex{Float64},Int64}, ::Int64) at /home/---/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:24
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
 [6] process_options(::Base.JLOptions) at ./client.jl:305
 [7] _start() at ./client.jl:371
while loading /home/---/MUMPS_test.jl, in expression starting on line 237

At the same time, running Pkg.test("MUMPS") completes successfully... now I don't know what's going on...

@Pbellive
Copy link
Contributor

Strange. I'm sorry, I have no good idea why openBLAS isn't picking up on your OMP_NUM_THREADS setting. In any event, neither the wrapper code nor the MUMPS library itself do anything to control multithreading. All that is going on is that MUMPS does much of its work through BLAS calls and these can be done multithreaded. So you set the number of threads for BLAS to use and MUMPS will just use what's there. I would double check that you built this package against the correct BLAS library and that you can run that library multithreaded in general. Unfortunately that's all I can think of, sorry.

As for your other error, that seems to be a case of MUMPS throwing the wrong error message. My guess is that your random matrix was rank deficient. In this case MUMPS should throw a -6 or -10 error code but for some reason it's throwing -3. I checked that trying to factor a rank deficient complex matrix can give the -3 error code. e.g.

julia> D = sprandn(20,20,0.1);

julia> Dinv = @time factorMUMPS(D,0);
ERROR: MUMPS: error --> -3 <--. Please refer to Ch. 7 of MUMPS User's guide!
Stacktrace:
 [1] checkMUMPSerror(::Array{Int64,1}) at /home/patrick/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:69
 [2] factorMUMPS(::SparseMatrixCSC{Float64,Int64}, ::Int64, ::Int64) at /home/patrick/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:50
 [3] factorMUMPS(::SparseMatrixCSC{Float64,Int64}, ::Int64) at /home/patrick/.julia/v0.6/MUMPS/src/MUMPSfuncs.jl:41

julia> rank(full(D))
16

julia> D = D + speye(20);

julia> Dinv = @time factorMUMPS(D,0);
  0.005187 seconds (149 allocations: 8.881 KiB)

MUMPS.jl just returns the error code thrown by MUMPS itself so this is an upstream bug.

@ianwilliamson
Copy link
Author

Thanks for the pointer on that misleading error message. I guess I'll take a look at the build.

@ianwilliamson
Copy link
Author

The only thing I had to modify out of the box was in src/compiler_options.in:

LIBBLAS = -lblas #/usr/lib/libblas/libblas.a

to get it to compile on Ubuntu. This seems to be the standard way to hook into blas and I'm not sure what other options I would need to give it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants