-
-
Notifications
You must be signed in to change notification settings - Fork 215
Initial support for PETSc MATIS matrix format #3885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Super! Just today we have a branch that does CI tests via Spack for dependencies - using
in our CI setup we could test it :). It might be better to add this to the exiting mixed Poisson demo. We're trying to avoid too many demos that solve very similar problems in favour of demos that show different ways to solve the same problem (in the same demo). The mixed Poisson demo could demonstrate a range of different solvers. |
auto adjlist = bc.get().function_space()->dofmap()->index_map->index_to_dest_ranks(); | ||
const auto off = adjlist.offsets(); | ||
std::vector<std::int32_t> number_of_sharing_ranks(off.size()); | ||
for (size_t i = 0; i < off.size() - 1; ++i) | ||
number_of_sharing_ranks[i] = off[i+1] - off[i] + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this setup code should be moved to IndexMap
I was thinking of implementing it lazily, with a mutable
class member, but I see you don't use mutable
anywhere in dolfinx. Would it be something you would accept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Stefano,
I know I suggested this function on discourse. However, as I now see the use-case, what I think we should do is to
- create an
la::Vector
on the index map. - Fill vector with 1 (including ghosts)
- scatter reverse add
- Scatter forward
that would give you an array of the number of shared procs per dof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably be a vector we compute once inside the DirichletBC class. What do you think @garth-wells ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgensd Can you paste some code here to do it? I will wait for @garth-wells confirmation and move the setup of the counting vector in the IndexMap
or DirichletBC
class, your call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m without a computer until Sunday. @IgorBaratta could you do me a solid and make a simple example for Stefano?:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to divide the insertion of 1 on the diagonal by the number of processes that has the dof, as MATIS cannot use insert, and has to add the values. I.e. if a DirichletBC dof is shared between N processes, we assign 1/N on each process.
I suggest we use a vector, stored in the DirichletBC to keep track of this, a illustrated with the snippet above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgensd summarized it properly. If you want to keep the same workflow for assembled and unassembled matrices, the counting vector is the only way to realize it. I think such information should belong to IndexMap
instead of DirichletBC
, but I'm fine either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanozampini @garth-wells I've made an attempt at this at: https://github.com/FEniCS/dolfinx/tree/dokken/suggested-matis
I'll have a go at installing PETSc main locally and run this, to see if I can get the demo to run:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A likely good approach is consistent handling of the LHS and RHS, i.e. use accumulate rather than set for the bc on the RHS. We did this is legacy DOLFIN. I'd need to look closely at the code for how this might work with the latest design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be quite straightforward with the approach suggested in my branch above, as it lets the user choose the insertion mode of the diagonal (which can trivially be extended to set_bc), as the DirichletBC itself holds the information about the count.
That test uses a different Pmat and has extra code to set up the preconditioner. Wouldn't it be too messy to add another preconditioner in the same script? |
See for instance the stokes demo which illustrates how we have solved it for 4 different configurations |
Put the different solvers in different functions. Can pass in common data, e.g. mesh. A reason we want 'more' in demos is to structure demos with more functions. We see over and over that users start from a simple, flat 20 line demo script and turn it into a monster, multi-thousand line flat script. With more functions in demos we're educating people on how to structure a program :). |
template <dolfinx::scalar T, std::floating_point U> | ||
void set_diagonal( | ||
auto set_fn, const FunctionSpace<U>& V, | ||
const std::vector<std::reference_wrapper<const DirichletBC<T, U>>>& bcs, | ||
T diagonal = 1.0) | ||
T diagonal = 1.0, bool unassembled = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks problematic - quite a few types can be cast to T
and bool
, very easy for a user to make an error.
Probably better to make unassembled
an enum. Easier to read too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an enum for a single function seems overkill to me.
If there's a way to extract back the Mat
from set_fn
, we could test if it is a MATIS
inside the function, and we won't need the extra optional argument. @garth-wells what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the branch I made, I use an insertion mode with add
or insert
to distinguish between the two, which resonates well with the scatter_fwd
/scatter_reverse
modes we have in the Python-interface for vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at your branch. I don't think it is correct for MATAIJ
. set_diagonal
uses (correctly) set_fn = dolfinx::la::petsc::Matrix::set_fn(A, INSERT_VALUES)
, and the values won't be accumulated into MATAIJ
. I think it is best to use an optional boolean that clearly indicates we want to insert into an unassembled (MATIS
) matrix. All these problems appear because the interface for set_diagonal
only takes set_fn
and not the matrix directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, use add_values formalism everywhere in matrix assembly (above all this is FEM :-)). The optimization you get by not communicating the values to be inserted into the diagonal is minimal @garth-wells ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at your branch. I don't think it is correct for
MATAIJ
.set_diagonal
uses (correctly)set_fn = dolfinx::la::petsc::Matrix::set_fn(A, INSERT_VALUES)
, and the values won't be accumulated intoMATAIJ
. I think it is best to use an optional boolean that clearly indicates we want to insert into an unassembled (MATIS
) matrix. All these problems appear because the interface forset_diagonal
only takesset_fn
and not the matrix directly
Right, but if we modify the input to the set_fn
to then use ADD_VALUES
it would resolve the issue, right?
i.e.
PetscBool flg;
PetscObjectTypeCompare((PetscObject)A, MATIS, &flg);
dolfinx::la::VectorInsertMode mode
= flg ? dolfinx::la::VectorInsertMode::add
: dolfinx::la::VectorInsertMode::insert;
InsertMode petsc_mode = flg ? ADD_VALUES : INSERT_VALUES;
dolfinx::fem::set_diagonal(
dolfinx::la::petsc::Matrix::set_fn(A, petsc_mode), V, _bcs,
diagonal);
diagonal, mode);
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about it, I kind of agree that we should only have one approach, which preferably is the add
-mode, as the approach of assuming that two independent input arguments are set correctly by the user is likely to lead to a lot of confusion.
Having only a single, accumulate data from all processes as 1/num_shared_procs
seems sensible.
The reason for DOLFINx not interfacing directly with the PETSc matrix, is that it allows us to have a single code path for multiple matrix types, like our own, built in matrices, and potentially TRILLINOS, without having to change the set_diagonal
, dolfinx::fem::assemble_*
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with any solution you think is reasonable. Note that if you switch back to add values, you won't need to flush assembly anymore in the fem callbacks. And you can still use the optimization of adding only to diagonal entries in AIJ format (since the value in the diagonal before the call is 0.0, so add
or insert
is the same)
Right, but if we modify the input to the set_fn to then use ADD_VALUES it would resolve the issue, right?
i.e
If you want equivalent behaviour, it should be the opposite way, i.e. petsc_mode = !flg ? ADD_VALUES : INSERT_VALUES
. This is because MATIS inserts directly into the local matrix. matrix-vector products in MATIS are done via R^t A R, where R restricts from global to local, A does the local multiplications, and R^T sums up the local contributions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I got that the wrong way around.
Done |
field information for fields how to clone a matrix
@stefanozampini I get a segfault when running the I installed the release PETSc branch and your branch using spack. |
if you run sequentially, BDDC is a direct solver because there no subdomains
The changes have been merged in |
Even running with PETSc main brain (using spack) and your branch of DOLFINx, I get: File "/home/dokken/Documents/src/dolfinx/python/demo/demo_mixed-poisson.py", line 418, in <module>
problem.solve()
File "/home/dokken/Documents/src/spack/var/spack/environments/test_matis/.spack-env/view/lib/python3.12/site-packages/dolfinx/fem/petsc.py", line 987, in solve
self.solver.solve(self.b, self.x)
File "petsc4py/PETSc/KSP.pyx", line 1782, in petsc4py.PETSc.KSP.solve
petsc4py.PETSc.Error: error code 81
[3] KSPSolve() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/ksp/interface/itfunc.c:1091
[3] KSPSolve_Private() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/ksp/interface/itfunc.c:840
[3] KSPSetUp() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/ksp/interface/itfunc.c:429
[3] PCSetUp() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/pc/interface/precon.c:1120
[3] PCSetUp_BDDC() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/pc/impls/bddc/bddc.c:1672
[3] PCBDDCSetUpSolvers() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/pc/impls/bddc/bddcprivate.c:3852
[3] PCBDDCSetUpCorrection() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/ksp/pc/impls/bddc/bddcprivate.c:4560
[3] MatCholeskyFactor() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/mat/interface/matrix.c:3372
[3] MatCholeskyFactor_SeqDense() at /tmp/dokken/spack-stage/spack-stage-petsc-main-bygphg6hok5rdbzsbmlhkt3p6alpuhyp/spack-src/src/mat/impls/dense/seq/dense.c:882
[3] Zero pivot in Cholesky factorization: https://petsc.org/release/faq/#zeropivot
[3] Bad factorization: zero pivot in row 0 when using coarse_solver = "lu"
coarse_solver_type = "superlu_dist"
local_solver_type = "mumps" or coarse_solver = "lu"
coarse_solver_type = "mumps"
local_solver_type = "mumps" is fine. |
I think in that case one of the processes has zero entries, and superlu_dist fails. |
This happen even if I increase the mesh to be 500x500, and use only 2 processes, where it is unlikely that one of the coarse levels should end up with zero entries? Is it because we do not have a (1, 1) diagonal block in the system? |
The issue you posted was with a local problem. run with the option |
Solving mixed poisson problem with monolithic solver.
Error benign extension 0.00000000000000e+00
Error scaling extension 0.00000000000000e+00
Error scaling restriction 0.00000000000000e+00
--------------------------------------------------
Subdomain 0000 infinity error for Dirichlet solve (demo_mixed_poisson_pc_bddc_dirichlet_) = inf
Subdomain 0001 infinity error for Dirichlet solve (demo_mixed_poisson_pc_bddc_dirichlet_) = inf
--------------------------------------------------
Local BDDC graph for subdomain 0000 (seq 0)
Number of vertices 625767
Number of local subdomains 1
Custom minimal size 1
Topological two dim? TRUE (set TRUE)
Total number of connected components 1
cc 0 (size 514, fid 0, neighs: 0 1)
Local BDDC graph for subdomain 0001 (seq 0)
Number of vertices 625747
Number of local subdomains 1
Custom minimal size 1
Topological two dim? TRUE (set TRUE)
Total number of connected components 1
cc 0 (size 514, fid 0, neighs: 0 1)
--------------------------------------------------------------
Subdomain 0000 got 00 local candidate vertices (1)
Subdomain 0000 got 01 local candidate edges (1)
Subdomain 0000 got 00 local candidate faces (1)
--------------------------------------------------------------
Subdomain 0001 got 00 local candidate vertices (1)
Subdomain 0001 got 01 local candidate edges (1)
Subdomain 0001 got 00 local candidate faces (1)
--------------------------------------------------
Subdomain 0000 local dimensions
local_size = 625767, dirichlet_size = 625253, boundary_size = 514
r_size = 625767, v_size = 0, constraints = 1, local_primal_size = 1
Subdomain 0001 local dimensions
local_size = 625747, dirichlet_size = 625233, boundary_size = 514
r_size = 625747, v_size = 0, constraints = 1, local_primal_size = 1
--------------------------------------------------
Subdomain 0000 infinity error for Neumann solve (demo_mixed_poisson_pc_bddc_neumann_) = inf
Subdomain 0001 infinity error for Neumann solve (demo_mixed_poisson_pc_bddc_neumann_) = inf |
that is odd, I don't know what you are running. This is what I get running the demo in my branch with 2 processes
|
@stefanozampini If I instead use |
This is a problem in superlu_dist, I have added the failing test to our CI and hopefully they will fix it in superlu_dist https://gitlab.com/petsc/petsc/-/merge_requests/8681 |
This allows to create BDDC preconditioners.
For now, there's a demo on mixed-poisson. To run it, you need fixes for
getLocalSubMatrix
here https://gitlab.com/petsc/petsc/-/merge_requests/8668.