Remove allocations for xl, xu, nlconstr0 in C interface #135

nbelakovski · 2024-01-07T22:40:49Z

As I was working on the Python bindings, I ran into some issues related to memory management. These changes should help with that, particularly the change to the allocation of nlconstr0, since that was the one that appeared to be giving issues with Python. In general I'd prefer to do as little memory allocation within the C interface as possible, since it's challenging enough to have memory management in 1 language, let alone 4 when we start interacting with Python->C++->C->Fortran.

What enables the removal of these allocations is that I've discovered that if a variable in Fortran is marked as a pointer and set to null(), then the Fortran function present returns false on that variable. This means we can use NULL from within C to indicate that a variable is not provided. This approach has a small inconvenience in that the user will have to handle allocations for optional variables.

For example, with f0, the user would have to do the following:

double f0 = obj_fun(x0);
problem->f0 = &f0;

Likewise for things like rhobeg, rhoend, the user would have to do something like the following:

double rhobeg = 1.0;
options->rhobeg = &rhobeg;
double rhoend = 1.0e-6;
options->rhoend = &rhoend;

This is in contrast to the current way of doing things:

options->rhobeg = 1.0;
options->rhoend = 1.0e-6;

As annoying as the previous option is, it might be preferable since it means we can stop duplicating initialization logic within C. That logic is currently implemented in prima_init_options and prima_check_problem, but it doesn't replicate the full logic for rhobeg/rhoend. I think that ideally C should be a 'dumb' pass-through and should let Fortran handle as much as possible.

We can use this PR to add this functionality to the rest of the variables, but I figured I would start with removing the memory allocations before going too much further.

nbelakovski · 2024-01-07T22:51:07Z

c/prima.c

-  if ((info == 0) && use_constr)
-  {
-    // reuse or (re)allocate nlconstr
-    if (result->_m_nlcon != problem->m_nlcon)


This particular if statement relied on the fact that result->_m_nlcon would be 0 after the call to prima_init_result. I moved it into prima_init_result and relied on the value of problem->m_nlcon to allocate it.

zaikunzhang · 2024-01-08T00:56:21Z

Hi @nbelakovski , two quick questions:

Will this necessitate any change to the Fortran code?
For claims like the following, we need a reference to the standard.

I've discovered that if a variable in Fortran is marked as a pointer and set to null(), then the Fortran function present returns false on that variable. This means we can use NULL from within C to indicate that a variable is not provided

Thank you.

zaikunzhang · 2024-01-08T01:02:53Z

It is the spirit of PRIMA to avoid requiring the user to do things like

double rhobeg = 1.0;
options->rhobeg = &rhobeg;
double rhoend = 1.0e-6;
options->rhoend = &rhoend;

This is unnatural and unfriendly. If this is inevitable, then we should add another layer to enable users to do options->rhobeg=1.0.

BTW, excuse me for my ignorance about C, could you remind me why it is an arrow instead of a dot?

N.B.: PRIMA tries to provide the most natural and friendly interface to users and handles everything else under the hood. Otherwise, PRIMA would not exist in the first place.

Thank you.

nbelakovski · 2024-01-08T03:21:23Z

1. Will this necessitate any change to the Fortran code?

Fortunately no.

2. For claims like the following, we need a reference to the standard.

From the 2003 standard:

Also section 13.7.88 indicates that using => null() on a pointer results in a disassociated pointer.

Link: https://wg5-fortran.org/N1601-N1650/N1601.pdf

BTW, excuse me for my ignorance about C, could you remind me why it is an arrow instead of a dot?

The arrow is used when the left side is a pointer. An example:

prima_problem_t problem;
problem.f0 = 5;
prima_problem_t *problem_ptr = &problem; // problem_ptr is of type pointer and points to problem
printf("%g\n", problem_ptr->f0); // prints 5

This is unnatural and unfriendly. If this is inevitable, then we should add another layer to enable users to do options->rhobeg=1.0.
N.B.: PRIMA tries to provide the most natural and friendly interface to users and handles everything else under the hood. Otherwise, PRIMA would not exist in the first place.

OK I have an idea, the options struct could look like this (using rhobeg as an example):

typedef struct {
  double rhobeg;
  double* _rhobeg; // internal - do not use
} prima_options_t;

prima_init_options will initialize rhobeg to NAN and _rhobeg to NULL. Then the user can overwrite rhobeg and in prima_minimize we do something like if (isnan(options.rhobeg) == false) {options._rhobeg = &options.rhobeg;}. If the user hasn't overwritten rhobeg then _rhobeg will be NULL and we can set its corresponding Fortran pointer to null(). What do you think?

zaikunzhang · 2024-01-08T04:00:54Z

Thank you for the explanation.

This is unnatural and unfriendly. If this is inevitable, then we should add another layer to enable users to do options->rhobeg=1.0.
N.B.: PRIMA tries to provide the most natural and friendly interface to users and handles everything else under the hood. Otherwise, PRIMA would not exist in the first place.

OK I have an idea, the options struct could look like this (using rhobeg as an example):
typedef struct {
  double rhobeg;
  double* _rhobeg; // internal - do not use
} prima_options_t;
prima_init_options will initialize rhobeg to NAN and _rhobeg to NULL. Then the user can overwrite rhobeg and in prima_minimize we do something like if (isnan(options.rhobeg) == false) {options._rhobeg = &options.rhobeg;}. If the user hasn't overwritten rhobeg then _rhobeg will be NULL and we can set its corresponding Fortran pointer to null(). What do you think?

Good. What about rhobeg_loc instead of _rhobeg?

Thanks.

zaikunzhang · 2024-01-08T04:59:36Z

I guess the following code will not compile or work as expected if RP is not C_DOUBLE.

Always keep in mind that RP is not necessarily C_DOUBLE or the default real type, and IK is not necessarily C_INT or the default integer type. Explicit conversions are always needed when we exchange data between C and Fortran (and probably between other languages). See #129.

if (C_ASSOCIATED(xl)) then
    call C_F_POINTER(xl, xl_loc, shape=[n])
else
    xl_loc => null()
end if
if (C_ASSOCIATED(xu)) then
    call C_F_POINTER(xu, xu_loc, shape=[n])
else
    xu_loc => null()
end if

BTW, what does [n] mean? Will n work?

Thanks.

nbelakovski · 2024-01-08T07:02:45Z

I was able to make a test program work with a different value of RP by going through an intermediate step:

https://godbolt.org/z/aKdWjzYo7

Basically it works like this:

! Compulsory arguments
type(C_PTR), value :: array_of_double_ptr
integer(C_INT), value :: num_elements

! Local variables
real(C_DOUBLE), pointer :: array_of_double_ptr_loc_intermediate(:)
real(RP), allocatable :: array_of_double_ptr_loc(:)

! Convert
if (C_ASSOCIATED(array_of_double_ptr)) then
  call C_F_POINTER(array_of_double_ptr, array_of_double_ptr_loc_intermediate, shape=[num_elements])
  allocate(array_of_double_ptr_loc(num_elements))
  array_of_double_ptr_loc = real(array_of_double_ptr_loc_intermediate, kind(array_of_double_ptr_loc))
end if

There's another way to do it with pointers if necessary, but it requires a second intermediate variable. Fortunately a variable marked allocatable but not allocated will return false when present is called, just like with unassociated pointers, so we can still use this to basically use NULL to indicate than an optional variable is not provided. Should I go ahead and implement this in the PR? If so, should I go ahead and implement it for all optional variables, or just the 4 I started with?

BTW, what does [n] mean? Will n work?

From the documentation:

SHAPE | (Optional) Rank-one array of type INTEGER with INTENT(IN). It shall be present if and only if fptr is an array. The size must be equal to the rank of fptr.

Basically the brackets turn it into an array. I tried without the brackets and it complained about the variable not having the correct rank.

Good. What about rhobeg_loc instead of _rhobeg?

Usually a leading underscore is understood to mean that something is private. Plus I don't like the _loc suffix. Apparently it's short for local? That's not obvious at all, and I don't think it makes sense in this context anyway since the two are next to each other.

zaikunzhang · 2024-01-08T07:21:58Z

! Compulsory arguments
type(C_PTR), value :: array_of_double_ptr
integer(C_INT), value :: num_elements

! Local variables
real(C_DOUBLE), pointer :: array_of_double_ptr_loc_intermediate(:)
real(RP), allocatable :: array_of_double_ptr_loc(:)

! Convert
if (C_ASSOCIATED(array_of_double_ptr)) then
  call C_F_POINTER(array_of_double_ptr, array_of_double_ptr_loc_intermediate, shape=[num_elements])
  allocate(array_of_double_ptr_loc(num_elements))
  array_of_double_ptr_loc = real(array_of_double_ptr_loc_intermediate, kind(array_of_double_ptr_loc))
end if

This matches my imagination. Let us use safealloc instead of allocate.

Usually a leading underscore is understood to mean that something is private.

I checked this on Internet. Understood. Let us follow this convention.

Plus I don't like the _loc suffix.

I will be happy to change if there is a better option.

Thank you.

nbelakovski · 2024-01-08T07:37:55Z

This matches my imagination. Let us use safealloc instead of allocate.

Of course, I was only using allocate for the examples.

Should this PR make the change for all the optional variables, or just the 4 I started with? We can also make an issue to fix the rest of them in a separate PR.

zaikunzhang · 2024-01-08T07:52:55Z

What are the other variables?

I guess the motivation for these four variably is that they are vectors, so allocation is needed somewhere. For scalar variables, what is the motivation?

I have no objection, but I hope to understand the situation better. If the modification leads to a better (more friendly) interface or better (cleaner or more consistent) code, then of course we should do it.

It is fine to handle the other variables on a separate PR, but I hope the delay would not be long, and I hope the C API will stabilize soon.

Thank you.

nbelakovski · 2024-01-08T08:09:36Z

The other variables currently exposed are

rhobeg
rhoend
maxfun
iprint
ftarget
npt

There are of course other optional variables in the various algorithms that are not currently exposed, like honour_x0, eta, gamma, and others. For variables like maxfun and npt, we currently have logic within the C interface to initialize those, but this logic is duplicated as compared to what's in Fortran. It would be nice if C didn't duplicate any initialization code.

The motivation really isn't related to whether or not the variables are vectors or scalars. The vector quantities were initially motivated by wanting to reduce unnecessary memory allocations, but overall both types would benefit from the ability to mark them as null in order to indicate to Fortran that they are "not present."

zaikunzhang · 2024-01-08T09:59:33Z

For variables like maxfun and npt, we currently have logic within the C interface to initialize those, but this logic is duplicated as compared to what's in Fortran. It would be nice if C didn't duplicate any initialization code.

The motivation really isn't related to whether or not the variables are vectors or scalars. The vector quantities were initially motivated by wanting to reduce unnecessary memory allocations, but overall both types would benefit from the ability to mark them as null in order to indicate to Fortran that they are "not present."

I don't necessarily disagree. We can do it.

However, duplicating the initialization in C is not necessarily a bad thing: if a pure C implementation will be done in the future, then it will need an interface that "duplicates (implements) the initialization".

Indeed, the MATLAB interface of PRIMA and the Python interface of PDFO both implement the initialization. Indeed, they do much more, including problem cleaning, pre-processing, and refining, all of which are done very carefully. Obviously, these things are much easier to do in MATLAB/Python than in Fortran (or C). In comparison, the initialization in Fortran is relatively "primitive". I hope the Python interface for PRIMA will be similar to the MATLAB interface for PRIMA and the Python interface for PDFO.

zaikunzhang · 2024-01-08T16:56:43Z

I was able to make a test program work with a different value of RP by going through an intermediate step:

How to test similar things systematically? We should include such tests in our CI. See #129.

nbelakovski · 2024-01-09T00:46:44Z

If the initialization only involved default values, then there wouldn't be very much to discuss - it would be trivial to implement in any language and not too difficult to copy the dozen or so optional variables that we have. However, I've noticed that some variables have more complicated initialization routine, like rhobeg and rhoend, for example.

	present(rhobeg) and present(rhoend)	present(rhobeg) and not present(rhoend)	not present(rhobeg) and present(rhoend)	not present(rhobeg) and not present(rhoend)
rhobeg	User-specified	User-specified	max(10*rhoend, RHOBEG_DFT)	RHOBEG_DFT
rhoend	User-specified	max(EPS, min(0.1*rhobeg, RHOEND_DFT))	User-specified	RHOEND_DFT

eta1 and eta2 have similar logic where they are initialized together. Right now in the C interface we have no way of detecting that rhoend was provided but rhobeg was not. I don't see this applying to other variables, so I think the logical course would be to implement the _rhobeg structure for these 4 variables (rhobeg, rhoend, eta1, eta2), but the others don't need this structure since they just take a default value. Since we don't currently expose eta1, eta2, I'll just implement it for rhoend and rhobeg.

nbelakovski · 2024-01-09T01:54:26Z

It seems nvfortran does not like my latest approach. I am investigating.

zaikunzhang · 2024-01-09T04:53:06Z

If the initialization only involved default values, then there wouldn't be very much to discuss - it would be trivial to implement in any language and not too difficult to copy the dozen or so optional variables that we have. However, I've noticed that some variables have more complicated initialization routine, like rhobeg and rhoend, for example.

present(rhobeg) and present(rhoend) present(rhobeg) and not present(rhoend) not present(rhobeg) and present(rhoend) not present(rhobeg) and not present(rhoend)
rhobeg User-specified User-specified max(10rhoend, RHOBEG_DFT) RHOBEG_DFT
rhoend User-specified max(EPS, min(0.1rhobeg, RHOEND_DFT)) User-specified RHOEND_DFT

Excuse me, @nbelakovski , it is a bit unclear to me what is the issue here.

Do you mean you will implement the same initialization in the C interface, given that we have agreed on your proposed way of indicating the absence of rhobeg (and rhoend)?

If the answer to the above question is yes, then wouldn't it contradict the following idea? According to this idea, shouldn't we just read the values of rhobeg and rhoend dumbly if they are present, and then pass whatever received to the Fortran code?

I think that ideally C should be a 'dumb' pass-through and should let Fortran handle as much as possible.

Thanks.

nbelakovski · 2024-01-09T05:08:57Z

Do you mean you will implement the same initialization in the C interface, given that we have agreed on your proposed way of indicating the absence of rhobeg (and rhoend)?

No, I was pointing out that the C interface as currently designed (i.e in the main branch, not in this PR) is broken in the sense that it can only pass values for rhobeg and rhoend and cannot indicate if they are absent. So if the user supplies rhobeg and does not supply rhoend, instead of setting rhoend to 0.1*rhobeg, it will just use the default value for rhoend as currently specified in the C interface.

This seems like a problem to me, and given my statement that I think C should be a 'dumb' pass-through, I think the way to solve this is to use that approach with rhobeg and _rhobeg to indicate to Fortran whether the user has provided a value for that variable or not. Then Fortran can do the proper initialization. Does that make sense?

I've investigated the nvfortran issue. The issue seems to come from the -Mchkptr flag passed to nvfortran. It complains when bobyqa is called from bobyqa_c that rhobeg_loc is null. This seems like a bug with nvfortran, since we're not dereferencing the pointer at this stage (and in fact in Fortran it's legal to pass unitialized or unallocated pointers to intrinsic procedures like present). I don't see a way around this in code. Do you think it would be appropriate to either remove the -Mchkptr flag from cmake.yml, or to modify the CMake files so that the flag is used when compiling Fortran examples but not the C interface?

zaikunzhang · 2024-01-09T05:15:26Z

if the user supplies rhobeg and does not supply rhoend, instead of setting rhoend to 0.1*rhobeg, it will just use the default value for rhoend as currently specified in the C interface.

You are right. This setting is wrong.

This seems like a problem to me, and given my statement that I think C should be a 'dumb' pass-through, I think the way to solve this is to use that approach with rhobeg and _rhobeg to indicate to Fortran whether the user has provided a value for that variable or not. Then Fortran can do the proper initialization. Does that make sense?

Yes. Agreed.

zaikunzhang · 2024-01-09T05:18:55Z

Thank you @nbelakovski for the investigation.

This seems like a bug with nvfortran, since we're not dereferencing the pointer at this stage (and in fact in Fortran it's legal to pass unitialized or unallocated pointers to intrinsic procedures like present).

I see. It will not surprise me if we have caught another compiler bug. It would be better if we can submit a minimal example to nvfortran and get a confirmation that it is their bug.

Do you think it would be appropriate to either remove the -Mchkptr flag from cmake.yml, or to modify the CMake files so that the flag is used when compiling Fortran examples but not the C interface?

The second approach sounds better unless it is much more complicated to implement.

nbelakovski · 2024-01-09T05:50:23Z

It looks like someone has run into this before: https://forums.developer.nvidia.com/t/bug-in-nvfortran-with-mchkptr-for-unallocated-optional-arguments/223220

They claim it's correct behavior, but they understand that this is a valid use case and offered to make a second flag to allow this sort of behavior while -Mchkptr is used. It doesn't seem like there was any follow up. I'll make a topic to follow up.

In the meantime, I dug around, and it seems like it might technically be possible to remove a flag from within CMake for a specific target, but there's no standard-conforming way to do it, and I think it would be best to just remove the chkptr flag from CMake. If we need it, we can make a separate github action that only builds and tests the Fortran code and not the C interface.

Edit: nvidia forum topic to revive the discussion around Mchkptr suboption https://forums.developer.nvidia.com/t/mchkptr-doesnt-allow-passing-null-optional-values/278293

nbelakovski · 2024-01-09T05:59:17Z

OK this PR is good to go @zaikunzhang

zaikunzhang · 2024-01-09T07:08:57Z

Thank you @nbelakovski . I have made some comments. Can you see them (I am not sure)?

Do not do a force push before checking all the comments, or they will disappear.

nbelakovski · 2024-01-09T07:14:42Z

No, I didn't see them, I'm sorry I didn't know force-pushing removed comments. Could you please re-comment?

c/bobyqa_c.f90

c/prima.c

zaikunzhang · 2024-01-09T07:17:07Z

What about now? @nbelakovski

zaikunzhang · 2024-01-24T01:26:08Z

On my side, this is the only thing left before merging.

For this I'd apply the same arguments I applied towards making sure we initialize npt and maxfun within the C code. The reason to use NaN to indicate that they're absent is so that the Fortran code can take the default path, instead of passing NaN through the whole chain of prima_minimize -> algorithm_c -> algorithm -> preproc before it gets acted upon.

But this substantially complicates the code. In addition, the code about f0 is wrong - check the code about f0 and nlconstr0 in cobyla.f90 (I am on cellphone so I cannot make a link easily). Thanks.

nbelakovski · 2024-01-24T05:12:03Z

The code around f0 is correct. f0_loc is allocatable, so if f0 is NaN, then f0_loc will not be allocated and therefore it will be interpreted as not present.

Let me summarize the desired state of affairs so that we have a clear picture of the whole thing:

integers are set to 0 by default, and if they are not overwritten by user C will initialize them correctly.
doubles are set to NAN by default, and C will do nothing with them, and neither will algorithm_c (it will pass NAN or an overwritten value to algorithm without looking at it)
arrays are set to NULL by default, and C will do nothing with them, but algorithm_c will interpret NULL in such a way that algorithm will believe that no value has been provided for the array, or it will interpret an overwritten value in such a way as to pass that data to algorithm.

To be clear what I wrote above is the desired state of affairs as I understand it from your comments. Please correct me if I'm wrong. I think the idea of passing NAN through into the algorithm introduces too much coupling between the C and Fortran code don't you think? By doing so the interface between the C and Fortran code no longer stops at algorithm_c, as it did before, but now reaches all the way into preproc. I would think that there should be freedom to modify internal implementation details of Fortran without worrying about whether or not it will break C, don't you think? The obvious exceptions are modifications to the interface of algorithm or modification to the callback, but both of those should produce compile errors, and it's quite obvious that changes to them will break other parts of the code, since they are major interfaces. But the details of preproc are not an interface, and I have concerns about future changes to preproc leading to issues that only surface at runtime instead of compile-time, and then they might not be caught by tests.

I guess one could make the argument that NaN is part of the interface, i.e. rhobeg could either be not specified or it could be specified as NaN with the same effect, but I notice that despite code changes to support this, none of the documentation is updated to reflect this. Shouldn't it be? Or would it make more sense to just rely on present instead of modifying the code and documentation to support NaN in addition to "not present"?

On a separate but related note I've noticed that if I remove the nan checks for rhobeg/rholoc then preproc will generate warnings about their values. I think this line is faulty - you're checking for NaN after you've overwritten NaN. I think that line should be inside the else statement just above it (assuming no changes in direction based on the above arguments)?

zaikunzhang · 2024-01-24T06:21:19Z

I am still on cellphone so will be brief.

You last paragraph is right. I will correct it.

I am afraid the first paragraph is wrong: it is possible that f0 is really evaluated to NaN. f0 is different from rhobeg etc. It is not a number freely set by the user. It may be produced by f. I Hope you read my comments on f0 and nlconstr0 again in cobyla.f90.

zaikunzhang · 2024-01-24T06:56:11Z

I guess one could make the argument that NaN is part of the interface, i.e. rhobeg could either be not specified or it could be specified as NaN with the same effect, but I notice that despite code changes to support this, none of the documentation is updated to reflect this. Shouldn't it be? Or would it make more sense to just rely on present instead of modifying the code and documentation to support NaN in addition to "not present"?

See https://github.com/libprima/prima/blob/main/fortran/common/preproc.f90#L17-L30

nbelakovski · 2024-01-24T07:12:58Z

I am afraid the first paragraph is wrong: it is possible that f0 is really evaluated to NaN.

I see what you mean, NaN can be an actually valid value. But I looked at your comment in coblya.f90 and I think this is a good example of why it's not a good idea to accept either "not present" or nan as equivalent. From the user's perspective, if a user provides f0 and it happens to be NaN, but does not provide nlconstr0, then their objective function will be run again needlessly, and the rationale will not be communicated to them. Or if it is communicated to them via a message, it would require them to make a code change to either not pre-compute f0 or to additionally compute nlconstr0. This doesn't seem very user friendly.

I read through your comment in preproc.f90#L17-L30. Why are we putting this in Fortran? We can handle these issues within C (and when I say C I'm including the algorithm_c files, even though they are written in Fortran, because they form part of the C interface). I was thinking that if we handle it in C then related code can be kept together. Putting this code in preproc.f90 makes it hard to discover. The comment you linked is not next to any code that handles NaN, and I did not see it until you personally pointed me to it. Don't you think it makes more sense to handle this within C?

As for f0 I think we need to make it a pointer within C and set it to null, and we need the user to allocate it. Since NaN is a valid value, we have no choice, we can't even use the _f0 method suggested earlier.

zaikunzhang · 2024-01-24T07:33:26Z

if a user provides f0 and it happens to be NaN, but does not provide nlconstr0.

This is not allowed. The two are either both present or both absent --- don't ask why, or this is the answer:

in practice, it happens very often that the simulator provides both f(x) and c(x) together; it is impossible to evaluate only one of them.
Do you recall what was the motivation for enabling the "user" to provide [f0, c(x0)]? In fact, we do not enable the "user" to do it. The real reason is to enable the Python/MATLAB/Julia/R interfaces to do it. Because they need to evaluate c(x0) to no the size of the constraint. This is not some facility prepared for the end user.

In summary, if what you said happens, then the Python/MATLAB/Julia/R interfaces are coded wrongly.

If some smart user discovers that he/she can provide f0, c(x0), and provides only one of them, the Fortran code will just brutally tell him/her that it is not allowed.

zaikunzhang · 2024-01-24T07:35:08Z

it's not a good idea to accept either "not present" or nan as equivalent.

This is what I believed. I compromised and changed the Fortran code because of our discussion.

zaikunzhang · 2024-01-24T07:39:57Z

As for f0 I think we need to make it a pointer within C and set it to null, and we need the user to allocate it. Since NaN is a valid value, we have no choice, we can't even use the _f0 method suggested earlier.

Since f0 and nlconstr0 must be both present or absent, we can tell exactly what is happening by checking whether nlconstr0 is present.

Note that the only solver that accepts [f0, nlconstr0] is cobyla. If they are provided to any other solver, a warning should be raised and they will be ignored. Or should we return with an error code? What do you think?

zaikunzhang · 2024-01-24T07:45:15Z

Let me summarize the desired state of affairs so that we have a clear picture of the whole thing:

integers are set to 0 by default, and if they are not overwritten by user C will initialize them correctly.

doubles are set to NAN by default, and C will do nothing with them, and neither will algorithm_c (it will pass NAN or an overwritten value to algorithm without looking at it)

arrays are set to NULL by default, and C will do nothing with them, but algorithm_c will interpret NULL in such a way that algorithm will believe that no value has been provided for the array, or it will interpret an overwritten value in such a way as to pass that data to algorithm.

To be clear what I wrote above is the desired state of affairs as I understand it from your comments. Please correct me if I'm wrong. I think the idea of passing NAN through into the algorithm introduces too much coupling between the C and Fortran code don't you think? By doing so the interface between the C and Fortran code no longer stops at algorithm_c, as it did before, but now reaches all the way into preproc. I would think that there should be freedom to modify internal implementation details of Fortran without worrying about whether or not it will break C, don't you think? The obvious exceptions are modifications to the interface of algorithm or modification to the callback, but both of those should produce compile errors, and it's quite obvious that changes to them will break other parts of the code, since they are major interfaces. But the details of preproc are not an interface, and I have concerns about future changes to preproc leading to issues that only surface at runtime instead of compile-time, and then they might not be caught by tests.

I agree. Then please correct the code regarding f0 and nlconstr0 while keeping the other parts (rhobeg etc) unchanged. I will merge it, correct the Fortran code regarding is_nan(rhobeg) (your last paragraph), and modify the comment about NaN and 0. Is there anything else?

…as "this argument takes the default value"; warnings will be raised when such values are received; see #135 (comment)

zaikunzhang · 2024-01-24T08:24:48Z

See

db3c939
1cc6a69

zaikunzhang · 2024-01-24T15:02:14Z

Hi, @nbelakovski ,

I recall now that

npt and maxfun are correlated
the minimal value of maxfun is solver dependent

See
https://github.com/libprima/prima/blob/main/fortran/common/preproc.f90#L136-L159
https://github.com/libprima/prima/blob/main/fortran/common/preproc.f90#L172-L180

Therefore, I suggest handling maxfun = 0 and npt = 0 in the same way as rhobeg = NaN, i.e., passing a null pointer to the Fortran backend. Otherwise, the initialization will be complicated. In addition, we are currently using two different ways to handle the same situation for integers and reals. This consistency is undesirable.

BTW, only NEWUOA, BOBYQA, and LINCOA accept npt; a warning should be raised if npt is provided to other solvers (e.g., "COBYLA receives npt, which will be ignored".) Note that this situation differs from NEWUOA receiving xl,u, because npt is an optional algorithmic parameter and xl,u is part of the problem.

As I was working on the Python bindings, I ran into some issues related to memory management. These changes should help with that, particularly the change to the allocation of nlconstr0, since that was the one that appeared to be giving issues with Python. In general I'd prefer to do as little memory allocation within the C interface as possible, since it's challenging enough to have memory management in 1 language, let alone 4 when we start interacting with Python->C++->C->Fortran. What enables the removal of these allocations is that I've discovered that if a variable in Fortran is marked as a pointer and set to null(), then the Fortran function `present` returns false on that variable. This means we can use NULL from within C to indicate that a variable is not provided. This approach has a small inconvenience in that the user will have to handle allocations for optional variables. For example, with f0, the user would have to do the following: double f0 = obj_fun(x0); problem->f0 = &f0; Likewise for things like rhobeg, rhoend, the user would have to do something like the following: double rhobeg = 1.0; options->rhobeg = &rhobeg; double rhoend = 1.0e-6; options->rhoend = &rhoend; This is in constrast to the current way of doing things: options->rhobeg = 1.0; options->rhoend = 1.0e-6; As annoying as the previous option is, it might be preferable since it means we can stop duplicating initialization logic within C. That logic is currently implemented in `prima_init_options` and `prima_check_problem`, but it doesn't replicate the full logic for rhobeg/rhoend. I think that ideally C should be a 'dumb' pass-through and should let Fortran handle as much as possible. We can use this PR to add this functionality to the rest of the variables, but I figured I would start with removing the memory allocations before going too much further.

An intermediate variable was introduced so that the interface will still work if RP=4. The method of providing f0 was changed to be a little bit more user friendly, and rhobeg and rhoend adopted this approach. Additionally, the "unset" value of maxfun and npt was changed to 0 based on discussion in the PR.

c/prima.c

nbelakovski · 2024-01-25T06:01:49Z

@zaikunzhang I've pushed the changes related to maxfun and npt, should be good to go.

zaikunzhang · 2024-01-25T06:33:45Z

@zaikunzhang I've pushed the changes related to maxfun and npt, should be good to go.

I agree with the logic of the code now.

However, does the following code assume that the status of an uninitialized allocatable is unallocated? Does the standard specify this?

I guess the status is "undefined", and compilers will complain about it.

real(RP), allocatable :: rhobeg_loc
...
if (.not. is_nan(rhobeg)) then
    rhobeg_loc = real(rhobeg, kind(rhobeg_loc))
end if

nbelakovski · 2024-01-25T06:50:41Z

However, does the following code assume that the status of an uninitialized allocatable is unallocated? Does the standard specify this?

It seems there are some Windows failures, I'm investigating.

Edit: I just want to add that I think the above makes sense. If they were not initialized to unallocated, and were undefined then checking if they are allocated before allocating anything to them wouldn't make sense, but it seems like a logical check. Imagine I passed that variable to a function and I want to know if that function allocated that variable. I couldn't rely on allocated if it might be undefined.

maxfun and npt were configured so that the default value of 0 is interpeted as not present. The notes on the default values in C were moved from the README to the implementation for better discoverability.

nbelakovski · 2024-01-25T06:53:30Z

It looks like the Windows failures are because Chocolatey failed to download wget. Here's a run from a couple pushes ago where everything worked: https://github.com/libprima/prima/actions/runs/7650480280. The only changes since that were in comments, so there's no code issue. I'm running again to see if there's a chocolatey problem we need to work around.

Edit: Looks like it was just a one off issue with chocolatey, things seem fine now.

zaikunzhang · 2024-01-25T07:11:08Z

However, does the following code assume that the status of an uninitialized allocatable is unallocated? Does the standard specify this?

It seems there are some Windows failures, I'm investigating.
Edit: I just want to add that I think the above makes sense. If they were not initialized to unallocated, and were undefined then checking if they are allocated before allocating anything to them wouldn't make sense, but it seems like a logical check. Imagine I passed that variable to a function and I want to know if that function allocated that variable. I couldn't rely on allocated if it might be undefined.

I agree with what you said. However, the part of the standard you referred to is talking about derived-types. It is not for a generic local variable.

Fortunately, I found the following in Section 9.7.1.3 "Allocation of allocatable variables" on page 149 of WD 1539-1 J3/23-007r1 (Draft Fortran 2023):

An unsaved allocatable local variable of a procedure has a status of unallocated at the beginning of each invocation
of the procedure.

That said, I think everything is fine.

nbelakovski requested a review from zaikunzhang as a code owner January 7, 2024 22:40

nbelakovski commented Jan 7, 2024

View reviewed changes

zaikunzhang requested a review from jschueller January 8, 2024 05:02

nbelakovski force-pushed the refactor_c_interface branch from 78760c3 to 2dd80c5 Compare January 9, 2024 01:44

nbelakovski force-pushed the refactor_c_interface branch from 2dd80c5 to 9d124a8 Compare January 9, 2024 05:53

zaikunzhang reviewed Jan 9, 2024

View reviewed changes

c/bobyqa_c.f90 Show resolved Hide resolved

c/bobyqa_c.f90 Outdated Show resolved Hide resolved

c/bobyqa_c.f90 Show resolved Hide resolved

c/bobyqa_c.f90 Outdated Show resolved Hide resolved

c/prima.c Outdated Show resolved Hide resolved

c/prima.c Outdated Show resolved Hide resolved

nbelakovski force-pushed the refactor_c_interface branch from 9d124a8 to 995c330 Compare January 11, 2024 05:42

zaikunzhang added a commit that referenced this pull request Jan 24, 2024

240124.161948.HKT fortran: remove the code that interprets NaN and 0 …

db3c939

…as "this argument takes the default value"; warnings will be raised when such values are received; see #135 (comment)

nbelakovski added 4 commits January 24, 2024 21:46

Removed _m_nlcon since it is not used, privately or publicly

0d47be2

Rename the C readme so that it gets processed as markdown

37bb669

nbelakovski force-pushed the refactor_c_interface branch from 9d445ed to 07ad335 Compare January 25, 2024 05:48

github-advanced-security bot found potential problems Jan 25, 2024

View reviewed changes

c/prima.c Fixed Show fixed Hide fixed

nbelakovski force-pushed the refactor_c_interface branch 2 times, most recently from f4c42ed to e9165e4 Compare January 25, 2024 06:01

PR feedback

7b3a9d2

maxfun and npt were configured so that the default value of 0 is interpeted as not present. The notes on the default values in C were moved from the README to the implementation for better discoverability.

nbelakovski force-pushed the refactor_c_interface branch from e9165e4 to 7b3a9d2 Compare January 25, 2024 06:51

zaikunzhang merged commit 5ba7c90 into libprima:main Jan 25, 2024
35 checks passed

nbelakovski deleted the refactor_c_interface branch January 25, 2024 07:18

zaikunzhang mentioned this pull request Jan 25, 2024

Apply the modifications of https://github.com/libprima/prima/pull/135 to Aeq, beq, Aineq, and bineq? #146

Open

Remove allocations for xl, xu, nlconstr0 in C interface #135

Remove allocations for xl, xu, nlconstr0 in C interface #135

Conversation

nbelakovski commented Jan 7, 2024

nbelakovski Jan 7, 2024

Choose a reason for hiding this comment

zaikunzhang commented Jan 8, 2024

zaikunzhang commented Jan 8, 2024 • edited Loading

nbelakovski commented Jan 8, 2024

zaikunzhang commented Jan 8, 2024

zaikunzhang commented Jan 8, 2024 • edited Loading

nbelakovski commented Jan 8, 2024 • edited Loading

zaikunzhang commented Jan 8, 2024 • edited Loading

nbelakovski commented Jan 8, 2024

zaikunzhang commented Jan 8, 2024 • edited Loading

nbelakovski commented Jan 8, 2024 • edited Loading

zaikunzhang commented Jan 8, 2024 • edited Loading

zaikunzhang commented Jan 8, 2024

nbelakovski commented Jan 9, 2024

nbelakovski commented Jan 9, 2024

zaikunzhang commented Jan 9, 2024 • edited Loading

nbelakovski commented Jan 9, 2024

zaikunzhang commented Jan 9, 2024

zaikunzhang commented Jan 9, 2024 • edited Loading

nbelakovski commented Jan 9, 2024 • edited Loading

nbelakovski commented Jan 9, 2024

zaikunzhang commented Jan 9, 2024

nbelakovski commented Jan 9, 2024

zaikunzhang commented Jan 9, 2024

zaikunzhang commented Jan 24, 2024

nbelakovski commented Jan 24, 2024 • edited Loading

zaikunzhang commented Jan 24, 2024

zaikunzhang commented Jan 24, 2024

nbelakovski commented Jan 24, 2024

zaikunzhang commented Jan 24, 2024 • edited Loading

zaikunzhang commented Jan 24, 2024 • edited Loading

zaikunzhang commented Jan 24, 2024 • edited Loading

zaikunzhang commented Jan 24, 2024 • edited Loading

zaikunzhang commented Jan 24, 2024

zaikunzhang commented Jan 24, 2024 • edited Loading

nbelakovski commented Jan 25, 2024

zaikunzhang commented Jan 25, 2024

nbelakovski commented Jan 25, 2024 • edited Loading

nbelakovski commented Jan 25, 2024 • edited Loading

zaikunzhang commented Jan 25, 2024

zaikunzhang commented Jan 8, 2024 •

edited

Loading

zaikunzhang commented Jan 8, 2024 •

edited

Loading

nbelakovski commented Jan 8, 2024 •

edited

Loading

zaikunzhang commented Jan 8, 2024 •

edited

Loading

zaikunzhang commented Jan 8, 2024 •

edited

Loading

nbelakovski commented Jan 8, 2024 •

edited

Loading

zaikunzhang commented Jan 8, 2024 •

edited

Loading

zaikunzhang commented Jan 9, 2024 •

edited

Loading

zaikunzhang commented Jan 9, 2024 •

edited

Loading

nbelakovski commented Jan 9, 2024 •

edited

Loading

nbelakovski commented Jan 24, 2024 •

edited

Loading

zaikunzhang commented Jan 24, 2024 •

edited

Loading

zaikunzhang commented Jan 24, 2024 •

edited

Loading

zaikunzhang commented Jan 24, 2024 •

edited

Loading

zaikunzhang commented Jan 24, 2024 •

edited

Loading

zaikunzhang commented Jan 24, 2024 •

edited

Loading

nbelakovski commented Jan 25, 2024 •

edited

Loading

nbelakovski commented Jan 25, 2024 •

edited

Loading