-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better handling of huge tableau #18
Comments
I'd probably place it in `/etc/wenoof/` or within a `share` directory.
When applications are installed from a distro's package manager, such
files end up in `/usr/share/PACKAGE-NAME/`. It would not be appropriate
to place them in a directory called `include`, as this is for header
files for libraries.
…On 15/01/17 07:06, Stefano Zaghi wrote:
Currently, all tableau of coefficients are /hard-coded/ in the
sources. This has at least 2 cons:
1. really error-prone with very bad visualization due to the 132
characters limit;
2. not flexible: add/modify tableau require touch the sources.
I think it is much better to read tableau from a separate file at
run-time. I like to encode them in JSON by means of json-fortran
<https://github.com/jacobwilliams/json-fortran>. However the big
question is
where place the /default/ tableau files?
@zbeekman <https://github.com/zbeekman> ***@***.***
<https://github.com/rouson> @cmacmackin
<https://github.com/cmacmackin> @jacobwilliams
<https://github.com/jacobwilliams> and all having a system-knowledge)
Maybe I have already asked an opinion about this, but I do not
remember your answer.
Do you know if there is some /standard/ (or almost standard) place
where unix-like libraries search for their /auxiliary-include/ files
read at run-time?
In the case a user do not want to perform a full-installation, but
want to use sources (as often happens in our Fortran ecosystem) where
should we search for such files? Maybe we need an *include* directory
in the root project...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AHxJPbYKcctAuA-640GW2fjKG7azP4j9ks5rScWFgaJpZM4Lj2Td>.
--
Chris MacMackin
cmacmackin.github.io <http://cmacmackin.github.io>
|
I agree with @cmacmackin about where one would place such tables. However, I would recommend, for performance reasons, that you might explore using the tables to generate, at compile or configure/cmake time, generated source files that have the coefficients hard-coded as (I know that I am violating the "premature optimization is the root of all evil" principle, however it is VERY likely that you flux stencil coefficients and/or your smoothness coefficients are going to be in the inner most kernel, and fairly computationally expensive... so I would recommend doing some experiments to check whether making these coefficients compile time constants vs read from a file doesn't have an adverse performance impact) |
Chris, thank you very much: the right dirs are what I was searching for. Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators. Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more? Thank you very much guys! Cheers |
Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.
Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like: module coefficients
implicit none
real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module The timings of this implementation could be compared to the version of the code that directly reads the coefficients from the JSON file into memory, without the effort/complication of creating generated sources. CMake has capabilities to handle generated sources. It would be a bit more complicated, perhaps, to roll your own, but you can do it with a makefile or another means. I hope I have been more clear. |
Il 15 gen 2017 6:55 PM, "Izaak Beekman" <notifications@github.com> ha
scritto:
Zaak, I do not understand the performance issue: coefficient should be
load 1 time during the integrator creation, not during the actual usage of
integrators.
Yes, it is read from disk once, but then it is placed in a variable which
is subject to the tyranny of the memory hierarchy (moved around between
RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that
the value hasn't changed, so it may end up fetching it from further away
that it needs to. Compile time constants can be embedded in the
instructions themselves, if I understand correctly---which I may not, I am
a poor Fortran guy---which means that they may not take up registers that
need to be used by other data, and may be fetched along with the
instruction. As I said, my understanding here is pretty limited, but I do
know that I've heard people who know more about the hardware layer than I
do, discussing the merits of compile time constants.
Dear @zbeekman, thank you for your idea: I agree with you that using
parameters could be better for performance reasons...
… Moreover, I do not understand how to generate tables without hard-coding
in some way, either fortran, configure/make, python, etc. To me, having a
JSON tables is very handy. Can you elaborate a bit more?
Yes my idea is simple: use generated source code. If you don't wish to
write coefficients in hardcoded tables (either because you have a formula
that can generate them, or due to readability issues, etc.) then you have
another program write the Fortran source code for you before compiling the
main library/program. You could put the tables of coefficients into a JSON
file and then you could have a python, or Fortran, or some other program
that reads the JSON file and writes a Fortran module that has the same
tables but as compile time constants like:
module coefficients
implicit none
real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...end module
The timings of this implementation could be compared to the version of the
code that directly reads the coefficients from the JSON file into memory,
without the effort/complication of creating generated sources.
CMake has capabilities to handle generated sources. It would be a bit more
complicated, perhaps, to roll your own, but you can do it with a makefile
or another means.
I hope I have been more clear.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#18 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGpxIYibJQTAZVB34OSVcxbB-jZUdiKHks5rSl2HgaJpZM4Lj2Td>
.
|
You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel. |
Zaak, thank you for your insight.
Oh, sorry, I did not focused that you were referring to parameters, my bad. Sure, parameters are always better-handled (I hope) than other memories, but in this specific case I did not consider them for some practical issues (see below).
module coefficients
implicit none
real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module Ok, this is an option, but has its own cons. Currently, we have 8 different set of polynomial-coefficients and linear (optimal) coefficients and 3 different WENO (JS, JS-Z, JS-M) resulting in 24 different integrators: just from its born, it was clear for me that, to preserve easy maintenance/improvements and allow a lot of different schemes, I need a flexible OOP pattern. The strategy pattern is very attractive in this scenario and Your coefficient modules should become something like: module wenoof_coefficients
implicit none
private
public :: beta-S2, beta-S3...., beta-S8
public :: gamma-S2, gamma-S3..., gamma-S8
real(R_P), parameter, target :: beta-S2(...:...,...:...) = reshape([....], ...)
real(R_P), parameter, target :: beta-S3(...:...,...:...) = reshape([....], ...)
....
real(R_P), parameter, target :: gamma-S8(...:,...:...) = reshape([....], ...)
endmodule wenoof_coefficients I used
Currently, we adopt the second approach: by a strategy pattern the interpolator is constructed with the proper set of coefficients that are stored into allocatables members of the interpolator. Now, if we want to have parameter-coefficients while avoiding the S-check for each interpolate we have few options:
Namely: ! concrete approach
type :: inpterpolator_S2
contains
procedure :: interpolate ! here beta-S2 and gamma-S2 are directly accessed
endtype interpolator_S2
type :: inpterpolator_S3
contains
procedure :: interpolate ! here beta-S3 and gamma-S3 are directly accessed
endtype interpolator_S3
! and so on...
! pointer approach
type :: inpterpolator
real(R_P), pointer :: beta(:,:)
real(R_P), pointer :: gamma(:,:)
contains
procedure :: init ! here beta and gamma are associated to the correct beta-S, gamma-S
procedure :: interpolate ! here the local beta, gamma members are accessed
endtype interpolator
At the end, I am really in doubt about which approach is better and overall if the performance will increase. As a matter of fact, while the coefficients are surely constants, smoothness indicators are not and must be stored in dynamic memory: the tyranny of memory hierarchy cannot be completely avoided. My afraid is mostly about code-simplicity-conciseness-clearness: Damian (@rouson) teach me how it is important to be KISS and handling coefficients by parameters looks very complex...
I'll try to verify the performance difference with 1 case if I'll find the time. Zaak, thank you again your help is priceless. Cheers |
I'm not the Fortran expert here, but if we use a JSON file to read the
coefficients, is it possible to store them into parameters? In practice, to
adopt a "mixed" approach between our requirement of an iterpolator with the
proper set of coefficients inside it and the use of parameters that could
be very useful in terms of performance...
I don't know if this approach is viable...
Giacomo Rossi Ph.D., Space Engineer
Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza"
University of Rome
*p: *(+39) 0692927207 | *m**: *(+39) 3408816643 | *e: *giacombum@gmail.com
<giacomo.rossi@uniroma1.it>
Member of Fortran-FOSS-programmers
<https://github.com/Fortran-FOSS-Programmers>
2017-01-16 6:18 GMT+01:00 Stefano Zaghi <notifications@github.com>:
… @zbeekman <https://github.com/zbeekman>
Zaak, thank you for your insight.
Yes, it is read from disk once, but then it is placed in a variable which
is subject to the tyranny of the memory hierarchy (moved around between
RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that
the value hasn't changed, so it may end up fetching it from further away
than it needs to. Compile time constants can be embedded in the
instructions themselves, if I understand correctly---which I may not, I am
a poor Fortran guy---which means that they may not take up registers that
need to be used by other data, and may be fetched along with the
instruction. As I said, my understanding here is pretty limited, but I do
know that I've heard people who know more about the hardware layer than I
do, discussing the merits of compile time constants.
Oh, sorry, I did not focused that you were referring to parameters, my
bad. Sure, parameters are always better-handled (I hope) than other
memories, but in this specific case I did not consider them for some *practical
issues* (see below).
Yes my idea is simple: use generated source code. If you don't wish to
write coefficients in hardcoded tables (either because you have a formula
that can generate them, or due to readability issues, etc.) then you have
another program write the Fortran source code for you before compiling the
main library/program. You could put the tables of coefficients into a JSON
file and then you could have a python, or Fortran, or some other program
that reads the JSON file and writes a Fortran module that has the same
tables but as compile time constants like:
module coefficients
implicit none
real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...end module
Ok, this is an option, but has its own cons.
Currently, we have 8 different set of polynomial-coefficients and linear
(optimal) coefficients and 3 different WENO (JS, JS-Z, JS-M) resulting in
24 different integrators: just from its born, it was clear for me that, to
preserve easy maintenance/improvements and allow a lot of different
schemes, I need a flexible OOP pattern. The strategy pattern is very
attractive in this scenario and allocatable variables are ubiquitous
here. This was the main reason why I never considered parameters. I have
full-thrust in your experience, thus if you think it worth to try, I do.
Your coefficient modules should become something like:
module wenoof_coefficients
implicit none
private
public :: beta-S2, beta-S3...., beta-S8
public :: gamma-S2, gamma-S3..., gamma-S8
real(R_P), parameter, target :: beta-S2(...:...,...:...) = reshape([....], ...)
real(R_P), parameter, target :: beta-S3(...:...,...:...) = reshape([....], ...)
....
real(R_P), parameter, target :: gamma-S8(...:,...:...) = reshape([....], ...)endmodule wenoof_coefficients
I used target specification for the following reason: when a user
instantiate an interpolator (s)he must select the accuracy, namely the
stencils number/dimension. Thus when performing the interpolation the logic
possible are essentially 2:
1. for each interpolate call we must *check* the number S (by means of
an if-elseif or select case construct) and the access to the right
beta-S# and gamma-S#) ;
2. avoid the check by *building* the interpolator with the proper set
of coefficients *inside it*:
Currently, we adopt the second approach: by a strategy pattern the
interpolator is constructed with the proper set of coefficients that are
stored into allocatables members of the interpolator.
Now, if we want to have parameter-coefficients while avoiding the S-check
for each interpolate we have few options:
1. provide a set concrete interpolators with hard-coded reference to
the proper parameter-coefficients set;
2. make the generic interpolator coefficients a *pointer* to the
proper parameter-coefficients set;
Namely:
! concrete approach
type :: inpterpolator_S2
contains
procedure :: interpolate ! here beta-S2 and gamma-S2 are directly accessed
endtype interpolator_S2
type :: inpterpolator_S3
contains
procedure :: interpolate ! here beta-S3 and gamma-S3 are directly accessed
endtype interpolator_S3
! and so on...
! pointer approach
type :: inpterpolator
real(R_P), pointer :: beta(:,:)
real(R_P), pointer :: gamma(:,:)
contains
procedure :: init ! here beta and gamma are associated to the correct beta-S, gamma-S
procedure :: interpolate ! here the local beta, gamma members are accessed
endtype interpolator
Is it possible to associate a pointer to a parameters right? If so, is the
memory handling still good?
At the end, I am really in doubt about which approach is better and
overall if the performance will increase. As a matter of fact, while the
coefficients are surely constants, smoothness indicators are not and must
be stored in dynamic memory: the tyranny of memory hierarchy cannot be
completely avoided.
My afraid is mostly about code-simplicity-conciseness-clearness: Damian (
@rouson <https://github.com/rouson>) teach me how it is important to be
KISS and handling coefficients by parameters looks very complex...
You won't know for sure until you can compare the techniques... but I just
thought it was worth mentioning since it is likely that the Smoothness
computation is an expensive, inner-most kernel.
I'll try to verify the performance difference with 1 case if I'll find the
time.
Zaak, thank you again your help is priceless.
Cheers
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#18 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGpxIT067pHNkr6qg8dIpaWCFfvB0_MPks5rSv2WgaJpZM4Lj2Td>
.
|
@giacombum
Nope if you do it at run-time in the library, parameters are compile-time constants. If you want JSON-formatted coefficients you must go with the Zaak suggestion: a pre-processor that read JSON before you compile WenOOF. |
Currently, all tableau of coefficients are hard-coded in the sources. This has at least 2 cons:
I think it is much better to read tableau from a separate file at run-time. I like to encode them in JSON by means of json-fortran. However the big question is
@zbeekman (@rouson @cmacmackin @jacobwilliams and all having a system-knowledge) Maybe I have already asked an opinion about this, but I do not remember your answer.
Do you know if there is some standard (or almost standard) place where unix-like libraries search for their auxiliary-include files read at run-time?
In the case a user do not want to perform a full-installation, but want to use sources (as often happens in our Fortran ecosystem) where should we search for such files? Maybe we need an include directory in the root project...
Note
Some coefficients are well defined integer fractions, it can be very useful to add FortranParser as third party library for parsing tableau with such a coefficients definition.
The text was updated successfully, but these errors were encountered: