-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About clang-tidy and code standard #907
Comments
Dear @YanzhaoW , in fact I wanted to stay out of this discussion, but as you tagged me in your comment, you will have to bear with me ;-) Before I go into more details, I just want to give my personal opinion on the PR reviewing process. In my opinion, we don't only have computer-programming experts and detector experts, but we also have people thinking and speaking in the computer-programming language and people thinking and speaking in the language of physics. If a programming expert doesn't know/understand physics processes described by a given PR, he should not make any demands nor more or less nice comments on the coding of a given PR. If he wants to comment, then firstly he should contact a developer who made this PR, and ask him/her about processes and physics behind the PR. Once he understood this, he can then make comments/suggestions how the coding could be improved. If the programming expert doesn't have time (which is understandable, as we all are overloaded with work) or interest to learn about the physics behind the given PR, then he should also not comment on the programming of the given PR. You can take the best computer programming engineer in the world and give him the R3BRoot to “properly” code it, and you will have at the end a code which you will have to throw in the garbage can, as it will not work. You have to know and understand physics first and then program the code. What we have at the moment, unfortunately, is a divide between programming experts and detectors/physics experts. And as long as we have this divide, we will not have a solution for present problems. I remember when Michael and Dima started to work on the Runge-Kutta tracker in the R3BRoot, Dima being programming expert and Michael being physics experts. They spend hours and days working together, side by side, Michael explaining physics behind the tracking process and Dima explaining how these processes can be coded in the best ways. And this is what should be our goal. Now, let me come to some of your points about code standards. 1.Software is different as hardware – I don't agree with your “For the hardware you could say "It works fine now and there won't be any problem if we ship it now" “. Ask anyone working on e.g. electronics for different detectors, each and every of them will tell you that in the most cases hardware which functioned without any problem in the electronic lab, once brought in the cave didn't work anymore. So, hardware is not that easy, and anybody spending time in cave could tell you that. I agree with you that our ultimate goal should be to have a code which is easy to change/adjust, but we have to be realistic and not expect/demand to have something like this now. As I mentioned some time ago, all our detectors and electronic readouts are in a developing phase, in each experiment we go with new detectors, new electronics, new cabling... People working on experimental setup are also working on the analysis. We are in the stage where we need a code which is working now, with a present setup, and we don't have time (at least people working on detectors don't have time) to think about future of the code. Is it ideal situation? No, of course it is not. In the ideal world we would do as you write in this point, but unfortunately, we are living in a realistic world. For me, the best solution would be to have experiment-relevant repositories as each experiment has different demands, detectors, electronics, cables... Don't demand that code should analyze different configurations of a given detector in different experiment, as this will only demand time to be spent on debugging the code or making code that worked for previous configuration not working any more for that case, as we have experienced it many times. Once we are in the HEC, when detectors are ready, when we know how these detectors are read out, then we should work on the version of the code that fulfill your suggestions in the point 1. Then, we will have time that programming experts and detectors experts sit together and work together on the code. This time, unfortunately, we don't have now. Above comment is also valid for your point 3. 2.Your second point was readability. While I fully agree that code has to be readable, I don't fully agree with you how this readability should be performed. What we have to keep in mind, is that we are trying do describe different physics process using computer tools. Many of the processes we still don't fully understand, or we understand them but don't have computer power to described them in their whole beauty... Thus, we have to use approximations, empirical descriptions, analytical solutions, set limits on the validity of a given description, make assumptions, make corrections... For a code to be readable all these things have to be commented. Without comments, however good you program the code, this code will not be readable, and this can cause problems. Let me give you an example – Ralf (who is not in our collaborations any more) started to work on the LOS readout. When he left, I overtook it for him. At that time, it was not possible to treat multi-hit events. Hans and me tried to look at e.g. correlations between VFTX and TAMEX times, but it was not a full solution. And, as we didn't have a way to treat multi-hit events properly, we clearly stated that in the code via writing a comment. But then, at one of the code's “clean-up” some of the programming experts who though code should be self-explainable and comments are not needed, removed that comment. Consequence was that when another person started to look into LOS, he declared that the whole code was bad and wrong, as it doesn't treat the multi-hit event properly. If the comment has not been removed, the new user would know what he is dealing with, and his comment would be different. Quite honestly, I was not happy at all. In my opinion, every time we describe some process, make some assumption, use some limitations... each of these things has to be clearly stated in the code for people who use it to know why it is like it is. Will it make code longer? Yes, it will. But, it will help others understanding physics assumptions and limitations behind the code. Now, concerning solutions. Yes, analysis of many detectors doesn't fulfill the holy grail of the computing programming, but calling them bad is not nice toward the people who made those analysis, spent time and efforts to have a working code. They did their best to have a working analysis. They didn't say, I don't care about this detector, thus I will write bad stuff. In my opinion, first step in finding solution is having respect and understanding for people how are not experts in programming but are still developping the code. As you said, we need stable and reproducible code, but this code must firstly and foremost properly describe physics processes. Proper description of physics is more important than beauty of the code programming (you know, inner values above outside beauty ;-)) As I already said, I don't see it realistically possible that at the present stage we can have a code fulfilling all the ideas of programming beauty. Don't get me wrong, I don't say that we should ignore the rules of good programming. This is not my point. My point is that at the present stage we don't have neither time nor man power to do it. At the present stage we are in a surviving mode, trying to keep up with constant changes happening in the cave. Let me exaggerate, at the present stage, we are swimming in the ocean, doing our best to reach the shore in any way possible. Once we reach the shore (i.e. being in HEC, having detectors ready and fully functional) we can then start to build nice and strong cabin. But, before we reach the shore we cannot work on that cabin. I think that the idea (I think somebody brought it) to make a template directory for analysis of a “ghost” detector is good. Once we have time to work on the beatification of the code, such templates will be helpful to make better-coded analysis. I never learned C nor C++ but with use of already existing codes in the R3BRoot and google, I managed to do quite a lot of work in the R3BRoot (although, nothing of my work would satisfy your ideas of proper programming). And many of the codes I used as a basis were at that time considered as good programming, but now you don't consider them like that anymore. So, I would gladly, with a help of modern templates, clilng-tidy checking and google, one day (but not now) rewrite the analysis I am doing now. And I am sure many of us non-experts would not have anything against doing it one day, too, especially if we would have support of the experts, if we would do it in close collaboration with experts, and not only getting rules “thrown” at us. I think any of us working on detectors and analysis have proven that they are professionals eager to learn and develop new things. It is just that at the present moment our priotities what that profesionalism concerns are probably different than yours. |
Oh sweet summer child. Hardware has bugs just like software, only that they tend to be a lot harder to locate and fix. Just because your mainboard or phone (both produced in the millions) does not have any visible design bugs, that does not mean that the ASICs, PCBs and FPGA firmwares we use in the cave behave as well. As @akelic remarked, the experience of something which worked in the lab not working in the cave is all to common. And that is before we go to analogue electronics and noise reduction, which of course is also an issue for digital signal transmission. Without wading into the details too much, some more remarks (itemization unrelated to the above):
A lot of our framework is to me obviously unnecessarily painful. This makes me sad because it means that when dealing with R3BRoot/FairRoot, we have to waste a lot of time and lines on stuff which could be easy but somehow is not. Finally, I notice that @akelic has closed this issue. I am not sure I agree on this, but then again there were certainly decisions of the R3BRoot maintainers I disagreed more strongly with. I still find myself confused about who is responsible for which parts of this github site and hereby reiterate my request to make the github permissions visible to all. In any case, I think all the usual people participating in discussions are already @'ed in one of these three walls of text :-) Good night, |
@klenze Sorry, it was not my plan to close at all to close the issue! Obviously, I pressed a wrong button (this happens, when you try to type and your dog push on your hand as he wants some attention ;-)). Sorry again! |
@klenze What do you mean with:
Issues, PRs, comments, code... are open for all; @akelic closed it in a distraction and the release manager reopened. If you have any complain or comment on the github permissions, please be concrete. |
Yes, but this requires a good candidate and money, let's say 40k euros per year. Unfortunately, the collaboration rejected this idea some time ago. Personally, I see this option as very complicated because we don't have the money to hire new postdocs to help with the analysis, so hiring other people like electronic engineers, software developers, ... seems impossible.
Yes I agree, the theory is very clear and we know it well, but do you know how to do this when we don't have the manpower?
Please, do the corresponding PR with the improvements. Best |
Thanks everyone for the comments. There are some I agree and there are others I don't. But let's leave most of disagreements to PRs in the future. Here I want to mention the long-term solution suggested by @klenze:
This would be an optimal solution but I don't think this's realistic. If we have resource to pay someone for his expertise, it's more likely that our collaboration would put the money on DAQ or detector setup (some on-site jobs) because we should first guarantee we have the reliable data coming out of detectors. What I suggest is that we should put more responsibility of the software development on those PhD students working remotely. I asked how many PhD students work on-site in GSI and I heard "very few". So there must be a lot of students working remotely and manpower shouldn't be an issue in this case. Then someone would say they can't because they don't have expertise of programming in C++. Well, I would say they should have. I can't say the same for other people. But when I studied my master in University of Bonn, they provide experiment-oriented physics students a course specific for C++, which teaches the concepts such as heap, stack, class, polymorphism, friend and up to unit tests. It doesn't mean much but at least universities do expect master physics students to have certain level of programming skill when they finish the program. If my case is not representative, then the required skills could be learned along the way. I don't think it's very rare for a PhD student to learn lots of new things. Besides, you get your expertise from practise. It's wrong to think "I don't have this expertise so I can't write code for the simulation". You start writing a code. Then you think what could be a better way and you learn how to improve it. After that, you begin to think how to improve it further and further. In the end, you are a programming expert. Most of experts don't sit at a lecture room and listen how to program for one semester. They get their expertise from practice. And contributing to R3BRoot is a VERY good practise from my personal experience. |
@hapol, @akelic: ok, I apologize, I should have considered the possibility that the closing was in error.
I do not think that the incentive structures for PhD students is conductive to getting them to contribute good C++. Basically, the modal PhD student cares and should care about (a) doing their analysis, (b) writing their thesis, (c) keeping their supervisor happy by doing what they ask them to do, be it analysis software development, DAQ, slow control, extra teaching, setting up detectors, beam time shifts or feeding the department cat. (From what I hear from my fellow cave dwellers, we are not plagued by endless hoards of grad students eager to set up detectors, so I conclude that mostly it is cats.)
I think it is safe to say that if you take away @YanzhaoW and me, both of whom do R3BRoot code reviews as a kind of weird hobby, there is little in the way of useful feedback to be gained by doing a PR. If you already did some local modifications for your analysis, I concede that in the process of creating a PR, you will learn more about git and the github workflow. I will also say that reading our code base, like basically all ROOT5 style code, is not conductive and indeed actively harmful to learning modern C++. In fact, for the job of a junior dev with a modern C++ project I would strongly prefer someone who did their PhD using python (and thus must learn C++ from the scratch) to someone who during their PhD learned "C++" using ROOT5 style code (and probably considers c-style casts "weird stuff you have to do to stop the compiler/interpreter from complaining", likes to use raw pointers everywhere, knows about goto and preprocessor macros but does not know about templates, prefers C arrays and generally is willing to accept any amount of misery by a badly designed framework and class hierarchy -- all of which would have to be unlearned). That is not a value judgement on the ROOT5-taught programmer. The days of ROOT5 are not so long ago, and back then, the ROOT interpreter would have punished every attempt to use modernish features (such as templates). |
For R3BRoot, the responsability to merge PR corresponds to the release manager, according to the scheme decided within the analysis WG (with people from DAQ groups and others) few years ago. Dmytro Kresan did the work several years till he could not work anymore for R3B. Then, Jose Luis volunteer for the work, while all this time I was authorised to accept PRs as coordinator, but im only doing it in vacacional periods or if the release manager asks for some help. For the parameter containers, both could accept PRs. Additional help reviewing and accepting PRs is always welcome from anybody interesting in the work, provided we organize and follow common rules. There was no candidates till now, to my knowledge. |
I had some suggestions and discussion with Yanzhao during R3BWeek. I think it would make sense for Yanzhao to assess what can be quickly fixed with a sed command. Or even a bash script with a list of directories to make these changes to. This could fix ownership issues, <> instead of "", override destructors vs. virtual, etc. I think this would, in general, be a helpful tool to have for future changes. I think coding implementation should be done in baby steps. I would rather you assume I don't know something in regards to coding. I, like Aleksandra, learned coding via examples and google - especially the root forums and documentation. I, personally, liked the way Philipp approached the dynamic casts. He taught this concept in a few different meetings and reiterate this to us. He then went through and replaced every casting to dynamic (where appropriate). This was extremely helpful. It was nice to see the examples in the code. If I had to remember this dynamic something Philipp mentioned at some meeting some time ago, I would not make the effort to find it and implement it. If there are already multiple instances in the code with this implementation, I have a reference, which encourages me to use it. Lastly, I think there needs to be some more discussion of what is necessary vs what is a suggestion. I don't think it's important to flag a printf statement vs a cout statement. I think it's more important to point out 10 levels of nesting or if array indexing is not correct. |
Hi @ajedele
I don't think this is possible. Simply using a regex and string replace modifies the existing code base in an uncontrollable way and could cause a lot of unknown issues. And it's better not to change the working code of your detectors without knowing how they work (as is suggested by Aleksandra). My philosophy is still "if the old code base works well, it's better not to touch it". If the old code base has a problem, we should fix it little by little without crashing the whole program here and there. |
With PR #906 and PR #908, clang-tidy no longer performs static analysis on the upcoming PRs related to detector folders (except one warning about dynamic casting), unless the code submitter turns it on (currently only NeuLAND related folders). However, PRs related to all shared code in folder So I consider this issue is solved and closed. If further discussion is needed, we could reopen it later. |
Few months ago, my PR #825 was merged to the dev branch, which requires all newly added or edited code to comply with clang-tidy checking. Some people, especially those who are working on site with detectors, are very "dissatisfied" with such requirements as many of them have been already overburdened with DAQ or hardware setups and do not have time or energy to fix all the warnings. Many discussions have already been done in weekly meetings or here in Github Issues, but without any constructive solutions to make both sides of people satisfied.
During the collaboration meeting this week, I have discussed with those people, especially @ajedele, about what could be a better solution to solve this conflict. I fully understand their arguments that for some detectors such as ROLU, there are only one or two people who are available to do both the software side, such as simulation, calibration algorithm and the hardware side, such as DAQ and detector. I agree that it's "asking too much" for them to also comply with the code standard suggested by clang-tidy or clang-format.
But before I suggest a possible solution to our code maintainers (@hapol and @jose-luis-rs), I would like to point out (?again) why the code standard in the software matters and what kind of role clang tidy plays:
auto
when it needs to and the restriction on the cognitive level of a function are very useful to help mitigating such problems.void*
,goto
and other blasphemy for C++, the code design is clear and easy to understand. Functionalities are divided into many small functions with good names to indicate their purposes.So now the solution:
Since R3BRoot code base have been nicely separated according to different detectors, a bad code quality for one detector doesn't have much impact for the people working on other detectors and any potential risks of crashing can be contained. I would suggest to our code maintainers that we could disable some or all clang-tidy checking for all the folders of the detector if there isn't anyone who can dedicate on its software development. The common files, such as
R3BUcesbReader.cxx
inr3bbase
, should keep following the default standard because it's very unlikely that the people working in the Cave needs to change something in the reader base class.If we adopt this policy, it's also very likely that, out of convenience, many detector groups decide to turn all clang-tidy checking off in their folders. I don't have any problem if, let's say, Califa people decide to just disable the magic number warning because they think it's fun to have some magic numbers in their code base. But I really hope the people who think this static analyzer checking is just an inconvenience and want to disable all checking, with all the arguments above, to rethink whether it's really an "inconvenience" in the long term.
An example of disabling all checking for ROLU has been made in PR #906. But as I've suggested above, this is just a short term solution. The long term solution, in my opinion, would be that some data analysis people (maybe first or second year PHD students), who work remotely and can't be on site in the Cave very regularly, start to learn some programming and take the responsibility on the software development of some detectors (simulation and calibration algorithm). I mean data analysis requires programming anyway. Would it be reasonable to learn how to do it in a little bit "professional" way?
As always, any feedbacks or comments are welcome. @klenze @akelic @inkdot7
The text was updated successfully, but these errors were encountered: