forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 4
WeeklyTelcon_20160426
Geoff Paulsen edited this page Apr 26, 2016
·
8 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres
- Todd Kordenbrock
- Sylvain Jeaugey
- Ralph
- Nysal
- Nathan Hjelm
- Joshua Ladd
- Howard
- Geoff Paulsen
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
- PR 1097 - for 1.10 may be mute.
- PSM2 issue short version. PSM2 API - uses a fixed UUID - so all jobs across cluster use same UUID (bad)
- Jeff will check 1.10.3 lib versions. Ralph already updated for 1.10.3, but jeff will check
- 1.10 is hanging if it doesn't get enough slots. Ralph will look at.
- Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
- Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker
- Memhooks went in yesterday (last blocker)
- Jeff will make a 2.0 RC tomorrow April 27
- doing the library version checking is easy (slam them all to 2.0)
- Some questions as far as which memhook function we're using with with protocol.
- leave-pinned=0 && leave-pipelined=0 <- memhook component will disable itself.
- Nathan also will add this change to mpool re-write on master
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0 *
Review Master MTT testing (https://mtt.open-mpi.org/)
- Widespread failure of mpool / rcache failure on usNIC last night.
- Ralph is seeing a bunch of attribute failures on 1.10.
- Jeff is passing in BTL parameters that limits him to a shared memory component, but it's going across nodes. So the attribute thinks it's failing, because some of them can't communicate.
- whatever happened to better faster
- Ralph needed an interface to submit results. That's there today. They can send a JSON structured packet, and it will submit correctly.
- The other side (webside) Still has a few values that would need to change.
- Submitter side is there.
- Reporter side would need some work.
- Josh had a student write something here in javascript. Long term would be to port that work to framework.
- Perl client uses REST storage.
- Should Jeff change his client to use new interface?
- Josh would recommend waiting until Ralph got his changes in.
- Right now we don't yet do cross product expansion. This would be good for Howard's intern to look at.
- This is the week for Howard / Josh / Ralph to sync up on this work.
- Jeff would like to be able to filter out certain failure signatures to see other bugs. Probably not even in new reporter.
- Ralph needed an interface to submit results. That's there today. They can send a JSON structured packet, and it will submit correctly.
- On Master, you don't need to do ENABLE_THREAD_MULTIPLE anymore.
- Mellanox
- Sandia
- Intel
- Mellanox, Sandia, Intel
- LANL, Houston, IBM
- Cisco, ORNL, UTK, NVIDIA