[mpiwg-sessions] will be on a plane today - and some observations

Ralph H Castain rhc at open-mpi.org
Mon Aug 20 10:24:33 CDT 2018


Passing a port on the cmd line for accept/connect was never implemented as I don’t think anyone really cared. Given how OMPI uses PMIx for that operation, it shouldn’t be all that difficult to do.

As noted in the referenced issue, there was a problem last year with cross-mpirun connections. Not sure when I’ll have time to look at it.

Canceling the meeting today is fine with me - I got pulled away and didn’t get the PMIx Groups implementation done (sigh).


> On Aug 20, 2018, at 8:10 AM, HOLMES Daniel <d.holmes at epcc.ed.ac.uk> wrote:
> 
> Hi Howard,
> 
> Thanks for the update. Sounds promising.
> 
> I'm trying to fix the test3.zip example from:
> https://github.com/open-mpi/ompi/issues/3458#issuecomment-322951227 <https://github.com/open-mpi/ompi/issues/3458#issuecomment-322951227>
> 
> If successful, this would extend the testing opportunities for the sandbox code to situations that involve more than one mpirun. The issue is definitely some sort of deadlock in PMIx but I’ve not figured it out completely yet.
> 
> I’m cancelling the meeting today, unless anyone objects in the next 50 minutes.
> 
> Cheers,
> Dan.
>> Dr Daniel Holmes PhD
> Applications Consultant in HPC Research
> d.holmes at epcc.ed.ac.uk <mailto:d.holmes at epcc.ed.ac.uk>
> Phone: +44 (0) 131 651 3465
> Mobile: +44 (0) 7940 524 088
> Address: Room 3415, JCMB, The King’s Buildings, Edinburgh, EH9 3FD
>> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
>> 
>> On 20 Aug 2018, at 15:55, Pritchard Jr., Howard <howardp at lanl.gov <mailto:howardp at lanl.gov>> wrote:
>> 
>> HI Folks,
>> 
>> I’ll be on a plane at 11 AM MDT today so will not be able to call in.
>> 
>> I tried running the tests Dan had added/modified  and observed
>> what he did, that one can’t allow more than one outstanding
>> accept/connect going on at a time or Open MPI’s ORTE gets confused.
>> I reduced this down to a simpler test which hangs with only 3 ranks
>> and am narrowing down what the issue is.
>> 
>> I’ll be opening a PR with changes to chapter 8 of the standard and
>> replacement for MPI_Get_Set_Names later this week.
>> 
>> Howard
>> 
>> -- 
>> Howard Pritchard
>> B Schedule
>> HPC-ENV
>> Office 9, 2nd floor Research Park
>> TA-03, Building 4200, Room 203
>> Los Alamos National Laboratory
>> 
>> _______________________________________________
>> mpiwg-sessions mailing list
>> mpiwg-sessions at lists.mpi-forum.org <mailto:mpiwg-sessions at lists.mpi-forum.org>
>> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions
> 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> _______________________________________________
> mpiwg-sessions mailing list
> mpiwg-sessions at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpiwg-sessions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-sessions/attachments/20180820/a55a3cb9/attachment.html>


More information about the mpiwg-sessions mailing list