[Mpi-forum] MPI - write to same binary file. Each process shows huge difference in time intervals to complete
Catherine Jenifer Rajam Rajendran
catrajen at iu.edu
Tue Jul 10 13:25:34 CDT 2018
Thank you so much for the update!
I already posted in Stack Overflow - no help yet! I will check with our
Cray machine support staff and I will shoot an email to MPICH too! Really,
thanks for the response!
On Tue, Jul 10, 2018 at 2:21 PM, Jeff Hammond <jeff.science at gmail.com>
> Hi Catherine,
> It sounds like this is an implementation issue. This email list is for
> discussion of the MPI standard itself. While many of us have a great deal
> of implementation expertise, you are likely to get the best response from
> the user list associated with the implementation you are using:
> MPICH <discuss at mpich.org>
> Open-MPI <users at lists.open-mpi.org>
> MVAPICH2 <mvapich-discuss at cse.ohio-state.edu>
> If you are using Cray MPI, you'll need to contact Cray support, perhaps
> via the staff that support your Cray machine locally. For Intel MPI, start
> with https://software.intel.com/en-us/forums/intel-
> clusters-and-hpc-technology/. I don't know about SGI or NEC support,
> unfortunately. You may also have good luck with StackOverflow - there are
> quite a few MPI experts there.
> I'll note that most of the implementations of MPI I/O are based on ROMIO,
> which is part of MPICH, so you might want to start with the MPICH user list.
> On Tue, Jul 10, 2018 at 11:01 AM, Catherine Jenifer Rajam Rajendran <
> catrajen at iu.edu> wrote:
>> Hi All,
>> I am trying to write in the same binary file using MPI. I set the offset
>> for each process in the beginning as per the rank. Then the following code
>> snippet in C runs. All MPI process executes and computes the value and it
>> writes to the exact offset as set.
>> The problem I am facing is, say, out of 32 Process, one process is
>> executed in 2 hours. Rest of the process keeps running for more than 24
>> hours, The thing is, it computes the values as expected but it takes so
>> much time. It seems like a deadlock situation, each process waits for some
>> resource. But, I am not sharing/communicating between the processes. I am
>> just using MPI_File_write_at to write at a specific location in the binary
>> I need to mention that each process computes huge amount of data so
>> storing it temporarily seemed inappropriate. I want to write the output in
>> single file as number of processes is increased depending on input data.
>> Number of computations are evenly distributed to all process. So, why does
>> process takes different time interval to finish its job?!
>> if(my_rank == step%num_cpus)
>> Calc = Calculation();
>> buf = (double)Calc;
>> MPI_File_write_at(outFile, OUT_ofst, buf, 1, MPI_DOUBLE,
>> Calc = 0.0;
>> OUT_ofst += num_cpus*MPI_File_write_at(sizeof(double));
>> I am new to MPI and I guess people must have had similar issues while
>> executing in MPI. Can anyone help me out please! I can provide more details
>> if needed.
>> mpi-forum mailing list
>> mpi-forum at lists.mpi-forum.org
> Jeff Hammond
> jeff.science at gmail.com
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpi-forum