[Mpi-forum] MPI - write to same binary file. Each process shows huge difference in time intervals to complete
Jeff Hammond
jeff.science at gmail.com
Tue Jul 10 13:21:42 CDT 2018
Hi Catherine,
It sounds like this is an implementation issue. This email list is for
discussion of the MPI standard itself. While many of us have a great deal
of implementation expertise, you are likely to get the best response from
the user list associated with the implementation you are using:
MPICH <discuss at mpich.org>
Open-MPI <users at lists.open-mpi.org>
MVAPICH2 <mvapich-discuss at cse.ohio-state.edu>
If you are using Cray MPI, you'll need to contact Cray support, perhaps via
the staff that support your Cray machine locally. For Intel MPI, start
with
https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/.
I don't know about SGI or NEC support, unfortunately. You may also have
good luck with StackOverflow - there are quite a few MPI experts there.
I'll note that most of the implementations of MPI I/O are based on ROMIO,
which is part of MPICH, so you might want to start with the MPICH user list.
Best,
Jeff
On Tue, Jul 10, 2018 at 11:01 AM, Catherine Jenifer Rajam Rajendran <
catrajen at iu.edu> wrote:
> Hi All,
>
> I am trying to write in the same binary file using MPI. I set the offset
> for each process in the beginning as per the rank. Then the following code
> snippet in C runs. All MPI process executes and computes the value and it
> writes to the exact offset as set.
>
> The problem I am facing is, say, out of 32 Process, one process is
> executed in 2 hours. Rest of the process keeps running for more than 24
> hours, The thing is, it computes the values as expected but it takes so
> much time. It seems like a deadlock situation, each process waits for some
> resource. But, I am not sharing/communicating between the processes. I am
> just using MPI_File_write_at to write at a specific location in the binary
> file.
>
> I need to mention that each process computes huge amount of data so
> storing it temporarily seemed inappropriate. I want to write the output in
> single file as number of processes is increased depending on input data.
> Number of computations are evenly distributed to all process. So, why does
> process takes different time interval to finish its job?!
>
> for(i=1;i<=limit;i++)
> {
> for(j=i+1;j<=limit;j++)
> {
> if(my_rank == step%num_cpus)
> {
> Calc = Calculation();
> buf[0] = (double)Calc;
> MPI_File_write_at(outFile, OUT_ofst, buf, 1, MPI_DOUBLE,
> &status);
> Calc = 0.0;
> OUT_ofst += num_cpus*MPI_File_write_at(sizeof(double));
> count++;
> }
> step++;
> }
> }
>
> I am new to MPI and I guess people must have had similar issues while
> executing in MPI. Can anyone help me out please! I can provide more details
> if needed.
>
> Thanks,
> Catherine
>
> _______________________________________________
> mpi-forum mailing list
> mpi-forum at lists.mpi-forum.org
> https://lists.mpi-forum.org/mailman/listinfo/mpi-forum
>
>
--
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpi-forum/attachments/20180710/65fa9349/attachment.html>
More information about the mpi-forum
mailing list