[Mpi-forum] MPI user survey

Mon Nov 16 09:10:54 CST 2009

On Nov 16, 2009, at 5:15 AM, Richard Treumann wrote:

> I did not see this until Monday morning but here are some late  
> comments.
>

Thanks for all the time and effort, Dick!  Making a good/useful survey  
is a darn hard process...  (I think we've easily consumed at least  
20-30 person-hours on this already!)

> How do you plan to give a reasonable range of stake holders a chance  
> to understand the questions and provide their answers?
>

Are you saying that the questions are difficult to understand?  I'm  
not sure what your indirect assertion is here...?

> ISVs or "closed source product" providers should be queried. They  
> need to deal with their customers who are delivered an MPI 3
>
> Buyers of closed source products are stake holders in this issue. If  
> I am using an MPI based product in my business, how do I feel about  
> needing to buy and validate new software when my MPI provider says  
> his new release breaks my production software?
>
> Providers of MPI implementations may want to think about the  
> implications of an influential customer saying "Ship MPI 3 if you  
> want to but to keep my business you must also keep shipping and  
> supporting the MPI 2.2 that my apps still use.
>

Distilling down your text, are you asking how we plan to get as many  
respondents from as many different categories as possible?

If so, the plan is loosely defined as follows: advertise it heavily at  
the BOF this week.  Further, advertise it as heavily as possible via  
public venues (e.g., open source mailing lists, etc.).  Finally,  
advertise it as heavily as possible with individual contacts (e.g.,  
MPI implementor organizations who have contacts with ISVs and other  
closed source application developers, customers, etc.).

> Typical number of tasks in an MPI job may not be very informative.  
> The question should (also?) try to get at:
>
> 1) What is the largest job you tend run today and consider important  
> (vs something run but not critical to the mission / business)
> 2) What number of tasks do you anticipate wanting to run (and  
> consider important) within the next few years?
>
> I do not see anything clearly addressing the question of "How many  
> cores. CPUs. PEs (whatever term) would you hope/expect to be able to  
> apply to your largest problems?"
>

The rationale for these questions was mainly demographic data  
gathering that can be used to help qualify/quantify further answers  
(e.g., "among respondents who said they run apps at 2049+ MPI procs,  
only 3% use one-sided communications").

Do you have suggestions for better wording of those questions, or  
replacement questions?

> Much of the hybrid discussion is about solving problems with so many  
> PEs that MPI cannot scale to enough tasks. It seems likely to me  
> that someone may want to use a million processing elements and that  
> MPI will never scale well to a million tasks.
>
> I assume question with the check boxes is asking about "error  
> handlers". It refers to "error handles"
> http://mpi-forum.questionpro.com/akira/TakeSurvey;jsessionid=dcaTO_iOlMbIqcQeW98ts
>
> I do not know what this question means.
>

I'm not quite sure how to parse the above (the link doesn't take me to  
a unique question):

1. Are you referring to the "How much are each of the following sets  
of MPI functionality used in your MPI applications?" question?
2. Are you saying that the phrase should be "Error handlers / error  
checking" (instead of "Error handles")?
3. Are you saying that it is unclear what this specific item means?

> MPI one-sided communication performance is more important to me than  
> supporting a rich remote memory access (RMA) feature set.
>
> Having been involved in several email discussions and knowing what  
> is being pressed, I can guess the desired answer. We cannot  
> legitimately make decisions based on such an ambiguous question:  
> What expectations of typical MPI communication are included in  
> "rich"? Is using any communicator except MPI_COMM_WORLD part of  
> "rich"? Is getting a non-MPI_SUCCESS return code for an error part  
> of rich?
>

Keith and I went round and round on this particular question in an off- 
list email thread.  His final suggestion, which I think is a good one,  
is to reword the question thusly:

"MPI one-sided communication performance (e.g., message rate and  
latency) is more important to me than supporting a rich remote memory  
access (RMA) feature set (e.g., communicators, datatypes)."

What do you think?

> "MPI application control of fault tolerance" -- sounds too easy. Who  
> would say no? I would be happy to have MPI_Tolerate_faults( yes | no)
>

We agonized over this one as well.  The current text is an attempt to  
imply that MPI won't magically handle faults for you, but rather give  
control of faults to the application.  The implication is that the  
application will have to *do* something rather than have MPI just  
handle faults "better" (which I agree, everyone will just say, "yes,  
gimme that!").

But I can definitely see how a quick read of that would lead to the  
interpretation of what you said.

Can you suggest better wording?

> It would be helpful to be able to read the survey questions before  
> starting to answer. I personally dislike answering a survey question  
> without knowing what all the questions are. I must try to pick the  
> best answer even though the question does not really fit my concern  
> or I must ignore my concern on the assumption a better targeted  
> question must be on the survey. (I cannot count the number of times  
> I have started to answer a telephone survey only to say in the  
> middle "We're done" because I have deduced the survey questions will  
> not really help capture my views")
>

Josh and I talked about this quite a bit while putting the questions  
in the survey web thingy.  We balanced your concerns against splitting  
up the survey into seemingly-smaller chunks so that we don't  
intimidate respondents from abandoning the survey in the middle.  In  
the end, we struck a compromise:

- split the survey up into multiple pages so that it looks/feels  
shorter (vs. one, massively-long page)
- don't use data validation rules
- provide navigation links in the survey (forward and back)

The latter 2 points allow you to click click click through the entire  
survey, reading everything without entering any answers.  Then, when  
you've read everything, you can navigate back (or just restart) and  
actually take the survey.

That was the best compromise we could think of.  Do you think we  
should have one big, massively-long page with all the questions?

-- 
Jeff Squyres
jsquyres at cisco.com