Persistent Collective Operations in MPI-3 for free!

dandelion
source: thewishwall.org

We discussed persistent collectives at the MPI Forum last week. It was a great meeting and the discussions were very insightful. I really like persistent collectives and believe that MPI implementors should support them!

In that context, I wanted to note that implementors can do this easily and elegantly in MPI-3 without any changes to the standard. We used this technique already in 2012 in the paper “Optimization Principles for Collective Neighborhood Communications”. But let me recap the idea here.

The key ingredients are communicators (MPI’s name for immutable process groups) and Info objects. Info objects are a mechanism for users to pass additional information about how he/she will use MPI to the library. Info objects are very similar to pragmas in C/C++. Some info strings are defined by the standard itself but MPI libraries may add arbitrary strings to it.

So one way to specify a persistent collective is now to duplicate the communicator to create a new name, e.g., my_persistent_comm. At this communicator, the user can specify a info object to make specific operations persistent, e.g., mympi_bcast_is_persistent. The MPI library is encouraged to choose a prefix specific to itself (in this case “mympi”).

The library can now set a flag on the communicator that is checked at broadcast calls whether they are persistent. By passing this info object, the user guarantees that the function arguments passed to the specific call (e.g., bcast) on this communicator will always be the same. Thus, the MPI library can make the call specific to the arguments (i.e., implement all optimizations possible for persistence) once it has seen the first invocation of MPI_Ibcast().

This interface is very flexible, one could even imagine various levels of persistence as defined in our 2012 paper: (1) persistent topology (this is implicit in normal and neighborhood collectives), (2) persistent message sizes, and (3) persistent buffer (sizes and addresses). We describe in the paper optimizations for each level. These levels should be considered in any MPI specification effort.

I agree that having some official support for persistence in the standard would be great but these levels and info arguments should at least be discussed as alternative. It seems like big parts of the MPI Forum are not aware of this idea (this is part of why I write this post ;-) ).

Furthermore, I am mildly concerned about feature-inflation in MPI. Adding more and more features that are not optimized because they are not used, because they have not been optimized, because they were not used …. maay not be the best strategy. Today’s MPIs are not great at asynchronous progression of nonblocking collectives, and the performance of neighborhood collectives and MPI-3 RMA is mostly unconvincing. maybe the community needs some time to optimize and use those features. At the 25 years of MPI symposium, it became clear that big parts of the community share a similar concern.

Keep the great discussions up!

SPCL Activities at SC16

After the stress of SC16 is finally over, let me summarize SPCL’s activities at the conference.

In a nutshell, we participated in two tutorials, two panels, the organization of the H2RC workshop, I gave three invited talks and my students and collaborators presented our four papers at the SC papers program. Not to mention the dozens of meetings :-) . Some chronological impressions are below:

1) Tutorial “Insightful Automatic Performance Modeling” with A. Calotoiu, F. Wolf, M. Schulz


2) Panel at Sixth Workshop on Irregular Applications: Architectures and Algorithms (IA^3)

I was part of a panel discussion on irregular vs. regular structures for graph computations.


The opening


Discussions :-)



Audience

3) Tutorial “Advanced MPI” with B. Gropp, R. Thakur, P. Balaji

I was co-presenting the long running successful tutorial on advanced MPI.


The section on collectives and topologies

4) Second International Workshop on Heterogeneous Computing with Reconfigurable Logic (H2RC) with Michaela Blott, Jason Bakos, Michael Lysaght

We organized the FPGA workshop for the second time, was a big success, people were standing in the back of the room. We even convinced database folks (here, my colleague Gustavo Alonso) to attend SC for the first time!


Gustavo’s opening


Full house

5) Invited talk at LLVM-HPC workshop organized by Hal Finkel

I gave a talk about Polly-ACC (Tobias Grosser’s work) at the workshop, quite interesting feedback!


Nice audience


Great feedback

6) Panel at LLVM-HPC workshop

Later, we had a nice panel about what to improve in LLVM to deal with new languages and/or accelerators.

7) SIGHPC annual member’s meeting

As elected member at large, I attended the annual members meeting at SC16.

8) Collaborator Jens Domke from Dresden presented our first paper “Scheduling-Aware Routing for Supercomputers


Huge room, nicely filled.

9) Booth Talk at Tokio Institute of Technology booth

Was an interesting experience :-) . First, you talk to two people, towards the end, there was a crowd. Even though most people missed the beginning, I got very nice questions.

10) Collaborator Bill Tang presented our paper “Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide

11) SPCL student Tobias Gysi presented our paper “dCUDA: Hardware Supported Overlap of Computation and Communication

12) Collaborator Maxime Martinasso presents our paper “A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers

But as usual, it’s always the informal, sometimes even secret, meetings that make out SC’s experience. The two SPCL students Greg and Tobias did a great job learning and representing SPCL while I was running around between meetings. I am so glad I didn’t have to present any papers this year (i.e., that I could rely on my collaborators and students :-) ). Yet, it’s a bit worrying that my level of business (measured by the number of parallel meetings and overbooked calendar slots) is getting worse each year. Oh well :-) .

Keynote at HPC China and Public lecture at ETH on Scientific Performance Engineering in HPC

In the last two weeks I gave two presentations on scientific performance engineering, a theme that describes best what we do at my lab (SPCL) at ETH. The first lecture was a keynote at HPC China, the largest conference on High-Performance Computing in Asia (and probably the second largest world-wide). I have to say that this was definitely the best conference that I attended this year due to several reasons :-) .


Here an impression from the impressive conference.

Shortly after that, I presented a similar talk at my home university ETH Zurich as the last step in a long process ;-) . It was great as well — the room was packed (capacity ~250) and people who came late even complained that there were not enough seats — well, their fault, there were some in the front :-) .

Here some impressions from this important talk:


My department head Prof. Emo Welzl introducing the talk with some personal connections and overlapping interests


Some were even paying attention!


One of the larger lecture rooms in ETH’s main building

In case you missed it, I gave a longer version of the same talk at Cluster 2016 in Taipei (more content for free!).

SPCL barbequeue version 3 (beach edition)

The next iteration in our celebration of SPCL successes since January was completed successfully! This time (based on popular demand) with a beach component where students could swim, fight, and be bitten by interesting naval creatures.

We celebrated our successes at HPDC, ICS, HOTI, and SC16!


Even with some action — boats speeding by rather closely ;-) .

Later, we moved to a barbequeue place a bit up a hill to get some real meat :-) .


We first had to conquer the place — but eventually we succeeded (maybe seond somebody earlier next time to occupy it and start a fire).


We had 7kg of Swiss cow this time!


And a much more professional fireplace!


Of course, some studying was also involved in the woods — wouldn’t be SPCL otherwise.


Including the weirdest (e.g., “hanging”) competitions.


We were around 20 people and consumed (listed here for the next planning iteration):

  • 6 x 1.5l water – 1l left at the end
  • 18×0.33l beer, 6×0.5l beer – all gone (much already at the beach)
  • 7l wine
  • 2l vodka
  • 7kg cow meat (4.5kg steaks, 2kg cevapcici, 0.5kg sausage)
  • 2 large Turkish-style breads (too quickly gone)
  • 1 quiche (too quickly gone)
  • 12 American cookies + 16 scones (both home-made)
  • 3/4 large watermelon
  • 0.5kg dates
  • 2kg grapes, 3kg peaches, 5 cucumbers,
  • 0.5kg grill pepper, 1kg mushrooms

What are the real differences between RDMA, InfiniBand, RMA, and PGAS?

I often get the question how the concepts of Remote Direct Memory Access (RDMA), InfiniBand, Remote Memory Access (RMA), and Partitioned Global Address Space (PGAS) relate to each other. In fact, I see a lot of confusion in papers of some communities which discovered these concepts recently. So let me present my personal understanding here; of course open for discussions! So let’s start in reverse order :-) .

PGAS is a concept relating to programming large distributed memory machines with a shared memory abstraction that distinguishes between local (cheap) and remote (expensive) memory accesses. PGAS is usually used in the context of PGAS languages such as Co-Array Fortran (CAF) or Unified Parallel C (UPC) where language extensions (typically distributed arrays) allow the user to specify local and remote accesses. In most PGAS languages, remote data can be used like local data, for example, one can assign a remote value to a local stack variable (which may reside in a register) — the compiler will generate the needed code to imlement the assignment. A PGAS language can be compiled seamlessly to target a global load/store system.

RMA is very similar to PGAS in that it is a shared memory abstraction that distinguishes between local and remote memory accesses. RMA is often used in the context of the Message Passing Interface standard (even though it does not deal with passing messages ;-) ). So why then not just calling it PGAS? Well, there are some subtle differences to PGAS: MPI RMA is a library interface for moving data between local and remote memories. For example, it cannot move data into registers directly and may be subject to additional overheads on a global load/store machine. It is designed to be a slim and portable layer on top of lower-level data-movement APIs such as OFED, uGNI, or DMAPP. One main strength is that it integrates well with the remainder of MPI. In the MPI context, RMA is also known as one-sided communication.

So where does RDMA now come in? Well, confusingly, it is equally close to both PGAS and it’s Hamming-distance-one name sibling RMA. RDMA is a mechanism to directly access data in remote memories across an interconnection network. It is, as such, very similar to machine-local DMA (Direct Memory Access), so the D is very significant! It means that memory is accessed without involving the CPU or Operating System (OS) at the destination node, just like DMA. It is as such different from global load/store machines where CPUs perform direct accesses. Similarly to DMA, the OS controls protection and setup in the control path but then removes itself from the fast data path. RDMA always comes with OS bypass (at the data plane) and thus is currently the fastest and lowest-overhead mechanism to communicate data across a network. RDMA is more powerful than RMA/PGAS/one-sided: many RDMA networks such as InfiniBand provide a two-sided message passing interface as well and accelerate transmissions with RDMA techniques (direct data transfer from source to remote destination buffer). So RDMA and RMA/PGAS do not include each other!

What does this now mean for programmers and end-users? Both RMA and PGAS are programming interfaces for end-users and offer several higher-level constructs such as remote read, write, accumulates, or locks. RDMA is often used to implement these mechanisms and usually offers a slimmer interface such as remote read, write, or atomics. RDMA is usually processed in hardware and RMA/PGAS usually try to use RDMA as efficiently as possible to implement their functions. RDMA programming interfaces are often not designed to be used by end-users directly and are thus often less documented.

InfiniBand is just a specific network architecture offering RDMA. It wasn’t the first architecture offering RDMA and will probably not be the last one. Many others exist such as Cray’s RDMA implementation in Gemini or Aries endpoints. You may now wonder what RoCE (RDMA over Converged Ethernet) is. It’s simply an RDMA implementation over (lossless data center) Ethernet which is somewhat competing with InfiniBand as a wire-protocol while using the same verbs interface as API.

More precise definitions can be found in Remote Memory Access Programming in MPI-3 and Fault Tolerance for Remote Memory Access Programming Models. I discussed some ideas for future Active RDMA systems in Active RDMA – new tricks for an old dog.

How many measurements do you need to report a performance number?

The following figure from the paper “Scientific Benchmarking of Parallel Computing Systems” shows the completion times for multiple identical runs of a tuned version of the high-performance Linpack (HPL) on the same system. It illustrates how important correct measurements are. Here, one may report 77.4 Tflop/s but when repeating the benchmark see as little as 61.2 Tflop/s. It suggests that one should use sound statistics when reporting any performance result.

Computer science is often about measuring computer systems. Be it time, energy, or performance, all these metrics are often non-deterministic in real computer systems and a single measurement may or may not provide a reliable result. So if you are not sloppy when measuring your system, you will measure several executions and report an aggregate measure such as the arithmetic or geometric average or the median. Well, but now the question is: “how many is several”? And this is where it gets less clear.

Typically, “several” is defined very informally, so if the measurement is cheap (such as a network latency measurement), it can be 1,000 or even 1,000,000. If it’s expensive (such as full-scale supercomputer runs), we’re very quickly back to a single measurement. But does it make sense to define the number of measurements based on the execution cost? Of course not — it should depend on the variability of the data! Who would have thought that …?

Unfortunately, most benchmarkers do not take the data variability into account at all in practice. Why not? Isn’t that somewhat clear that one needs to? Yes, it is, but it’s also hard! But actually, it’s not that hard if one knows some basic statistics. The simplest way is to check if one has enough measurements for a given variability in the result. But how to assess the variability? Well, one needs to look at some samples — ah, a catch 22? I need samples to know how many samples I need? Yes, that is true — in fact, the more samples I have, the higher my confidence in the variability and the correctness of my reported number.

A simple technique to assess the confidence of my measurement (we are simplifying this somewhat here) is to compute the confidence interval. Confidence Intervals (CIs)
are a tool to provide a range of values that include the true mean with a given probability p depending on the estimation procedure. So if the measurement is 1 second and the 95% CI is the range [0.9;1.1] then there is a 95% probability that the true mean is within that interval. There are two basic types of CIs: (1) confidence intervals around the mean assuming a normal distribution and (2) nonparametric confidence intervals around the median without assumptions on the distribution. The former CI one is simplest to compute: [mean-t(n-1,p/2)/sqrt(n); mean+t(n-1,p/2)/sqrt(n)] where mean is the arithmetic mean, n is the number of samples, and t(x,p) is student’s t distribution with x degrees of freedom. So it’s easy to see that the interval quickly gets tighter when the number of samples grows. But which computing system generates measurements following a standard distribution, which means that it’s equally likely to become faster than slower. Well, my computers are certainly more often becoming slower than faster leading to a right-skewed distribution.

So how do we get to confidence intervals of non-normally distributed measurements? Well, first of all, if the data is not normally distributed, the average makes little sense as it will be skewed as well. So one usually reports the median (the n/2-th element in the sorted set of all n measurements) as the most likely value to be observed in practice. But how to get to our confidence interval? Since we cannot assume any distribution of the values, we work on the sorted set of measurements and call the rank-i value the ith value in the set. Now we identify rank floor((n-z(p/2)*sqrt(n))/2) to ceil(1+(n+z(p/2)*sqrt(n))/2) as the conservative CI which is commonly asymmetric as well.

So ok, we can now compute this CI as statistical measure of certainty of our reported median. Median what? Don’t we like averages? Well, again, averages are not too useful for non-normally distributed data *unless* you care about only an accumulation of many measurements, i.e., you only want to know how expensive 1,000 iterations are and you do not care about every single one. Well, if this is the case, just measure the 1,000. If you’re well-versed in statistics, you will now recognize the connection to the Central Limit Theorem :-) .

But now again, how many measurements do we actually need?? To answer this, we’d first need to define a needed level of certainty, for example 95%. Then, we define an accepted error in our reporting around the median, for example 1%. In other words, we would like to have enough measurements to be 95% sure that the real median is within 1% of our reported value. Hey, so we’re now back to a single reported value just together with a certainty! So how do we achieve this? Well, for normally distributed data in the case (1), one could compute the number of needed measurements. But that doesn’t work with real computers, so let’s skip this here. In the nonparametric case, no explicit formula is known to us, so we would need to recompute the confidence interval after each measurement (or a set of measurements) and we could stop measuring once the 95% CI is within the 1% interval around the mean.

Wow, so now we know how to *really* measure and report performance! In fact, in practice, we often need less than 1,000 measurements to reach a tight interval with high confidence. So if they’re cheap, we can as well do them and check afterwards of the statistics make sense. But what if we are running out of benchmarking budget before we reach the required accuracy — for example, each measurement takes a day and we only have four days but after four days, the CI is still wider than we’d like it to be? Well, bad luck! In that case, we can only report the wide CI and leave it up wo the reader/observer to conclude if our measurements make sense in his context.

I wish you happy (and correct) measuring! Torsten Hoefler

This blog post summarizes a part of the paper “Scientific Benchmarking of Parallel Computing Systems” which appeared at IEEE/ACM Supercomputing 2015. The full paper provides more insight and references around this topic and also the equation for the number of measurements assuming a normal distribution. The paper also establishes more rules for sound performance analyses that I may blog on later. Spread the word and cite the paper if you find these rules useful :-) .

United airlines versus Switzerland

Since my time in the US, I am a long-standing United airlines (actually Continental) customer. I enjoyed the service and especially the modern computer system. The transparency of United operations towards their customers is just years ahead of their competition. I simply enjoy viewing the seat map, upgrade standby list, boarding pass etc. in my app or online. The even nicer feature is to see where the next plane comes from and if it is still on schedule. Yes, I’m a geek but kudos United!

Well, now I live in Switzerland (ZRH as main airport) and sometimes catch myself considering to switch. My last flight was again one of these moments … let me elaborate.

I’m usually traveling on tight schedules (because I unfortunately travel too much). This means, my plane usually starts boarding when my train arrives in the station. The time to get to the gate is very deterministic. So far, I have done this for intra-European flights, and it works great! After United sent me this email about the online check-in and app, I thought I’ll try it for overseas as well – what can go wrong!?

Well, so I checked in using the app — worked great as expected. Then, I made my first mistake: I asked at the check-in to reconfirm if that would work (I was on a very tight time schedule and needed to get breakfast in the lounge, so I could not afford to go back and forth). Ok, they told me that I need to repeat (at least parts of) the check-in procedure. So this took forever, as usual, this is why I used the app check in in the first place. So far, overall, a time loss (needed to do both, app check in and counter check in …). Uff! Also, the personnal seemed unfriendly as usual, they’re not United reps after all and it seems they don’t care much about service or the perception of the airline in general.

Ok, at the end, I proceeded to ask if United managed to buy into the Panorama lounge so I cound get breakfast. This requires a bit of explanation: the international terminal E in ZRH has no SWISS lounge. United only cooperates with SWISS afaik, which operates a lounge in A. So all Staralliance member airlines BUT United buy into the Panorama lounge which seems to be somewhat independent. There are immigration and train between A and E, so no way to quickly change terminals. So I asked if United managed to buy into the E lounge, the person at check-in reconfirmed with the apparent lead and both agreed that I could, as a Staralliance Gold member with an Eco ticket use the E lounge now (this was after I explicitly told them that this did not used to be this way). Ok, great, so I minimize the risk of missing the plane which was boarding at that time.

When showing up at the E lounge, I was told that I was not allowed in. They even tried calling check-in or a United representative – nada, apparently all at the gate. So they called the gate and got the answer “the plane is already boarding, so send the customer down”. Well, I wonder if they have breakfast for me at the gate. Total fail United, total fail! The lounge desk personnel was not further willing to continue this discussion, so I went to the gate and asked about the situation.

First, somebody from swissport (the company that United hires for managing check-in etc.): I said I wanted to complain about their process. Apparently swissport doesn’t even have a mechanism to complain and they sent me to complain to United. Seriously, they do NOT have such a channel — well, this makes it just too obvious how much they care about the customer. Ok, finally, there was a United representative at check-in (first time I saw somebody from United at the ground in ZRH actually). So I explained my situation and got a “sorry, they must have made a mistake”. It was, like often, with this latent, one could call it arrogant, “well, maybe it’s your fault and you may not have understood what they really said/meant” sub-tone. Gaaaaaaaa! At this point, I was ready to go up any wall, and nearly lost my temper. I repeated that I clearly understood what they said (even what they discussed in German) and that it was a very simple question after all. Fail #2: don’t treat customers like idiots (at least check before if they are).

Ok, accepting this mess (the boarding process was coming to an end), I explained that my major problem with this is that I didn’t have breakfast. Immediately, they pointed me to the airport shop so I can go and buy a “Gipfeli”. Wow, you screw up and then you ask me to fix it myself — great cutomer care! I asked explicitly if they wouldn’t have some snack in the plane – I got a clear no (which we all know is false). Ok, so I bought my breakfast before running into the plane … wow. I mean, it’s really not about the $20 breakfast, but the total unwillingness to (1) accept that something went wrong on their side and (2) trying to fix it, even a little bit!

And then, I get into the plane and it’s like a different planet. As usual, everybody is friendly and helpful. The purser (who is now called “International Service Manager”) asked for feedback and I told him the story. He suggested to send the story (especially since the situation in ZRH is latently suboptimal) to United. I also got to talk to the United sales manager for Switzerland for quite a while. We had very interesting discussions and I even learned something, well, I’m glad I left the problems on the ground but I’m not really looking forward to checkin in to the next flight ;-) .

One more somewhat riduculous issue in the context of United vs. Switzerland is that, sometimes, one has to use SWISS as a carrier (based on either route or time). United sells code-share tickets but really nothing seems to work between the two airlines. For example, there is no way that either SWISS or United can reserve a seat for such a flight. Yes, not even SWISS can! Sorry, what!? My secretary gave up after hours on the phone … so I’m not even looking forward to the return flight the day after tomorrow (probably 9 hours in a middle seat …). Gaaa! This seems to be the most effective way to get rid of frequent fliers. Thank you.

SPCL activities at SC15

I am just back from SC15, definitely the most stressful week of my year. It was much worse than usual, I slept an average of 4.5 hours last week because I had a full schedule each day and had to prepare over night. Fortunately, I have my device measuring my sleep so I could understand why I felt so miserable :-) .

But it was absolutely amazing! I really love SC, the community, the excitement, the science at the conference. As usual, I learned a lot and SPCL communicated a lot. This year, I brought two students with me: Maciej and Tobias. Here is what we did at SC15:

  1. Sunday morning: International Workshop on Heterogeneous High-performance Reconfigurable Computing
    I co-organized this workshop together with a great team! My special thanks go to Michela and Jason! The workshop was wildly successful. The room was packed for the two keynotes by Doug Burger and Jason Cong. We can start an interesting discussion about the role of reconfigurable logic in HPC.

  2. Sunday afternoon: Tutorial on Insightful Automatic Performance Modeling

    Together with Alex Calotoiu (main presenter), Felix Wolf, and Martin Schulz. The tutorial discussed our previous work in automatic performance modeling and was well attended (~30)! I’d like to change some things but we’ll see if I can be convincing enough for my co-presenters.

  3. Monday: Full-day tutorial on Advanced MPI Programming

    Was as usual very well attended (~50) and a lot of fun to teach! I had to sneak out in the morning to speak at the panel “Research Panel: A Best Practices Guide to (HPC) Research” which was also a lot of fun (especially with Bart Miller).

    If you couldn’t make it then I’d suggest the book on the same topic (has very similar, actually slightly more, content).

  4. Tobias prepared his poster for the ACM student research competition
    He even made it into the finals and presented his work to the jury!

  5. SIGHPC Annual Meeting

    As an elected officer, I attended the SIGHPC BoF at SC15. Many exciting news, especially Intel’s fellowship program!

  6. Graph500 BoF

    As each year, released the Green Graph500 list. My slides.

  7. BoF Performance Reproducibility in HPC – Challenges and State-of-the-Art

    I presented my disruptive view at this BoF. Basically saying that we may want to give up and care about interpretability first! Similar in vein to my talk later in the week.

  8. Tobias presented the STELLA paper
  9. Georgios presented the diameter-two topologies paper

    A collaboration with IBM Research Zurich. Here’s the paper.

  10. Maciej received the George Michael HPC fellowship

    During the SC15 awards ceremony. Well done Mac!

  11. I presented our paper “Scientific Benchmarking of Parallel Computing Systems

    The room was nicely filled. The talk was rather provocative but I put cuddly vegetables on the slides. Thus, must be fine ;-) . Here are slides and paper!



Finally done! I arrived home and accepted the Latsis prize today. Now ready to get a lot of sleep …

2nd SPCL Barbequeue

Continuing our lab tradition that actually started in 2009 (with two people), we celebrated our scientific achievements with a party (now with 20 people). We had a lot to celebrate and even more that I cannot mention here yet (both will be announced by ETH very soon!).

We started at 4pm even though most people arrived around 5pm (partially due to some confusion about the location) and the hard core partied until 12:45am when we nearly ran out of firewood.

Some (rough) consumption statistics:
- ~10l wine
- ~27 bottles of beer
- ~2.5l various hard liquors (too much!)
- 16 beef patties (1.6kg), 8 burger buns
- home-marinated chicken (1kg)
- Bauern sausage (2kg)
- various other (Polish etc.) sausages (~1.5kg)
- 2 full-plate quiches (should have had three, were gone very fast)
- again, low consumption of non-alcoholic beverages (4l water, 2l juice)
- ~2kg vegetables (cucumber, pepper, …)
- 1kg bread
- 45 home-made american-style cookies (chocolate chip, pumpkin, raisin)
- various snacks (peanuts, chips, …)


Two preparing firewood and one watching (no comment!)


Took a while to get the fire going because of the really wet wood but then it was unstoppable!


lots of food and drinks (I don’t have a good picture of the big pile of food unfortunately)


Even special wintage wines from 1993 from Moldovia.


Starting the special BBQ setup after making enough ember.


Nice chats, nice forest (Switzerland rocks)


When shopping, we couldn’t not buy the Swiss Eidgenoss beer “Ein Schluck Heimat” :-) .


The grill looked 10x more professional than last time (see some exponential growth here).


It got dark a bit early, well, it’s late fall. BUT the weather was very nice and even though it was around 10 C, it was never cold due to the fire (so we can do this pretty late/early in the year).


The fire went strong …


Th Eidgenoss beer was finished first (it was actually pretty good) :-) .


The fire went very strong until the bitter end of the wood, we were nearly running out at 12:45am (nearly 8 hours after the start). We decided to leave some wood for the next people :-)

Microsoft Store – the worst shopping experience I can remember

You would think that a company like Microsoft has their online retailing somewhat under control. My first (and probably last) attempt to order something there failed miserably. Here’s the story:

I needed a new laptop for teaching, not too pricey, touchscreen, convertible. The Acer Aspire 11 seemed to fit that category. So I found a good deal on the Microsoft store for $449 through that link. It was Thursday August 13th and I needed it until August 24th — great, shipping in 3-7 business days, that works!

I added it to the cart, created a new account, verified it, works! Then I proceeded to checkout and after entering my credit card information the whole thing crashed. I only got a blank page and nothing else. Well, ok, close the store and retry logging in. Of course some cookie got stuck and when logging in, I got only the default error message “An error has occurred, ask support”. It is Thursday night.

Ok, well, there’s this chat feature and I tried it. Thirty minutes later, the person at the other end told me that the product I just purchased is not available. Well, weird, I sent her the link and she acknowledged that she sees the “add to cart” button but the product is not available. Huh, must be a bug? At the end, she could not push the old order through (something I do on a regular basis because I travel a lot). I remark that I had (have) an order number and everything but it seemed like this is not good for anything — I’m wondering what kind of database they have. It was also confusing that she constantly asked me what I ordered and who I am (I mean the order number should have these things attached … oh well).

Fine, the conclusion was to try another browser and re-order it myself. An hour of my life gone … I tried Firefox (was Chrome before) and indeed the store worked again (no cookies). I was able to order it. Now my bank declined the order due to some fraud alert. Fine, I called the bank and pushed the order though, the bank acknowledged (via email, as usual) the full charge and Microsoft sent an order confirmation saying “it may take as long as 4-6 hours for us to process it.”. Phew, done!

Ok, great … now it’s Friday and I have not gotten any shipping confirmation from Microsoft. Weird … 4-6 hours turned into 48 hours. I call the support (chat doesn’t seem to work to inquire about orders). The support line is overly complex and annoying trying to verify my account (why!? I have an order number, what role does the account play?). It takes minutes for them to send a challenge/response email to my self-made email address (as if this is any verification …). Well, I wait patiently on the line, this is my first call. So they tell me again that the product I ordered does not exist. But hey, I have an order confirmation!!!??? Then they blame the bank, I tell them to charge the bank right now again to check. They can’t do it, not sure why. Apparently, it needs to be “escalated”. They take my number and I’ll hear within 24 hours. Fine.

Well, I guess they weren’t able to call a German number, so I didn’t hear anything for 48 hours. Just nothing, no email, nothing. It nearly seems like they silently hope I forgot about the order (and the bank charge). It’s Tuesday the 18th now, getting tight. I call them again. They tell me it was escalated … well, yeah, I know this since I just gave her the case number *hmpf*. Each of these calls takes 30 minutes at least (partly due to the silly account verification even though I have an order number AND a case ID). Well, fine, no news, I need to wait for the “escalation team” which apparently cannot be reached and only operates by interrupting me. I’m a busy person and this is a silly concept, but fine, wait again.

Next day, nothing happened. I call again. AGAIN they tell me that the product I ordered does not even exist anymore. Well, I spell the link above into the phone and the other side is surprised and confused. Then, they are quick to tell me that there was also a problem with my bank but apparently they don’t see that it was resolved (must really be a great database). I gave up, no I just want to cancel the order. BUT they CANNOT cancel it. I now have to rely on their system to drop the order after a while (which it may or may not do, it’s not clear if it’ll wake up in the future and suddenly charge my card and send this laptop). This is a truly horrible shopping system. So fine, I’ll rely on their word, after all, they boast with free returns. But this system appears as extremely unprofessional. Microsoft should be able to do better. THIS is not the way to do business.

I spent a total of four and one half hours on the phone and in chats, all for nothing. I’m not going to compute what my salary was … definitely more than the laptop is worth.

Then I order the same thing on Amazon, well, within minutes I have order confirmation, charge, everything is on its way. However, due to the great Microsoft delay, I had to pay $15 extra for expedited shipping. Thank you Microsoft, this is wonderful!

And the saga continues: This morning, I received an email regarding my case ID. They DID NOT GET that I cancelled this order. Well, why should they, it cannot be cancelled after all. Wow, this is getting truly crazy and very unprofessional. I cannot recommend business with the Microsoft store. Fortunately, I know many higher-up Microsoft employees, I’ll mention this next time I’m in Redmond. Sadly, this is how one creates a bad reputation. I hope this documentation helps to improve the process!

Update (15/8/20): It is getting better — I sent them a link to this description and the answer is: “However we do apologize for the inconvenience that the computer you are requesting is now out of stock and you will not get this PC at the sale price.” – Wow, they’re good at making snarky apologies that don’t sound apologetic at all. There is of course no word about cancelling my order or anything (may still be “impossible”). The item is also STILL on the store webpage and I can still add it to my cart. Yesterday, I thought it couldn’t get worse but they don’t stop to surprise me!

Update (15/8/22): Microsoft, please stop sending me emails. I now received three (!!) more emails, two of them with identical content (see above). I guess it’s not enough to make the snarky comment once. The whole support system now looks to me like an AI/ML algorithm gone wild. I will not reply because I fear it’ll trigger more frustration!

Update (15/8/23): This is no joke, I received another (fourth) email about this. The exact same content as two of the emails before … Microsoft is not missing any chance for snarky comments “… you will not get this PC at the sale price.”. Yes, remind me that I should feel ripped off every day now … please stop!

Update (15/8/25): It is getting funny now. I received another email. Now it is essentially empty and only contains the default text which seems to ask me to call them. But I am not going to do this … well, each call costs me 30 minutes. I also already canceled my order. Wow, this system is incredibly broken, unbelievable. I am typing this post on the other laptop already …