Posted on Leave a comment

Research at Microsoft 2020: addressing the present while looking to the future

Microsoft researchers pursue the big questions about what the world will be like in the future and the role technology will play. Not only do they take on the responsibility of exploring the long-term vision of their research, but they must also be ready to react to the immediate needs of the present. This year in particular, they were asked to use their roles as futurists to address pressing societal challenges.

In early 2020, as countries began responding to COVID-19 with stay-at-home orders and business operations moved from offices into homes, researchers sprang into action to identify ways their skills and projects could help while also making personal and professional adjustments of their own. In some cases, they pivoted to directly address the pandemic. A team from Microsoft Research Asia developed the COVID Insights website to promote scientific analysis and understanding of the disease, while the Socially Intelligent Meetings program expanded its work in telepresence technologies to include the Meetings During COVID-19 project. From responses provided by employee volunteers, these researchers are piecing together the effects of taking meetings almost entirely via screens.

Researchers also turned to the wider research community in their pursuit of solutions that would allow people to persevere in these challenging times and prosper beyond them with academic collaboration around topics related to pandemic preparedness and—in August—the New Future of Work symposium. A series of reports conducted by Microsoft considers a variety of information, including research from throughout the company, in studying worker productivity and well-being. The insights are leading to enhancements in Microsoft productivity tools that are available now and in the near future, such as Together mode, virtual commutes, and meditation experiences in Teams (the latter two features roll out next year).

“When a major crisis strikes the world, science and technology research is almost always of paramount importance in response, rehabilitation, and—ultimately—creating resilience for the future. Today, the people of Microsoft Research are proving to be critical in dealing with climate calamities, such as famines and major wildfires; global health threats, such as that posed by the COVID-19 pandemic; the weakening of democratic institutions posed by misinformation and insecure voting infrastructure; wildlife extinction caused by pollution and illegal poaching; and more. While our main mission continues to be grounded in fundamental, long-term research, contributing to societal resilience is also a growing element in how we ensure a good future for all.”

Peter Lee, Corporate Vice President, Microsoft Research & Incubations

Meanwhile, research started before the pandemic feels increasingly significant. In April, two papers with implications on workplace well-being and hybrid scenarios, respectively, were presented on the conference circuit. Researchers from the productivity group developed models that leverage digital activities and other data to suggest appropriate times for workers to switch tasks and take breaks, while researchers in the United Kingdom and Canada built a two-way telepresence system to enhance collaboration among remote and local individuals. In December, Eyal Ofek shared how advances in virtual reality could be used to maximize our workspaces—wherever they may be—during a Microsoft Research webinar.

And while a lot has been said about work in these unprecedented times, research into the dynamics of epidemics themselves moved forward. In September, it was announced that Microsoft Premonition, a system leveraging robotics and genomics to track pathogens responsible for widespread disease, is being made available to additional partners.

Microsoft Premonition researchers worked to create scalable monitoring solutions for early disease detection. In a trial in Houston, Texas, they used smart traps to capture and monitor mosquitoes, and then data was analyzed in the cloud with a goal of spotting new transmission patterns. To learn more, explore the news article.

While research relevant to the pandemic has been of the highest importance this year, Microsoft researchers took their research in a broad range of directions—progress in AI, healthcare technology, and security advanced rapidly. Below is a selection of highlights that came out of Microsoft Research in 2020.

Scaling AI for better performance and real-world applications

This year saw significant breakthroughs for creating AI that is substantially more powerful, scalable, and readily integrated into Microsoft products. The AI at Scale initiative, born from a cornucopia of work in the area in 2020, combines large-scale models, AI supercomputing, and teams of researchers and product engineers working together to implement AI in a variety of Microsoft products and infrastructure.

The Microsoft Turing team’s natural language generation and natural language representation models went from being announced in February to implementation into products like Microsoft Word and Bing, ultimately with one model setting a record on the Xtreme benchmark for cross-lingual transfer learning in October.

Listen to the full interview with Rangan Majumder on the Microsoft Research Podcast.

Helping to power large AI models, the DeepSpeed library with Zero Redundancy Optimizer (ZeRO) also underwent major transformations since its introduction in February. The library initially included support for training models up to 100 billion parameters in size, but by the end of the year, the library was capable of training models up to 1 trillion parameters while also introducing new methods to train models with lower resource costs. On the testing side of natural language generation and understanding, researchers released XGLUE, a benchmark dataset for gauging models’ zero-shot cross-lingual transfer capabilities across 19 different languages.

In the realm of vision and language pretraining (VLP), Microsoft researchers released OSCAR (Object-Semantics Aligned Pretraining) in May, leading to a novel framework with state-of-the-art performance on six different vision-and-language tasks. In October, researchers collaborating with Azure Cognitive Services created VIVO (Visual Vocabulary Pretraining), which resulted in a framework for novel object captioning that achieved state of the art and even surpassed human performance on the novel object captioning (nocaps) benchmark.

Technologies like Microsoft Floating Point, used in the Project Brainwave architecture, are helping to lower the cost of deep neural network (DNN) inference. Improvements like this allow Microsoft to power large AI models on the scale needed to empower Microsoft users around the world. Head over to the AI at Scale page to learn about some of the many other projects undertaken at Microsoft Research to advance large-scale AI this year.

Building AI responsibly by pursuing safety, fairness, interpretability, and accessibility

As AI techniques have made leaps and bounds, researchers are undertaking the crucial task of examining responsible practices in AI, which include accessibility, fairness, and interpretability. Methods for understanding and explaining what AI does, as well as assessing fairness at all stages of development, are big trends in this area.

“Building and fielding AI responsibly is a challenging, cross-disciplinary research area. Our progress over the last year builds on insights from previous years with an emphasis on applying our research and learnings to developing usable methods and tools that can help engineers design and develop trustworthy AI systems. We’ve made strong progress, but our journey is far from over.”

Eric Horvitz, Technical Fellow and Chief Scientific Officer

In January, researchers shared their insights into how societal bias in historical data used to train algorithmic decision-making systems could be reinforced by future data and explored two interventions that could help to correct this bias. Meanwhile, another team of researchers developed a framework and open-source library for generating explanations that individuals adversely impacted by system decisions—such as those who’ve been denied a loan or insurance—can use to work toward a positive determination. These counterfactual explanations can also be used for system evaluation, becoming a tool for practitioners. Builders of AI tech are at the center of two papers recognized at CHI 2020. Hanna Wallach, Jennifer Wortman Vaughan, and their co-authors sought to empower the group with an actionable checklist—co-designed with practitioners—for discussing and addressing fairness throughout the AI life cycle and an examination into the effectiveness of available interpretability tools based on an interview study and survey with practitioners. Wortman Vaughan explored understanding AI systems in a January webinar.

Collaboration with and inclusion of those closest to the tech was also happening on the accessibility front, where advances in computer vision and natural language processing (NLP) are allowing researchers to improve AI for alt text generation and object identification. Researchers worked with people who are blind or have low vision to understand how to improve the alt text generated by automated systems and to develop a dataset for personalized object recognition. In a March webinar, Dr. Danna Gurari and Dr. Ed Cutrell discussed developing impactful vision systems and the role dataset challenges play. Meanwhile, a partnership with people living with and affected by amyotrophic lateral sclerosis (ALS) laid the groundwork for Expressive Pixels, a platform for creating LED-display animations that simultaneously offers opportunities for creating, learning, and communicating in new ways.

Responsible AI requires that the talent, knowledge, and experiences of those developing it are as diverse as the people using it. Microsoft is committed to clearing a path into the industry for underrepresented groups, including sponsoring and participating in events like the Black in AI, Queer in AI, and Women in Machine Learning NeurIPS workshops and creating professional and academic opportunities through Microsoft Research. To have your work supported or to join the Microsoft Research team, see our Academic Programs—such as the Dissertation Grant for PhD students from underrepresented groups (proposals will start to be accepted in February)—and open research positions and internships.

Practical and theoretical advances in cryptography and security

As technology becomes more embedded in and essential to people’s lives, creating new technologies for 2020 and beyond demands deeper consideration of people’s privacy, security of the web and internet-connected devices, and safeguards on human rights. This year, researchers introduced ElectionGuard, a system that applies homomorphic encryption to both secure people’s votes—so that no one else can see how they voted—and allow voters to verify their votes are properly counted. The system was piloted in Fulton, Wisconsin, in February, where election officials tested machines running ElectionGuard; the final tally was gathered through traditional paper ballot methods. The code for ElectionGuard was made open source, and Josh Benaloh presented on the technology in a webinar in April.

Cryptography was also explored from a post-quantum angle. Future quantum computers could decipher even the most secure current cryptographic techniques, so researchers in this space have begun to investigate new methods for making the cryptography of the future equal to the task of protecting people’s privacy and information in a world where quantum computers far exceed the power of supercomputers now. These methods can also keep information more secure in the present computing landscape. To learn more about the world of post-quantum cryptography, check out webinars from Craig Costello and Christian Paquin below.

Researchers and engineers released resources and technologies to uncover security vulnerabilities and identify potential attacks. In March, Patrice Godefroid made a case for developers adding fuzzing to their toolkit to detect vulnerabilities in software through automated testing, and in November, a team introduced RESTler, “the first stateful REST API fuzzing tool for automatically testing and finding security and reliability bugs in cloud/web services through their REST APIs.” Researchers also released Project Freta, a service for Linux systems that detects evidence of OS and sensor sabotage by analyzing a memory snapshot to find rootkits and other malware.

Improving healthcare through technology

The importance of healthcare technology came into especially strong focus as this year progressed. In many instances, projects already underway were particularly timely, as was the case with investigations into making online mental health interventions more effective through data analysis, the potential for personalizing those mental health interventions via subtyping, improving mental health helpline technology, and using chat apps to help facilitate patient care in hospitals.

In late August, researchers announced a method for biomedical NLP pretraining that could enable researchers to stay up to date with the continually increasing amount of new scientific knowledge in the field by using NLP to quickly identify and cross-reference important findings. Their model, PubMedBERT, obtained state-of-the-art results in several biomedical applications.

Dr Raj Jena using InnerEye software

As medical and mental health professionals adapt how they provide care for people in the 21st century, researchers intend to continue to create technology that complements experts’ skills.

Bringing reinforcement learning into the real world

Reinforcement learning—a framework in which ML systems learn via interactions with their environment—has long been an active area at Microsoft Research, and the drive to advance RL has increased thanks to the approach’s success in Microsoft products and services. Researchers are tackling RL both empirically and theoretically, and their enthusiasm and efforts were on full display in 17 NeurIPS-accepted papers that pursued a variety of promising avenues.

Two papers designed methods for leveraging existing logged datasets in an area of RL that uses past experience to give agents a leg up prior to deployment, while separate work brought strategic exploration to the popular gradient decent–based approaches for RL. Other work highlighted a trend in learning good representations for agents’ observations. In their respective papers, Akshay Krishnamurthy, Devon Hjelm, and their coauthors incorporated auxiliary prediction problems to discover representations that simplify downstream learning tasks. Earlier in the year, researchers deployed Transformers in several ways to develop Working Memory Graph, an RL agent capable of more efficient learning when advanced reasoning is involved, such as future planning in the game Sokoban.

Games make great arenas in which to train agents for use in gaming or in more general applications. In August, Sam Devlin and Katja Hofmann shared work done as part of a new collaboration with game developer Ninja Theory around enhancing gaming with RL agents capable of teamwork with human counterparts. Also during the summer, researchers kicked off the second iteration of MineRL, a sample-efficient RL challenge based on the platform Project Malmo, which uses Minecraft as a playground for AI experimentation.

For an overview of RL, check out Hofmann’s webinar, and to learn about one RL framework in particular, multi-armed bandits, read this introductory text on the subject. And if you can hardly wait to learn more, fret not—you can start the new year off strong with Reinforcement Learning Day 2021 in January. Until then, check out content, including videos, from last year’s event.

Optimizing AI

On the winding road of this year in technology research, it’s only fitting that we loop back to AI, which researchers sought to optimize from multiple perspectives. One perspective considered how AI has evolved alongside the game of chess. In the last few decades, AI has advanced to a point where it can spar with and succeed against the best players in the world. This led researchers to shift their focus from how to make AI better at chess to how chess-playing AI can be refined to better match human playing styles and skill levels. As a result, researchers created Maia, a Leela Chess Zero–based engine that matches human play more closely than previously achieved.

To ramp up neural architecture search (NAS) research, ARCHAI was introduced to make work in this area more usable, reproducible, and unified. The framework allows standard NAS algorithms to be executed with a single command line, making it easy to experiment with and add new algorithms and datasets. Researchers also proposed an autoML approach to compare any two classification datasets, even if their labels aren’t directly comparable, called Optimal Transport Dataset Distance. If you’re interested in learning more about autoML, check out the Directions in ML Speaker Series, which kicked off in July.

The Semantic Machines research team introduced a new framework for conversational AI in which dialogues represented as dataflow graphs make AI more flexible in its ability to adapt to the natural flow of conversation. Along with this, they released the largest, most complex task-oriented dialogue dataset to help advance conversational AI research more broadly.

Computer vision moved forward on many fronts in 2020. Researchers created a visual question answering (VQA) evaluation score in their work to understand the connection between visual understanding and neuro-symbolic reasoning. The score betters prior evaluation methods by isolating reasoning from perception in VQA models with a differentiable first-order logic framework. Researchers also investigated how two concepts integral to human reasoning, locality and compositionality, can help to enhance zero-shot representation learning.

Researchers out of Microsoft Research Asia developed methods to improve visual recognition with HRNet and boost photo enhancement with two AI techniques—one that transfers high-resolution texture information to low-resolution images and another that uses variational autoencoders (VAEs) to restore old photos. Other advances in deep generative networks by included Optimus, FQ-GAN, and Prevalent, while researchers also found ways to extend adversarial robustness and training, concepts closely associated with GANs, to transfer learning and causal inference.

Finally, researchers are looking at a future AI landscape that is increasingly multimodal and interactive. However, engineering AI that uses multimodal streaming data in real time is time consuming because programming infrastructures in this area are lacking. Researchers built Platform for Situated Intelligence to provide an open-source framework for experimentation, development, and research in this area.

Continuing a tradition of research with real-world impact

The above research represents a small portion of the great work that was enthusiastically pursued by dedicated researchers at Microsoft Research in 2020, and even if we were to consider all the work produced this year, it would still tell only part of a bigger research story at Microsoft.

In a new series of posts this year, we began connecting the dots to provide an overview of how the individual contributions of researchers and their collaborators are coming together to have a profound impact on customers and society at large. Our collection on responsible AI shows how researchers are upholding and advancing the Microsoft commitment to AI grounded in principles that put people first and benefit society, while our collection on reinforcement learning recounts the history of work in the field and its application to Microsoft products and services. Visit our collections archive for more.

2020 has been a year like no other, underscoring the importance of the relationship between research and resilience. As we look to the future, researchers are an important part of answering the big questions that will shape the direction of society. We have been inspired by the research community’s continued commitment to technological advancements—those in direct response to these unprecedented times and those keeping research on all fronts moving forward. We wish you and yours a safe and healthy new year.

To stay up to date on all things research at Microsoft, follow our blog and subscribe to our newsletter and the Microsoft Research Podcast. You can also follow us on Facebook, Twitter, YouTube, and Instagram.


Posted on Leave a comment

Researchers explore using consumer cameras for contact-free physiological measurement in telehealth and beyond

a man sitting at a table using a laptop
Our research is enabling robust and scalable measurement of physiology. Cameras on everyday devices can be used to detect subtle changes in light reflected from the body caused by physiological processes. Machine learning algorithms are then used to process the camera images and recover the underlying pulse and respiration signals that can then be used for health and wellness tracking.

According to the CDC WONDER Online Database, heart disease is currently the leading cause of death for both men and women in the United States. However, most deaths due to cardiovascular diseases could be prevented with suitable interventions. Early detection of changes in health and well-being can have a significant impact on the success of these interventions and boost the chances of positive outcomes. Atrial fibrillation (AFib) is an example of a symptom that can indicate increased risk of heart disease, and when detected early, it can inform interventions that help to reduce risk of stroke.

Physiological sensing plays an important role in helping people track their health and detect the onset of symptoms. However, there are barriers to conducting physiological sensing that act as a disincentive, such as access to medical devices and the inconvenience of performing regular measurements. Making physiological sensing more accessible and less obtrusive can reduce the burden on people to perform physiological assessments of this kind and help catch early warning signs of symptoms like AFib.

Over the past decade, researchers have discovered that increasingly available webcams and cellphone cameras combined with AI algorithms can be used as effective health sensors. These methods involve measurement of very subtle changes in the appearance of the body across time, in many cases changes imperceptible to the unaided human eye, to recover physiological information. In essence, as ambient light in a room hits your body, some is absorbed and some is reflected. Physiological processes such as blood flow and breathing change the appearance of the body very subtly over time.

A smartphone camera can pick up this reflected light, and the changes in pixel intensities over time can be used to recover the underlying sources of these variations (namely a person’s pulse and respiration). Using optical models grounded in our knowledge of these physiological processes, a video of a person can be processed to determine their pulse rate, respiration, and even the concentration of oxygen in their blood.

  • illustrated icons related to artificial intelligence for Microsoft's involvement at NeurIPS 2020 EVENT Microsoft at NeurIPS 2020 Check out Microsoft’s presence at NeurIPS 2020, including links to all of our NeurIPS publications, the Microsoft session schedule, and links to open career opportunities.

Building on previous work, our team of researchers from Microsoft Research, University of Washington, and OctoML have collaborated to create an innovative video-based on-device optical cardiopulmonary vital sign measurement approach. The approach uses everyday camera technology (such as webcams and mobile devices) and a novel convolutional attention network, called MTTS-CAN, to make real-time cardio-pulmonary measurements possible on mobile platforms with state-of-the-art accuracy. Our paper, “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement,” has been accepted at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020) and will be presented in a Spotlight talk on Monday, December 7th at 6:15PM- 6:30PM (PT).

Camera-based physiological sensing applications in telehealth

Camera-based physiological sensing has numerous fitness, well-being and clinical applications. For everyday consumers, it could make home monitoring and fitness tracking more convenient. Imagine if your treadmill or smart at-home fitness equipment could continuously track your vitals during your run without you needing to wear a device or sync the data. In clinical contexts, camera-based measurements could enable a cardiologist to more objectively analyze a patient’s heart health over a video call. Contact sensors, necessary for monitoring vitals in intensive care, can damage the skin of infants—remote sensing could provide a more comfortable solution.

Perhaps the most obvious application for camera-based physiological sensing is in telehealth. The SARS-CoV-2 (COVID-19) pandemic is transforming the face of healthcare around the world. One example of this revolution can be seen in the number of medical appointments held via teleconference, which has increased by more than an order of magnitude because of stay-at-home orders and greater burdens on healthcare systems. This is due to the desire to protect healthcare workers and restrictions on travel, but telehealth also benefits patients by saving them time and costs. The Center for Disease Control and Prevention is recommending the “use of telehealth strategies when feasible to provide high-quality patient care and reduce the risk of COVID-19 transmission in healthcare settings.” The COVID-19 virus has been linked to increased risk of myocarditis and other serious cardiac (heart) conditions, and experts are suggesting that particular attention should be given to cardiovascular and pulmonary protection during treatment.

In most telehealth scenarios, however, physicians lack access to objective measurements of a patient’s condition because of the inability to capture signals such as the patient’s vital signs. This concerns many patients because they worry about the quality of the diagnosis and care they can receive without objective measurements. Ubiquitous sensing could help transform how telehealth is conducted, and it could also contribute to establishing telehealth as a mainstream form of healthcare.

It can take many years for new technologies such as these to transition from research discoveries to mature applications. The fields of AI and computer vision, as a whole, are six decades old, yet it is only in the past 10 years that many applications have started to reach fruition. Research on camera-based vital sign monitoring began much more recently—within the past 15 years—so there is still a lot of effort required to help it reach maturity.

Improving accuracy, privacy, and latency for contactless vital sign sensing methods

Contact sensors (electrocardiograms, oximeters) are the current gold standard for measurement of heart and lung function, yet these devices are still not ubiquitously available, especially in low-resource settings. The development of video-based contactless sensing of vital signs presents an opportunity for highly scalable physiological monitoring. Computer vision for remote cardiopulmonary measurement is a growing field, and there is room for improvement in the existing methods.

First, the accuracy of measurements is critical to avoid false alarms or misdiagnoses. The US Federal Drug Administration (FDA) mandates that testing of a new device for cardiac monitoring should show “substantial equivalence” in accuracy with a legal predicate device (for example, a contact sensor). This standard has not been obtained in non-contact approaches. Second, designing models that run on-device helps reduce the need for high-bandwidth internet connections, making telehealth more practical and accessible. Our method, detailed below, works to improve accuracy with a newly designed algorithm (see Figure 1) and runs on-device.

Figure 1: The trade-off between latency (the time it takes to process each frame of video) and error in heart rate estimation. An optimal method would be in the top left corner, meaning we can process video frames at a high rate and with small errors. Our proposed method, MTTS-CAN, has the lowest latency and has accuracy that is well above the baseline we used for our research. The MT-Hybrid-CAN was also developed as part of our research to support devices with bigger computational power, such as PCs.

Camera-based cardiopulmonary measurement is also a highly privacy-sensitive application. This data is personally identifiable, combining videos of a patient’s face with sensitive physiological signals. Therefore, streaming and uploading data to the cloud to perform analysis is not ideal. This motivated our focus to develop methods that run on device—helping keep people’s data under their control.

Finally, the ability to run at a high frame rate enables opportunistic sensing (for example, obtaining measurements each time you look at your phone) and helps capture waveform dynamics that could be used to detect atrial fibrillation, hypertension, and heart rate variability where high-frame rates (at least 100Hz) are a requirement to yield precise measurements of the waveform dynamics.

MTTS-CAN: Using a convolutional neural network to improve non-contact physiological sensing

To help address the gaps in the current research, we developed an algorithm for multi-parameter physiological measurement that can run on a standard mid-range mobile phone, even at high frame rates. The method uses a type of deep learning algorithm called a convolutional neural network and analyzes pixels in a video over time to extract estimates of heart and respiration rates. The algorithm extracts two representations of the face: 1) the motion representation that contains the temporal changes pixel information and 2) the appearance representation that helps guide the network toward the spatial regions of the frame to focus on. Our specific design of this method is called a multi-task temporal shift convolutional attention network (MTTS-CAN). See Figure 2 below for details.

diagram
Figure 2: MTTS-CAN is a new neural network architecture that allows for efficient, multi-parameter physiological measurement from video. The video is analyzed to extract subtle changes in pixel intensities over time and then recover estimates of the underlying pulsatile and respiratory signals.

We introduced several features to help address the challenges of privacy, portability, and precision in contactless physiological measurement. Our end-to-end MTTS-CAN performs efficient temporal modeling and removes sources of noise without any added computational overhead by leveraging temporal shift operations rather than 3D convolutions, which are computationally onerous.

These shift operations allow the model to capture complex temporal dependencies, which are particularly important for recovering the subtle dynamics of the pulse and respiration signals. An attention module improves signal source separation by helping the model learn which regions of the video frame to apply greater importance to, and a multi-task mechanism shares the intermediate representations between pulse and respiration to jointly estimate both simultaneously.

Multi-task learning is effective for two reasons. First, the heart rhythms are correlated with breathing patterns meaning the two signals share some common properties—this is a principle known as Respiratory Sinus Arrhythmia (RSA). Second, by sharing many of the preliminary processing steps, we can dramatically reduce the computation required.

By combining these three techniques, our proposed network can run on a mobile CPU and achieve state-of-the-art accuracy and inference speed. Ultimately, these features result in significant improvements for gathering real physiological signals, like heart rate and pulse (see Figure 3).

Figure 3: MTTS-CAN reduces the error in heart rate measurement and considerably improves the pulse signal-to-noise ratio compared to previous methods such as ICA, CHROM, POS, and 2D-CAN on a large benchmark dataset.

One concern with optical measurement of vital signs is whether performance will work equally across people, including all skin types and appearances (for example, those with facial hair, wearing cosmetics, head coverings, or glasses). We have worked on characterizing these differences and helping to reduce them using personalization and data augmentation. Improving sensing technology to create equitable performance is a central focus to this research.

We hope that this work advances the speed at which scalable non-contact sensing can be adopted. Atrial fibrillation (AFib) is just one of most common cardiovascular symptoms that impact millions of people and could be better detected with more accurate, easily deployed non-contact health sensing systems. Our work is a step in this direction. Through our research we are continuing to develop methods for sensing other physiological parameters, such as blood oxygen saturation and pulse transit time.

If you’re interested in learning more about our research in physiological sensing, there are a number of resources available. Our project page is a hub for publications and related content, including links to open-source code. We also recently gave a webinar on contactless camera-based health sensing that further elaborates on this work and dives deeper into how the technology works. Register now to watch the on-demand webinar/Q&A.

Posted on Leave a comment

The future of work unbound: 2020 and the strange new mobility of space and time

For those of us who have transitioned to working from home over the course of the last year, we must navigate a strange new manifestation of mobility.

Far-flung colleagues appear almost magically in grid format on a screen right in front of our faces, despite their remote locations. Yet at the same time, a document, presentation, piece of content, or part of a running application already at our fingertips is awkward to share with others on the same video call.

It’s a paradoxical science fiction world where far is near, and the close-at-hand dilates impossibly beyond our reach. Perhaps these surreal distortions of time and space explain in part why “video-conferencing syndrome” feels so draining.

And even as we’re stranded within the confines of our improvised home offices, we’re somehow supposed to navigate this otherworldly place—a jumbled chaos terrain of home and work, personal and professional, private and semi-public.

Moving between these realities, sometimes moment by moment, makes us nimble in a way we’ve never experienced before: our activity is mobile even as we stay put in the same location. We work in the same physical spaces, but as we navigate these transitions, we’re not in the same human places.

While 2020 has accelerated this trend, perhaps it’s inevitable—and indeed, as we point out below, in many ways this strange new mobility has been a long time coming.

Microsoft researchers are creating technologies to help people succeed in this new way of life. We are working to develop systems that help us to navigate these changes. A new world where these transitions feel less strange—and more empowering. A place appropriate to our current task, locality, and context, where “mobility” means technology that rises to the universal human need to connect and work with others seamlessly.

To that end, Microsoft researchers have published three papers—two of which appear at this year’s ACM Symposium on User Interface Software and Technology (UIST 2020)—on new technologies that redefine how we interpret this concept of place.

The first explores SurfaceFleet, a system to decouple computing from individual devices and places. The second presents Ambrosia, a system which uses resilient distributed programming techniques to unbind running programs and their state from any particular device (CPU). The third circles back on this notion of place, showing how nuanced social cues such as tilt and orientation of a display, on the adjustable Microsoft Surface Studio, can support elegant and natural transitions between different tasks and ways of using such a display.

SurfaceFleet: Mobility as transitions of user activity from one ‘place’ to another

What the Fleet system is in brief:

  • A distributed system leveraging a robust, performant declarative database foundation and building on the Ambrosia runtime
  • An exploration of novel implications for migration of user experiences across devices
  • A platform for Applets: lightweight, distributed user interface elements that unbind interactions from devices, applications, users, and time
  • A collaboration tool enabling people to work across devices and act at synchronous or asynchronous times.

The COVID-19 pandemic has brought the collision of home and work—of activity in limited physical spaces that must transition between different human places—to a critical juncture.

But two trends, manifested over the past decade already, are influencing the future of human experiences with computing technology.

The first trend concerns hardware and systems architecture. With Moore’s Law at an end, yet networking and storage exhibiting exponential gains, the future appears to favor systems that emphasize seamless mobility of data—favoring techniques that consume network and storage bandwidth rather than those using a particular CPU. Accelerated by pervasive cloud services and 5G, these computational shifts show no sign of slowing.

The second trend is one of human behavior. People now interact with more devices than ever before. Modern information work increasingly relies on multi-device workflows and distributed workspaces, with connected and interdependent devices—smartphones, desktops, tablets, and perhaps even emerging new form factors. The problem here lies in that transitioning from device to device, and more broadly place to place, can cost us precious efficiency or resources, like time or attention, leaving our activities marooned on islands of glass instead of creating a new interconnected world.

What’s needed is an ecosystem of technologies that seamlessly transitions from place to place, whether that “place” takes the form of a literal location, a different device form factor, the presence of a collaborator, or the availability of the pieces of information needed to complete a particular task at a given time. Such a “Society of Technologies” favors techniques that establish meaningful relationships between the members of this society, rather than with any particular device, to afford mobility of user activity from one place to another, in a very general sense of the word.

Through this lens, we can view the essence of mobility as the transition of user activity from one place to another. SurfaceFleet is a working system, development toolkit, and user experience that explores some implications of these challenges by decoupling computation—including its representation in the graphical user interface—from the current device.  

Yet, once user interface mechanisms are decoupled from a single device, we discovered that this also has interesting carry-on implications for unbinding interaction from the current application, the current user, and the current time, as well. The Fleet system handles transitions in place—bridging the resulting gaps—across all four of these dimensions. See the embedded video above for demonstrations that show how user interfaces can “float” above the screen and transcend program state that is confined to the current device. 

The Fleet system unbinds UI elements from not only the device but also the current application, user, and time. In the visible UI, Applets unbind controls from applications. Portfolios unbind tools, inputs, behaviors, and content from the current device and user. Promises unbind actions from time. Read more in the SurfaceFleet paper.

But authoring distributed programs is difficult and requires considerable expertise. How do we reimagine this notion of “device” that is so deeply baked into current development practices? This is where our journey crosses paths with a new distributed-systems technology known as Ambrosia.

Ambrosia: Programming as if failure doesn’t matter

What the Ambrosia runtime does in brief:

  • Introduces the notion of “virtual resiliency,” which allows programmers of distributed applications to program as if failure doesn’t matter
  • Facilitates recovery and replay of logged messages that include mechanisms to correctly handle non-determinism
  • Achieves highly performant remote procedure calls through database techniques such as batching, high-performance log writing, high-performance serialization concepts, and group commit strategies
  • Provides the technical foundation of the Fleet system

Programmers face complex decisions and coding tasks when coping with failure in distributed systems—especially when applications modify state that is shared across devices. Unfortunately, a lot can go wrong even in simple scenarios of passing messages between distributed services. Connections can drop. Distributed clients can crash at any moment. A remote procedure call (RPC) might even fail just as it sends a remote message, creating uncertainty of what has been sent or received that must be reconciled. All of these cases and error conditions must be correctly anticipated, handled, and implemented correctly, and in an efficient manner. This is why distributed services are so hard to program and deploy correctly.

But using Ambrosia, a developer can write the code for their client as if failure doesn’t matter. We call this virtual resiliency, similar to virtual memory, where one can author programs as if limits on physical memory don’t exist.

Virtual resilience for “Alice’s” running client, where the Ambrosia runtime intercepts all outgoing and incoming remote procedure calls and logs them to resilient storage, such as in the cloud, before Alice acts on them. Ambrosia automatically replays this log to recover from failures in a way that ensures deterministically ordered, one-time delivery of requests.

The developer simply wraps their service (“Alice” in the figure above) in the Ambrosia runtime. Ambrosia intercepts each message and logs it to resilient storage before sending it over the network via RPC. Whenever a remote service (“Bob”) responds, Ambrosia likewise logs these return messages before Alice’s code acts on their contents.

Ambrosia encapsulates the many possible failure conditions, factoring all the distributed-systems complexity of the resulting client code. If Alice goes down, Ambrosia automatically recovers by replaying the log, allowing Alice’s code to pick up where it left off. Likewise, if Bob crashes, the system can automatically recover from that, too, so long as Bob is wrapped in the Ambrosia runtime, as well. And since network connection state is also logged to resilient storage in the cloud via Azure, we also can automatically self-heal disruptions such as intermittent connections or changing network addresses via a subsystem known as the Common Runtime for Applications (CRA).

Programming distributed applications in this way, as if failure doesn’t matter, is a nifty trick. But the true secret sauce of Ambrosia is that it provides this virtual resiliency with high performance. It does so by applying decades-old wisdom that has been used to build performant, reliable, and available database systems. For instance, Ambrosia makes extensive use of batching, high-performance log writing, high-performance serialization concepts, and group commit strategies. It also includes mechanisms to properly handle non-determinism by logging any such events, as well. These carefully implemented techniques allow Ambrosia to deterministically provide virtual resiliency with little or no reduction in throughput, depending on message size, as compared to popular RPC frameworks that lack resilience mechanisms.

With a distributed system built on these abstractions, we end up with two coordinated instances of the Ambrosia runtime surrounding the running services Alice and Bob:

The running services “Alice” and “Bob” are encapsulated in coordinated instances of the Ambrosia runtime to ensure mutual resiliency of the distributed system.

This basic architecture not only encapsulates many types of distributed-system failures, but it also allows for interesting variations, such as standing up multiple active instances of a service (so-called active/active configurations) so that we can quickly failover to “Bob 2” or “Alice 2” if one of the services dies or is slow to recover.

But such a failover might not reflect a networking or system failure at all.

Perhaps it is a matter of choice—the end user’s preference.

Maybe a user of the Alice service shuts off their desktop at the end of a hectic day and picks up their tablet instead. “Desktop Alice” halts, and “Tablet Alice” resumes where they left off. Instead of a network or hardware crash, it’s simply a failover to their preferred device.

This leads us to a key insight. Circling back to the Fleet system, where we started, we can now cast transitions of user activity from one device to another as a special case of failover. In this case, it’s a transition from one device to another.

But migration of program state to a new device is just one special case of mobility. If we have the right feedback and user interface mechanisms in place, we can generalize this as transitions of user activity from one place to another, in many senses of the word place. How best to do this is still an open problem. Our work explores some possibilities and hints at solutions. This suggests that cross-device and distributed systems will have major impact on user interfaces going forward, even if the full vista of interactive systems and human experiences this makes possible has only just begun to dawn.

Next, we take this up a level by looking at a simple example of how sensing shifts in context—such as responding appropriately when a user tilts a display—can drive lightweight and natural transitions from one human activity to another.

Changes in display orientation as a nuanced transition in ‘place’

What tilt-responsive techniques for digital drawing boards do in brief:

  • Run on a Microsoft Surface Studio 2 using a C# module for sampling the sensors and implementing signal conditioning and a JavaScript-based client
  • Demonstrate how a variety of everyday applications can use sensed display adjustments to drive context-appropriate transitions, such as shifts between reading versus writing, displays of public versus personal information, face-to-face video versus screen sharing of documents in remote work, and other nuances of input and feedback contingent on display angle—with continuous interactive responses tailored to each use case.

During the long incubation and technical development of the distributed-systems advances discussed above, we kept circling back at odd intervals to another endeavor: we had outfitted the Microsoft Surface Studio with an extra sensor to detect its angle. The Microsoft Surface Studio is a 27” screen that supports multi-touch and pen input and can be adjusted smoothly from a vertical display to a drafting table–like 20 degrees. In this device, we saw a parallel between its use and people’s behaviors and expectations outside the digital world.

In everyday life, people naturally reposition objects, such as paper documents, to allow shared visibility, partial viewing, and even concealment. Such motions are completely natural and perhaps even subconscious. How we position an object depends on what we intend to do. For example, a doctor might hold a medical chart “close to the vest” at first, but then turn it toward their patient when ready to share particular results. Similarly, the appropriate display orientation depends on the task and situation at hand. A vertical monitor makes for easier reading but not necessarily easier writing with a stylus. Angled drafting tables in a design studio encourage sketching and freeform brainstorming, but the preference when presenting refined versions of those same ideas may be a vertical screen. Display angle is not one size fits all. We wondered, could we tap into our most natural ways of mediating information exchange by sensing the tilt of a display?

We explored system responses to the sensed tilt of an adjustable Microsoft Surface Studio display, such as during transitions from vertical to low-angled, drafting table–like postures. This transforms the current application’s user experience via continuous, interactive, sensor-driven transitions.

By adding an off-the-shelf tilt sensor to the Microsoft Surface Studio, we discovered a series of designs, techniques, and interactions that can respond appropriately to the user’s context of use, as sensed by the current display angle. In doing so, we begin to shift the burden of adapting the inputs, tools, modes, and graphical layout of applications from the user to the system. For example, one demonstration we built explores a teleconferencing scenario in which the typical talking-head video feed of “person-space” appears when the screen is vertical but transitions to a shared document that users can mark up with a digital pen when the screen is tilted down like a drafting board. As the display tilts, we fade out the camera feed to let the user avoid unbecoming video angles. This also selectively focuses the remote audience’s attention on the shared document rather than the video feed—in effect, a remote way of steering a remote participant’s attention in a manner analogous to angling a paper document toward a collaborator nearby.

As the demo reel above illustrates, displays can respond to tilt by transitioning between reading and writing, public versus personal, authoring versus presenting, and other nuances of input and feedback—and in ways that often can delight and entertain, as well.

Amid our above distributed-systems research, this curiosity-driven work was a completely unrelated side project. Or so we thought. After we finished writing up tilt-responsive techniques for publication, we had an epiphany. As you adjust the angle of a display, you’re:

In the same physical location.

On the same device.

Using the same screen.

Running the same application.

But the new screen orientation doesn’t afford the same tasks and activities—you’ve transitioned to a different place.

This subtly shifts your expectations of what is appropriate. And with just a tiny bit of awareness, well-designed software could provide a sort of intelligence by responding appropriately in kind. That is, the angle of a digital display is just another form of mobility. Here, the mobility is on the micro-level, moving from one screen orientation to another, as opposed to the more macro-level transitions that are the current focus of the Fleet system, such as moving from one device to another or across local and remote locations.

Closing thoughts

We have discussed how the Fleet system explores new ways to think about mobility, and we’ve shown how it builds on an exciting new distributed-systems technology known as Ambrosia. These technologies work to build and implement applications that not only go beyond the current device, but also unbind other dimensions of mobility—the current user, the current application, the current time—as well. Beyond that, as hinted at by our final example above, our research shows how sensors can bridge transitions from the current (sensed) context to another—by responding appropriately to natural human activity.

At the highest level, these advances hint at how devices can be better together—complementing one another across an ecosystem of technologies—instead of competing to add ever more complexity with each new device or service.

Posted on Leave a comment

How a cloud-based solution is transforming care for people with cystic fibrosis

Sitting beside their son during one of his weeks-long hospital stays over the Christmas holidays a few years ago, David and Kirsty Hill had plenty of time to worry and think.

As 12-year-old George lay in an isolation room, receiving antibiotics to treat a bacterial infection related to his cystic fibrosis, a progressive genetic disease that damages the lungs and digestive system, the couple thought about what managing their younger son’s disease involved — the daily regimen of medications and nebulizers, the yearly stints in the hospital, the frequent interruptions to school and work, the dread and worry each time George developed a cough.

David and Kirsty were actively involved in cystic fibrosis charities, running half-marathons and doing 100-mile bike rides to raise funds and awareness. But could they do more? As a domain solution architect for Microsoft UK, David was using his technical skills daily to benefit customers. How, he wondered, could he channel those abilities and tap the expertise of his colleagues to use technology to improve the quality of life for George and other people with the disease?

Those musings in a lonely hospital room led to what could be a groundbreaking approach to managing cystic fibrosis — a solution called Project Breathe that seeks to give patients greater control over their health, might reduce the need for time-consuming and risky hospital visits, and could even prolong life.

The smartphone-based solution allows people with cystic fibrosis to monitor their health at home with devices that measure key indicators such as lung function, blood oxygen levels and activity. That data is then stored in the cloud and can be accessed by clinicians on a dashboard using Power BI, Microsoft’s data visualization platform, to look for trends and determine when patients are becoming unwell. By tracking their own data, patients can intervene earlier and potentially head off serious, lung-damaging infections.

The solution was developed through a consortium involving Microsoft, the U.K.-based Cystic Fibrosis Trust, the University of Cambridge, Royal Papworth Hospital in Cambridge, Microsoft Research and Magic Bullet, a social enterprise company Kirsty Hill runs whose purpose is to improve quality of life and outcomes for people with CF.

The consortium launched a research project on Project Breathe in 2019 to investigate the viability of home monitoring for cystic fibrosis patients. The project was humming along and showing promising results when the coronavirus pandemic hit, bringing the need for remote health monitoring acutely into focus.

Health authorities advised patients with cystic fibrosis, who are particularly vulnerable to respiratory infections, to isolate at home. In-person clinics were canceled across the U.K. and the Project Breathe team shifted into high gear to make its app more broadly available to people who suddenly found themselves trying to manage their cystic fibrosis at home.

“We realized we were sitting on this solution that was restricted to a 100-person research project and thousands of people could benefit from it,” Kirsty Hill says. “Suddenly there was an opportunity to have a much bigger impact.”

Cystic fibrosis, or CF, causes the body to develop thick mucus that can clog lungs and lead to infections and respiratory failure. Better screening and treatments have greatly improved life expectancy, but the disease requires time-intensive daily regimens and is often unpredictable, causing frequent disruptions in patients’ lives — including routine clinics every four to six weeks that involve a multidisciplinary team of specialists and take the better part of a day.

A man sits at a kitchen table working on a laptop
John Winn has cystic fibrosis and says Project Breathe “is incredibly close to my heart.” Photo by Jonathan Banks.

John Winn, a principal researcher at Microsoft Research in Cambridge and part of the Project Breathe team, understands the burden of CF as well as anyone. Winn has cystic fibrosis, and when the pandemic struck, he moved out of the house he shares with his wife and two young children near Cambridge and into a rental home a few minutes away.

He isolated alone there for four months with a supply of food so he didn’t have to come into contact with other people, eating meals with his family twice a day over video chat. Winn moved back with his family for the summer but is prepared to isolate alone again during the school year if need be.

Since Winn’s lung function is diminished by about 30% because of the disease, contracting COVID-19 would pose a serious risk for him, he says. Being able to manage his health at home and stay out of the hospital is critical.

“In the last few years we’ve seen a huge step forward in the drugs available to treat CF, but the processes around managing the disease and the practice of managing it in clinics has not really changed much in 20 years,” Winn says. “Project Breathe is about revolutionizing that.

“I’m very, very excited about it. This project is incredibly close to my heart.”

Dealing with CF was already challenging for Caroline Powell, a busy teacher who lives near Cambridge. She has frequent lung and chest problems, takes about 80 pills a day and has “always had to work hard” at her health. Each time she has a medical appointment or requires hospitalization, Powell worries about who will cover for her and about arranging lessons for her students. After her son was born almost a year and a half ago, those concerns intensified.

“I don’t want him coming to hospital with me all the time or being away from me when I’m hospitalized,” Powell says. “That’s now my biggest incentive to make everything more manageable.”

When Powell heard about Project Breathe during a routine clinic visit to Royal Papworth Hospital in late February, she was eager to try it. She hoped the approach might allow her to head off hospitalizations and avoid some of the clinics she was attending every four weeks. Having more insight into her health also appealed to her.

A woman plays with a toddler in a playground
Project Breathe is helping Caroline Powell gain better insights into her health. Photo by Jonathan Banks.

The Project Breathe kit, which is provided to study participants, includes a free smartphone app, a Fitbit to track activity and sleep, an oximeter that measures oxygen levels in blood and a spirometer that gauges lung function. That data is automatically uploaded to the app, and patients also enter self-reported data on how much they are coughing and how they’re feeling overall. By monitoring her data collected through the app over a period of weeks, Powell realized she needed to start on a course of antibiotics to treat a lung infection. After the pandemic lockdown started in the U.K., she had her first virtual clinic with a CF specialist nurse at Royal Papworth who was able to access her data through the Project Breathe dashboard, which provides graphs and other visual information, and get a clearer picture of her condition.

“We were able to go into a lot of detail because she had all my information there and she’d read over my data,” Powell says. “Unlike a physical clinic where they just use the data from that one appointment, she was able to spot the pattern of my symptoms increasing.”

Powell hopes the Project Breathe approach can enable earlier interventions that will help keep her out of the hospital and minimize disruptions to her life.

“It’s really helpful to give me insights into my own health and spot these patterns of deteriorations before it’s too late,” she says. “So far, it’s really proving to be useful in that way.”

Janet Allen is the director of strategic innovation for the U.K.-based Cystic Fibrosis Trust, which ran an earlier study on the feasibility of home monitoring for CF patients. Led by Andres Floto, a University of Cambridge professor of respiratory biology, in collaboration with Winn, the SmartCareCF study enrolled 148 patients across seven sites, who monitored their health daily for six months.

Allen sees Project Breathe as the way of the future, an approach that empowers people with CF to manage their health care and challenges dated standards of care.

“SmartCare CF has shown the power of providing health care data to individuals who understand and know their own condition, and initial data from the Project Breathe pilot has shown that technology can be safely harnessed to disrupt health care models,” she says.

“The idea that you have to go to hospital even when stable to have your chronic condition managed, whatever that condition is, in this day and age shouldn’t be required. There is a definite need for (Project Breathe).”

After that hospital stay with his son a few years ago, David Hill returned to work in early 2017 and met for coffee with a couple of Microsoft colleagues, Giri Tharmananthar and Tom Chapman, and relayed his idea of using technology to create a remote monitoring system for people with CF.

A woman and man sitting on a backyard swing
Kirsty Hill, left, and David Hill are part of the team that created Project Breathe. Photo by Jonathan Banks.

Hill had a chance meeting with Allen at Microsoft and learned that for every 10 CF patients who attend clinics, eight typically did not need to be there and the other two needed medical attention weeks earlier. His goal for creating a self-monitoring system was twofold — to help patients avoid time-consuming clinic visits if they were well and identify declines in their health so they could be treated earlier.

“It was kind of a light-bulb moment, that if we could do something to solve both of those problems, it would improve quality of life,” says Hill, who lives in Reading, west of London. “We built the solution around solving those two problems.”

Tharmananthar was part of a small innovation team incubated at that time in Microsoft Digital that had been looking into solutions for digital health care. The vision of using technology to enable patient-driven health care beyond traditional medical settings had been around for 15 years or more, Tharmananthar says, but hadn’t made much concrete, sustainable progress. Hill’s idea seemed like a promising opportunity.

“Everything Dave wanted to do for cystic fibrosis was a tangible example of this thing we’d been talking about, which is a patient-centric platform that allows clinicians to access patient data,” he says. “There’s a concept of treatment pathways in health care, but it’s usually about the condition, and we wanted to put the patient at the center of it.”

As the project moved forward, Microsoft employees from across the company volunteered their time to help, Tharmananthar says, inspired by the personal story behind Project Breathe and the potential to make a difference.

“It really embodies that thing that Satya (Nadella, Microsoft’s CEO) talks about,” he says. “It’s not about what you do for Microsoft. It’s about the impact you can have in the world with what Microsoft can bring. It really does speak to that.”

With initial funding from Microsoft Digital, Innovate UK and the Cystic Fibrosis Trust, a small team led by Kirsty Hill, with support from Microsoft employees and input from health care professionals and CF patients at Royal Papworth Hospital, developed the Project Breathe app and a back-end solution that securely stores patient data in Azure. The app and solution, built entirely with Microsoft technology, have since been extensively developed and are operated by Magic Bullet for several health organizations in the U.K.

During the SmartCareCF study, Floto and Winn, with help from a Ph.D. student, used patient data to develop a predictive model that uses machine learning to detect signals which might be hidden in the data and can indicate when a patient is becoming unwell. The model is now being tested as part of the current Project Breathe study at Royal Papworth.

The study, which Floto oversees, initially enrolled 95 patients at Royal Papworth and was quickly scaled up after the pandemic hit to include two additional sites in Wales and Scotland, with around 500 patients expected to be enrolled by the end of the year. Plans for the coming year include adding a fourth site in the U.K. and working with Cystic Fibrosis Canada to implement a research study in Toronto.

“Project Breathe is about turning the previous study into a reality, in terms of actually changing clinical practice,” Winn says.

A man leans on a railing in front of a building entrance
Andres Floto is leading a study on home monitoring for cystic fibrosis patients. Photo by Jonathan Banks.

The first phase of the study aims to prove that home monitoring is safe and effective; later phases will involve testing novel new devices and capabilities with the solution and applying the predictive model to determine when patients are becoming sick. Early results show that the model can identify a decline in a patient’s condition an average of 11 days earlier than antibiotics would typically be started, Floto says. And almost all patients in the study have been able to skip clinics by using the app and reviewing their data with a clinician.

“We think Project Breathe may be a great solution to realize the widespread rolling out of virtual clinics,” Floto says. “If we can intervene earlier, we should be able to protect the lungs from long-term, ongoing damage.”

For Kate Eveling, who enrolled in the study in July 2019, being able to skip clinics has not only reduced the three-hour round trips required to attend them but alleviated her worries about going into hospitals.

“It’s just a scary thing. For me, it gives me a lot of anxiety,” she says. “I definitely think (the Project Breathe approach) is the future of CF clinics. It’s made things a lot easier.”

The novel coronavirus has raised new questions about what the future standard of care for CF patients might look like — whether there will be a return to in-person clinics at some point, more of a reliance on remote clinics, or a mix of both.

“The impact of COVID-19 is that everybody’s been forced to use a completely remote model for an unknown length of time,” Kirsty Hill says. “And what became apparent immediately is that patients already enrolled in Project Breathe have a huge advantage in that doctors can have a data-informed discussion with them, whereas for everybody else, there was no data reference to discuss.”

Britain’s National Health Service (NHS) is providing funding to supply spirometers to thousands of cystic fibrosis patients throughout the U.K., giving CF patients at least one of the pieces of equipment needed for the Project Breathe solution. The team hopes to find funding to cover the costs of making the solution’s back-end available to clinics beyond the study, which the NHS currently doesn’t cover. Ultimately, the goal is to enable Project Breathe to collect patient data passively and eliminate the need for self-monitoring, but reaching that point will require additional funding.

In the meantime, as coronavirus cases are again ticking upward in England and other countries, Project Breathe participants like Sammie Read are getting insights into their health from the safety of home. For years, Read was spending two weeks in the hospital about every three months being treated with antibiotics for infections caused by CF. She takes more than 40 pills a day and follows a daily routine of nebulizers, exercise and physiotherapy.

A woman sits at a table with medical devices
By monitoring her health at home, Sammie Read has been able to avoid hospitalizations and skip clinic visits. Photo by Jonathan Banks.

About five years ago, Read became so stressed between juggling work and caring for her school-aged son that her health spiraled dangerously downward. On her husband’s urging, she quit her job.

“With CF, it’s quite unpredictable. You can have a perfectly good day and be fine and the next day it’s like bang, you can’t breathe,” says Read, who lives in a rural area near Stowmarket, England. “It’s sort of like you’re just walking on eggshells.”

A longtime patient at Royal Papworth and a participant in the SmartCareCF study, Read heard about the Project Breathe study, enrolled and began monitoring her health at home.

By tracking her data and making adjustments as needed — exercising a little more if her lung function drops, starting antibiotics at home when an infection is coming on — Read went 18 months without being hospitalized. Even before the coronavirus halted in-person clinics, she was able to skip some of her scheduled visits after remotely reviewing her data with a nurse.

These days, with her son moved out of the house and her health more stable, Read is thinking about going back to work.

“Project Breathe has made a massive impact on my life,” says Read. “It’s definitely made my life easier. You’re in control, rather than CF being in control of you.”

Top image: David Hill, left, looks on while his son George uses a spirometer to gauge his lung function. Photo by Jonathan Banks.

Posted on Leave a comment

Wrist-worn VR controller from Microsoft Research simulates forces such as momentum and gravity

When you reach out an empty hand to pick an apple from a tree, you’re met with a variety of sensations—the firmness of the apple as you grip it, the resistance from the branch as you tug the apple free, the weight of the apple in your palm once you’ve plucked it, and the smooth, round surface under your fingertips.

In recent years, steady progress in haptic controllers from Microsoft Research has moved us toward a virtual reality (VR) experience in which those feelings will be on par with the awe-inspiring and realistic visual renderings being produced today by head-mounted displays. With previous devices such as NormalTouch, we can simulate a virtual object’s surface inclination and texture on the tip of an individual’s index finger. CLAW enables a person to feel she’s grabbed an object between her fingers to explore its compliance and elasticity, and TORC allows a new level of dexterity, parallel to real life. Using these prototypes, an individual can feel the skin of a virtual apple, squeeze the virtual fruit, and move it around in her hand. However, to facilitate a complete interaction with that apple in its virtual surroundings, we also have to take into account the dynamics of the objects in the space. Now, with Haptic PIVOT, we bring the physics of forces to VR controllers. Worn on the wrist, PIVOT is a portable device with a haptic handle that moves in and out of the hand on demand.

If Sir Isaac Newton were to have found the inspiration for his laws of motion and gravity from a virtual apple falling from a virtual tree, he would have needed a controller like PIVOT. By grounding PIVOT to the wrist, we’re able to render the momentum and drag of thrown and caught objects, which are governed by Newton’s laws, including simulating speeds of objects upon reaching the hand: The robotized haptic handle deploys when needed, approaching and finally reaching the hand, creating the feeling of first contact—going from a bare hand to one holding an object—thus mimicking our natural interaction with physical objects in a way that traditional handheld controllers can’t. We studied the performance and limits of PIVOT and co-authored “Haptic PIVOT: On-Demand Handhelds in VR” with fellow Microsoft researchers Mike Sinclair and Christian Holz, who is now with ETH Zurich, and Róbert Kovács, Alexa Fay Siu, and Sebastian Marwecki, who were interns at the time of the work. This week, we’re presenting Haptic PIVOT at the 2020 ACM Symposium on User Interface Software and Technology (UIST).

Haptic PIVOT serves on-demand control and haptic rendering of virtual objects as the hand reaches for them. PIVOT comprises a haptic handle that is deployed (left) and retracted (right) via a motorized hinge. A passive radioulnar hinge allows for natural hand tilting.

From the physical to the virtual—on demand

At the core of PIVOT’s design is its hinge mechanism and haptic handle. The haptic handle is interchangeable and can be swapped out for existing controllers. However, for our work with PIVOT, we outfitted a prototype handle with capacitive touch sensors that detect contact and release of objects; a voice coil actuator for providing vibrotactile feedback; and a trigger switch for control input. The haptic handle operates via a modified servo motor (driving the hinge) and can be summoned into individuals’ hands on demand, keeping their hands free when not in use. This capability makes PIVOT ideal for augmented reality or blended scenarios. An individual can be typing on a keyboard, using a mouse, or working with other physical objects in her environment. Whenever needed, a quick flick of the wrist can initiate PIVOT to rotate the handle into the person’s palm so she can interact with virtual objects. The handle can be retracted with another flick of the wrist. Both summoning actions are detected by an internal accelerometer.

Haptic PIVOT leaves individuals’ hands free until they need the controller, at which point a quick flick of the wrist will pivot the haptic handle into their hand. Such on-demand capability can be helpful in augmented or mixed-reality scenarios.

The haptic handle’s motor stops running once the handle has been grabbed, as sensed by the capacitive sensors, and thanks to a passive radioulnar hinge, individuals can move their wrists freely from side to side (up to 60 degrees) and up and down while continuing to hold the handle. To prevent the haptic handle from hitting the thumb as it moves between its resting and activated positions, the motorized hinge is slanted toward the hand, as opposed to perpendicular to it, and a 190-degree range was set to prevent the handle from getting in the way when not in use.

May the forces be with you

The true power of PIVOT shines when interacting with virtual objects. Take picking the apple from the tree as an example. A combination of mechanics, electronics, firmware, and software works together from the moment the apple enters reaching range to the moment it’s resting in the palm of the individual’s hand.

Computer-vision tracking of the hand via a head-mounted display such as Microsoft HoloLens or a consumer VR tracker worn on the back of the hand allows for absolute position tracking so our control system can detect when an individual begins reaching for the target—in this case, the apple. When the apple is within a 30-centimeter radius of collision, PIVOT moves the haptic handle into a preparation position. As the individual’s hand closes in on the apple, within 10 centimeters of it, the handle moves proportionally closer and then finally lands in her palm at the same time she wraps her fingers around the virtual fruit. The handle moves as fast as the individual, providing a very realistic simulation of impact. The four capacitive touch areas along the surface of the handle register that contact with the handle has been made, and a signal is sent via serial communication interface to the virtual hand, which closes around the apple just as the hand grasps the handle. With polling that takes less than 1 millisecond, the interface offers a latency that supports the immediacy needed to delivery haptic response times that align with user expectations.

When reaching out for a virtual object (left), PIVOT rotates its haptic handle toward the individual’s hand in proportion to the distance to the virtual object (right).

As the individual pulls the apple from the tree, she encounters the expected resistance from the branch on which it’s attached as PIVOT uses its motor to pull the haptic handle away from the hand. She experiences the resistance until the apple is detached—an action accompanied by a “thud” sensation generated by the voice coil actuator—at which point, she then feels the impact and weight of the apple in her palm. Instead of pulling the handle away, PIVOT presses it into the palm, creating a sense of momentum and weight. PIVOT can render such forces on the palm and fingers because it’s grounded on the wrist and not in the palm. With a simple rotation of the hand and release of the handle, the individual can drop a bad apple to the ground or a good one into a basket with others. Additionally, when worn on both wrists, PIVOT can facilitate two-handed interactions, such as picking up that basket of apples by the handles. The devices render the feeling of holding by synchronizing the haptic feedback in each device.

Wearing PIVOT on both arms enables haptic feedback for bimanual interactions. Here, the individual is stretching and compressing a basket, which is rendered as synchronized push-pull forces on both hands.

Play ball!

In making the design decision to ground PIVOT to the wrist, one of the first things we considered was baseball. From pitching to a batter to throwing a runner out at second, there’s a lot happening in the arm, and the same could be said of other sports. Introducing the wrist form factor, or design, into our offerings presented an opportunity to provide a wider range of actions without interfering with the physical environment around the player.

With PIVOT, individuals can catch and throw virtual objects. The reaction time of catching a flying virtual object is significantly shorter than grabbing a stationary virtual object (we can simulate the catch of a 55.9-mph throw through visuo-motor illusions!). Like with simulating the grasp of a nonmoving object, the simulation of catching objects requires that PIVOT and the visual input are aligned correctly to accurately render when object meets hand. For high speeds, a larger collision radius can be implemented to increase the responsiveness of the device.

As is the case with dropping an apple into a basket, throwing relies on PIVOT sensing the motion of the hand and the release of the haptic handle, which coincides with the release of the virtual object. Upon release, the handle is driven out of the palm by the motor at the physically correct angular speed, up to 0.55 milliseconds/degree. In other words, the handle can go from grasp to fully retracted (at approximately 190 degrees) in 340 milliseconds, the time it takes to blink an eye. Throwing, catching, and passing objects among people not only enables simulation for sports games, but can also extend to collaboration in the virtual workplace, where factory workers or industrial designers can feel the forces of virtual designs or products in a completely new way, even before manufacturing them.

PIVOT not only enables grasping virtual objects, but dropping, throwing, and catching them, as well.

The ultimate frontier

Touch is the ultimate frontier in rendering. Once you’ve achieved incredibly realistic visual renderings of objects in virtual and augmented reality, next you want to simulate natural interactions with these virtual objects. That’s when haptics takes center stage.

Today, VR’s visual renderings are immersive, sophisticated, and appealing—so much so that when you put on a VR headset and are transported into a virtual world in which an apple hangs from a tree branch, you can’t help but grab it. But when you reach for that apple and don’t feel its smoothness and firmness, the pullback of the branch when you try to pluck it, or its weight in the palm of your hand, the illusion is shattered. With haptic controllers like PIVOT, Microsoft researchers are working to solve the challenge.

Posted on Leave a comment

Introducing CodeXGLUE, a benchmark dataset and open challenge for code intelligence

According to Evans Data Corporation, there are 23.9 million professional developers in 2019, and the population is expected to reach 28.7 million in 2024. With the growing population of developers, code intelligence, which aims to leverage AI to help software developers improve the productivity of the development process, is growing increasingly important in both communities of software engineering and artificial intelligence.

When developers want to find code written by others with the same intent, code search systems can help automatically retrieve semantically relevant code given natural language queries. When developers are confused about what to write next, code completion systems can help by automatically completing the following tokens given the context of the edits being made. When developers want to implement Java code with the same function of some existing body of Python code, code-to-code translation systems can help translate from one programming language (Python) to another (Java).

Code intelligence therefore plays a vital role in Microsoft’s mission to empower developers. As highlighted by Microsoft CEO Satya Nadella at Microsoft Build 2020, the role of developers is more important than ever. GitHub is increasingly the default home for source code, and Visual Studio Code is one of the most popular code editors. Microsoft offers a complete toolchain for developers, bringing together the best of GitHub, Visual Studio, and Microsoft Azure to help developers to go from idea to code and code to cloud.

Recent years have seen a surge of applying of statistical models, including neural nets, to code intelligence tasks. Very recently, pre-trained models learned from big programming language data have been inspired by the great success of large pre-trained models like BERT and GPT in natural language processing (NLP). These models, including IntelliCode and CodeBERT, obtain further improvements on code understanding and generation problems. However, the area of code intelligence lacks a benchmark suite that covers a wide range of tasks. We have seen that a diversified benchmark dataset is significant for the growth of an area of applied AI research, like ImageNet for computer vision and GLUE for NLP.

To address this, researchers from Microsoft Research Asia (Natural Language Computing Group) working together with Developer Division and Bing introduce CodeXGLUE, a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark for code. It includes 14 datasets for 10 diversified code intelligence tasks covering the following scenarios:

  • code-code (clone detection, defect detection, cloze test, code completion, code refinement, and code-to-code translation)
  • text-code (natural language code search, text-to-code generation)
  • code-text (code summarization)
  • text-text (documentation translation)
table
Figure 1: A brief summary of CodeXGLUE is given below, including tasks, datasets, language, sizes in various states, baseline systems, and providers. Datasets highlighted in BLUE are newly introduced.

CodeXGLUE includes six existing code intelligence datasets — BigCloneBench, POJ-104, Defects4J, Bugs2Fix, CONCODE, and CodeSearchNet — but also newly introduced datasets that are highlighted in the table above. Below, we elaborate on the task definition for each task and dataset.

  1. Clone detection (BigCloneBench, POJ-104). A model is tasked with measuring the semantic similarity between codes. Two existing datasets are included. One is for binary classification between code, and the other is for retrieving semantically similar code given code as the query.
  2. Defect detection (Defects4J). A model is tasked with identifying whether a body of source code contains defects that may be used to attack software systems, such as resource leaks, use-after-free vulnerabilities, and DoS attack. An existing dataset is included.
  3. Cloze test (CT-all, CT-max/min). A model is tasked with predicting the masked token from a code, formulated as a multi-choice classification problem. The two datasets are newly created: one with candidates from the (filtered) vocabulary and the other with candidates among “max” and “min.”
  4. Code completion (PY150, GitHub Java Corpus). A model is tasked with predicting following tokens given a code context. Both token-level and line-level completion are covered. The token-level task is analogous to language modeling, and we include two influential datasets here. Line-level datasets are newly created to test a model’s ability to autocomplete a line.
  5. Code translation (CodeTrans). A model is tasked with translating the code in one programming language to the code in another one. A dataset between Java and C# is newly created.
  6. Code search (CodeSearchNet, AdvTest; StacQC, WebQueryTest). A model is given the task of measuring the semantic similarity between text and code. In the retrieval scenario, a test set is newly created where function names and variables in test sets are replaced to test the generalization ability of a model. In text-code classification scenario, a test set where natural language queries come from Bing query log is created to test on real user queries.
  7. Code refinement (Bugs2Fix). A model is tasked with trying to automatically refine the code, which could be buggy or complex. An existing dataset is included.
  8. Text-to-code generation (CONCODE). A model is given the task to generate a code given natural language description. An existing dataset is included.
  9. Code summarization (CodeSearchNet). A model is given the task to generate natural language comments for a code. Existing datasets are included.
  10. Documentation translation (Microsoft Docs). A model is given the task to translate code documentation between human languages. A dataset, focusing on low-resource multilingual translation, is newly created.

To make it easy for participants, we provide three baseline models to support these tasks, including a BERT-style pretrained model (in this case, CodeBERT), which is good at understanding problems. We also include a GPT-style pretrained model, which we call CodeGPT, to support completion and generation problems. Finally, we include an Encoder-Decoder framework that supports sequence-to-sequence generation problems.

Figure 2: Three pipelines including CodeBERT, CodeGPT, and Encoder-Decoder are provided to make it easy for participants.

Looking Forward: Extending to more programming languages and downstream tasks

With CodeXGLUE, we seek to support the development of models that can be applied to various code intelligence problems, with the goal of increasing the productivity of software developers. We encourage researchers to participate in the open challenges to continue progress in code intelligence. Moving forward, we’ll extend CodeXGLUE to more programming languages and downstream tasks while continuing to push forward pre-trained models by exploring new model structures, introducing new pre-training tasks, using different types of data, and more.

This research was conducted by Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Daya Guo, Duyu Tang, Junjie Huang, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shuai Lu, Shujie Liu, and Shuo Ren.

Posted on Leave a comment

Microsoft finds underwater datacenters are reliable, practical and use energy sustainably

Algae, barnacles and sea anemones

The Northern Isles underwater datacenter was manufactured by Naval Group and its subsidiary Naval Energies, experts in naval defense and marine renewable energy. Green Marine, an Orkney Island-based firm, supported Naval Group and Microsoft on the deployment, maintenance, monitoring and retrieval of the datacenter, which Microsoft’s Special Projects team operated for two years.

The Northern Isles was deployed at the European Marine Energy Centre, a test site for tidal turbines and wave energy converters. Tidal currents there travel up to 9 miles per hour at peak intensity and the sea surface roils with waves that reach more than 60 feet in stormy conditions.

The deployment and retrieval of the Northern Isles underwater datacenter required atypically calm seas and a choreographed dance of robots and winches that played out between the pontoons of a gantry barge. The procedure took a full day on each end.

The Northern Isles was gleaming white when deployed. Two years underwater provided time for a thin coat of algae and barnacles to form, and for sea anemones to grow to cantaloupe size in the sheltered nooks of its ballast-filled base.

“We were pretty impressed with how clean it was, actually,” said Spencer Fowers, a principal member of technical staff for Microsoft’s Special Projects research group. “It did not have a lot of hardened marine growth on it; it was mostly sea scum.”

Crew cleans off the Project Natick datacenter
A member of the Project Natick team power washes the Northern Isles underwater datacenter, which was retrieved from the seafloor off the Orkney Islands in Scotland. Two years underwater provided time for a thin coat of algae and barnacles to form on the steel tube, and for sea anemones to grow to cantaloupe size in the sheltered nooks of its ballast-filled triangular base. Photo by Simon Douglas.

Power wash and data collection

Once it was hauled up from the seafloor and prior to transportation off the Orkney Islands, the Green Marine team power washed the water-tight steel tube that encased the Northern Isles’ 864 servers and related cooling system infrastructure.

The researchers then inserted test tubes through a valve at the top of the vessel to collect air samples for analysis at Microsoft headquarters in Redmond, Washington.

“We left it filled with dry nitrogen, so the environment is pretty benign in there,” Fowers said.

The question, he added, is how gases that are normally released from cables and other equipment may have altered the operating environment for the computers.

The cleaned and air-sampled datacenter was loaded onto a truck and driven to Global Energy Group’s Nigg Energy Park facility in the North of Scotland. There, Naval Group unbolted the endcap and slid out the server racks as Fowers and his team performed health checks and collected components to send to Redmond for analysis.

Among the components crated up and sent to Redmond are a handful of failed servers and related cables. The researchers think this hardware will help them understand why the servers in the underwater datacenter are eight times more reliable than those on land.

“We are like, ‘Hey this looks really good,’” Fowers said. “We have to figure out what exactly gives us this benefit.”

The team hypothesizes that the atmosphere of nitrogen, which is less corrosive than oxygen, and the absence of people to bump and jostle components, are the primary reasons for the difference. If the analysis proves this correct, the team may be able to translate the findings to land datacenters.

“Our failure rate in the water is one-eighth of what we see on land,” Cutler said.

“I have an economic model that says if I lose so many servers per unit of time, I’m at least at parity with land,” he added. “We are considerably better than that.”

Posted on Leave a comment

Research Mode for HoloLens 2 to facilitate computer vision research

Lifestyle image of male wearing a Hololens 2 device

Since its launch in November 2019, Microsoft HoloLens 2 has helped enterprises in manufacturing, construction, healthcare, and retail onboard employees more quickly, complete tasks faster, and greatly reduce errors and waste. It sets the high-water mark for intelligent edge devices by leveraging a multitude of sensors and a dedicated ASIC (Application-Specific Integrated Circuit) to allow multiple real-time computer vision workloads to run continuously. In Research Mode, HoloLens 2 is also a potent computer vision research device. (Note: Research Mode is available today to Windows Insiders and soon in an upcoming release of Windows 10 for HoloLens .)

Compared to the previous edition, Research Mode for HoloLens 2 has the following main advantages:

  • In addition to sensors exposed in HoloLens 1 Research Mode, we now also provide IMU sensor access (these include an accelerometer, gyroscope, and magnetometer).
  • HoloLens 2 provides new capabilities that can be used in conjunction with Research Mode. Specifically, articulated hand-tracking and eye-tracking which can be accessed through APIs while using research mode, allowing for a richer set of experiments.

With Research Mode, application code can not only access video and audio streams, but can also simultaneously leverage the results of built-in computer vision algorithms such as SLAM (simultaneous localization and mapping) to obtain the motion of the device as well as the spatial-mapping algorithms to obtain 3D meshes of the environment. These capabilities are made possible by several built-in image sensors that complement the color video camera normally accessible to applications.

HoloLens 2 has four grayscale head-tracking cameras and a depth camera to sense its environment and perform articulated hand tracking. It also has two additional infrared cameras and accompanying LEDs that are used for eye tracking and iris recognition. As shown in Figure 1, two of the grayscale cameras are configured as a stereo rig, capturing the area in front of the device so that the absolute depth of tracked visual features can be determined through triangulation. Meanwhile, the two additional grayscale cameras help provide a wider field of view to keep track of features. These synchronized global-shutter cameras are significantly more sensitive to light than the color camera and can be used to capture images at a rate of up to 30 frames per second (FPS).

Figure 1: Hololens 2 Research Mode enables access to the gray-scale, depth camera and IMU sensors on device. This complements the color camera normally available to applications.

The depth camera uses active infrared (IR) illumination to determine depth through phase-based time-of-flight. The camera can operate in two modes. The first mode enables high-framerate (45 FPS) near-depth sensing, commonly used for hand tracking, while the other mode is used for lower-framerate (1-5 FPS) far-depth sensing, currently used by spatial mapping. As hands only need to be supported up to 1 meter from the device, HoloLens 2 saves power by reducing the number of illuminations, which results in the depth wrapping around beyond one meter . For example, something at 1.3 meters will appear at 0.3 meters in HoloLens 2 in this case. In addition to depth, this camera also delivers actively illuminated IR images (in both modes) that can be valuable in their own right because they are illuminated from the HoloLens and reasonably unaffected by ambient light. Azure Kinect uses the same sensor package, but with slightly different depth modes.

With the newest Windows Insider release of Windows 10 for HoloLens, researchers now have the option to enable Research Mode on their HoloLens devices to gain access to all of these external facing raw image sensors streams. Research Mode for HoloLens 2 also provides researchers with access to the accelerometer, gyroscope, and magnetometer readings. To protect users’ privacy, raw eye-tracking camera images are not available through Research Mode. Researchers can access eye-gaze direction through existing APIs.

For other sensor streams, researchers can also still use the results of the built-in computer vision algorithms and can now also choose to use the raw sensor data for their own algorithms.

The sensors’ streams can either be processed or stored on device or transferred wirelessly to another PC or to the cloud for more computationally demanding tasks. This opens a wide range of new computer vision applications for HoloLens 2. HoloLens 2 is particularly well suited as a platform for egocentric vision research as it can be used to analyze the world from the perspective of a user wearing the device. For these applications, HoloLens devices’ abilities to visualize results of the algorithms in the 3D world in front of the user can be a key advantage. HoloLens sensing capabilities can also be very valuable for robotics where these can, for example, enable a robot to navigate its environment.

These new HoloLens capabilities will be demonstrated at a tutorial on August 28th, 2020, at the European Conference on Computer Vision (ECCV). An initial set of sample apps is being made available showcasing computer vision use cases on GitHub, and you can check out the Research Mode documentation for further technical details.

Posted on Leave a comment

State-of-the-art algorithm accelerates path for quantum computers to address climate change

While there has been a focus in the quantum computing industry on growing the number of qubits in a quantum computer, the reality is there are many important factors when building an overall system to bring quantum solutions to fruition. Hardware scaling, temperature control, software optimizations, and many other considerations must be reimagined in ways that allow large-scale quantum computers to do the necessary, meaningful work to solve some of today’s and tomorrow’s biggest problems. A question emerges that is both scientific and philosophical in nature: once a quantum computer scales to handle problems that classical computers cannot, what problems should we solve on it? Quantum researchers at Microsoft are not only thinking about this question—we are producing tangible results that will shape how large-scale quantum computer applications will accomplish these tasks.

We have begun creating quantum computer applications in chemistry, and they could help to address one of the world’s biggest challenges to date: climate change. In January, Microsoft launched a bold new environmental sustainability initiative focusing on carbon, water, waste, and biodiversity, announcing one of the most ambitious carbon commitments put forward by any company: Microsoft will be carbon negative by 2030 and remove from the environment more carbon than we have emitted since our founding by 2050. Last week, we announced seven important new steps on our path to be carbon negative by 2030. Learn more on the Microsoft on the Issues blog.

Microsoft has prioritized making an impact on this global issue, and Microsoft Quantum researchers have teamed up with researchers at ETH Zurich to develop a new quantum algorithm to simulate catalytic processes. In the context of climate change, one goal will be to find an efficient catalyst for carbon fixation—a process that reduces carbon dioxide by turning it into valuable chemicals. One of our key findings is that the resource requirements to implement our algorithm on a fault-tolerant quantum computer are more than 10 times lower than recent state-of-the-art algorithms. These improvements significantly decrease the time it will take a quantum computer to do extremely challenging computations in this area of chemistry. In our research, we have not only improved upon quantum algorithms and have shown how they can help effectively find new catalysts, we have also learned more about other quantum resources that are necessary to perform these calculations at an exponentially faster rate than classical computers. These learnings include the size of quantum computers and their runtime—and more generally how to better co-design a hybrid quantum-classical computing system to handle this type of problem. Our research is detailed in a paper called “Quantum computing enhanced computational catalysis.”

Carbon fixation: An opportunity in chemistry opens the door for a new application in quantum computing

Figure 1: In the catalytic cycle studied in our paper, a Ruthenium-based catalyst reacts with carbon dioxide and hydrogen molecules to produce water and methanol, leaving the catalyst unchanged to react with another carbon dioxide molecule.

Synthetic carbon fixation is a process that has potential to help greatly reduce carbon dioxide in the atmosphere by converting CO2 into other useful chemical compounds. Carbon fixation is not a new process. In fact, it is a very old one. Plants use a form of carbon fixation to convert carbon dioxide into energy-rich molecules such as glucose. But glucose isn’t the only possible biproduct of carbon fixation. When using different catalysts, natural or synthetic, carbon dioxide can be converted into other compounds.

Currently, synthetic catalytic processes are found through lengthy trial-and-error lab experiments. In a process that requires testing thousands of molecular combinations, computer simulations that very accurately model quantum correlations could replace complex synthesis of new candidate catalysts. Whereas computers today can have a difficult time accurately calculating properties of complex molecules, quantum computers are especially suited for this task and will give more reliable and predictive simulation results. We hope that quantum computers will complement traditional methods and, together, could reveal a process that both removes carbon dioxide from the atmosphere and provides valuable chemicals in return.

Why begin with a known catalytic reaction if the goal is to find new ones?
It’s important to first look at how these catalytic processes work in order to find ways they can be improved upon, especially exploring where quantum computers can make computational catalysis, a method to simulate catalysts already being performed on classical computers, more effective, more accurate, and less time-consuming.

In order to better understand how quantum computer algorithms can assist in discovering new, more efficient catalysts, we decided to focus our analysis on a previously published catalytic process based on the transition metal Ruthenium to convert carbon dioxide into methanol. It is also—like all known catalysts resulting in methanol to date—extremely inefficient. This inefficiency offers an opportunity for finding catalytic reactions that are more scalable. Using this reaction as a foundation for testing our algorithm, we were able to gain knowledge about how to best optimize algorithms for simulating these types of reactions on a quantum computer (see Figure 1 above).

Our algorithmic advancement: Boosting computational efficiency through compression

We need to develop more efficient algorithms for quantum computers because problems that involve calculating molecular energies with high precision, such as for catalytic processes, will be resource intensive—even on quantum computers.

  • ENGAGE Quantum Development Kit Are you a researcher or developer who wants to help in discovering new algorithms for quantum computers? Check out the Quantum Development Kit and Q#, a toolkit and high-level programming language for developing quantum algorithms, which allow you to try out a small chemical algorithm for yourself.

Obtaining high-precision energy estimates requires simulating the molecule’s quantum state for a long period of time, which is split into multiple smaller time steps. All the interaction terms in the problem description, the so-called Hamiltonian, need to be loaded over and over again at every single time step since quantum information cannot be copied. The natural approach to reduce overall runtime is then to reduce both the information that needs to be loaded as well as the number of time steps required for the simulation. One promising approach is to use a so-called “double-factorized” representation of the Hamiltonian. In this representation, the information describing the interaction between electrons is compressed into fewer terms.

In our research, we precisely achieve this runtime reduction by developing a new, efficient quantum algorithm. Our algorithm exploits the improved compression properties of the double-factorized form, and it also manages to perform the simulation with significantly larger step sizes compared to prior state of the art that exploits the unfactorized or single-factorized forms of the Hamiltonian. The extent of our improvement for molecules such as Ruthenium catalysts is driven primarily by the larger time step size, as illustrated in the table below. Moreover, aggressive compression can further reduce the number of terms at the cost of accuracy in the simulation. Importantly, our use of a so-called “qubitization” simulation algorithm allows for good control over the target accuracy. Combined, these factors reduce runtime by orders of magnitude for obtaining reliable results.

Ruthenium catalyst configuration
VIII with 130 spin-orbitals
Approach Number of steps per unit
of time evolution
Overall algorithmic speedup
Unfactorized 10,600 1.0x
Single-factorized 42,200 0.4x
Our results 570 18.9x

Designing quantum computers for the hunt for new catalysts

The computational design of catalysts relies on very accurate energy calculations. Quantum computers can avoid uncontrolled approximations of classical simulations. They scale much better and open opportunities to assess the energetics of chemical species with sufficient accuracy. By using the Hamiltonian parameters generated on classical computers, a quantum computer could solve the exact energies of the chemical systems and help profile a quantitatively accurate landscape of reaction pathways. Catalyst structures could then be further refined or modified through the insights generated by reaction kinetics analysis. Such a process could iterate until a desired catalyst is found.

Figure 2: Protocol of quantum computing enhanced computational catalysis workflow. The energies of all species in the catalytic reaction cycle can be evaluated through the quantum computer using the output parameters of classical computers (upper right). The kinetics analysis (bottom left) can then be performed on the whole reaction pathways and new insights on the catalyst structures can be generated. This process repeats until an ideal catalyst structure is found.

Our paper is the first to show analysis of a quantum algorithm being done on a specific chemical reaction along the entire reaction pathway. Instead of just a single configuration, we analyzed relevant configurations of the reactants along this pathway. In addition, we performed state-of-the-art classical calculations, but our results with these confirmed that they lack reliability for truly predictive computational catalysis. Thus, one of the first roles for quantum computers will be not only to provide accurate results for novel catalysts, but also to benchmark validity of various classical approximations and develop better classical simulation methods.

Beyond this, we want to further optimize quantum algorithms to enable the simulation of larger numbers of electrons. Current algorithms limit the accurate quantum computation to so-called active spaces of the most correlated electrons. While that may often be sufficiently accurate, we will not know unless we can simulate larger active spaces or ideally all electrons in a molecule.

Finally, with the estimates for gate counts calculated, we were able to translate this information into potential runtime estimates for quantum computation on this problem. Depending on the assumptions made about future quantum computers, we estimate that it may take anywhere from a little over a day to several years to perform such calculations. This clearly shows the need not only for fast algorithms but also fast and scalable quantum hardware.

Our newer, faster quantum algorithm for calculating molecular energy levels is itself an exciting development and a crucial step in the computational catalysis workflow (see Figure 2 above), but it will take more than that to find an efficient catalyst. In fact, knowing more about the quantum algorithms needed to undertake improved computational catalysis opens the door to even more questions about the scale of quantum computers. What is the amount of memory we need to run these algorithms at a meaningful speed? What does this imply for the needed hybrid workflow and quantum architecture it runs on to successfully find these catalysts? Our results after testing this algorithm reveal some important discoveries going forward.

Where do quantum computers and chemistry applications go from here?

The research presented in this post is evidence that rapid advances in quantum computing are happening now—our algorithm is 10,000 times faster than the one we created just three years ago. By gaining more insight into how quantum computers can improve computational catalysis, including ways that will help to address climate change while creating other benefits, we hope to spur new ideas and developments on the road to creating some of the first applications for large-scale quantum computers of the future. The advancements in algorithms and knowledge gained from our research are a springboard for future work, including exploring additional ways algorithms can be made even more effective. Given the promise and potential that quantum computing represents for tackling the toughest challenges in chemistry, we hope to work alongside the chemistry community to better understand how quantum computers can be best utilized to further develop new chemical processes, molecules, and, eventually, materials.

We are encouraging those who are interested in exploring how chemistry can be impacted by quantum computing to explore Azure Quantum, which comprises a full set of tools, ranging from the Quantum Development Kit (QDK) and the Q# programming language for quantum to simulators and resource estimators. The QDK allows researchers to develop and test new quantum algorithms for chemistry, run small examples on a simulator, use Azure Quantum on quantum hardware, and estimate resource requirements to run simulations at scale on future quantum computers.

As part of the QDK, we developed a Q# chemistry library, with our partner Pacific Northwest National Laboratories (PNNL), that provides several fundamental data structures and tools to explore quantum algorithms for chemistry. If you are looking to get started with the QDK and Q#, check out our Microsoft Learn modules released at Microsoft Build 2020.

Posted on Leave a comment

Microsoft Research Dissertation Grant supports students’ cutting-edge work

A compilation of headshots of the 2020 Microsoft Research Dissertation Grant recipients: Rogerio Bonatti, Kianté Brantley, Mayara Costa Figueiredo, Sami Davies, Farah Deeba, Anna Fariha, Diego Gómez-Zará, Zerina Kapetanovic, Urvashi Khandelwal, and Shruti Sannon

This year marks the fourth year of the Microsoft Research Dissertation Grant, which offers grants of up to $25,000 to support the research of students nearing the completion of doctoral degrees at North American universities who are underrepresented in the field of computing. This was the most competitive year yet for the grant program; about 230 students submitted proposals. While we wish we could have given grants to the entire submission pool, it’s encouraging to see that so many students from underrepresented groups are pursuing advanced degrees in computing and related fields, and I wish all of those who submitted proposals success in their studies!

This year’s grant recipients, along with their respective academic institutions and dissertations, are:

  • Rogerio Bonatti, Carnegie Mellon University, “Active Vision: Autonomous Aerial Cinematography with Learned Artistic Decision-Making”
  • Kianté Brantley, University of Maryland, College Park, “Practical Techniques for Leveraging Experts for Sequential Decisions and Predictions”
  • Mayara Costa Figueiredo, University of California, Irvine, “Self-Tracking for Fertility Care: A Holistic Approach”
  • Sami Davies, University of Washington, “Complex Analysis, Hierarchies, and Matroids—Improving Algorithms via a Mathematical Perspective”
  • Farah Deeba, The University of British Columbia, “Placenta: Towards an Objective Pregnancy Screening System”
  • Anna Fariha, University of Massachusetts Amherst, “Enhancing Usability and Explainability of Data Systems”
  • Diego Gómez-Zará, Northwestern University, “Using Online Team Recommender Systems to Form Diverse Teams”
  • Zerina Kapetanovic, University of Washington, “Low-Power Communication for Environmental Sensing Systems”
  • Urvashi Khandelwal, Stanford University, “Understanding and Exploiting the Use of Linguistic Context by Neural Language Models”
  • Shruti Sannon, Cornell University, “Towards a More Inclusive Gig Economy: Examining Privacy, Security, and Safety Risks for Workers with Chronic Illnesses and/or Disabilities”

Furthering their research agendas

Our recipients plan to use the grant monies to further various aspects of their research programs; in addition to supporting their tuition, students described the myriad ways in which the funds would further their research agendas.

For instance, Shruti Sannon, who is studying the risks and opportunities a range of gig platforms pose to workers with chronic illness and/or disabilities, plans to use a portion of her grant to compensate gig workers for participating in interviews, noting it’s particularly important that her interview studies pay a fair hourly wage. “Providing compensation that is reflective of a living wage is particularly important given wider concerns with exploited labor in the gig economy,” Sannon says. She also plans to use some of the funds to pay for professional transcription of the interview recordings to facilitate subsequent qualitative analysis.

2020 Dissertation Grant: Shruti Sannon

Shruti Sannon, Department of Communication, Cornell University

Diego Gómez-Zará, who hopes his thesis work will highlight the potential of recommender systems to design and create more diverse teams, plans to use the grant funding to recruit and pay 240 research study participants and to support two undergraduate research assistants, whom he’ll mentor. He’ll also put some of the funding toward conference costs to present his research findings and open-access publication fees to more widely disseminate his research results. “This grant will help us to continue carrying out our team experiments, which require a large number of participants; continue developing new algorithms for our team recommender systems; and test empirically if team recommender systems can enable users to form more diverse teams,” he says.

2020 Dissertation Grant: Diego Gomez-Zara

Diego Gómez-Zará, Computer Science and Communication Studies Departments, Northwestern University

Farah Deeba, who is working to develop a system for better screening placenta health, plans to use her funding to purchase a handheld ultrasound scanner, which she noted will allow her to extend her research to a point-of-care application for pregnancy monitoring. She also plans to use some of her grant to purchase GPUs to speed up data analysis and for conference travel so she can share her findings.

“The placenta, despite being the single most important factor responsible for a healthy baby and a healthy mother, remains neglected in pregnancy monitoring,” Deeba says. “My research aims at changing the current clinical practice. As a woman, I feel a special connection to my research topic. I believe my research will promise health and security to every pregnant woman during this precious but vulnerable stage of life.”

2020 Dissertation Grant: Farah Deeba

Farah Deeba, Electrical and Computer Engineering Department and Robotics and Control Laboratory, The University of British Columbia

Career development and networking

In addition to the grant monies, recipients will also participate in a virtual two-day career development and networking summit this fall, where they’ll join the recipients of many other Microsoft Research fellowships and grants, as well as our research scientists, to discuss their work and receive advice on completing their degree and navigating the post-PhD job market.

Learn more by exploring the research of all 2020 Dissertation Grant recipients.