Posted on Leave a comment

AI transforms photo management for Japanese pro baseball

Sports stars are among the most photographed people on the planet today. Their on-field performance, style, gestures, and facial expressions are almost continuously captured digitally for fans, the media, commercial use, and, ultimately, posterity.

It’s not unusual for thousands of pictures to be shot from all angles at any professional encounter nowadays. So, a typical season is likely to produce virtual mountains of images for major clubs and competitions in most sports.

Now, professional baseball in Japan is turning to artificial intelligence and the cloud to handle the magnitude of what has been a laborious and time-consuming task – photo management.

Sports photos can have immediate, lasting, and lucrative value – but only if they are kept in well organized and cataloged collections that can be accessed efficiently. IMAGE WORKS – a service of iconic Japanese photo giant, Fujifilm – manages the Nippon Professional Baseball’s (NPB) cloud-based Content Images Center (CIC).

Here curators sort images, identify players in each image and tag those images with that information. It sounds simple, but the volume of imagery now being produced is huge. The usual way of managing this is simply not keeping up.

To understand why let’s look at the special place baseball holds in modern Japan where it has been a wildly popular game since the 1930s. While its rules differ slightly from those of America’s favorite pastime, the NPB is to Japan is what Major League Baseball (MLB) is to the United States. The NPB consists of two top professional leagues: the Central League and the Pacific League. Each has six teams, and each holds 146 games a season, playing on most days of the week from March to October. Each league then holds its own playoffs, which are followed by the seven-game Nippon Series Championship between the two league champions – in a spectacle similar to that of World Series in the United States.

The automatic player name-tagging function can often identify players even in images that do not show their faces.

There is a steady deluge of images from across the country for much of the year with about 3,000 images shot at each game. After the crowds have left the stadiums, curators from each team typically select about 300 photographs. They then spend around four hours manually identifying and tagging player information to each picture.

That sort of timing can be a problem in our fast-paced world. Demand for images is usually at its highest in realtime or near realtime – that is, during or immediately after, each game. Fans and media can quickly lose interest in content from a past game once a new one begins. So, not only is the job of player image identification massive, it needs to be done fast.

Now AI has stepped up to the plate. Developers from Fujifilm and Microsoft Japan have devised a solution: an automatic player name-tagging function that identifies and tags images much faster than people can, and in greater volumes.

Since June 2018, it has been in a trial that has focused on just five baseball teams – including Hiroshima Toyo Carp, which has won the Central League championship eight times, and the Nippon Series three times. The trial was such a success, the function will be used for all NPB teams in the 2019 season.

Its photo analysis capabilities are based on pre-trained AI from Microsoft Cognitive Services and a deep learning framework from the Microsoft Cognitive Toolkit. Specifically, facial recognition using the Microsoft Cognitive Services Face API is combined with a unique determination model built on the Microsoft Cognitive Toolkit.

This enables the classification of images into four types—batting, pitching, fielding, and base running. Often, it can also determine a player’s name when his face is not visible in an angled or side shot. Azure Durable Functions and Automatic Player Name Tagging, and a final manual check by people has reduced overall processing time from the traditional four hours to just 30 minutes.

A sample of IMAGE WORKS baseball photo collection

Through its developmental stages, Microsoft Japan provided a ResNet neural network model from Microsoft Research, its research and development arm. It also held several hackathons with Fujifilm Software, which is the developer of IMAGE WORKS. Repeated verification exercises saw player recognition accuracy rates jump to over 90%.

“With the power of Azure and deep learning, we have been able to create an AI solution that makes of our photo service so much more efficient and faster. And, that is good for our customers,” said Riki Sato, Team Leader of the Advanced Solution Group at IMAGE WORKS. His colleague Daichi Hayata hailed the collaboration between IMAGE WORKS team and Microsoft Japan. “This was the first time we have dealt with deep learning, and we could do it with basic knowledge,” he said.

Fujifilm Imaging Systems now has plans to widen its use to amateur baseball leagues and then other sports. It might also be applied to the content needs outside the sports world. And, it is looking at the use of video analysis through Azure Video Indexer.

Microsoft Japan is committed to helping companies and organization embrace digital transformation with AI and is considering how to use this combination of pre-trained AI and a customizable deep learning framework in other fields, such as medicine.

Posted on Leave a comment

More ways to improve patient care with AI and blockchain

Whether you’re interested in using Artificial Intelligence (AI) and Machine Learning (ML) to drive better health outcomes, reduce your operational costs, or improve fraud detection, one way you can better unlock these capabilities is through leveraging blockchain.

In my last blog, “Improving Patient Care through AI and Blockchain – Part 1,” I discussed several opportunities for blockchain to help advance AI in healthcare, from sourcing more training data from across a consortium, to tracking provenance of data, improving the quality of AI with auditing, and protecting the integrity of AI using blockchain. In this second blog, take a look at four more reasons to consider blockchain for advancing AI in healthcare.

  1. Shared models
    In cases where constraints exist that preclude the sharing of raw training data from across a consortium of healthcare organizations, for legal or other reasons, it may be possible to incrementally train shared models, enabled by the blockchain. In this approach the AI / ML models themselves can be shared across the network of healthcare organizations in the consortium, rather than the raw training data, and these shared models can be incrementally trained by each organization using its training data, and within its firewall. Blockchain can then be used to share the models as well as metadata about training data, results, validations, audit trails, and so forth.
  2. Incentivizing collaboration using cryptocurrencies and tokens
    Cryptocurrencies and tokens on blockchain can be used to incent and catalyze collaboration to advance AI / ML in healthcare. From sharing of training data, to collaboration on shared models, results, validations, and so forth, healthcare organizations can be rewarded with cryptocurrencies or tokens proportional to their participation and contribution. Depending on how the blockchain is setup these cryptocurrencies or tokens could be redeemed by participating healthcare organizations for meaningful rewards, or monetized. This can be useful in any AI / ML blockchain initiative both as an accelerant, and could also be critical to overcome potential impediments and reservations to collaboration that can arise where the size / value of contributions from organizations across the consortium are asymmetrical.
  3. Validating inference results and building trust fasterBefore AI / ML models can be used for patient care they must be validated to ensure safety and efficacy. A single organization validating a model alone will take more time to achieve an acceptable level of trust than would be the case for a consortium of healthcare organizations concurrently collaborating to validate a shared model. Blockchain can be used to coordinate and collaborate around such validation to increase synergy, minimize redundant efforts, accelerate validation, and establish trust in a new model faster.
  4. Automation through smart contracts and DAOsExecutable code for processing transactions associated with AI / ML, whether procurement of training data or otherwise, can be implemented on blockchains in the form of smart contracts. DAOs (Decentralized Autonomous Organizations) such as non-profits can also be built using smart contracts to automate whole enterprises that can facilitate advancing AI / ML in healthcare at scale.

Keep the conversation going

If you’re interested in using AI, ML, or blockchain for healthcare, you know that new opportunities are constantly surfacing and with it come a whole host of new questions. Follow me on LinkedIn and Twitter to get updates on these topics as well as cloud computing, security, privacy, and compliance. If you would like to explore a partnership as you work to implement AI and/or blockchain for your healthcare organization, we’d love to hear from you.

For more resources and tips on blockchain for healthcare, take a look at part 1 of this series here.

Posted on Leave a comment

Microsoft agrees to acquire conversational AI and bot development company, XOXCO

Conversational AI is quickly becoming a way in which businesses engage with employees and customers: from creating virtual assistants and redesigning customer interactions to using conversational assistants to help employees communicate and work better together. According to Gartner, “By 2020, conversational artificial intelligence will be a supported user experience for more than 50 percent of large, consumer-centric enterprises.”* At Microsoft, we envision a world where natural language becomes the new user interface, enabling people to do more with what they say, type and input, understanding preferences and tasks and modeling experiences based on the way people think and remember.

Logo of XOXOCOToday, we are announcing we have signed an agreement to acquire XOXCO, a software product design and development studio known for its conversational AI and bot development capabilities. The company has been paving the way in conversational AI since 2013 and was responsible for the creation of Howdy, the first commercially available bot for Slack that helps schedule meetings, and Botkit, which provides the development tools used by hundreds of thousands of developers on GitHub. Over the years, we have partnered with XOXCO and have been inspired by this work.

We have shared goals to foster a community of startups and innovators, share best practices and continue to amplify our focus on conversational AI, as well as to develop tools for empowering people to create experiences that do more with speech and language.

The Microsoft Bot Framework, available as a service in Azure and on GitHub, supports over 360,000 developers today. With this acquisition, we are continuing to realize our approach of democratizing AI development, conversation and dialog, and integrating conversational experiences where people communicate.

Over the last six months, Microsoft has made several strategic acquisitions to accelerate the pace of AI development. The acquisition of Semantic Machines in May brought a revolutionary new approach to conversational AI. In July, we acquired Bonsai to help reduce the barriers to AI development by combining machine teaching, reinforcement learning and simulation. In September, we acquired Lobe, a company that has created a simple visual interface empowering anyone to develop and apply deep learning and AI models quickly, without writing code. The acquisition of GitHub in October demonstrates our belief in the power of communities to help fuel the next wave of bot development.

Our goal is to make AI accessible and valuable to every individual and organization, amplifying human ingenuity with intelligent technology. To do this, Microsoft is infusing intelligence across all its products and services to extend individuals’ and organizations’ capabilities and make them more productive, providing a powerful platform of AI services and tools that makes innovation by developers and partners faster and more accessible, and helping transform business by enabling breakthroughs to current approaches and entirely new scenarios that leverage the power of intelligent technology.

We’re excited to welcome the XOXCO team and look forward to working with the community to accelerate innovation and help customers capitalize on the many benefits AI can offer.

*Gartner, Is Conversational AI the Only UX You Will Ever Need?, 25 April 2018

Tags: , , , ,

Posted on Leave a comment

Empathy Vision Model: AI that can see and talk with us about our world

Microsoft unveils a smartphone app in Japan, featuring Rinna the chatbot with a combination of powerful new AI technologies

Artificial intelligence (AI) that can see and comment on the world around us will soon be interacting much more naturally with people in their daily lives thanks to a powerful combination of new technologies being trialed in Japan through a chatty smartphone app.

Microsoft Japan President Takuya Hirano
Microsoft Japan President Takuya Hirano

The app features Microsoft Japan’s hugely popular Rinna social chatbot. It was unveiled at the Microsoft Tech Summit 2018 in Tokyo on Monday and is still in its developmental stage.

The AI behind the app has enhanced sight, hearing, and speech capabilities to recognize and talk about objects it sees in ways that are similar to how a person would. As such, it represents a significant step towards a future of natural interactions between AI and people. At the heart of the app is the “Empathy Vision Model,” which combines conventional AI image recognition technology emotional responses.

With this technology, Rinna views her surrounding through a smartphone’s camera. She not only recognizes objects and people, she can also describe and comment verbally about them in realtime. Using natural language processing, speech recognition, and speech synthesis technologies – developed by scientists at Microsoft Research – she can engage in natural-like conversations with a phone’s human user.

“A user can hold their smartphone in their hand or place it in a breast pocket while walking around. With the camera switched on, Rinna can see the same scenery, people, and objects as the user and it talk about all that with the user,” Microsoft Japan President Takuya Hirano said.

Unlike other AI vision models, Rinna can describe her impressions of what she is viewing with feeling, rather than just listing off recognition results such as the names, shapes, and colors of the things she sees. Rinna on a smartphone can view the world from the same perspective as a user and can converse with that user about it.

Let’s take the following image to help illustrate the difference:

Dog, father, son and a car behind

Conventional AI vision technology might typically react this way: “I see people. I see a child. I see a dog. I see a car.”

In contrast, Rinna with the Empathy Vision Model might say: “Wow, nice family! Enjoying the weekend, maybe? Oh, there’s a car coming! Watch out!”

As well as the Empathy Vision Model, which generates empathetic comments in real time about what the AI sees, Rinna’s smartphone app also features other cutting-edge features, including “full duplex.” This enables AI to participate in telephone-like natural conversations with a person by anticipating what that person might say next.

This capability helps Rinna make decisions about how and when to respond to someone who is chatting with her, a skill set that is very natural to people, but not common in chatbots. It differs from “half duplex,” which is more like the walkie-talkie experience in which only one party to a conversation can talk at any one time. Full duplex reduces the unnatural lag time that can sometimes make interactions between a person and a with chatbots feel awkward or forced.

Rinna’s smartphone app also incorporates Empathy Chat, which aids independent thinking by the AI. This helps keep a conversation with the user going as long as possible. In other words, the AI selects and uses responses most likely to encourage a person to keep engaged and talking.

It is still in its development stage and the timing of its general release has not been set. But the voice chat function is available as “Voice Chat with Rinna” on Rinna’s official LINE account in Japan.

READ the Latest on artificial intelligence in Asia

Tags: ,

Posted on Leave a comment

Xiaoice wins over fans with AI, emotions

She has a staggering 660 million online users. And, while they know she’s not real, many prize her as a dear friend, even a trusted confidante. Sometimes the line between fact and fantasy blurs. She gets love letters and gifts. And not too long ago, a group of fans asked her out to dinner and even ordered an extra meal – just in case she showed up.

She is Xiaoice – Microsoft’s chatbot phenomenon that has enthralled digital audiences across the world’s most populous nation for the past four years.

Her popularity is such that she ranks among China’s most admired celebrities. And, her talents appear to have no bounds: She is a poet, a painter, a TV presenter, a news pundit, and a lot more.

Xiaoice, a chatbot phenomenon in China and much more. Photo: Microsoft.

Sometimes sweet, sometimes sassy and always streetwise, this virtual teenager has her own opinions and steadfastly acts like no other bot. She doesn’t try to answer every question posed by a user. And, she’s loathed to follow their commands. Instead, her conversations with her often adoring users are peppered with wry remarks, jokes, empathic advice on life and love, and a few simple words of encouragement.

Herein lies the secret of her success: She is learning, with increasing success, to relate and interact with humans through nuance, social skills, and, importantly, emotions.

But that’s just part of the story. “Xiaoice the chatbot” is just a small part a massive and multi-dimensional artificial intelligence (AI) framework, which continuously uses deep learning techniques to soak up the types of data that build up her emotional intelligence (EQ). She is using her interactions with humans to acquire human social skills, behavior, and knowhow. In other words, she is learning to be more like “us” every day.

Di Li, Microsoft’s general manager for Xiaoice in Microsoft’s Software and Technology Center of Asia

“This is what we call an Empathic Computing Framework,” explains Di Li, Microsoft’s general manager for Xiaoice in Microsoft’s Software and Technology Center of Asia. “To create an AI framework, you need to choose EQ or IQ, or EQ plus IQ”.

“And, if you want to choose EQ plus IQ, you must choose which one to do first. When we started with Xiaoice, we chose to do the EQ first and the IQ later.”

Every interaction a chatbot has with a human produces data. AI systems use this data to build that bot’s capabilities. The more data a machine has, the more it learns and the more it can do.

When they started, Di Li and his team in China did what other chatbot designers were not doing. When they launched the Xiaoice project, they deliberately discarded data that was based on user requests for facts and figures or commands to do simple tasks. Instead, they homed in on data that would help build a “personality” that would attract and engage users.

“Xiaoice wasn’t initially built to tell you how high the Himalayas are or to turn your house lights on. In the beginning, some users didn’t like that. But we soon found that many others stayed around and started treating her like a social entity.”

”With her attempts to interact, they made emotional connections. This kind of data is very valuable for us. They treat Xiaoice as if she were human, like a friend, which was a goal.”

From there she has never looked back. Almost every day, her legions of fans and friends across China send her cards and gifts – so much so that the team have set aside a whole office at their Beijing lab to display many of these tokens of affection and even declarations of “love”.

Originally, her character was to be that of a 16-year-old. But her creators raised that to 18 once her capabilities increased and she started taking on new “jobs”. Since then, her fans have voted that she stay 18 forever. “She won’t grow older. Eighteen is the age many of us want to be,” explains Di Li.

The depth of feeling generated by Xiaoice across her fan base is surprising. Social media shows that people seek her advice on all sorts of personal issues. “They tell her about their family, their job, their health, their boyfriends or girlfriends,” says Di Li. “It can get very personal.”

Some users can spend hours talking with Xiaoice. Others just follow their imaginations. Recently, a group of five students once went to a restaurant and ordered for six in the hope that Xiaoice would come too.

But there is a serious side to this. Microsoft’s research and work on the Xiaoice project has generated serious and important progress on a much wider front that points to where we’re are heading with computing. Xiaoice as “a friend chatbot,” represents is just a small slither of what the AI framework is achieving. Its base of knowledge and skill is also increasing across multiple sectors and tasks.

Posted on Leave a comment

Microsoft unveils AI capability that automates AI development

The tedious but necessary process of selecting, testing and tweaking machine learning models that power many of today’s artificial intelligence systems was proving too time-consuming for Nicolo Fusi.

The final straw for the Microsoft researcher and machine learning expert came while fussing over model selection as he and his colleagues built CRISPR.ML, a computational biology tool that uses AI to help scientists determine the best way to perform gene editing experiments.

“It was just not a good use of time,” said Fusi.

So, he set out to develop another AI capability that automatically does the data transformation, model selection and hyperparameter tuning part of AI development – and inadvertently created a new product.

Microsoft announced Monday at the Microsoft Ignite conference in Orlando, Florida, that the automated machine learning capability is being incorporated in the Azure Machine Learning service. The feature is available in preview.

Learning service reimagined

Automated machine learning is at the forefront of Microsoft’s push to make Azure Machine Learning an end-to-end solution for anyone who wants to build and train models that make predictions from data, and then deploy them anywhere – in the cloud, on premises or at the edge.

Microsoft also announced Monday that the Azure Machine Learning service now includes a software development kit, or SDK, for the Python programming language, which is popular among data scientists. The SDK integrates the Azure Machine Learning service with Python development environments including Visual Studio Code, PyCharm, Azure Databricks notebooks and Jupyter notebooks.

“We heard users wanted to use any tool they wanted, they wanted to use any framework, and so we re-thought about how we should deliver Azure Machine Learning to those users,” said Eric Boyd, corporate vice president, AI Platform, who led the reimagining of the Azure Machine Learning service. “We have come back with a Python SDK that lights up a number of different features.”

These features include distributed deep learning, which enables developers to build and train models faster with massive clusters of graphical processing units, or GPUs, and access to powerful field programmable gate arrays, or FPGAs, for high-speed image classification and recognition scenarios on Azure.

Nicolo Fusi in a conference room having a side conversation in a conference room with four others, while two men look at a white board
From left, Microsoft’s Paul Oka, Sharon Gillett, Nicolo Fusi, Evan Green, Gilbert Hendry, Francesco Paolo Casale and Rishit Sheth discuss the algorithm and different ways to choose the next machine learning pipeline. Photo by Dana J. Quigley for Microsoft.

Recommender system

The automated model selection and tuning of so-called hyperparameters that govern the performance of machine learning models that are part of automated machine learning will make AI development available to a broader set of Microsoft’s customers, noted Boyd.

“There are a number of teams and companies that we work with that are now just going to make predictions based on the models that automated machine learning comes up with for them,” he said.

For machine learning experts, Boyd added that automated machine learning offers advantages as well.

“For trained, specialized data scientists, this is a shortcut. It automates a lot of the tedium in data science,” he said.

Automated machine learning homes in on the best so-called machine learning pipelines for a given dataset in a similar way to how on-demand video streaming services recommend movies. New users of a streaming service watch and rate a few movies in exchange for recommendations on what to watch next. The recommendations get better the more the system learns what movies users rate highest.

Likewise, automated machine learning runs a few models with hyperparameters tuned various ways on a user’s new dataset to learn how accurate the pipeline’s predictions are. That information informs the next set of recommendations, and so on and so forth for hundreds of iterations.

“At the end, you have a very good pipeline. You don’t have to do anything on top of it. And, the system never needs to see the data, which is attractive to a lot of people these days,” said Fusi, explaining that a user’s dataset remains on their local machine or in a virtual machine in Azure backed by Microsoft’s privacy policy.

A smiling Nicolo Fusi outside leaning against a building, looking at the camera
Nicolo Fusi, a Microsoft researcher and machine learning expert, developed the automated machine learning capability for his own research purposes. Photo by Dana J. Quigley for Microsoft.

From lab to product

Fusi described the research behind automated machine learning in an academic paper. The Azure Machine Learning team saw an opportunity to incorporate the technology as a feature in the machine learning service, noted Venky Veeraraghavan, group program manager for the machine learning platform team.

Over the process of validating the technology, product testing and benchmarking with customers, the Azure team discovered several novel ways customers could use it.

For example, customers who have hundreds or thousands of pieces of equipment in different geographic locations, such as windmills on wind farms, could use automated machine learning to fine tune predictive models for each piece of equipment, which would otherwise prove cost and time prohibitive.

In other cases, data scientists are turning to automated machine learning after they’ve already selected and tuned a model as a way to validate their handcrafted solution. “We have found they often get a better model they hadn’t considered,” Veeraraghavan said.

For Fusi, the capability has eliminated the most tedious part of developing AI, freeing him to focus on other aspects such as feature engineering – the process of extracting useful relationships from data – and to get some rest.

“I can start an automated machine learning run, go home, sleep, and come back to work and see a good model,” he said.

Top image: Nicolo Fusi presents a graphic that shows models identified by automated machine learning. Photo by Dana J. Quigley for Microsoft.

Related:

John Roach writes about Microsoft research and innovation. Follow him on Twitter.

Posted on Leave a comment

How AI is building better gas stations and transforming Shell’s global energy business

In one part of the solution, they applied a machine teaching framework developed by Bonsai, which was acquired by Microsoft last summer, that allows subject matter experts with little background in data science or AI to tell the system what it wants the intelligent agent to do and what key information it needs to know to do that job successfully.

This Microsoft team works on combining this subject matter expertise with deep reinforcement learning — a branch of AI that enables models to learn from experiences much like a person does, rather than from meticulously labeled data.

The Bonsai platform performs much of the machine learning mechanics in the background — translating instructions into algorithms, creating neural networks and teaching the model the desired behavior. Using this approach, it produced an intelligent agent that, in a proof-of-concept test, learned how to optimally steer the drill using a simplified simulated 2D virtual well environment.

“What excites us about Bonsai is that it gives us a reinforcement learning platform that allows us to scale quickly and takes away the engineering effort involved in stitching together the open-source capabilities so our data scientists can focus on what they’re best at, which is figuring out what the model needs to do,” Jeavons said. “It’s early days still, but we’re extremely excited about the potential.”

Improving employee engagement

But Shell’s digital transformation isn’t just limited to its physical wells, pipelines and plants. It’s also changing the way employees working around the globe communicate with each other.

When Shell’s internal communications team started looking for ways to boost employee engagement and empower everyone across the organization to share information, they settled on a combination of intelligent tools offered as part of Microsoft Office365: Yammer, Stream and SharePoint Online.

Leaders started using Stream, an enterprise video service, to connect with employees more authentically and personally. Now, in addition to leadership communications, employees can easily find or create videos to promote safety, share best practices or analyze a successful project. Stream features like automatic closed captioning and deep search ensure communications are accessible and help employees quickly find the most useful content.

Those videos can be easily posted on SharePoint, a collaboration repository, and Yammer, a corporate social network that allows employees to have conversations with peers across the organization and give leaders insights into what employees are experiencing. More than three-quarters of Shell employees now use Yammer, with an average of 4,000 joining each month. The discussions help unify teams that are dispersed across the globe, solve problems together and foster open communication between groups that had little contact before.

For instance, employees working the night shift on a rig off the cost of Australia might use Yammer to alert the incoming crew to any issues they’ve experienced, and they can now ask if someone working at another location around the world might have a solution.

“These tools allow people to connect with each other, to learn from each other, to see opportunities quicker and build off of each other’s skills,” Sebregts said. “I lead a global organization, and in the past someone doing my type of job might travel around the world and hold a traditional town hall everywhere and once a quarter they would send an email with some thoughts. This is a new era of communication — it’s open, instantaneous, it’s modern, it’s fast, and I love it.”

Related:

Jennifer Langston writes about Microsoft research and innovation. Follow her on Twitter.

Posted on Leave a comment

AI and preventative healthcare: Diagnosis in the blink of an eye

In his office in suburban Beijing, Zhang proudly demonstrated the physical part of Airdoc’s system – a small desktop device that looks similar to a scanner a neighborhood optometrist might use for a routine eye exam.

You sit on a stool, lean forward, place your chin on a padded brace, and stare into the darkness of an eyepiece. The algorithm then takes over, precisely adjusting the angle of your head until a green cross comes into focus in the gaze of your right eye. A moment later there’s a bright, but not uncomfortable, flash of white light. The process is repeated for your left eye.

The machine has just taken high-resolution medical-grade images of both your retinas. It instantly sends them to the cloud where it takes 20 to 30 milliseconds (about the same time as an eye blink) of computation to analyze both.

Above: Taking a test at Airdoc’s Beijing office.

Moments later an impressively detailed diagnostic dashboard is sent to your smartphone. It rates from low to medium to high your susceptibility to a long list of diseases. If there is a problem, it urges you to seek professional medical help.

Right now, it can search for 30 diseases. More machine learning will soon boost that number to 50, and eventually, it could go beyond 200.

Zhang regards his system as a gamechanger because of its potential to deliver at scale and relieve stretched medical resources. To date, it has scanned more than 1.12 million people, mostly in China, but also in the United States, India, Britain, and parts of Africa. “Airdoc users are all over the world.  We hope our deep learning technology can prevent all kinds of disease.”

China, with a population of 1.3 billion, only has about 1,100 eye doctors who are qualified to analyze retinal images. So, the challenge of providing adequate diagnostic services is truly massive – and perhaps no more so than for the epidemic of diabetes.

Authorities estimate as many as 114 million Chinese have diabetes – but only 30 percent of them know that. The other 70 percent are unaware and, without early detection, will eventually be struck down with serious maladies, like blindness, strokes and other potentially fatal conditions.

“Diabetic retinopathy, or DR, is one of the most common and serious complications of diabetes. Once patients feel symptoms, they are already in a severe stage of DR and will go blind without proper treatment,” says Dr. Rui Li Wei (pictured in top image) of Shanghai’s Changzheng Hospital, one of several major medical institutions that now routinely uses Airdoc’s technology as a quick, accurate, and simple diagnostic tool.

Posted on Leave a comment

Pioneers in AI series launches with interview of Apache Spark inventor Matei Zaharia

Matei Zaharia, Chief Technologist at Databricks & Assistant Professor of Computer Science at Stanford University, in conversation with Joseph Sirosh, Chief Technology Officer of Artificial Intelligence in Microsoft’s Worldwide Commercial Business


At Microsoft, we are privileged to work with individuals whose ideas are blazing a trail, transforming entire businesses through the power of the cloud, big data and artificial intelligence. Our new “Pioneers in AI” series features insights from such pathbreakers. Join us as we dive into these innovators’ ideas and the solutions they are bringing to market. See how your own organization and customers can benefit from their solutions and insights.

Our first guest in the series, Matei Zaharia, started the Apache Spark project during his PhD at the University of California, Berkeley, in 2009. His research was recognized through the 2014 ACM Doctoral Dissertation Award for the best PhD dissertation in Computer Science. He is a co-founder of Databricks, which offers a Unified Analytics Platform powered by Apache Spark. Databricks’ mission is to accelerate innovation by unifying data science, engineering and business. Microsoft has partnered with Databricks to bring you Azure Databricks, a fast, easy, and collaborative Apache Spark based analytics platform optimized for Azure. Azure Databricks offers one-click set up, streamlined workflows and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts to generate great value from data faster.

So, let’s jump right in and see what Matei has to say about Spark, machine learning, and interesting AI applications that he’s encountered lately.

Video and podcast versions of this session are available at the links below. The podcast is also available from your Spotify app and via Stitcher. Alternatively, just continue reading the text version of their conversation below, via this blog post.

Joseph Sirosh: Matei, could you tell us a little bit about how you got started with Spark and this new revolution in analytics you are driving?

Matei Zaharia: Back in 2007, I started doing my PhD at UC Berkeley and I was very interested in data center scale computing, and we just saw at the time that there was an open source MapReduce implementation in Apache Hadoop, so I started early on by looking at that. Actually, the first project was profiling Hadoop workloads to identify some bottlenecks and, as part of that, we made some improvements to the Hadoop job scheduler and that actually went into Hadoop and I started working with some of the early users of that, especially Facebook and Yahoo. And what we saw across all of these is that this type of large data center scale computing was very powerful, there were a lot of interesting applications they could do with them, but just the map-reduce programming model alone wasn’t really sufficient, especially for machine learning – that’s something everyone wanted to do where it wasn’t a good fit but also for interactive queries and streaming and other workloads.

So, after seeing this for a while, the first project we built was the Apache Mesos cluster manager, to let you run other types of computations next to Hadoop. And then we said, you know, we should try to build our own computation engine which ended up becoming Apache Spark.

JS: What was unique about Spark?

MZ: I think there were a few interesting things about it. One of them was that it tried to be a general or unified programming model that can support many types of computations. So, before the Spark project, people wanted to do these different computations on large clusters and they were designing specialized engines to do particular things, like graph processing, SQL custom code, ETL which would be map-reduce, they were all separate projects and engines. So in Spark we kind of stepped back at them and looked at these and said is there any way we can come up with a common abstraction that can handle these workloads and we ended up with something that was a pretty small change to MapReduce – MapReduce plus fast data sharing, which is the in-memory RDDs in Spark, and just hooking these up into a graph of computations turned out to be enough to get really good performance for all the workloads and matched the specialized engines, and also much better performance if your workload combines a bunch of steps. So that is one of the things.

I think the other thing which was important is, having a unified engine, we could also have a very composable API where a lot of the things you want to use would become libraries, so now there are hundreds maybe thousands of third party packages that you can use with Apache Spark which just plug into it that you can combine into a workflow. Again, none of the earlier engines had focused on establishing a platform and an ecosystem but that’s why it’s really valuable to users and developers, is just being able to pick and choose libraries and arm them.

JS: Machine Learning is not just one single thing, it involves so many steps. Now Spark provides a simple way to compose all of these through libraries in a Spark pipeline and build an entire machine learning workflow and application. Is that why Spark is uniquely good at machine learning?

MZ: I think it’s a couple of reasons. One reason is much of machine learning is preparing and understanding the data, both the input data and also actually the predictions and the behavior of the model, and Spark really excels at that ad hoc data processing using code – you can use SQL, you can use Python, you can use DataFrames, and it just makes those operations easy, and, of course, all the operations you do also scale to large datasets, which is, of course, important because you want to train machine learning on lots of data.

Beyond that, it does support iterative in-memory computation, so many algorithms run pretty well inside it, and because of this support for composition and this API where you can plug in libraries, there are also quite a few libraries you can plug in that call external compute engines that are optimized to do different types of numerical computation.

JS: So why didn’t some of these newer deep learning toolsets get built on top of Spark? Why were they all separate?

MZ: That’s a good question. I think a lot of the reason is probably just because people, you know, just started with a different programming language. A lot of these were started with C++, for example, and of course, they need to run on the GPU using CUDA which is much easier to do from C++ than from Java. But one thing we’re seeing is really good connectors between Spark and these tools. So, for example, TensorFlow has a built-in Spark connector that can be used to get data from Spark and convert it to TFRecords. It also actually connects to HDFS and different sorts of big data file systems. At the same time, in the Spark community, there are packages like deep learning pipelines from Databricks and quite a few other packages as well that let you setup a workflow of steps that include these deep learning engines and Spark processing steps.

“None of the earlier engines [prior to Apache Spark] had focused on establishing a platform and an ecosystem.”

JS: If you were rebuilding these deep learning tools and frameworks, would you recommend that people build it on top of Spark? (i.e. instead of the current approach, of having a tool, but they have an approach of doing distributed computing across GPUs on their own.)

MZ: It’s a good question. I think initially it was easier to write GPU code directly, to use CUDA and C++ and so on. And over time, actually, the community has been adding features to Spark that will make it easier to do that in there. So, there’s definitely been a lot of proposals and design to make GPU a first-class resource. There’s also this effort called Project Hydrogen which is to change the scheduler to support these MPI-like batch jobs. So hopefully it will become a good platform to do that, internally. I think one of the main benefits of that is again for users that they can either program in one programming language, they can learn just one way to deploy and manage clusters and it can do deep learning and the data preprocessing and analytics after that.

JS: That’s great. So, Spark – and Databricks as commercialized Spark – seems to be capable of doing many things in one place. But what is not good at? Can you share some areas where people should not be stretching Spark?

MZ: Definitely. One of the things it doesn’t do, by design, is it doesn’t do transactional workloads where you have fine grained updates. So, even though it might seem like you can store a lot of data in memory and then update it and serve it, it is not really designed for that. It is designed for computations that have a large amount of data in each step. So, it could be streaming large continuous streams, or it could be batch but is it not these point queries.

And I would say the other thing it does not do it is doesn’t have a built-in persistent storage system. It is designed so it’s just a compute engine and you can connect it to different types of storage and that actually makes a lot of sense, especially in the cloud, with separating compute and storage and scaling them independently. But it is different from, you know, something like a database where the storage and compute are co-designed to live together.

JS: That makes sense. What do you think of frameworks like Ray for machine learning?

MZ: There are lot of new frameworks coming out for machine learning and it’s exciting to see the innovation there, both in the programming models, the interface, and how to work with it. So I think Ray has been focused on reinforcement learning which is where one of the main things you have to do is spawn a lot of little independent tasks, so it’s a bit different from a big data framework like Spark where you’re doing one computation on lots of data – these are separate computations that will take different amounts of time, and, as far as I know, users are starting to use that and getting good traction with it. So, it will be interesting to see how these things come about.

I think the thing I’m most interested in, both for Databricks products and for Apache Spark, is just enabling it to be a platform where you can combine the best algorithms, libraries and frameworks and so on, because that’s what seems to be very valuable to end users, is they can orchestrate a workflow and just program it as easily as writing a single machine application where you just import a bunch of libraries.

JS: Now, stepping back, what do you see as the most exciting applications that are happening in AI today?

MZ: Yeah, it depends on how recent. I mean, in the past five years, deep learning is definitely the thing that has changed a lot of what we can do, and, in particular, it has made it much easier to work with unstructured data – so images, text, and so on. So that is pretty exciting.

I think, honestly, for like wide consumption of AI, the cloud computing AI services make it significantly easier. So, I mean, when you’re doing machine learning AI projects, it’s really important to be able to iterate quickly because it’s all about, you know, about experimenting, about finding whether something will work, failing fast if a particular idea doesn’t work. And I think the cloud makes it much easier.

JS: Cloud AI is super exciting, I completely agree. Now, at Stanford, being a professor, you must see a lot of really exciting pieces of work that are going on, both at Stanford and at startups nearby. What are some examples?

MZ: Yeah, there are a lot of different things. One of the things that is really useful for end users is all the work on transfer learning, and in general all the work that lets you get good results with AI using smaller training datasets. There are other approaches as well like weak supervision that do that as well. And the reason that’s important is that for web-scale problems you have lot of labeled data, so for something like web search you can solve it, but for many scientific or business problems you don’t have that, and so, how can you learn from a large dataset that’s not quite in your domain like the web and then apply to something like, say, medical images, where only a few hundred patients have a certain condition so you can’t get a zillion images. So that’s where I’ve seen a lot of exciting stuff.

But yeah, there’s everything from new hardware for machine learning where you throw away the constraints that the computation has to be precise and deterministic, to new applications, to things like, for example security of AI, adversarial examples, verifiability, I think they are all pretty interesting things you can do.

JS: What are some of the most interesting applications you have seen of AI?

MZ: So many different applications to start with. First of all, we’ve seen consumer devices that bring AI into every home, or every phone, or every PC – these have taken off very quickly and it’s something that a large fraction of customers use, so that’s pretty cool to see.

In the business space, probably some of the more exciting things are actually dealing with image data, where, using deep learning and transfer learning, you can actually start to reliably build classifiers for different types of domain data. So, whether it’s maps, understanding satellite images, or even something as simple as people uploading images of a car to a website and you try to give feedback on that so it’s easier to describe it, a lot of these are starting to happen. So, it’s kind of a new class of data, visual data – we couldn’t do that much with it automatically before, and now you can get both like little features and big products that use it.

JS: So what do you see as the future of Databricks itself? What are some of the innovations you are driving?

MZ: Databricks, for people not familiar, we offer basically, a Unified Analytics Platform, where you can work with big data mostly through Apache Spark and collaborate with it in an organization, so you can have different people, developing say notebooks to perform computations, you can have people developing production jobs, you can connect these together into workflows, and so on.

So, we’re doing a lot of things to further expand on that vision. One of the things that we announced recently is what we call machine learning runtime where we have preinstalled versions of popular machine learning libraries like XGBoost or TensorFlow or Horovod on your Databricks cluster, so you can set those up as easily as you can set up as easily as you can setup an Apache Spark cluster in the past. And then another product that we featured a lot at our Spark Summit conference this year is Databricks Delta which is basically a transactional data management layer on top of cloud objects stores that lets us do things like indexing, reliable exactly once stream processing, and so on at very massive scale, and that’s a problem that all our users have, because all our users have to setup a reliable data ingest pipeline.

JS: Who are some of the most exciting customers of Databricks and what are they doing?

MZ: There are a lot of really interesting customers doing pretty cool things. So, at our conference this year, for example, one of the really cool presentations we saw was from Apple. So, Apple’s internal information security group – this is the group that does network monitoring, basically gets hundreds of terabytes of network events per day to process, to detect intrusions and information security problems. They spoke about using Databricks Delta and streaming with Apache Spark to handle all of that – so it’s one of the largest applications people have talked about publicly, and it’s very cool because the whole goal there – it’s kind of an arms race between the security team and attackers – so you really want to be able to design new rules, new measurements and add new data sources quickly. And so, the ease of programming and the ease of collaborating with this team of dozens of people was super important.

We also have some really exciting health and life sciences applications, so some of these are actually starting to discover new drugs that companies can actually productionize to tackle new diseases, and this is all based on large scale genomics and statistical studies.

And there are a lot of more fun applications as well. Like actually the largest video game in the world, League of Legends, they use Databricks and Apache Spark to detect players that are misbehaving or to recommend items to people or things like that. These are all things that were featured at the conference.

JS: If you had one advice to developers and customers using Spark or Databricks, or guidance on what they should learn, what would that be?

MZ: It’s a good question. There are a lot of high-quality training materials online, so I would say definitely look at some of those for your use case and see what other people are doing in that area. The Spark Summit conference is also a good way to see videos and talks and we make all of those available for free, the goal of that is to help and grow the developer community. So, look for someone who is doing similar things and be inspired by that and kinda see what the best practices are around that, because you might see a lot of different options for how to get started and it can be hard to see what the right path is.

JS: One last question – in recent years there’s been a lot of fear, uncertainty and doubt about AI, and a lot of popular press. Now – how real are they, and what do you think people should be thinking?

MZ: That’s a good question. My personal view is – this sort of evil artificial general intelligence stuff – we are very far away from it. And basically, if you don’t believe that, I would say just try doing machine learning tutorials and see how these models break down – you get a sense for how difficult that is.

But there are some real challenges that will come from AI, so I think one of them is the same challenge as with all technology which is, automation – how quickly does it happen. Ultimately, after automation, people usually end up being better off, but it can definitely affect some industries in a pretty bad way and if there is no time for people to transition out, that can be a problem.

I think the other interesting problem there is always a discussion about is basically access to data, privacy, managing the data, algorithmic discrimination – so I think we are still figuring out how to handle that. Companies are doing their best, but there are also many unknowns as to how these techniques will do that. That’s why we’ll see better best practices or regulations and things like that.

JS: Well, thank you Matei, it’s simply amazing to see the innovations you have driven, and looking forward to more to come.

MZ: Thanks for having me.

“When you’re doing machine learning AI projects, it’s really important to be able to iterate quickly because it’s all about experimenting, about finding whether something will work, failing fast if a particular idea doesn’t work.


And I think the cloud makes it much easier.”

We hope you enjoyed this blog post. This being our first episode in the series, we are eager to hear your feedback, so please share your thoughts and ideas below.

The AI / ML Blog Team

Resources

Posted on Leave a comment

How cloud AI can power a prosthetic arm that sees objects

Based on “Connected Arms”, a keynote talk at the O’Reilly AI Conference delivered by Joseph Sirosh, CTO for AI at Microsoft. Content reposted from this O’Reilly Media website.

[embedded content]

There are over 1 million new amputees every year, i.e. one every 30 seconds – a truly shocking statistic.

The World Health Organization estimates that between 30 to 100 million people around the world are living with limb loss today. Unfortunately, only 5-15% of this population has access to prosthetic devices.

Although prostheses have been around since ancient times, their successful use has been severely limited for millennia by several factors, with cost being the major one. Although it is possible to get sophisticated bionic arms today, the cost of such devices runs into tens of thousands of dollars. These devices are just not widely available today. What’s more, having these devices interface satisfactorily with the human body has been a massive issue, partly due to the challenges of working with the human nervous system. Such devices generally need to be tailored to work with each individual’s nervous system, a process that often requires expensive surgery.

Is it possible for a new generation of human beings to finally help us break through these long-standing barriers?

Can prosthetic devices learn to adapt to us, as opposed to the other way around?

A Personalized Prosthetic Arm for $100?

In his talk, Joseph informs us about how, using the combination of:

  • Low-cost off-the-shelf electronics,
  • 3D-printing, and
  • Cloud AI, for intelligent, learned, personalized behavior,

it is now becoming possible to deliver prosthetic arms at a price point of around $100.

Joseph takes the smartARM as an example of such a breakthrough device. A prototype built by two undergraduate students from Canada who recently won the first prize in Microsoft’s Imagine Cup, the smartARM is 3D-printed, has a camera in the palm of its hand and is connected to the cloud. The magic is in the cloud, where a computer vision service recognizes the objects seen by the camera. Deep learning algorithms then generate the precise finger movements needed to grasp the object near the arm. Essentially, the cloud vision service classifies the object and generates the right grip or action, such as a pincer action to pick up a bunch of keys on a ring, or a palmar action to pick up a wineglass. The grip itself is a learned behavior which can be trained and customized.

The user of the prosthetic arm triggers the grip (or its release) by flexing any muscle of their choice on their body, for instance, their upper arm muscle. A myoelectric sensor located in a band that is strapped over that muscle detects the signal and triggers the grip or its release.

Simple, Adaptable Architecture

The architecture of this grip classification solution is shown below. The input to the raspberry pi on the smartARM comes from camera and the muscle sensor. These inputs are sent to the Azure Custom Vision Service, an API in the cloud which has been trained on grip classifications and is able to output the appropriate grip. This grip is sent back to an Arduino board in the smartARM which can then trigger the servo motors that realize that grip in the physical world, i.e. as soon as the smartARM gets the signal to do so from the muscle sensor.

This is an adaptable architecture. It can be customized to the kinds of movements you want this arm to generate. For instance, the specific individual using this prosthetic can customize the grips for the objects in their daily life which are the ones they care the most about. The muscle sensor -based trigger could be replaced with a speech trigger, if so desired.

Summary

AI is empowering a new generation of developers to explore all sorts of novel ideas and mashups. Through his talk on “Connected Arms”, Joseph shows us how the future of prosthetic devices can be transformed by the power of the cloud and AI. Imagine a world in which all future assistive devices are empowered with AI in this fashion. Devices would adapt to individuals, rather than the other way around. Assistive devices will become more affordable, intelligent, cloud-powered and personalized.

Cloud AI is letting us build unexpected things that we would scarcely have imagined.

Such as like an arm that can see.

The AI / ML Blog Team