Data Science Takes on Cancer
Dr. Lena Maier-Hein is the Head of the Division Computer Assisted Medical Interventions at the German Cancer Research Center (DFKZ). Her team of roughly twenty international researchers works to improve interventional healthcare through data collection and analysis, approaching medicine cross-disciplinarily from the perspectives of machine learning, biophotonics, knowledge modeling and evidence-based medicine. In an interview she chats with us about computer-aided surgery, data access and privacy and more.
DWIH: Thanks for taking time to chat, Dr. Maier-Hein. Your personal background is in computer science at Karlsruhe Institute of Technology (KIT), and you are now working in the field of medical technologies for the DFKZ. Could you explain to us why this intersection of medicine—specifically cancer research—and technology fascinates you and what your reasons were for specializing in this field?
Maier-Hein: With a research area at the intersection of computer science and medicine I found exactly the job that I was always looking for. I don’t have to be a health worker but can leverage the talents and skills that I have – which are more on the technical side – to make a contribution to the improvement of healthcare. With cancer care as field of application, I further found a domain that is of especially high socioeconomic relevance.
DWIH: Your research in the division of Computer Assisted Medical Interventions aims to provide physicians with “the right data at the right time.” In what ways has this field of research expanded with the advances of big data, machine learning and artificial intelligence? How can data “revolutionize” clinical interventions?
Maier-Hein: In the past, surgery already underwent revolutionary changes from time to time. Examples are the introduction of anesthesia and antiseptics in the 19th century or the introduction of minimally invasive surgery in the 20th century. I believe that the next paradigm shift will be triggered by data science. The ability to automatically interpret and record everything occurring within and around the treatment process and to link decisions with outcomes will enable us move from subjective to objective decision-making and towards truly context-aware assistance in the operating room. Clinical applications are manifold—ranging from surgical training, to choosing the best treatment option for a patient in a personalized manner, all the way to robot-assisted therapy. The key to success is a technical infrastructure that paves the way for clinical success stories. In the scope of the Surgical Oncology program of the National Center for Tumor Diseases (NCT) Heidelberg, we are currently working on such a research enabling infrastructure.
DWIH: Could you illustrate the current focus of your research by giving an example for a project you have been or are working on?
Maier-Hein: Media reports in the past few years have been full of data science success stories. Algorithm outperformed radiologist, cardiologist, dermatologist – you name it. But if you look closely, such success stories are lacking in surgery. While there are many reasons for this phenomenon (as discussed in our Nature Biomedical Engineering 2017 paper), the access to masses of annotated training data is clearly the core bottleneck. According to the high-impact publications in radiological data science, algorithms have often been trained on hundreds of thousands – if not millions of data sets. Publications in the field of surgical data science work with data sets that are orders of magnitude smaller. My team has thus been working on methods to address this bottleneck.
In our European Research Council (ERC)-funded project COMBIOSCOPY, for example, we developed a deep learning-based approach to perfusion monitoring with the mission to make cancer therapy safer in the future. In collaboration with clinicians from the University Clinic Heidelberg and engineers from Imperial College London, we have developed a special imaging device that can acquire so-called multispectral images of the tissue during minimally-invasive surgery. We had the hypothesis that a machine learning – based algorithm would be able to convert this high-dimensional multispectral data to clinically relevant tissue properties, such as the tissue oxygenation. The trouble was that we had no ground truth labels to train the algorithm. We tackled this bottleneck by leveraging all our prior knowledge on tissue properties as well as on principles according to which light interacts with tissue to generate masses of simulated data. These simulated training data with perfect ground truth labels of tissue properties were then used to train our deep learning algorithm. In collaboration with Prof. Dogu Teber from Karlsruhe, this concept is now tested in clinical studies in the context of ischemia detection during partial kidney resection.
DWIH: Many scholars in your field have cited limited data access as a hindrance to their research. Can you speak to this?
Maier-Hein: I can only second that. According to an international poll that we initiated last summer, no commonly recognized surgical data science success stories exist to date. The vast majority of researchers assumes the lack of large amounts of annotated training data as main reason. In fact, some of the relevant data, such as endoscopic video data, for example, is not even recorded in clinical routine. I am therefore a big fan of the the OR Black Box® project initiated by Dr. Teodor Grantcharov. The OR Black Box® was introduced in analogy to the black box in airplanes and is essentially a platform that allows healthcare professionals to identify, understand, and mitigate risks that impact patient safety. It combines input from video cameras, microphones, and other sensors with human and automated processing to produce insights from the operating room that lead to improved efficiencies and reduced adverse events I am very grateful for having become an affiliated professor to LKSK institute of St. Michael‘s Hospital (Toronto) as I can now work on some of the exciting data science aspect with the team.
DWIH: Conversely, medical data is some of the most private personal data. What suggestions do you have for protecting valuable, medical data while allowing data science to continue to improve diagnosis and therapy?
Maier-Hein: There are a lot of creative methods out there these days. One of the most promising directions from my point of view is federated learning. Technically speaking, federated learning is a machine learning concept that involves training algorithm in a decentralized manner. Data samples are held locally – in our case in hospitals – and do not need to be exchanged across sites. Hence, rather than bringing the data to the algorithms, you do the opposite: You take the algorithms to the data, such that the data never have to leave the hospital. Knowledge on the data is then implicitly represented in the models that were trained with them. This research field is in its infancies, and many issues remain to be addressed, but I think it’s definitely worth pursuing.