Insights based Decision Making: Julia Ling, Citrine Informatics

This blog is an Innovation Insights video podcast transcript. In this episode of Innovation Insights, Emerald’s Sector Specialist Graham Carey discusses ‘Materials informatics’ with Julia Ling, Chief Technology Officer at Citrine Informatics.


Graham Carey: Hi, I’m Graham Carey from Emerald Technology Ventures. Welcome to Innovation Insights. With me today, is Dr. Julia Ling from Citrine Informatics. Julia is the CTO of Citrine and is a proven pioneer in the use of data-driven technology for scientific and engineering applications, particularly in the material science arena. At Citrine, she leads the data science, data engineering, software engineering and security teams, and she also led the development of Citrine’s IP, including patented technology, applying AI to make strategic R&D project decisions. Prior to Citrine, Julia was a Harry S. Truman Fellow at Sandia National Labs, where she led multiple research projects, applying AI to the physical sciences, and so has deep interest and deep knowledge in this space. Welcome, Julia, thanks so much for your presentation at the European Venture Fair, it was a fantastic talk on Citrine’s interest in activities in this space. You participated in a session we called Digitalizing Chemical Innovation. For those new to the concept of materials informatics, can you remind us how do you define MI and how does it differ from what’s been done in the past?

Julia Ling: Thanks, and thanks for having me, Graham. The key idea of materials informatics, is to bring data into the picture, to spark new insights and make new connections. Many electronic lab notebooks and LIMS systems today, get used mainly for compliance or reference, but not necessarily forward-looking analysis. In materials informatics, you want to bring together all the relevant sources of information, an individual researcher’s experimental data and domain expertise, along with all of their colleagues’ and institutional data and domain expertise, and use that aggregated information to look forward at what should come next, what experiment, what project, what R&D investment. And there are two main goals for materials informatics: Increasing innovation and increasing agility. Bringing together all this information and using AI approaches, can make new connections and uncover new opportunities for innovation. In terms of agility, materials informatics lets you be much more responsive to changing market conditions or supply chain disruptions or urgent customer requirements, because it lets you build an AI workflow that can process those new requirements and help you respond quickly. At Citrine, the company where I work, we’ve built a material informatics platform to help companies realize both of those benefits.

Graham: So materials informatics seems to offer a really compelling value proposition here, in a time where, like you say, innovation is increasingly time and resource intensive. This platform can help you identify new materials and products, and like you indicate, it can also indicate, help you determine the most effective directions for new development. Why has this approach, this technology not already become the norm across the board, with large and medium chemical companies? What do you see as the key barriers to a broader adoption, in this space?

Julia: That’s a great question, Graham. Over the last few years, I have seen a shift in the materials and chemicals industry, where materials informatics is becoming more and more mainstream. But I think there’s a few reasons why it’s taken so long. The first is that when these companies first start their digital initiatives, they often focus on really well-trodden ground first, they apply data science to business analytics, supply chain management. And so, R&D doesn’t make it to the top of the list, even though it’s where they really stand to gain the most in terms of increasing market share. And the second reason is that, materials informatics hasn’t always been easy to adopt. Data structuring and AI modelling, is one side of the coin, and the other side of the coin is, change management in the R&D org. One of our main focuses over the last year at Citrine, has been trying to streamline that initial adoption process through phased data onboarding and hands-on training, so that teams can adopt and start to see value much more quickly and see value along the way and not feel like they have to boil the ocean, in terms of data management.

Graham: It’s interesting, you mentioned this onboarding process and the need to get your data into the system. That seems to be one of the areas where we’ve seen, at Emerald, a lot of the barrier, both in terms of the willingness and the comfort of some of these big companies to adopt these sorts of technologies. There’s this, “We don’t want our data getting out,” assurance from many of these companies. Do you see something similar, and how does Citrine approach balancing the need for this data, obviously, to build strong models and strong recommendations, but on the other hand, avoiding data leakage or IP leakage, which could be really detrimental to your customers?

Julia: Absolutely. Yeah, data security is one of the key factors in materials informatics. In many cases, we’re talking about the crown jewels for these companies, information that forms a key part of their competitive advantage. So of course, they want to be careful with it. And it’s really important that as companies invest in digital initiatives, they think about data security for their team, as well as for any vendor they work with. One of the trends I’ve noticed though, is that more and more companies are realizing that on-prem is not actually more secure than using AWS or Azure or other cloud services, that have really invested in having serious industry-leading security practices, and at Citrine, we take this really seriously as well. Our platform is built on top of AWS, and each customer gets a separate deployment of our platform, in its own secure environment, so that there’s no chance of data leakage between customers.

Graham: Now, Citrine was founded, I think, correct me if I’m wrong, back in 2013. You’re, by this point, an old hand, an established name in this space. But in recent years in particular, we’re seeing a growing number of other players entering this space, both smaller companies and also some of the larger companies starting to try this… Try to do this on their own, internally. If I was leading innovation at some materials company today and I was leading a new thrust to adopt this kind of materials informatics technology, what are the big differentiating factors that I should look for, in a successful platform and a successful approach?

Julia: Sure. There are two main things I’d look for. First, it should be material specific. A generic AI platform would not be able to work effectively on the sparse, limited data that are typical in materials development applications. It’s incredibly important that the materials developers be able to combine their domain expertise with their data, to make the most efficient use possible of that limited data, and it’s really hard to do that without a platform, tailored to materials and chemicals. The key thing I’d look for in this case is how extensively the platform lets researchers embed their scientific domain knowledge into their AI workflows. Second, it should be scalable. By that, I mean it shouldn’t just give you success on a single project with a single all-star team. You can have a successful single project if you cherry-pick your best team for it, and they use all custom-built in-house code. But that doesn’t let you transform your entire organization. For that, you need a platform that lets all of your scientists benefit from data-driven workflows and that lets them share their data and their AI models and their domain expertise, across an organization with clear access controls and best in class data security.

Julia: Key things to look for in terms of scalability, are advanced access controls that allow sharing of data and models with specific teams or projects, user interfaces that are intuitive for your scientists, and data management that contextualizes data, so that people can actually understand where the data came from, even if they weren’t the ones who compiled it in the first place. Those are some of the key factors in assessing scalability of a platform across your organization.

Graham: Perfect. Thanks so much. Dr. Ling, thanks again for your time and your expertise and for helping us understand a little better, the emerging and I think, a fast-growing space of materials informatics and how it can really change this space of chemical and materials innovation for years to come. Thanks.

Julia: Thanks for having me. It was great to talk to you about it.