Distributional Logo
Get to know Erin LeDell, Chief Scientist at Distributional

Get to know Erin LeDell, Chief Scientist at Distributional

Scott Clark
November 20, 2024

We’re excited to announce that Erin LeDell has joined Distributional as Chief Scientist. As the creator of the world’s first open-source enterprise AutoML algorithm and a co-founder of the industry benchmark for AutoML systems, AMLB, Erin brings unparalleled expertise to the company. With over a decade of experience developing, designing, and shipping AI software, Erin will enable Distributional to fulfill our mission of helping our customers build reliable AI through testing.

We sat down with Erin for a brief Q&A about her previous work with AutoML, why she decided to join Distributional instead of building her own company, and what’s next for her at Distributional.

How would you describe your role at Distributional?

As Chief Scientist, my focus includes a mix of scientific leadership and product strategy. Drawing on my experience in designing and developing AI software, I’m collaborating closely with the go-to-market, engineering, product, research, and customer success teams to help assure that we’re building a product that is an invaluable component in the modern AI stack. I’m also using my expertise in AI evaluation and statistical analysis to help improve the depth and breadth of the product. Overall, I’m eager to contribute to making AI more reliable and trustworthy, which has been a long-standing area of interest for me.

What was your background prior to joining Distributional?

I’ve spent more than a decade in the AI evaluation and benchmarking space, and co-created the industry benchmark for AutoML systems—AMLB. Today, large commercial AI labs such as Amazon and Microsoft continue to use AMLB to evaluate their AutoML systems, and the platform has been a big driver in the increase in performance and reliability of AutoML systems over the past half decade. In 2021, I delivered a keynote at NeurIPS about benchmarking AI algorithms based on the work I’ve done in this area.

I spent eight years as the Chief Machine Learning Scientist at the enterprise AI software company, H2O.ai. I was an early employee, so I had the opportunity to work on a wide variety of projects over many years that spanned the entire AI stack. For example, I led scientific and product efforts on the open source enterprise ML platform, H2O, which has been adopted by numerous Fortune 500 companies for its critical applications in high-impact production environments. I also developed H2O’s model explainability suite, and integrated novel statistical bias detection and fairness algorithms into the H2O platform. Building on my PhD research, I created the world’s first open-source enterprise AutoML algorithm, H2O AutoML—a pioneering platform that is one of the most widely adopted and utilized AutoML solutions to date. 

More recently, I spent the past year advising companies on building GenAI products through my consulting firm, DataScientific, Inc, and spent a lot of time researching and talking to founders and investors in the AI eval and testing space. My passion for meticulously and robustly quantifying the behavior of machine learning algorithms, and instrumenting that as software, is what led me to Distributional’s innovative AI testing platform. I was in the process of founding a company in this area when I reconnected with Scott, and was so impressed by the product differentiation and vision that I decided to join forces with Distributional instead of building my own company.

What appealed to you about the Distributional team?

I know Scott from the early days of SigOpt, his first startup that later sold to Intel. I have a huge respect for his experience and knowledge of the AI industry. I value working with a seasoned leadership team who have been through the first wave of the enterprise AI software industry in the 2010s, like I have. There were many critical lessons learned in that era that we need to bring into the next wave of AI software—for example, the importance of having trust in your models, not just in development, but on an ongoing basis in production. I wanted to join a team that is experienced in developing enterprise AI software and has a strong vision, strategic plan, ability to execute, and is deeply technical yet user focused. I found that in Distributional.

What appealed to you about Distributional’s product?

As I was doing product research and advising companies over the past year, I became very familiar with the products in the AI eval, guardrails, and observability space, but I felt that there was a lack of depth in the solutions available. I immediately appreciated the statistically rigorous approach that Distributional takes to the AI testing problem. Distributional addresses the new challenges that arise with AI applications, especially with applications built with LLMs—they are non-stationary, non-deterministic, and complex, multi-component software systems.

Distributional’s approach is unique and it became clear to both Scott and myself that we should work together on building the flagship enterprise platform for AI testing. We both value depth and expertise in an area of broad yet shallow platforms trying to do too much. A company that is entirely focused on AI testing is much more likely to produce a best-in-class testing product than one trying to be the whole AI stack or another that’s primarily focused on model development, debugging, or logging, for example.

Why does AI testing matter?

Rigorous testing provides confidence in AI models, which is something far more valuable than knowing your model overfit some performance metric on a benchmark—and I say this as someone who built many of those benchmarks! I have long been an advocate of increasing fairness and safety of models, as I’ve seen many ways that unpredictable, unfair, or unsafe models can cause real harm to people, not to mention financial loss to the companies deploying these models.

Looking ahead, it’s crucial for companies to leverage AI as a strategic advantage to stay ahead of competitors and drive long-term value.  AI is here to stay and will only play an increasing role in company operations, as well as in B2B and consumer products. Given that, I am very motivated to make sure that there is sufficient tooling to minimize harms of all kinds. Infusing GenAI models into already complex AI software systems increases complexity and stochasticity, so robust AI testing has never been more important than it is right now.

What’s a misconception around AI that you wish could change?

A common misconception about AI is the belief that AI is intelligent like humans. People may assume that because AI can generate impressive language or solve complex problems, there is thought or awareness behind that. In reality, AI is a pattern-matching system that processes data and makes predictions based on statistical relationships. Even with chain of thought reasoning, which allows GenAI models to break down complex tasks into a series of intermediate steps, it’s only mimicking reasoning.

The misconception that AI is “intelligent” often leads to unrealistic expectations about its capabilities, particularly when it comes to reliability. In reality, AI models can be susceptible to biases and errors, especially in novel or unfamiliar scenarios. As tasks and models get more complex these issues can propagate to cause unwanted or even dangerous behavior. To mitigate these risks, comprehensive testing, validation, and human oversight are crucial in ensuring AI systems meet the required standards for real-world deployment and behave as expected even in unpredicted moments.

When you think about potential impacts of AI, what’s the future you want to see become real?

My dream for AI is that it’s used in a way that brings society closer together instead of further apart. I’m really excited about the potential for AI applications to improve medicine and human health. I think it’s possible in the near future to access an extremely knowledgeable medical AI that can help with diagnostics and recommend and improve treatment regimens. I would love to see people become more empowered around their healthcare with consumer medical products.

I also think using AI to do research and design products will accelerate all industries across the board. My hope is that we can collectively decide that trust with AI models is paramount, and prioritize the safety mechanisms which enable that to be the case. Our ability to rely on AI is only as good as our trust in AI, and we need better testing tools like Distributional in order to get there.

Last but not least, what’s your favorite AI model or algorithm?

In traditional ML, Stacked Ensembles will probably always be my favorite algorithm, as it’s the primary machine learning algorithm I worked with during my PhD and is the foundation of H2O AutoML. We also see ensembles playing a larger role in LLMs as well. For LLMs, I like to stay current with the cutting edge commercial models, and I use Llama3 and its derivatives locally.

Recommended Blog Posts