Containerization: The New Standard for Reproducible Scientific Computing

Containers resolve deployment and reproducibility issues in Life Science computing.

Bioinformatics software and scientific computing applications are crucial parts of the Life Science workflow. Researchers increasingly depend on third-party software to generate insights and advance their research goals.

These third-party software applications typically undergo frequent changes and updates. While these updates may improve functionalities, they can also impede scientific progress in other ways.

Research pipelines that rely on computationally intensive methodologies are often not easily reproducible. This is a significant challenge for scientific advancement in the Life Sciences, where replicating experimental results – and the insights gleaned from analyzing those results – is key to scientific progress.

The Reproducibility Problem Explained 

For Life Science researchers, reproducibility falls into four major categories:

Direct Replication is the effort to reproduce a previously observed result using the same experimental conditions and design as an earlier study.

Analytic Replication aims to reproduce scientific findings by subjecting an earlier data set to new analysis.

Systemic Replication attempts to reproduce a published scientific finding under different experimental conditions.

Conceptual Replication evaluates the validity of an experimental phenomenon using a different set of experimental conditions.

Researchers are facing challenges in some of these categories more than others. Improving training and policy can help make direct and analytic replication more accessible. Systemic and conceptual replication is significantly harder to address effectively.

These challenges are not new. They have been impacting research efficiency for years. In 2016, Nature published a study showing that out of 1,500 life science researchers, more than 70% failed to reproduce another scientist’s experiments.

There are multiple factors responsible for the ongoing “reproducibility crisis” facing the life sciences. One of the most important challenges scientists need to overcome is the inability to easily assemble software tools and their associated libraries into research pipelines.

This problem doesn’t fall neatly into one of the categories above, but it impacts each one of them differently. Computational reproducibility forms the foundation that direct, analytic, systemic, and conceptual replication techniques all rely on.

Challenges to Computational Reproducibility 

Advances in computational technology have enabled scientists to generate large, complex data sets during research. Analyzing and interpreting this data often depends heavily on specific software tools, libraries, and computational workflows.

It is not enough to reproduce a biotech experiment on its own. Researchers must also reproduce the original analysis, using the computational techniques that previous researchers used, and do so in the same computing environment. Every step of the research pipeline has to conform with the original study in order to truly test whether a result is reproducible or not.

This is where advances in bioinformatic technology present a bottleneck to scientific reproducibility. Researchers cannot always assume they will have access to (or expertise in) the technologies used by the scientists whose work they wish to reproduce. As a result, achieving computational reproducibility turns into a difficult, expensive, and time-consuming experience – if it’s feasible at all.

How Containerization Enables Reproducibility 

Put simply, a container consists of an entire runtime environment: an application, plus all its dependencies, libraries, and other binaries, and configuration files needed to run it, bundled into one package. By containerizing the application platform and its dependencies, differences in OS distributions and underlying infrastructure are abstracted away.

If a researcher publishes experimental results and provides a containerized copy of the application used to analyze those results, other scientists can immediately reproduce those results with the same data. Likewise, future generations of scientists will be able to do the same regardless of upcoming changes to computing infrastructure.

Containerized experimental analyses enable life scientists to benefit from the work of their peers and contribute their own in a meaningful way. Packaging complex computational methodologies into a unique, reproducible container ensure that any scientist can achieve the same results with the same data.

Bringing Containerization to the Life Science Research Workflow

Life Science researchers will only enjoy the true benefits of containerization once the process itself is automatic and straightforward. Biotech and pharmaceutical research organizations cannot expect their researchers to manage software dependencies, isolate analyses away from local computational environments, and virtualize entire scientific processes for portability while also doing cutting-edge scientific research.

Scientists need to be able to focus on the research they do best while resting assured that their discoveries and insights will be recorded in a reproducible way. Choosing the right technology stack for reproducibility is a job for an experienced biotech IT consultant with expertise in developing R&D workflows for the biotech and pharmaceutical industries.

RCH Solutions helps Life Science researchers develop and implement container strategies that enable scalable reproducibility. If you’re interested in exploring how a container strategy can support your lab’s ability to grow, contact our team to learn more.

RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here. 

Once Upon A Time….

One of my favorite communications leaders and public speakers is Conor Neill. 

Among the many presentations he’s given on effective communication, is a particularly popular speech on, of all topics, “How to Start a Speech”. 

Paradoxically, he begins by telling you what not to do and then offers this very simple and practical advice: Speak like you’re talking to a child. From Neill’s perspective, perhaps beginning your talk with one of the most recognizable lines from almost any classic children’s story … “Once Upon A Time” … not only gets your audience’s attention but creates a sense of anticipation for the story that is about to unfold before them.

The reason for this, I offer, is that a story is not only easy to understand, it also, in the best examples, transports you to a place other than where you are. It suspends time, takes you on a journey, and makes you believe.

Think of history’s greatest storytellers (e.g. Mark Twain, Walt Disney, Oprah Winfrey, Steve Jobs, Warren Buffet, etc.). You may not enjoy or believe what some of them had to say but, you have to admit, they are great storytellers. You understand the message they’re trying to deliver. Most especially Buffet who has this great ability to take a very complicated topic of finance, and explain it in terms that (almost) everyone can understand.

After working in this business for more than 25 years and leading RCH Solutions for the last 15 of those, I’m a firm believer in telling a story. Why? Because despite our tenure in the industry, I’m often asked “what exactly does RCH Solutions do?” 

The challenge is not that I’m unable to articulate our services or value, but rather, RCH is in a very unique business. One which supports a relatively narrow audience, to solve some very specific and challenging problems. So, the answer requires some level of explanation even to our ideal customers. And many times, the explanation is not brief and requires specific detail that demonstrates the very specific problem in a very specific market that we aim to solve.  Besides, would you like to listen to a solution pitch or hear a story about a customer, just like you, who experienced the same challenge then realized a great outcome?

If RCH made software or technology (which, BTW, we don’t do either of those things), the answer would be much simpler and likely the same no matter who’s asking.  

But how do you present or deliver a complex answer in a concise manner, to one individual or perhaps more, in a way they can understand? 

Like Neill says, the answer is to tell a story. So here goes. 

Once upon a time, there was a kind and wise young man, David, who was sent to disarm the mighty and powerful Goliath. Despite all odds, the seemingly inferior David defeated the battle-wise behemoth that is Goliath by hurling a stone to the center of his head. Through a cunning combination of skill, agility, speed, and the effective use of tools, he proved that the things many believe to be an advantage, like size, actually have little to do with ability. 

And what does this have to do with RCH Solutions, you might ask? I’ll tell you. 

Although RCH is in a unique market serving a specific customer base, we do battle competition as well (or at least other vendors who are believed to offer similar products or solutions). Often, we are the challenger. The David to the incumbent in Goliath. And what lessons have we learned from that story?

  1. Speed and Agility Beat Size
  2. Focus on a Specific Area
  3. Not all Rules Apply
  4. Embrace Emerging Technologies

The Stories that Matter Most to Our Customers

At RCH Solutions, we have a top-notch marketing group. They have prepared some terrific material to tell the story of RCH. What we do, what we offer, and most importantly the value the customer will see from our services. Like most companies, we have pitch decks to tell our story. After all, customers expect you to have material. They expect you to have slides and slicks and sales pieces that help prove your value to internal stakeholders. 

However, whenever we have a chance to speak to a prospective customer—who, by the way, are some of the smartest people on the planet—I always ask the same questions. Would you rather have material about what we do? Or, would you like to hear a story? A story of how RCH helped a customer, just like you, solve a very specific challenge?

Every single time, they choose a story. 

So I tell them about a time we have tackled a challenge like theirs or improved an outcome similar to what they hope to realize, or finished a project started by some other vendor who had the size, but none of the skill to actually get it done. Pulling from our experience, I can help our prospective customers understand what we do through their lens, helping them to see why our service would be valuable to them. 

If a vendor can’t tell a story, then the book isn’t yet finished. Wouldn’t you rather hear a story with a great ending?

Let RCH tell you a story. It begins with “Once upon a time in Scientific Computing….”

RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.