HPC Migration in the Cloud: Getting it Right from the Start
High-Performance Computing (HPC) has long been an incredible accelerant in the race to discover and develop novel drugs and therapies for both new and well-known diseases. And a HPC migration to the Cloud might be your next step to maintain or grow your organization’s competitive advantage.
Whether it’s a full HPC migration to the Cloud or a uniquely architected hybrid approach, evolving your HPC ecosystem to the Cloud brings critical advantages and benefits including:
- Flexibility and scalability
- Optimized costs
- Enhanced security
- Compliance
- Backup, recovery, and failover
- Simplified management and monitoring
And with incredibly careful planning, strategic design, effective implementation and with the right support, the capabilities and accelerated outcomes of migrating your HPC systems to the Cloud can lead to truly accelerated breakthroughs and drug discovery.
But with this level of promise and performance, comes challenges and caveats that require strategic consideration throughout all phases of your supercomputing and HPC development, migration and management.
So, before you commence your HPC Migration from on-premise data centers or traditional HPC clusters to the Cloud, here are some key considerations to keep in mind throughout your planning phase.
1. Assess & Understand Your Legacy HPC Environment
Building a comprehensive migration plan and strategy from inception is necessary for optimization and sustainable outcomes. A proper assessment includes an evaluation of the current state of your legacy hardware, software, and the data resources available for use, as well as the system’s capabilities, reliability, scalability, and flexibility, prioritizing security and maintenance of the system.
Gaining a deep and thorough understanding of your current infrastructure and computing environment will help identify technical constraints or bottlenecks that exist, and inform the order that might be necessary for migration. And that level of insight can streamline and circumvent major, arguably avoidable, hurdles that your organization might face.
2. Determine the Right Cloud Provider and Tooling
Determining the right HPC Cloud provider for your organization can be a complex process, but an irrefutable critical one. In fact, your entire computing environment depends on it. It involves researching the available options, comparing features and services, and evaluating cost, reputation and performance.
Amazon Web Service, Microsoft Azure, and Google Cloud – to name just the three biggest – offer storage and Cloud computing services that drive accelerated innovation for companies by offering fast networking and virtually unlimited infrastructure to store and manage massive data sets the computing power required to analyze it. Ultimately, many vendors offer different types of cloud infrastructure that run large, complex simulations and deep learning workloads in the cloud, and it is important to first select the one that best meets the needs of your unique HPC workloads between public cloud, private cloud, or hybrid cloud infrastructure.
3. Plan for the Right Design & Deployment
In order to effectively plan for a HPC Migration in the Cloud, it is important to clearly define the objectives, determine the requirements and constraints, identify the expected outcomes, and a timeline for the project.
From a more technical perspective, it is important to consider the application’s specific requirements and the inherent capabilities including storage requirements, memory capacity, and other components that may be needed to run the application. If a workload requires a particular operating system, for example, then it should be chosen accordingly.
Finally, it is important to understand the networking and security requirements of the application before working through the design, and definitely the deployment phase, of your HPC Migration.
The HPC Migration Journey Begins Here…
By properly considering all of these factors, it is possible to effectively plan for your organization’s HPC migration and its ability to leverage the power of supercomputing in drug discovery.
Assuming your plan is comprehensive, effective and sustainable, implementing your HPC migration plan is ultimately still a massive undertaking, particularly for research IT teams likely already overstretched or for an existing Bio-IT vendor lacking specialized knowledge and skills.
So, if your team is ready to take the leap and begin your HPC migration, get in touch with our team today.
The Next Phase of Your HPC Migration in the Cloud
A HPC migration to the Cloud can be an incredibly complex process, but with strategic planning and design, effective implementation and with the right support, your team will be well on their way to sustainable success. Click below and get in touch with our team to learn more about our comprehensive HPC Migration services that support all phases of your HPC migration journey, regardless of which stage you are in.
Learn the key considerations for evaluating and selecting the right application for your Cloud-environment.
Good software means faster work for drug research and development, particularly concerning proteins. Proteins serve as the basis for many treatments, and learning more about their structures can accelerate the development of new treatments and medications.
With more software now infusing an artificial intelligence element, researchers expect to significantly streamline their work and revolutionize the drug industry. When it comes to protein folding software, two names have become industry frontrunners: AlphaFold and Openfold.
Learn the differences between the two programs, including insights into how RCH is supporting and informing our customers about the strategic benefits the AlphaFold and Openfold applications can offer based on their environment, priorities and objectives.
About AlphaFold2
Developed by DeepMind and EMBL’s European Bioinformatics Institute, AlphaFold2 uses AI technology to predict a protein’s 3D structure based on its amino acid sequence. Its structure database is hosted on Google Cloud Storage and is free to access and use.
The newer model, AlphaFold 2, won the CASP competition in November 2020, having achieved more accurate results than any other entry. AlphaFold2 scored above 90 for more than two-thirds of the proteins in CASP’s global distance test, which measures whether the computational-predicted structure mirrors the lab-determined structure.
To date, there are more than 200 million known proteins, each one with a unique 3D shape. AlphaFold2 aims to simplify the once-time-consuming and expensive process of modeling these proteins. Its speed and accuracy are accelerating research and development in nearly every area of biology. By doing so, scientists will be better able to tackle diseases, discover new medicines and cures, and understand more about life itself.
Exploring Openfold Protein Folding Software
Another player in the protein software space, Openfold, is PyTorch’s reproduction of Deepmind’s AlphaFold. Founded by three Seattle biotech companies (Cyrus Biotechnology, Outpace Bio, and Arzeda), the team aims to support open-source development in the protein folding software space, which is registered on AWS. The project is part of the nonprofit organization Open Molecular Software Foundation and has received support from the AWS Open Data Sponsorship Program.
Despite being more of a newcomer to the scene, Openfold is quickly turning heads with its open source model and more “completeness” compared to AlphaFold. In fact, it has been billed as a faster and more powerful version than its predecessor.
Like AlphaFold, Openfold is designed to streamline the process of discovering how proteins fold in and around on themselves, but possibly at a higher rate and more comprehensively than its predecessor. The model has undergone more than 100,000 hours of training on NVIDIA A100 Tensor Core GPUs, with the first 3,000 hours boasting 90%+ final accuracy.
AlphaFold vs. Openfold: Our Perspective
Despite Openfold being a reproduction of AlphaFold, there are several key differences between the two.
AlphaFold2 and Openfold boast similar accuracy ratings, but Openfold may have a slight advantage. Openfold’s interface is also about twice as fast as that of AlphaFold when modeling short proteins. For long protein strands, the speed advantage is minimal.
Openfold’s optimized memory usage allows it to handle much longer protein sequences—up to 4,600 residues on a single 40GB A100.
One of the clearest differences between AlphaFold2 and Openfold is that Openfold is trainable. This makes it valuable for our customers in niche or specialized research, a capability that AlphaFold lacks.
Key Use Cases from Our Customers
Both AlphaFold and Openfold have offered game-changing functionality for our customers’ drug research and development. That’s why many of the organization’s we’ve supported haveeven considered a hybrid approach rather than making an either/or decision.
Both protein folding software can be deployed across a variety of use cases, including:
New Drug Discovery
The speed and accuracy with which protein folding software can model protein strands make it a powerful tool in new drug development, particularly for diseases that have largely been neglected. These illnesses often disproportionately affect individuals in developing countries. Examples include parasitic diseases, such as Chagas disease or leishmaniasis.
Combating Antibiotic Resistance
As the usage of antibiotics continues to rise, so does the risk of individuals developing antibiotic resistance. Previous data from the CDC shows that nearly one in three prescriptions for antibiotics is unnecessary. It’s estimated that antibiotic resistance costs the U.S. economy nearly $55 billion every year in healthcare and productivity losses.
What’s more, when people become resistant to antibiotics, it leaves the door wide open for the creation of “superbugs.” Since these bugs cannot be killed with typical antibiotics, illnesses can become more severe.
Professionals from the University of Colorado, Boulder, are putting AlphaFold to the test in learning more about proteins involved in antibiotic resistance. The protein folding software is helping researchers identify protein structures that they could confirm via crystallography.
Vaccine Development
Learning more about protein structures is proving useful in developing new vaccines, such as a multi-agency collaboration on a new malaria vaccine. The WHO endorsed the first malaria vaccine in 2021. However, researchers at the University of Oxford and the National Institute of Allergy and Infectious Diseases are working together to create a more effective version that better prevents transmission.
Using AlphaFold and crystallography, the two agencies identified the first complete structure of the protein Pfs48/45. This breakthrough could pave the way for future vaccine developments.
Learning More About Genetic Variations
Genetics has long fascinated scientists and may hold the key to learning more about general health, predisposition to diseases, and other traits. A professor at ETH Zurich is using AlphaFold to learn more about how a person’s health may change over time or what traits they will exhibit based on specific mutations in their DNA.
AlphaFold has proven useful in reviewing proteins in different species over time, though the accuracy diminishes the further back in time the proteins are reviewed. Seeing how proteins evolve over time can help researchers predict how a person’s traits might change in the future.
How RCH Solutions Can Help
Selecting protein folding software for your research facility is easier with a trusted partner like RCH solutions. Not only can we inform the selection process, but we also provide support in implementing new solutions. We’ll work with you to uncover your greatest needs and priorities and align the selection process with your end goals with budget in mind.
Contact us to learn how RCH Solutions can help.
Sources:
https://www.nature.com/articles/d41586-022-00997-5
https://www.deepmind.com/research/highlighted-research/alphafold
https://www.drugdiscoverytrends.com/7-ways-deepmind-alphafold-used-life-sciences/
https://www.cdc.gov/media/releases/2016/p0503-unnecessary-prescriptions.html
In Life Sciences, and medical fields in particular, there is a premium on expertise and the role of a specialist. When it comes to scientists, researchers, and doctors, even a single high-performer who brings advanced knowledge in their field often contributes more value than a few average generalists who may only have peripheral knowledge. Despite this premium placed on specialization or top-talent as an industry norm, many life science organizations don’t always follow the same measure when sourcing vendors or partners, particularly those in the IT space.
And that’s a mis-step. Here’s why.
Why “A” Talent Matters
I’ve seen far too many organizations that had, or still have, the above strategy, and also many that focus on acquiring and retaining top talent. The difference? The former experienced slow adoption which stalled outcomes which often had major impacts to their short and long term objectives. The latter propelled their outcomes out of the gates, circumventing cripping mistakes along the way. For this reason and more, I’m a big believer in attracting and retaining only “A” talent. The best talent and the top performers (Quality) will always outshine and out deliver a bunch of average ones. Most often, those individuals are inherently motivated and engaged, and when put in an environment where their skills are both nurtured and challenged, they thrive.
Why Expertise Prevails
While low-cost IT service providers with deep rosters may similarly be able to throw a greater number of people at problems, than their smaller, boutique counterparts, often the outcome is simply more people and more problems. Instead, life science teams should aim to follow their R&D talent acquisition processes and focus on value and what it will take to achieve the best outcomes in this space. Most often, it’s not about quantity of support/advice/execution resources—but about quality.
Why Our Customers Choose RCH
Our customers are like minded and also employ top talent, which is why they value RCH—we consistently service them with the best. While some organizations feel that throwing bodies (Quantity) at a problem is one answer, often one for optics, RCH does not. We never have. Sometimes you can get by with a generalist, however, in our industry, we have found that our customers require and deserve specialists. The outcomes are more successful. The results are what they seek— Seamless transformation.
In most cases, we are engaged with a customer who has employed the services of a very large professional services or system integration firm. Increasingly, those customers are turning to RCH to deliver on projects typically reserved for those large, expensive, process-laden companies. The reason is simple. There is much to be said for a focused, agile and proven company.
Why Many Firms Don’t Restrategize
So why do organizations continue to complain but rely on companies such as these? The answer has become clear—risk aversion. But the outcomes of that reliance are typically just increased costs, missed deadlines or major strategic adjustments later on – or all of the above. But why not choose an alternative strategy from inception? I’m not suggesting turning over all business to a smaller organization. But, how about a few? How about those that require proven focus, expertise and the track record of delivery? I wrote a piece last year on the risk of mistaking “static for safe,” and stifling innovation in the process. The message still holds true.
We all know that scientific research is well on its way to becoming, if not already, a multi-disciplinary, highly technical process that requires diverse and cross functional teams to work together in new ways. Engaging a quality Scientific Computing partner that matches that expertise with only “A” talent, with the specialized skills, service model and experience to meet research needs can be a difference-maker in the success of a firm’s research initiatives.
My take? Quality trumps quantity—always in all ways. Choose a scientific computing partner whose services reflect the specialized IT needs of your scientific initiatives and can deliver robust, consistent results. Get in touch with me below to learn more.
Data science has earned a prominent place on the front lines of precision medicine – the ability to target treatments to the specific physiological makeup of an individual’s disease. As cloud computing services and open-source big data have accelerated the digital transformation, small, agile research labs all over the world can engage in development of new drug therapies and other innovations.
Previously, the necessary open-source databases and high-throughput sequencing technologies were accessible only by large research centers with the necessary processing power. In the evolving big data landscape, startup and emerging biopharma organizations have a unique opportunity to make valuable discoveries in this space.
The drive for real-world data
Through big data, researchers can connect with previously untold volumes of biological data. They can harness the processing power to manage and analyze this information to detect disease markers and otherwise understand how we can develop treatments targeted to the individual patient. Genomic data alone will likely exceed 40 exabytes by 2025 according to 2015 projections published by the Public Library of Science journal Biology. As data volume increases, its accessibility to emerging researchers improves as the cost of big data technologies decreases.
A recent report from Accenture highlights the importance of big data in downstream medicine, specifically oncology. Among surveyed oncologists, 65% said they want to work with pharmaceutical reps who can fluently discuss real-world data, while 51% said they expect they will need to do so in the future.
The application of artificial intelligence in precision medicine relies on massive databases the software can process and analyze to predict future occurrences. With AI, your teams can quickly assess the validity of data and connect with decision support software that can guide the next research phase. You can find links and trends in voluminous data sets that wouldn’t necessarily be evident in smaller studies.
Applications of precision medicine
Among the oncologists Accenture surveyed, the most common applications for precision medicine included matching drug therapies to patients’ gene alterations, gene sequencing, liquid biopsy, and clinical decision support. In one example of the power of big data for personalized care, the Cleveland Clinic Brain Study is reviewing two decades of brain data from 200,000 healthy individuals to look for biomarkers that could potentially aid in prevention and treatment.
AI is also used to create new designs for clinical trials. These programs can identify possible study participants who have a specific gene mutation or meet other granular criteria much faster than a team of researchers could determine this information and gather a group of the necessary size.
A study published in the journal Cancer Treatment and Research Communications illustrates the impact of big data on cancer treatment modalities. The research team used AI to mine National Cancer Institute medical records and find commonalities that may influence treatment outcomes. They determined that taking certain antidepressant medications correlated with longer survival rates among the patients included in the dataset, opening the door for targeted research on those drugs as potential lung cancer therapies.
Other common precision medicine applications of big data include:
- New population-level interventions based on socioeconomic, geographic, and demographic factors that influence health status and disease risk
- Delivery of enhanced care value by providing targeted diagnoses and treatments to the appropriate patients
- Flagging adverse reactions to treatments
- Detection of the underlying cause of illness through data mining
- Human genomics decoding with technologies such as genome-wide association studies and next-generation sequencing software programs
These examples only scratch the surface of the endless research and development possibilities big data unlocks for start-ups in the biopharma sector. Consult with the team at RCH Solutions to explore custom AI applications and other innovations for your lab, including scalable cloud services for growing biotech and pharma research organizations.
Do You Need Support with Your Cloud Strategy?
Cloud services are swiftly becoming standard for those looking to create an IT strategy that is both scalable and elastic. But when it comes time to implement that strategy—particularly for those working in life sciences R&D—there are a number of unique combinations of services to consider.
Here is a checklist of key areas to examine when deciding if you need expert support with your Cloud strategy.
- Understand the Scope of Your Project
Just as critical as knowing what should be in the cloud is knowing what should not be. The act of mapping out the on-premise vs. cloud-based solutions in your strategy will help demonstrate exactly what your needs are and where some help may be beneficial. - Map Out Your Integration Points
Speaking of on-premise vs. in the Cloud, do you have an integration strategy for getting cloud solutions talking to each other as well as to on-premise solutions? - Does Your Staff Match Your Needs?
When needs change on the fly, often your staff needs to adjust. However, those adjustments are not always so easily implemented, which can lead to gaps. So when creating your cloud strategy, ensure you have the right team to help understand the capacity, uptime and security requirements unique to a cloud deployment.
Check our free eBook, Cloud Infrastructure Takes Research Computing to New Heights, to help uncover the best cloud approach for your team. Download Now
- Do Your Solutions Meet Your Security Standards?
There are more than enough examples to show the importance of data security. It’s no longer enough however, to understand just your own data security needs. You now must know the risk management and data security policies of providers as well. - Don’t Forget About Data
Life Sciences is awash with data and that is a good thing. But all this data does have consequences, including within your cloud strategy so ensure your approach can handle all your bandwidth needs. - Agree on a Timeline
Finally, it is important to know the timeline of your needs and determine whether or not your team can achieve your goals. After all, the right solution is only effective if you have it at the right time. That means it is imperative you have the capacity and resources to meet your time-based goals.
Using RCH Solutions to Implement the Right Solution with Confidence
Leveraging the Cloud to meet the complex needs of scientific research workflows requires a uniquely high level of ingenuity and experience that is not always readily available to every business. Thankfully, our Cloud Managed Service solution can help. Steeped in more than 30 years of experience, it is based on a process to uncover, explore, and help define the strategies and tactics that align with your unique needs and goals.
We support all the Cloud platforms you would expect, such as AWS and others, and enjoy partner-level status with many major Cloud providers. Speak with us today to see how we can help deliver objective advice and support on the solution most suitable for your needs.
Studied benefits of Cloud computing in the biotech and pharma fields.
Cloud computing has become one of the most common investments in the pharmaceutical and biotech sectors. If your research and development teams don’t have the processing power to keep up with the deluge of available data for drug discovery and other applications, you’ve likely looked into the feasibility of a digital transformation.
Real-world research reveals these examples that highlight the incredible effects of Cloud-based computing environments for start-up and growing biopharma companies.
Competitive Advantage
As more competitors move to the Cloud, adopting this agile approach saves your organization from lagging behind. Consider these statistics:
- According to a February 2022 report in Pharmaceutical Technology, keywords related to Cloud computing increased by 50% between the second and third quarters of 2021. What’s more, such mentions increased by nearly 150% over the five-year period from 2016 to 2021.
- An October 2021 McKinsey & Company report indicated that 16 of the top 20 pharmaceutical companies have referenced the Cloud in recent press releases.
- As far back as 2020, a PwC survey found that 60% of execs in pharma had either already invested in Cloud tech or had plans for this transition underway.
Accelerated Drug Discovery
In one example cited by McKinsey, Moderna’s first potential COVID-19 vaccine entered clinical trials just 42 days after virus sequencing. CEO Stéphane Bancel credited Cloud technology, that enables scalable and flexible access to droves of existing data and as Bancel put it, doesn’t require you “to reinvent anything,” for this unprecedented turnaround time.
Enhanced User Experience
Both employees and customers prefer to work with brands that show a certain level of digital fluency. In the survey by PwC cited above, 42% of health services and pharma leaders reported that better UX was the key priority for Cloud investment. Most participants – 91% – predicted that this level of patient engagement will improve individual ability to manage chronic disease that require medication.
Rapid Scaling Capabilities
Cloud computing platforms can be almost instantly scaled to fit the needs of expanding companies in pharma and biotech. Teams can rapidly increase the capacity of these systems to support new products and initiatives without the investment required to scale traditional IT frameworks. For example, the McKinsey study estimates that companies can reduce the expense associated with establishing a new geographic location by up to 50% by using a Cloud platform.
Are you ready to transform organizational efficiency by shifting your biopharmaceutical lab to a Cloud-based environment? Connect with RCH today to learn how we support our customers in the Cloud with tools that facilitate smart, effective design and implementation of an extendible, scalable Cloud platform customized for your organizational objectives.
References
https://www.mckinsey.com/industries/life-sciences/our-insights/the-case-for-Cloud-in-life-sciences
https://www.pharmaceutical-technology.com/dashboards/filings/Cloud-computing-gains-momentum-in-pharma-filings-with-a-50-increase-in-q3-2021/
https://www.pwc.com/us/en/services/consulting/fit-for-growth/Cloud-transformation/pharmaceutical-life-sciences.html
Consider the Advantages of Guardrails in the Cloud
Cloud integration has quite deservedly become the go-to digital transformation strategy across industries, particularly for businesses in the pharmaceutical and biotech sectors. By integrating Cloud technology into your IT approach, your organization can access unprecedented flexibility while taking advantage of real-time collaboration tools. What’s more, Cloud solutions deliver sustained value compared to on-premises solutions, which require resources (both time and money) to upgrade and maintain the associated hardware, since companies can easily scale Cloud platforms in tandem with accelerating growth.
At the same time, leaders must carefully balance the flexibility and adaptability of Cloud technology with the need for robust security and access controls. With effective guardrails administered appropriately, emerging biopharma companies can optimize research and development within boundaries that shield valuable data and ensure regulatory compliance. Explore these advantages of adding the right guardrails to your biotech or pharmaceutical organization’s digital landscape to inform your planning process.
Prevent unintended security risks
One of the most appealing aspects of the Cloud is the ability to leverage its incredible ecosystem of knowledge, tools, and solutions within your own platform. Having effective guardrails in place allows your team to quickly install and benefit from these tools, including brand-new improvements and implementations, without inadvertently creating a security risk.
Researchers can work freely in the digital setting while the guardrail monitors activity and alerts users in the event of a security risk. As a result, the organization can avoid these common issues that lead to data breaches:
- Maintaining open access to completed projects that should have privileges in place
- Disabling firewalls or Secure Shell systems to access remote systems
- Using sensitive data for testing and development purposes
- Collaborating on sensitive data without proper access controls
Honor the shared responsibility model
Biopharma companies tend to appreciate the autonomous, self-service approach of Cloud platforms, as the dynamic infrastructure offers nearly endless experimentation. At the same time, most security issues in the Cloud result from user errors such as misconfiguration. The implementation of guardrails creates a stopgap so that even with the shortest production schedules, researchers won’t accidentally expose the organization to potential threats. Guardrails also help your company comply with your Cloud service provider’s shared responsibility policy, which outlines and defines the security responsibilities of both organizations.
Establish and maintain best practices for data integrity
Adolescent biopharma companies often experience such accelerated growth that they can’t keep up with the need to create and follow organizational best practices for data management. By putting guardrails in place, you also create standardized controls that ensure predictable, consistent operation. Available tools abound, including access and identity management permissions, security groupings, network policies, and automatic enforcement of these standards as they apply to critical Cloud data.
A solid information security and management strategy becomes even more critical as your company expands. Entrepreneurs who want to prepare for future acquisitions should be ready to show evidence of a culture that prizes data integrity.
According to IBM, the cost of a single Cloud-based security breach in the United States averaged nearly $4 million in 2020. Guardrails provide a solution that truly serves as a golden means, preserving critical Cloud components such as accessibility and collaboration without sacrificing your organization’s valuable intellectual property, creating compliance issues and compromising research objectives.
Bio-IT teams must focus on five major areas in order to improve research efficiency and outcomes
Life Science research organizations need to collect, maintain, and analyze a large amount of data in order to achieve research outcomes. The need to develop efficient, compliant data management solutions is growing throughout the Life Science industry, but Bio-IT leaders face diverse challenges to optimization.
These challenges are increasingly becoming obstacles to Life Science research, where data accessibility is crucial for gaining analytic insight. We’ve identified five main areas where data management challenges are holding Life Science research teams back from developing life-saving drugs and treatments.
Five Data Management Challenges for Life Science Research Firms
Many of the popular applications that Life Science researchers use to manage regulated data are not designed specifically for the Life Science industry. This is one of the main reasons why Life Science research teams are facing data management and compliance challenges. Many of these challenges stem from the implementation of technologies not well-suited to meet the demands of scientific research.
Here, we’ve identified five areas where improvements in data management can help drug R&D efficiency and reliability.
1. Manual Compliance Processes
Some drug research teams and their Bio-IT partners are dedicated to leverage software to automate tedious compliance-related tasks. These include creating audit trails, monitoring for personally identifiable information, and classifying large volumes of documents and data in ways that keep pace with the internal speed of scientific discovery.
However, many Life Science researchers remain outside of this trend towards compliance automation. Instead, they perform compliance operations manually, which creates friction when collaborating with partners and drags down the team’s ability to meet regulatory scrutiny.
Automation can become a key value-generating asset in the Life Science research process. When properly implemented and subjected to a coherent, purpose-built data governance structure, it improves data accessibility without sacrificing quality, security, or retention.
2. Data Security and Integrity
The Life Science industry needs to be able to protect electronic information from unauthorized access. At the same time, certain data must be available to authorized third parties when needed. Balancing these two crucial demands is an ongoing challenge for Life Science researchers and Bio-IT teams.
When data is scattered across multiple repositories and management has little visibility into the data lifecycle, striking that key balance becomes difficult. Determining who should have access to data and how permission to that data should be assigned takes on new levels of complexity as the organization grows.
Life Science research organizations need to implement robust security frameworks that minimize the exposure of sensitive data to unauthorized users. This requires core security services that include continuous user analysis, threat intelligence, and vulnerability assessments, on top of an MDM-based data infrastructure that enables secure encryption and permissioning of sensitive data, including intellectual properties.
3. Scalable, FAIR Data Principles
Life Science organizations increasingly operate like big data enterprises. They generate large amounts of data from multiple sources and use emerging technologies like artificial intelligence to analyze that data. Where an enterprise may source its data from customers, applications, and third-party systems, Life Science researchers get theirs from clinical studies, lab equipment, and drug development experiments.
The challenge that most Life Science research organizations face is the storage of this data in organizational silos. This impacts the team’s ability to access, analyze, and categorize the data appropriately. It also makes reproducing experimental results much more difficult and time-consuming than it needs to be.
The solution to this challenge involves implementing FAIR data principles in a secure, scalable way. The FAIR data management system relies on four main characteristics:
Findability. In order to be useful, data must be findable. This means it must be indexed according to terms that researchers, auditors, and other stakeholders are likely to search for. It may also mean implementing a Master Data Management (MDM) or metadata-based solution for managing high-volume data.
Accessibility. It’s not enough to simply find data. Authorized users must also be able to access it, and easily. When thinking about accessibility—while clearly related to security and compliance, including proper provisioning, permissions, and authentication—ease of access and speed can be a difference-maker, which leads to our next point.
Interoperability. When data is formatted in multiple different ways, it falls on users to navigate complex workarounds to derive value from it. If certain users don’t have the technical skills to immediately use data, they will have to wait for the appropriate expertise from a bio-IT team member, which will drag down overall productivity.
Reusability. Reproducibility is a serious and growing concern among Life Science professionals. Data reusability plays an important role in ensuring experimental insights can be reproduced by independent teams around the world. This can be achieved through containerization technologies that establish a fixed environment for experimental data.
4. Storage Solutions
The way your research team stores and communicates data is an integral component of your organization’s overall productivity and flexibility. Organizational silos create bottlenecks that become obstacles to scientific advancement, while robust, accessible data storage platforms enable on-demand analysis that improves time-to-value for research applications.
The three major categories of storage solutions are Cloud, on-premises, and hybrid systems. Each of these presents a unique set of advantages and disadvantages, which serve specific research goals based on existing infrastructure and support. Organizations should approach this decision with their unique structure and goals in mind.
Life Science research firms that implement MDM solutions are able to take important steps towards storing their data while improving security and compliance. Master data management provides a single reference point for Life Science data, as well as a framework for enacting meaningful cybersecurity policies that prevent unauthorized access while encouraging secure collaboration.
MDM solutions exist as Cloud-based software-as-a-service licenses, on-premises hardware, and hybrid deployments. Biopharma executives and scientists will need to choose a deployment style that fits within their projected scope and budget for driving transformational data management in the organization.
Without an MDM solution in place,Bio-IT teams must expend a great deal of time and effort to organize data effectively. This can be done through a data fabric-based approach, but only if the organization is willing to leverage more resources towards developing a robust universal IT framework.
5. Monetization
Many Life Science research teams don’t adequately monetize data due to compliance and quality control concerns. This is especially true of Life Science research teams that still use paper-based quality management systems, as they cannot easily identify the data that they have – much less the value of the insights and analytics it makes possible.
This becomes an even greater challenge when data is scattered throughout multiple repositories, and bio-IT teams have little visibility into the data lifecycle. There is no easy method to collect these data for monetization or engage potential partners towards commercializing data in a compliant way.
Life Science research organizations can monetize data through a wide range of potential partnerships. Organizations to which you may be able to offer high-quality research data include:
Healthcare providers and their partners.
Academic and research institutes.
Health insurers and payer intermediaries.
Patient engagement and solution providers.
Other pharmaceutical research organizations.
Medical device manufacturers and suppliers.
In order to do this, you will have to assess the value of your data and provide an accurate estimate of the volume of data you can provide. As with any commercial good, you will need to demonstrate the value of the data you plan on selling and ensure the transaction falls within the regulatory framework of the jurisdiction you do business in.
Overcome These Challenges Through Digital Transformation
Life Science research teams who choose the right vendor for digitizing compliance processes are able to overcome these barriers to implementation. Vendors who specialize in Life Sciences can develop compliance-ready solutions designed to meet the needs of drug R&D, making fast, efficient transformation a possibility.
RCH Solutions can help you capitalize on the data your Life Science research team generates and give you the competitive advantage you need to make valuable discoveries. Rely on our help to streamline research workflows, secure sensitive data, and improve drug R&D outcomes.
RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.
There are good reasons to balance Cloud infrastructure between multiple vendors.
In Part One of this series, we discussed some of the applications and workflows best-suited for public Cloud deployment. But public Cloud deployments are not the only option for life science researchers and biopharmaceutical IT teams. Hybrid Cloud and multi-Cloud environments can offer the same benefits in a way that’s better aligned to stakeholder interests.
What is a Multi-Cloud Strategy?
Multi-Cloud refers to an architectural approach that uses multiple Cloud computing services in parallel. Organizations that adopt a multi-Cloud strategy are able to distribute computing resources across their deployments and minimize over-reliance on a single vendor.
Multi-Cloud deployments allow Life Science researchers and Bio-IT teams to choose between multiple public Cloud vendors when distributing computing resources. Some Cloud platforms are better suited for certain tasks than others, and being able to choose between multiple competing vendors puts the organization at an overall advantage.
Why Bio-IT Teams Might Want to Adopt a Multi-Cloud Strategy
Working with a single Cloud computing provider for too long can make it difficult to move workloads and datasets from one provider to another. Especially as needs and requirements change which, as we know, is quite often within Life Sciences organizations. Highly centralized IT infrastructure tends to accumulate data gravity – the tendency for data analytics and other applications to converge on large data repositories, making it difficult to scale data capabilities outwards.
This may go unnoticed until business or research goals demand migrating data from one platform to another. At that point, the combination of data gravity and vendor lock-in can suddenly impose unexpected technical, financial, and legal costs.
Cloud vendors do not explicitly prevent users from migrating data and workflow applications. However, they have a powerful economic incentive to make the act of migration as difficult as possible. Letting users flock to their competitors is not strictly in their interest.
Not all Cloud vendors do this, but any Cloud vendor can decide to. Since Cloud computing agreements can change over time, users who deploy public Cloud technology with a clear strategy for avoiding complex interdependencies will generally fare better than users who simply go “all in” with a single vendor.
Multi-Cloud deployments offer Life Science research organizations a structural way to eliminate over-reliance on a single Cloud vendor. Working with multiple vendors from the start demands researchers and IT teams plan for data and application portability from the beginning.
Multi-Cloud deployments also allow IT teams to better optimize diverse workflows with scalable computing resources. When researchers demand new workloads, their IT partners can choose an optimal platform for each one of them on a case-by-case basis.
This allows researchers and IT teams to coordinate resources more efficiently. One research application’s use of sensitive data may make it better suited for a particular Cloud provider, while another workflow demands high-performance computing resources only available from a different provider. Integrating multiple Cloud providers under a single framework can enable considerable efficiencies through each stage of the research process.
What About Hybrid Cloud?
Hybrid Clouds are IT architectures that rely on a combination of private Cloud resources alongside public Cloud systems. Private Cloud resources are simply Cloud-based architectures used exclusively by one organization.
For example, imagine your life science firm hosts some research applications on its own internal network but also uses Microsoft Azure and Amazon AWS. This is a textbook example of a multi-Cloud architecture that is also hybrid.
Hybrid Cloud environments may offer benefits to Life Science researchers that need security and compliance protection beyond what public Cloud vendors can easily offer. Private Cloud frameworks are ideal for processing and storing sensitive data.
Hybrid Cloud deployments may also present opportunities to reduce overall operating expenses over time. If researchers are sure they will consistently use certain Cloud computing resources frequently for years, hosting those applications on a private Cloud deployment may end up being more cost-efficient over that period.
It’s common for Bio-IT teams to build private on-premises Cloud systems for small, frequently used applications and then use easily scalable public Cloud resources to handle less frequent high-performance computing jobs. This hybrid approach allows life science research organizations to get the best of both worlds.
Optimize your Cloud Strategy for Achieving Research Goals
Life Science research organizations generate value by driving innovation in a dynamic and demanding field. Researchers who can perform computing tasks on the right platform and infrastructure for their needs are ideally equipped to make valuable discoveries. Public Cloud, multi-Cloud, and hybrid Cloud deployments are all viable options for optimizing Life Science research with proven technology.
RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.
Life Science researchers are beginning to actively embrace public Cloud technology. Research labs that manage IT operations more efficiently have more resources to spend on innovation.
As more Life Science organizations migrate IT infrastructure and application workloads to the public Cloud, it’s easier for IT leaders to see what works and what doesn’t. The nature of Life Science research makes some workflows more Cloud-friendly than others.
Why Implement Public Cloud Technology in the Life Science Sector?
Most enterprise sectors invest in public Cloud technology in order to gain cost benefits or accelerate time to market. These are not the primary driving forces for Life Science research organizations, however.
Life Science researchers in drug discovery and early research see public Cloud deployment as a way to consolidate resources and better utilize in-house expertise on their core deliverable—data. Additionally, the Cloud’s ability to deliver on-demand scalability plays well to Life Science research workflows with unpredictable computing demands.
These factors combine to make public Cloud deployment a viable solution for modernizing Life Science research and fostering transformation. It can facilitate internal collaboration, improve process standardization, and extend researchers’ IT ecosystem to more easily include third-party partners and service providers.
Which Applications and Workflows are Best-Suited to Public Cloud Deployment?
For Life Science researchers, the primary value of any technology deployment is its ability to facilitate innovation. Public Cloud technology is no different. Life Science researchers and IT leaders are going to find the greatest and most immediate value utilizing public Cloud technology in collaborative workflows and resource-intensive tasks.
1. Analytics
Complex analytical tasks are well-suited for public Cloud deployment because they typically require intensive computing resources for brief periods of time. A Life Science organization that invests in on-premises analytics computing solutions may find that its server farm is underutilized most of the time.
Public Cloud deployments are valuable for modeling and simulation, clinical trial analytics, and other predictive analytics processes that enable scientists to save time and resources by focusing their efforts on the compounds that are likely to be the most successful. They can also help researchers glean insight from translational medicine applications and biomarker pathways and ultimately, bring safer, more targeted, and more effective treatments to patients. Importantly, they do this without the risk of overpaying and underutilizing services.
2. Development and Testing
The ability to rapidly and securely build multiple development environments in parallel is a collaborative benefit that facilitates Life Science innovation. Again, this is an area where life science firms typically have the occasional need for high-performance computing resources – making on-demand scalability an important cost-benefit.
Public Cloud deployments allow IT teams to perform large system stress tests in a streamlined way. System integration testing and user acceptance testing are also well-suited to the scalable public Cloud environment.
3. Infrastructure Storage
In a hardware-oriented life science environment, keeping track of the various development ecosystems used to glean insight is a challenge. It is becoming increasingly difficult for hardware-oriented Life Science research firms to ensure the reproducibility of experimental results, simply because of infrastructural complexity.
Public Cloud deployments enable cross-collaboration and ensure experimental reproducibility by enabling researchers to save infrastructure as data. Containerized research applications can be opened, tested, and communicated between researchers without the need for extensive pre-configuration.
4. Desktop and Devices
Research firms that invest in public Cloud technology can spend less time and resources provisioning validated environments. They can provision virtual desktops to vendors and contractors in real-time, without having to go through a lengthy and complicated hardware process.
Life Science research organizations that share their IT platform with partners and contractors are able to utilize computing resources more efficiently and reduce its data storage needs. Instead of storing data in multiple places and communicating an index of that data to multiple partners, all of the data can be stored securely in the cloud and made accessible to the individuals who need it.
5. Infrastructure Computing
Biopharmaceutical manufacturing is a non-stop process that requires a high degree of reliability and security. Reproducible high-performance cloud (HPC) computing environments allow researchers to create and share computational biology data and biostatistics in a streamlined way.
Cloud-enabled infrastructure computing also helps Life Science researchers monitor supply chains more efficiently. Interacting with supply chain vendors through a Cloud-based application enables researchers to better predict the availability of research materials, and plan their work accordingly.
Hybrid Cloud and Multi-Cloud Models May Offer Greater Efficiencies
Public Cloud technology is not the only infrastructural change happening in the Life Science industry. Certain research organizations can maximize the benefits of cloud computing through hybrid and multi-Cloud models, as well. The second part of this series will cover what those benefits are, and which Life Science research firms are best-positioned to capitalize on them.
RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.
Transformative change means rethinking the scientific computing workflow.
The need to embrace and enhance data science within the Life Sciences has never been greater. Yet, many Life Sciences organizations performing drug discovery face significant obstacles when transforming their legacy workflows.
Multiple factors contribute to the friction between the way Life Science research has traditionally been run and the way it needs to run moving forward. Companies that overcome these obstacles will be better equipped to capitalize on tomorrow’s research advances.
5 Obstacles to the Cloud-First Data Strategy and How to Address Them
Life Science research organizations are right to dedicate resources towards maximizing research efficiency and improving outcomes. Enabling the full-scale Cloud transformation of a biopharma research lab requires identifying and addressing the following five obstacles.
1. Cultivating a Talent Pool of Data Scientists
Life Science researchers use a highly developed skill set to discover new drugs, analyze clinical trial data, and perform biostatistics on the results. These skills do not always overlap with the demands of next-generation data science infrastructure. Life Science research firms that want to capitalize on emerging data science opportunities will need to cultivate data science talent they can rely on.
Aligning data scientists with therapy areas and enabling them to build a nuanced understanding of drug development is key to long-term success. Biopharmaceutical firms need to embed data scientists in the planning and organization of clinical studies as early as possible and partner them with biostatisticians to build productive long-term relationships.
2. Rethinking Clinical Trials and Collaborations
Life Science firms that begin taking a data science-informed approach to clinical studies in early drug development will have to ask difficult questions about past methodologies:
- Do current trial designs meet the needs of a diverse population?
- Are we including all relevant stakeholders in the process?
- Could decentralized or hybrid trials drive research goals in a more efficient way?
- Could we enhance patient outcomes and experiences using the tools we have available?
- Will manufacturers accept and build the required capabilities quickly enough?
- How can we support a global ecosystem for real-world data that generates higher-quality insights than what was possible in the past?
- How can we use technology to make non-data personnel more capable in a cloud-first environment?
- How can we make them data-enabled?
All of these questions focus on the ability for data science-backed cloud technology to enable new clinical workflows. Optimizing drug discovery requires addressing inefficiencies in clinical trial methodology.
3. Speeding Up the Process of Achieving Data Interoperability
Data silos are among the main challenges that Life Science researchers face with legacy systems. Many Life Science organizations lack a company-wide understanding of the total amount of data and insights they have available. So much data is locked in organizational silos that merely taking stock of existing data assets is not possible.
The process of cleaning and preparing data to fuel AI-powered data science models is difficult and time-consuming. Transforming terabyte-sized databases with millions of people records into curated, AI-ready databases manually is slow, expensive, and prone to human error.
Automated interoperability pipelines can reduce the time spent on this process to a matter of hours. The end result is a clean, accurate database fully ready for AI-powered data science. Researchers can now create longitudinal person records (LPRs) with ease.
4. Building Infrastructure for Training Data Models
Transforming legacy operations into fast, accurate AI-powered ones requires transparent access to many different data sources. Setting up the infrastructure necessary takes time and resources. Additionally, it can introduce complexity when identifying how to manage multiple different data architectures. Data quality itself may be inconsistent between sources.
Building a scalable pipeline for training AI data models requires scalable cloud technology that can work with large training datasets quickly. Without reputable third-party infrastructure in place, the process of training data models can take months.
5. Protecting Trade Secrets and Patient Data
Life Science research often relies on sensitive technologies and proprietary compounds that constitute trade secrets for the company in question. Protecting intellectual property has always been a critical challenge in the biopharmaceutical industry, and today’s cybersecurity landscape only makes it more important.
Clinical trial data, test results, and confidential patient information must be protected in compliance with privacy regulations. Life Science research organizations need to develop centralized policies that control the distribution of sensitive data to internal users and implement automated approval process workflows for granting access to sensitive data.
Endpoint security solutions help ensure sensitive data is only downloadable to approved devices and shared according to protocol. This enables Life Science researchers to share information with partners and supply chain vendors without compromising confidentiality.
A Robust Cloud-First Strategy is Your Key to Life Science Modernization
Deploying emergent technologies in the Life Science industry can lead to optimal research outcomes and better use of company resources. Developing a cloud computing strategy that either supplements or replaces aspects of your legacy system requires input and buy-in from every company stakeholder it impacts. Consult with the expert Life Science research consultants at RCH Solutions to find out how your research team can capitalize on the digital transformation taking place in Life Science.
RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.
Key Takeaways from NVIDIA’s GTC Conference Keynote
I recently attended NVIDIA’s GTC conference. Billed as the “number one AI conference for innovators, technologists, and creatives,” the keynote by NVIDIA’s always dynamic CEO, Jensen Huang, did not disappoint.
Over the course of his lively talk, Huang detailed how NVIDIA’s DGX line, which RCH has been selling and supporting since shortly after the inception of DGX, continues to mature as a full-blown AI enabler.
How? Scale, essentially.
More specifically, though, NVIDIA’s increasing lineup of available software and models will facilitate innovation by removing much of the software infrastructure work and providing frameworks and baselines on which to build.
In other words, one will not be stuck reinventing the wheel when implementing AI (a powerful and somewhat ironic analogy when you consider the impact of both technologies—the wheel and artificial intelligence—on human civilization).
The result, just as RCH promotes in Scientific Compute, is that the workstation, server, and cluster look the same to the users so that scaling is essentially seamless.
While cynics could see what they’re doing as a form of vendor lock, I’m looking at it as prosperity via an ecosystem. Similar to the way I, and millions of other people around the world, are vendor-locked into Apple because we enjoy the “Apple ecosystem”, NVIDIA’s vision will enable the company to transcend its role as simply an emerging technology provider (which to be clear, is no small feat in and of itself) to become a facilitator of a complete AI ecosystem. In such a situation, like Apple, the components are connected or work together seamlessly to create a next-level friction-free experience for the user.
From my perspective, the potential benefit of that outcome—particularly within drug research/early development where the barriers to optimizing AI are high—is enormous.
The Value of an AI Ecosystem in Drug Discovery
The Cliff’s Notes version of how NVIDIA plans to operationalize its vision (and my take on it), is this:
- Application Sharing: NVIDIA touted Omniverse as a collaborative platform — “universal” sharing of applications and 3D.
- Data Centralization: The software-defined data center (BlueField-2 & 3 / DPU) was also quite compelling, though in the world of R&D we live in at RCH, it’s really more about Science and Analytics than Infrastructure. Nonetheless, I think we have to acknowledge the potential here.
- Virtualization: GPU virtualization was also impressive (though like BlueField, this is not new but evolved). In my mind, I wrestle with virtualization for density when it comes to Scientific Compute, but we (collectively) need to put more thought into this.
- Processing: NVIDIA is pushing its own CPU as the final component in the mix, which is an ARM-based processor. ARM is clearly going to be a force moving forward, and Intel x86_64 is aging … but we also have to acknowledge that this will be an evolution and not a flash-cut.
What’s interesting is how this approach could play to enhance in-silico Science.
Our world is Cloud-first. Candidly, I’m a proponent of that for what I see as legitimate reasons (you can read more about that here). But like any business, Public Cloud vendors need to cater to a wide audience to better the chances of commercial success. While this philosophy leads to many beneficial services, it can also be a blocker for specialized/niche needs, like those in drug R&D.
To this end, Edge Computing (for those still catching up, a high-bandwidth and very low latency specialty compute strategy in which co-location centers are topologically close to the Cloud), is a solution.
Edge Computing is a powerful paradigm in Cloud Computing, enabling niche features and cost controls while maintaining a Cloud-first tact. Thus, teams are able to take advantage of the benefits of a Public Cloud for data storage, while augmenting what Public Cloud providers can offer by maintaining compute on the Edge. It’s a model that enables data to move faster than the more traditional scenario; and in NVIDIA’s equation, DGX and possibly BlueField work as the Edge of the Cloud.
More interestingly, though, is how this strategy could help Life Sciences companies dip their toes into the still unexplored waters of Quantum Computing through cuQuantum … Quantum (qubit) simulation on GPU … for early research and discovery.
I can’t yet say how well this works in application, but the idea that we could use a simulator to test Quantum Compute code, as well as train people in this discipline, has the potential to be downright disruptive. Talking to those in the Quantum Compute industry, there are anywhere from 10 – 35 people in the world who can code in this manner (today). I see this simulator as a more cost-effective way to explore technology, and even potentially grow into a development platform for more user-friendly OS-type services for Quantum.
A Solution for Reducing the Pain of Data Movement
In summary, what NVIDIA is proposing may simplify the path to a more synergistic computing paradigm by enabling teams to remain—or become—Cloud-first without sacrificing speed or performance.
Further, while the Public Cloud is fantastic, nothing is perfect. The Edge, enabled by innovations like what NVIDIA is introducing, could become a model that aims to offer the upside of On-prem for the niche while reducing the sometimes-maligned task of data movement.
While only time will tell for sure how well NVIDIA’s tools will solve Scientific Computing challenges such as these, I have a feeling that Jensen and his team—like our most ancient of ancestors who first carved stone into a circle—just may be on to something here.
RCH Solutions is a global provider of computational science expertise, helping Life Sciences and Healthcare firms of all sizes clear the path to discovery for nearly 30 years. If you’re interesting in learning how RCH can support your goals, get in touch with us here.