Pharmaceutical giant AstraZeneca, along with its research partner Oxford University, has been at the forefront of the global effort to research, develop and rapidly deploy a vaccine that could aid in curbing the spread of the novel Covid-19 coronavirus across the world.
The first of the 100 million doses of the Oxford-AstraZeneca vaccine the UK government has acquired was administered at the start of January 2021, and – at the time of writing – GP surgeries across the country are starting to take delivery of their stocks.
While a lot of the media attention the Oxford-AstraZeneca vaccine has attracted to date has focused on the rapid pace of its development, a lot of the behind-the-scenes work that has made it possible to confer widescale protection against Covid-19 within the UK population has relied on cloud computing.
AstraZeneca’s global infrastructure services director, Scott Hunter, is responsible for the pharmaceutical company’s cloud platforms and innovation solutions for the cyber security and infrastructure part of its business.
The company relies on four of the major public cloud providers’ platforms to carry out its work, including Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform and Alibaba.
It also draws on the interconnection capabilities of colocation giant Equinix to make all four of these firms’ respective cloud technologies available via AstraZeneca’s own datacentres.
“We’ve strapped our own methods of operations on top of that, but at the same time we take advantage of some of the niche capabilities like natural language processing (NLP) with search on Azure, and we use a lot of infrastructure as a service [IaaS] on AWS for research and development,” says Hunter.
Hunter’s 112-person team has responsibility for architecture, design and governance, and controls what the company does with its hybrid multicloud environment, known as the AZ Cloud.
Data ingest processes
With regard to AstraZeneca’s Covid-19 vaccine development and production efforts, the biggest challenge it has faced is lack of time.
Pre-Covid, antibodies could only be tested on a small subset of the population and gradually increased over a period of 12-24 months or longer. However, with limited time, AstraZeneca needed to ingest the adverse event data of two billion doses being deployed at the same time globally.
Scott Hunter, AstraZeneca
“It’s incredibly important to collect data, particularly any adverse event data, to make sure that’s fed back to the relevant organisations in real time. Part of that process allows us to cross-reference this data with existing data for our own patients to ensure there’s no adversity with our patients and therapies they’re currently taking,” says Hunter.
To do this, Hunter’s team has worked closely with AstraZeneca’s connectivity teams to ensure that the company’s data is used at the point of need – thanks to an abstraction layer provided by storage giant NetApp.
However, this was not always the way the company worked with its data platforms. About five years ago, the pharmaceutical company was predominantly using Hitachi Data Systems (HDS) and Hewlett Packard Enterprise (HPE), with NetApp being used for its file-based solutions. It decided to review this and conducted a request for information (RFI).
“We found that as well as offering similar capabilities to what AstraZeneca was getting with HDS and HPE, the differences in how the company could use NetApp in the cloud were compelling, and from a price point per terabyte it was better – so the optimisation of workloads and consolidation with one single provider and getting a good commercial deal made sense for us,” says Hunter.
NetApp is being used to provide a data fabric that enables AstraZeneca to collate the data across the four cloud providers it works with, federate this data with partners and research institutes, and to inform the development of other Covid-19 treatments and therapies.
“The important element of the data fabric used for hybrid multicloud was the movement of workloads from a private cloud to a public cloud – the benefits for scientists is they know the data is going to be available all of the time,” adds Hunter.
“The biggest challenge before the hybrid multicloud approach was data conversion. The data fabric allows us to effectively run NetApp in the cloud, so we don’t have to do any conversion – we can move data between on-premise and public cloud, and start the service as quickly as under 10 minutes.”
Vaccine deployment underway
Now the vaccine is rolling out to priority groups across the UK, and in other territories across the world too, the benefits of having access to a global, hybrid cloud environment built on technologies from multiple providers are really coming to the fore.
“We host a lot of workloads and services in public cloud because, as you can imagine, rolling out a worldwide vaccine in countries that don’t have a datacentre location or are not close to one of our key datacentres, it makes sense that we take advantage of any one of the four public clouds to ensure it’s got the best range of services in there so we can get a common approach to learning what we need to do,” says Hunter.
These lessons centre on developing new patient advice, governance, details on adverse effects and tracking, based on the feedback the company gets.
Additionally, AstraZeneca is leaning on tools such as NLP to ensure it can set up websites featuring information about the vaccine in different languages. Normally, that information would be made available to patients in the form of a physical leaflet, but – as time is of the essence – that information now needs to be relayed online.
The company has implemented a security by design approach, which Hunter suggests has been essential, particularly at this time when R&D data and information is a big source of attraction for cyber criminals.
“As you can imagine, right now AstraZeneca is at the front and foremost thought for lots of bad actors – we have five to six million events per day so security is key for us,” he says. “If you think about it from a disaster recovery or data protection point, the data fabric ensures we don’t inappropriately lose any data.”