Background
A space science operations center (SOC) is a facility where scientists use data to conduct research and/or manage operations of a spacecraft. Traditionally, data is available to be analyzed or downloaded to a user’s local system for analysis. As data volumes rapidly increase with a new generation of sensors, the process of delivering data to each scientist for analysis is becoming unsustainable. The purpose of this research is to invert the process and push analysis software to the data.
Approach
Through a novel process known as “containerization” (where a functional piece of a system is distilled into a recipe and stored into an image), and the emergence of cloud technology, it should be possible to do many, if not all, elements of a space science operations center using a cloud-based solution. Our existing data analysis, data mining, and visualization software are to be modernized and containerized, and then deployed on the cloud. To aid in our research, as well as for security and potentially keep elements proprietary, a locally available cluster will be utilized as a “pseudo-cloud.”

Figure 1. Various services possible with a cloud or cloud like infrastructure. Ideally, most of the services are available on the cloud or adjacent to it, without having to download the data locally.
Accomplishments
The entirety of our development process has been modernized and containerized including our build system, bug-tracking system, and our continuous integration system. Significantly, the entirety of our generic SOC including visualization, data mining and analysis has been containerized into a multi-container system. What this means is each major piece of a SOC is within its own container which can be deployed anywhere, including commercial cloud service providers such as Amazon, or on our on-premises cluster computer. By using this hybrid approach, cost-savings can be utilized by gradually migrating to the cloud instead of doing an entire “lift-and-shift” approach to cloud development. This also allows for just utilizing the cloud for specific use cases such as computer speed or large disk arrays. Finally, as part of this research, the Heliophysics API (HAPI) for serving data to other requesting data centers has been developed for both the client and server. The biggest lesson learned was that the cloud is not the panacea that was predicted. Transforming to a full cloud-native infrastructure can offer much if one truly has the use case, but in many cases, it is overkill, which can come at a high cost. Using several of the cloud technologies on on-premises systems and then migrating to commercial cloud services on demand seems like the best approach.