ARTEQ implements archive solution at MARIN

MARIN prepares for data explosion: from 1 to 8 petabytes in five years

MARIN is facing a data explosion. The research institute from Wageningen anticipates that the amount of valuable research data will increase from approximately 1 to over 8 petabytes in five years. To prepare for this, MARIN, in collaboration with partner Arteq, implemented Oracle Hierarchical Storage Manager (HSM).

The Maritime Research Institute Netherlands (MARIN) is one of the five applied research institutes in our country. As an independent institution, MARIN has been assisting its clients for over 85 years in making structures in water, such as ships, platforms, and ports, smarter, cleaner, and safer. This is achieved through simulations, such as testing the design of the Second Maasvlakte when it was still in the planning stage, with ships that had not yet been conceived. “We do this with around 400 experts based in Wageningen for clients all over the world, including both companies and governments,” says Kelvin van Brakel, System Administrator at MARIN.

These model tests are increasingly complemented by numerical simulations (CFD, Computational Fluid Dynamics) for which MARIN deploys computing clusters. “In some cases, this is the same work as with the model tests, but done digitally, often faster and more cost-effective. Sometimes a simulation provides more information than model tests because with model tests, you can only measure where you place sensors on the model, whereas with such a simulation, you get results from every location.”

Especially due to the deployment of the computing cluster added last year, MARIN is expecting an explosive increase in the amount of stored data. Currently, MARIN already has tens of millions of files, totaling approximately 1 petabyte (1000 terabytes) of research data. Without the use of the new computing cluster, MARIN expects this amount to increase by 40 percent each year, resulting in 3.2 petabytes of data in five years.

However, with the new computing cluster, this estimate is raised to 8.1 petabytes.

MARIN's new data strategy: scalable archiving

“With the anticipated growth in mind, we needed a scalable storage solution that also provided robust archiving capabilities,” emphasizes Van Brakel. “As a research institute, we want to be able to preserve data for the long term and retrieve it when needed. It’s much more cost-effective to archive this data than to keep it on active storage disks.”

MARIN had been missing this archiving function in recent years. “Years ago, we used a solution that replaced files in the active storage with small stub files of just a few kilobits when archiving. Users thought they could open such a file directly, but actually retrieving a file was much more complicated and could take several days. We didn’t want that anymore.”

Therefore, MARIN decided to disable the archive and expand the active storage. “This wasn’t a strategy we could sustain for long, especially considering the expected growth in data volume. Disks for active storage are too expensive to use for data that should be in an archive.” Additionally, the distinction between active and archive data disappeared, making it challenging to classify data and place it in the right locations for users.

For Van Brakel, one thing was clear: “We needed an archiving solution again.” After a tendering process, MARIN chose Oracle HSM.

Future-proof Datastorage: MARIN's Collaboration with Arteq and Oracle HSM

The solution was implemented in collaboration with Arteq, which is responsible for periodic maintenance such as updates and upgrades. “The communication and interaction between us run very smoothly,” says Van Brakel.

While competing solutions rely solely on disk storage, Oracle HSM offers a combination of disk and tape storage. Wim Huijbers, Managing Director of Arteq, explains: “Essentially, all data is stored on tape, but Oracle Hierarchical Storage Manager keeps track of where they are stored so that a file can be retrieved quickly without the intervention of an administrator. This happens automatically, increasing the speed of operations while significantly reducing operational costs.”

Through close collaboration between MARIN, Oracle, Distributor Techdata, and Arteq, the right hardware, software, and implementation were selected. “In terms of hardware, we are prepared for a storage capacity of over 8 petabytes in five years,” Van Brakel explains. “Currently, we have a tape robot that is prepared for the scenario where we will need 3.2 petabytes of storage in five years, but we have also chosen an expansion unit with enough capacity to scale up to 8.1 petabytes.”

Furthermore, the solution is fully redundant. MARIN’s secondary data center in Ede houses an identical storage environment to the main location in Wageningen. Oracle HSM writes data to both separate environments, making them identical. “For example, if ransomware affects the storage in Wageningen, the data is still available in Ede,” explains Huijbers.

The migration of data to the new environment is still ongoing. Huijbers says, “MARIN has so much data to archive that the migration takes a considerable amount of time. We are talking about hundreds of terabytes of data that need to be transferred.”

However, MARIN is already reaping the benefits of Oracle HSM. Van Brakel states, “The limited space on active storage used to cause problems regularly. When you run out of space on active storage, processes can stall. By archiving data, we have been able to free up space. We no longer need to free up space ad hoc to keep working.

Furthermore, the distinction between active and archived data is now clearer. This allows us to work more smoothly and shift our focus to other projects.”

Contact:

Would you like to explore how Arteq can assist your organization with advanced storage solutions and seamless management of growing data volumes?

Contact us today! Our team is ready to discuss your specific needs and provide customized solutions.