Architecture

PaaS Part 4: How PaaS facilitates data migration: A path to implementation success

May 30, 2017
Crosslake

Recently, a SaaS company within the financial technology industry acquired a competitor and launched an initiative to merge both companies’ critical customer-facing applications and data. This was the latest in a series of acquisitions, leaving the company with several platforms that needed migrating.

In this post, we will explore the problems that hampered the company’s productivity and the underlying causes behind the difficulties they faced. In this example, we learn how PaaS (Platform-as-a-Service) facilitates data migration and offers several ways to deliver value. In this case, a PaaS approach could have turned the troubled migration project into a successful one.

Cause #1: Absence of modernization efforts

The company had all of its data in a single, massive database that supported the various applications and services, including the core platform that served their Windows applications. The database was accessed in different ways (ORM, direct SQL, stored procedures) and using various platforms (.NET, Java, Ruby on Rails). This presented several problems (and opportunities for architecture evolution), including no standard repository pattern used across different applications and multiple services sharing the same database.

A modernization effort could have started with focusing on the data service layer. PaaS could have significantly helped here by facilitating a microservices-based approach, easing efforts in development, testing and deployment of a centralized data service.

Not only would the data service have provided a single place to manage database access (authentication, authorization, and data transfer) and would have standardized persistence patterns and practices, but, because it is a service, it would have been independently scalable.

Once a service had been established, this abstraction could have also facilitated building out a hybrid data store for the various persistence needs.

Cause #2: Lack of data abstraction

Without a data service, company managers were locked into keeping all their data stored in the same database, because they feared that pulling out data elements might cause unforeseen breakages. Nobody knew how the removal of certain data types may impact one or more applications. Thus, the company ended up with a single database used to house different data structure types (transactional, analytical, XML documents, logs).

Of the 800 GB of data in the database, approximately 500 GB was made up of data that didn’t need to be there ─ 300 GB of audit (write-once) log data and 200 GB of XML documents (simple document data).

Using PaaS, the company could have built a data service, and then began the process of separating out the different types of data into data stores that are designed to handle them (e.g., SQL Server OLTP for transactional data and OLAP for analytical data, MongoDB for XML documents and ELK for logging). PaaS could have provided the various data stores as managed services and the data service as a registered component discoverable by the multiple applications via a service registry.

Applications wouldn’t need to account for data storage locations, as the data service would manage all data abstraction. In the end, the data service would have made the data store far more efficient and the applications faster.

Cause #3: No on-demand deployment capability

Each time the company wanted to undertake a refactoring effort, it meant extended test and deployment cycles, as the complete functionality of the platform would need to be tested and deployed at once. These efforts were significantly hampered by the fact that they did not have easily reproducible environments that could be used and upon which integration and regression tests could be run.

Testing requirements were made more stringent because of the regulatory environment of the industry; the company had to take extreme care in handling its data. This meant being able to clean out sensitive data (e.g., individuals’ personal mailing address, phone number and other personal identifiers) before running any tests on production data.

Using PaaS, the company could have created and deployed instances easily, automatically and quickly — allowing the team to correctly put the new code through its paces. Moreover, these instances would have been production replicas (along with production-level test data), which would have provided the highest fidelity test model for the team; an environment that contains all of the randomnesses of data and user input, machine behavior and networking issues that users will face in production.

Instead, testing was limited due to the effort it took to create a suitable environment.

For the data migration team to perform their tasks, IT had to set up a test environment. It took approximately 24 hours to back up and restore production data onto the staging environment due to the size of the database. Without an automated deployment and platform available, team members could not quickly recover from a mistake in the process.

Conclusion

PaaS provides organizations with the ideal tools for testing, re-platforming, and re-architecting their monolithic application. The ease of deployment, management, and scalability of PaaS should attract development and operations teams that want ultimate flexibility and data security as they pursue new initiatives in support of the business.

Organizations that continue to develop in legacy environments will absolutely be challenged to compete on time-to-market, productivity, and the costs to build innovation. Crosslake helps companies accelerate their efforts with PaaS. If you’d like to explore how PaaS facilitates data migration or other ways in which it might benefit your development efforts, contact us.