As I noted in a previous post, containers are a big part of any cloud transition as they combine an application and its underlying prerequisites into neat packages. IBM’s Cloud Pack for Automation (CP4BA) brings that concept of containerization to the business automation portfolio. That portfolio, of course, includes the FileNet content services platform. As always, containers bring advantages to any FileNet implementation. For cloud, they are mandatory.
A major healthcare provider uses FileNet for, as you would expect, patient medical records. But their larger use is as the repository of record for all enterprise digital content, including email. The sheer volume of content and the speed it had to be ingested caused the creation of their large system—maybe THE largest and highest performance FileNet implementation in the world. IBM Enterprise Records manages retention for it all.
Their current FileNet infrastructure includes Linux virtual machines with Oracle as the database. One production instance has more than eighty active FileNet servers (CPE and Search).
Enterprise direction is to move all applications to the cloud. In this case, the chosen cloud deployment platform is Google’s Cloud Platform. The client created a team charged with learning about GCP and working through what it will take to move all major applications over. They asked us to join the team to help with understanding how FileNet will fit into the broader plan.
Like all cloud providers, GCP has a container implementation platform they call Google Kubernetes Engine. While the desired deployment target was GKE, the client is also a large user of Operational Decision Manager (ODM), and that team had decided to implement on GCP using IBM OpenShift. That gave us an opportunity to work with the ODM team on an OpenShift deployment and work in parallel to deploy FileNet on GKE.
An apparent licensing challenge prevented deploying Oracle on GCP, so the FileNet POC used dB2. Storage was a mixture of file systems (NFS) and S3 Object Storage for testing and comparison. IBM Directory supplied LDAP services.
Both deployments used FileNet containers instead of the broader CP4BA deployment. That was just a project choice at the time. Using FileNet containers results in a simpler system. It just has fewer parts. We also containerized and deployed their enterprise content services API stack. The client was interested in seeing how current on-premises content-enabled applications would interact with the cloud FileNet deployment.
In the end, we found minor differences between the GKE and OpenShift deployments. While that was true for a FileNet container deployment, there could be greater contrasts with the added complexity of a CP4BA-based system. Although GKE is one of the IBM “supported” Kubernetes environments, we did run into an interesting situation where IBM Support asked that we try to replicate an issue we had with GKE on the OpenShift side. It makes sense that the support team has more experience and available testing capability on OpenShift.
The client declared the GCP POC a success for FileNet and for the other applications. However, they are a large Oracle shop and are not ready to move the database onto GCP. From what we hear there are licensing challenges and performance concerns. An on-premises production OpenShift is in place and hosting other IBM applications. They have scheduled FileNet for deployment on that OpenShift with a goal of having a smaller P8 environment up by end of year. That deployment will be CP4BA using the on-premises Oracle DB and the on-premises directory. Storage architecture decisions are pending.
We did learn from this effort. A few of our experiences include:
- New release deployment is still a challenge. The changes occurring in each CP4BA release require changes in deployment processes and scripts. I suspect that will settle down over time. But for now, our recommendation is to disable automatic upgrades.
- Cloud resource management at large enterprise client sites, as you would expect, is strict. We commonly experienced locked down features and capabilities that made it almost impossible to troubleshoot issues without active help from the cloud resources team. To be successful whoever is deploying does need broad access and authority.
- Ensure to have an upfront plan for ALL the domain names and their associated certificates. We found changing those attributes after deployment to be extremely difficult.
- Networking needs to have a plan before starting—VLANs, subnets, etc.—as that all is configured in OpenShift, not in the container. It is easy to get configurations into the deployment script up front.
- Have a solid plan for storage before you start. The deployment assumes unlimited disk resources and logs alone can quickly fill up any allocated space. Some mechanism—commercial or otherwise--for managing the logs is a big help.
- Develop a plan for how you expect to monitor the system once it is up and running. Logs, of course, are a big part of that, but there are features, both from the cloud platform and from OpenShift, that help, but only if they are a part of the initial configuration.
- Be prepared for the bill. Containerized deployments have advantages, but they require more computing resources than traditional deployments.
Scaling, once everything is up and running, is quite simple. Just start another instance of the over-stressed container. Of course, one of the promises of containerization is elasticity—the ability for the system to grow and shrink itself as demand grows and shrinks. We did encounter some challenges with elasticity seemingly related to the Liberty application server. Liberty would occasionally spike the processor load to 100% and when automated scaling is enabled, a new container instance would start. Then the processor load would quickly drop back down to 10% or 20% and the new container would shut down—kind of a teeter-totter thing. One can easily resolve such behavior by writing custom elasticity rules, but we did not do that as part of the POCs.
Containers are a way to make infrastructure management easier and are certainly the future for all applications. Just remember that transitioning to them requires new designs and does require a new set of skills.