The customer required support in setting up, operating, optimizing and further developing a global cloud management environment. Based on a managed service, the tasks of the external service provider also included the complete management of the infrastructure, the coordination of the connected units and the dismantling of the entire cloud environment at the end of the operating phase.
Brief description:
In Managed Service, Netlution was responsible for the error-free 24/7 operation of a global cloud solution over a period of six years, in which internal units could book and use demos in virtual showrooms. We were responsible for all levels required to provide the portal and its functions. Without this demo infrastructure, existing and new customers would only have been able to familiarize themselves with new software, releases, scenarios and systems at great expense.
Situation:
The customer operated an internal & external cloud infrastructure and service to provide product demonstration environments for its global presales and sales organization for partners and customers. The internal landscape consisted of over 100 hypervisors with approximately 130 TB RAM and almost 5,000 CPUs. The total storage capacity of the storage systems amounted to approx. 2,000 TB. On average, around 100 product demo environments were made available worldwide on a permanent basis.
Depending on the complexity of the use case, the demo environments provided consisted of several virtual machines (including databases and application systems), which were allocated corresponding resources in terms of computing power (CPU), RAM, network and storage capacities. The demo environments could largely be requested and used by customers via a self-service portal. To ensure high-quality, stable and cost-effective operation and further development of the environment, Netlution took over the operation of the cloud management environment as part of a managed service.
Customer request:
- In virtual showrooms, the customer was able to provide sales, its partners and new customers with a global cloud solution for demos
- In addition to four data centers and over 100 hypervisors, Netlution is responsible for orchestrating all levels (including virtualization) required to provide the portal and its functions in this context
The customer’s aim was to operate the cloud-based product demo environment economically and reliably. For this reason, the operation of the cloud management environment was handed over to an external service provider as a managed service.
A competent team of experts was required for the service, which took over operational responsibility over the entire product life cycle, further developed the service and operated it stably, securely and economically until the end of life.
Challenges:
- Semi-automated monitoring of VMware vCenter and infrastructure
- Consulting in an orchestrating and coordinating function
- Between all internal customer units and external providers
- Optimization of internal information processing and dissemination
- Implementation of an average of over 1000 service requests per year
- Resolving an average of 260 incidents per year
- Administration of the backend application (Veritas eCM)
- Management of the 1st/2nd level providers
The aim of the project was to create a company-wide, standardized concept for hardware, software and backup & recovery and to improve support at the same time. This included, among other things:
- Centralized resource and deployment management transparent for the customer
- Standardized and consistent billing basis, resulting in planning security and administrative relief
- Centralized assurance of the required service level agreements (SLAs)
- Continuous, needs-based further development of the environment
- Optimal integration into internal processes
- Independence from individuals
- Ensuring permanent external know-how
A concept had to be developed for cost-effective 7×24 operation (with on-call duty) within a highly complex multi-vendor structure, in which the IT specialists deployed were continuously familiar with the operating processes, but the net capacity requirement only allowed these employees to be deployed pro rata temporis. In addition, the skill requirements and capacity needs of the expert team fluctuated considerably depending on the phase of the product lifecycle. A solution had to be found for this.
KPIs:
- SLA compliance on a monthly basis
- Ensuring error-free operation (entire infrastructure, the backend application (Veritas eCM) and the demo portal)
- 24/7 on-call service
- Support in the implementation and planning of projects
- Needs analyses
- Capacity and resource management
- Incident management & service requests:
- SLA reporting (IRT, MPT)
- Overview of the severity of tickets
- Incidents: overview of MoD involvement / escalations
- Overview of changes processed during the month and outlook for upcoming changes (incl. success rate, explanation of problems/complications encountered)
- Overview of problems processed during the month and analysis of recurring incidents
- Outage reporting with statistics on the root cause
- Efficiency indicators
- Quality indicators
- Service improvement plan (CSI)
Netlution solution:
Since 2014, support has been provided as a managed service in a rolling deployment system (RES) with > 10 Netlution (senior) consultants and two Netlution service managers and the agreement to permanently maintain 40% of trained capacity. Due to the requirement to provide the environment for different time zones, Netlution ensured the availability of the environment around the clock. A 3rd level on-call service (24/7) was established for this purpose. Alarms are raised by telephone with a defined response time.
Project duration:
The project started in 2014 and the global cloud environment was dismantled by Netlution at the end of the operating phase at the end of 2020.
Netlution services:
- Ensuring service, administration, optimization and further development of the cloud environment including all infrastructure components and services
- Troubleshooting 2nd / 3rd level (ticker based)
- Service management (incl. customer contact person, review meetings, steering committee, etc.)
- Transparent, centralized resource and deployment management
- KPI / SLA reporting
- Documentation (incl. knowledge transfer management, e.g. wiki, logbook)
- Managed service in a rolling deployment system (RES) with > 10 Netlution (senior) consultants
- Two Netlution Service Managers
- Provision of a built-in capacity reserve of 40%
- Planning security through standardized, consistent billing basis
Results:
With one cloud landscape per participant, training courses can be offered that could not be implemented with the classic split backend variant (landscape per training course), such as destructive user behavior or upgrades of data bases, etc.
- Provision of cloud landscapes at international events with 5000+ visitors each
- Up to approx. 1000 VMs used simultaneously
- Event agent runs on over 1500 laptops
- With the participant-dedicated cloud landscapes (landscape per participant), training courses can be offered that cannot be implemented with the classic shared backend variant (landscape per training course), such as destructive user behavior or upgrades of data beacons, etc.