Brief description:
As a managed service, we ensure the error-free operation of the application that controls the global central compliance monitoring and patch management. This automation solution makes it possible to roll out configuration changes to a large number of servers globally in a uniform manner.
The Managed Service works with various interfaces and partners within the Group and its subsidiaries.
Situation:
The customer wants to use its server automation solution to ensure uniform management and operation of its global infrastructure (more than 150,000 servers). At the time of the initial introduction, several landscapes were already in operation, which were set up independently of each other for various reasons.
The unified solution was introduced as a strategic product, and over the years further features were added beyond the usual server automation and the range of functions for end customers was continuously expanded.
This includes, in particular, the creation of daily reports regarding configuration and patch compliance, but also customer-specific reports that help the customer’s server management teams to fulfill their tasks and create specific requirements, such as SOX and other audits.
In addition, the solution is used by various teams for patch installation and also offers the option of using “auto-remediation” to fix security gaps in the configuration of services and applications without further user interaction. This remediation can be specifically adapted to the requirements of the customer team and take into account various guidelines in terms of security and general standards.
Customer request:
The aim of the project was to consolidate and optimize the existing global server automation solution and then to operate it independently as a managed service while ensuring the SLA.
- Increase and reporting of compliance at infrastructure, database and application level
- Automatic detection and – according to customer requirements – elimination of vulnerabilities through the optimal use of BladeLogic
- Consulting in server management / further development of the infrastructure required for operation
- Assumption of dedicated support and on-call service for users of the automation solution for support outside normal service hours
Challenges:
- Development of compliance checks
- Configuration compliance at OS, database and application level
- General security compliance
- Provision of on demand scanning
- Creation and further development of a SOAP interface for connecting and interacting with external systems
- Definition and creation of interfaces to various CMDB and reporting systems
- Automation of infrastructure operations
- Support in creating patch jobs and specific reporting
- Support in the creation of targeted reports and analyses across a large number of servers and infrastructure operators, including hyperscalers
- Consulting in the area of process design
- Optimization of internal processes
- Design and implementation of highly automated processes within server automation
- Advice on the revision of customer processes
- Further development of the server automation solution
- Integration of new customers
- Holistic requirements analysis and subsequent implementation
- Stable service operation /- Continuous optimization through automation
To this end, the service was initially analyzed and taken over as a managed service as part of a transition phase. In addition to the operation of the environments, Netlution is responsible for the further development of the landscapes, the implementation and monitoring of compliance checks and the implementation of auto-remediations as well as the stabilization and improvement of the actual environments. New environments are also set up and integrated into the service.
KPIs:
- Ensuring the secure operation of the customer infrastructure by monitoring the >150,000 servers
- Ensuring BladeLogic application operation (approx. 100 servers in several landscapes worldwide)
- Sustainable increase in configuration compliance thanks to automatic monitoring, reporting and remediation
- Resolved more than 550 incidents in the last 12 months
- Support in approx. 700 service requests in the same period
- Average response within 30 minutes during on-call duty
Technologies used:
- BladeLogic / TrueSight Server Automation (versions 8.x to 21.02)
- > 10 operationally used landscapes, approx. 100 application and linked servers
- Scripting
- Windows PowerShell
- BaSH, KSH, NSH
- Python
- BLCLI
- C# (SOAP interface)
- MS SQL and file server in the failover cluster
- SLES as repository server
Netlution solution:
- Managed service in 5×11 operation
- Ensuring continuous operation through a rolling deployment system including on-site deployments at the customer’s premises and remote provision
- Establishing service management and continuous reporting to our customers
- Close cooperation within the project framework and in continuous service improvement
Project duration:
The managed service including transition started in 2016 and ends in June 2021.
Netlution services:
- Operational support of all BladeLogic landscapes for end users
- Configuration and integration of new landscapes
- Operation of the BladeLogic landscapes (incl. patching, compliance support and troubleshooting)
- Support of the database infrastructure (incl. troubleshooting)
- Creation and support of BladeLogic jobs, packages and templates
- Complete role-based user management
- Development and support of own and customer-specific scripts to meet new requirements