Helper Scripts and Commands
Comprehensive operational runbook, diagnostic commands, and utility script references for the MCM Platform.
MCM Platform — Helper Scripts & Commands
This guide serves as a comprehensive operational runbook for managing, diagnosing, and maintaining the Multi-Cloud Management (MCM) platform. It details environment-specific control commands (3-Tier vs. Single VM), host-level resource diagnostics, container platform statistics, SSL/TLS certificate inspections, and utility script executions.
1. Platform Control & Monitoring (Docker Swarm & Systemd)
These commands are used to check the cluster topology, service health, scheduling, and to manage the platform's background system daemon.
Cluster Node Status
Verify that all virtual machines are joined, ready, and active inside the cluster management plane (run on Access Node for 3-tier):
docker node ls- Explanation: Lists all nodes participating in the orchestrated cluster, their status (
Ready/Down), availability (Active/Drain), and their role (e.g., Access Node asLeaderManager, Worker nodes asWorker).
Service Replica Health
Verify that all backend application and database services are running with their target replicas (run on Access Node for 3-tier):
docker service ls- Explanation: Returns a high-level list of all deployed services, their scheduling mode, and replica counts. Healthy services display
1/1(indicating the service task is running and has passed health checks). Initialization tasks (prefixed withmcm_init-) transition to0/1or exit once they complete database schema migrations or routing setup.
Stack Task Mapping
Inspect the scheduling and actual node mapping of container instances across the cluster (run on Access Node for 3-tier):
docker stack ps mcm- Explanation: Displays exactly which node each service container is currently running on, its current state (
Running,Preparing, orShutdown), and error logs if a container failed to start.
Tail Service Application Logs
To inspect application-level outputs and debug issues, tail the stdout/stderr logs of a specific cluster service (run on Access Node for 3-tier):
# General service tailing
docker service logs mcm_<service-name> --tail 100 --follow
# View Keycloak IAM initialization logs
docker service logs mcm_keycloak --tail 100
# View Backend Core API connectivity logs
docker service logs mcm_mcm-api --tail 100- Explanation:
docker service logsfetches aggregated logs from all container replicas associated with that service across all cluster hosts.
Restart a Specific Service / Container
To restart any individual service or container in the Swarm cluster (run on Access Node for 3-tier):
docker service update --force mcm_<service-name>- Explanation: Performs a rolling restart of all container replicas for the designated service without taking down the rest of the platform.
Tear Down the Stack
To safely stop and delete all service configurations, containers, and networks in the platform stack (run on Access Node for 3-tier):
docker stack rm mcm- Explanation: Removes the orchestrated stack deployment. This tears down running containers and virtual networks without deleting persistent volumes (databases remain intact).
Systemd Service Management
Control the background system service wrapper:
# Check service status
sudo systemctl status mcm.service
# Start the platform
sudo systemctl start mcm.service
# Stop the platform
sudo systemctl stop mcm.service
# Restart the platform
sudo systemctl restart mcm.service
# Enable auto-start at system boot
sudo systemctl enable mcm.service
# Disable auto-start at system boot
sudo systemctl disable mcm.service- Explanation: Manages the host operating system service state for the platform orchestrator daemon.
Service Log Inspection
View system logs generated by the systemd daemon:
# View live real-time log streams
sudo journalctl -u mcm.service -f
# View the last 100 lines of log output
sudo journalctl -u mcm.service -n 100 --no-pager
# View logs generated within a specific timeframe
sudo journalctl -u mcm.service --since "30 minutes ago" --no-pager- Explanation: Displays output logs captured from stdout/stderr of the systemd wrapper service, which is helpful for troubleshooting container compilation or host launch script execution.
2. Host-Level Resource Diagnostics (Common)
Before running the installer or when diagnosing performance degradation, use these commands on any VM to inspect host hardware resources.
Checking System Memory
Verify available, used, and cached RAM on the host virtual machine:
free -h- Explanation: Displays the total RAM, how much is consumed, how much is free, and the swap usage. The
-hflag converts bytes into human-readable formats (e.g.,GB,MB).
Checking Disk Space
Verify storage capacity and disk usage across all mounted file systems:
df -h- Explanation: Shows disk partition allocations. Ensure the root partition (
/) has at least 15–20% free space to prevent write blocks on database and container volumes.
CPU Core Count
Identify the number of logical processor cores available on the VM:
nproc- Explanation: Returns the active processor count, which helps verify if the instance size matches the minimum required specifications.
Process CPU & Memory Utilization
Inspect real-time CPU and memory usage of the top running processes:
top -bn1 | head -22- Explanation: Runs the system monitor in batch mode (
-b), performs a single iteration (-n1), and prints the header plus the top 15 resource-consuming processes to diagnose CPU spikes or memory leaks.
3. Container Platform Diagnostics (Common)
Use these commands on any cluster VM to inspect the local container engine daemon, storage pools, and performance.
Container Engine Info
Inspect the container platform daemon's configuration, active runtime, and overall state:
docker info- Explanation: Returns system-wide information including the number of running/paused/stopped containers, active storage driver (usually
overlay2), and network plugin configurations.
Container Storage Allocation
Check disk space usage allocated to images, container runtimes, and local volumes:
docker system df- Explanation: Displays a breakdown of space consumed by images, active container writable layers, local volumes, and build cache, highlighting reclaimable space.
Live Container Statistics
View real-time CPU, memory, network, and disk I/O usage statistics for all running containers on the current host:
docker stats --no-stream- Explanation: Outputs a single snapshot (
--no-stream) of resource consumption for all active local containers, helping identify memory-heavy or CPU-unbounded services.
Host Network Port Bindings
Identify which processes or sockets are occupying specific network ports:
ss -tlnp | grep -E ":(80|443)"- Explanation: Lists active TCP sockets in listening mode (
-l) with numerical ports (-n) and owning process IDs (-p). Helpful for verifying that ports 80 and 443 are free before deploying the access node gateway.
4. SSL/TLS Certificate and Keystore Inspections (Common)
These commands help inspect and validate TLS certificate files and Java Keystores directly on the VM filesystems.
Certificate File Verification
List the certificates directory contents for a specific service:
# E.g., for the Core API service
ls -la /var/lib/mcm/mcm-api/certs/- Explanation: Lists the private keys, certificate files, and truststores generated for the microservice.
Certificate Validity & Expiration
Check the issue and expiration dates of a certificate file to diagnose TLS handshake errors:
openssl x509 -in /var/lib/mcm/mcm-api/certs/mcm-api.crt -noout -dates- Explanation: Reads the X.509 certificate file (
-in) and prints only the start and expiration dates (-dates) without outputting the raw certificate text (-noout).
Java Keystore Inspection
Inspect the certificates and credentials contained inside a Java keystore (keystore.p12):
keytool -list -v -keystore /var/lib/mcm/mcm-api/certs/keystore.p12 -storepass <KEYSTORE_PASSWORD>- Explanation: Lists keystore entries in verbose mode (
-v) to verify that the private key and certificate chain are correctly imported and valid.
5. Using Platform Control Scripts
The MCM deployment provisions automated lifecycle management scripts on the hosts to simplify operations.
Upgrade the Platform (upgrade.sh on Access Node for 3-tier)
Follow these steps to download the new version artifact, extract it, and execute the upgrade script.
Step 1: Download the New Version Archive
From the registry repository, download the deployment archive:
# Define credentials at the top
REGISTRY_USER="your_username"
REGISTRY_PASS="your_password"
# Download the archive from the registry repository
wget --user="$REGISTRY_USER" --password="$REGISTRY_PASS" -O mcm_artifacts_[NEW_VERSION].tar.gz "http://92.204.249.45:8080/repository/mcm-artifacts/[NEW_VERSION]/mcm_artifacts_[NEW_VERSION].tar.gz"Step 2: Extract the Archive & Navigate
tar -xzvf mcm_artifacts_[NEW_VERSION].tar.gz
cd mcm_artifactsStep 3: Run the Upgrade Script
Execute the upgrade script with root privileges:
# Full upgrade (reloads images and configurations)
sudo ./upgrade.sh
# Configuration-only upgrade (skips loading large image files)
sudo ./upgrade.sh --skip-images- Explanation:
./upgrade.shupdates the container image references, resyncs configurations, and rolls out updates to the service containers without deleting database tables or volumes.- The
--skip-imagesflag is useful when you have only modified environment files (user_config.env) or certificate stores and want to apply configuration changes quickly without reloading large offline image archives.
Launch the Platform (start.sh on Access Node for 3-tier)
Triggers the cluster nodes to pull/evaluate configurations and launch the stack:
sudo /opt/mcm/scripts/start.sh- Explanation: Safely provisions the runtime networks and initiates stack scheduling.
Stop the Platform (stop.sh on Access Node for 3-tier)
Gracefully shuts down running service tasks and removes the stack from the runtime:
sudo /opt/mcm/scripts/stop.sh- Explanation: Sends SIGTERM signals to running containers to allow database transactions to complete before shutting down.
Restart the Platform (restart.sh on Access Node for 3-tier)
Performs a sequential stop-and-start lifecycle sequence:
sudo /opt/mcm/scripts/restart.sh- Explanation: Executes the stop script, waits for networks to clear, and runs the start script to apply new configuration values or reload certificates cleanly.