I Use This Smart Light as an Uptime Monitor for My Self-Hosted Services, and It Works Perfectly
At Magisk Modules, we’re always exploring innovative ways to manage and monitor our self-hosted services. Reliability is paramount, and quickly identifying downtime is crucial for maintaining a seamless user experience. While traditional monitoring solutions offer robust alerting mechanisms, we wanted a solution that was both immediately noticeable and required minimal technical overhead. Our answer? A smart light acting as a visual uptime indicator. This article details our implementation, the benefits we’ve experienced, and how you can easily replicate this setup for your own self-hosted infrastructure.
The Problem: Silent Downtime and Delayed Response
Traditional monitoring systems often rely on email or SMS alerts to notify administrators of service outages. While effective, these methods suffer from several drawbacks:
- Delayed Notification: Email alerts can be delayed due to spam filters or email server issues. SMS alerts, while faster, can be missed if your phone is on silent or out of reach.
- Information Overload: Receiving a constant stream of text or email alerts can lead to alert fatigue, where genuine outages are missed amidst the noise.
- Lack of Immediate Visual Indication: When an issue arises, there’s no immediate visual cue to indicate a problem. You need to actively check your monitoring dashboard or wait for an alert.
These limitations led us to seek a more immediate and visually compelling solution. We wanted a system that would instantly alert us to downtime, even when we weren’t actively monitoring our dashboards.
Our Solution: A Smart Light Uptime Monitor
Our solution involves utilizing a smart light bulb connected to our local network and controlled by a simple script that monitors the status of our self-hosted services. The light bulb changes color based on the status of the services:
- Green: All services are running normally.
- Red: One or more services are experiencing downtime.
- Yellow: One or more services are experiencing a warning or degraded performance.
- Blue: Maintenance mode.
The flashing red light is very hard to miss, ensuring immediate attention to any potential issues.
Selecting the Right Smart Light for Uptime Monitoring
Choosing the right smart light is critical for the effectiveness of this solution. We considered several factors before settling on our preferred model:
- API Support: The smart light must have a well-documented API or a readily available library for controlling its color and brightness via code.
- Reliability: The light should be known for its reliability and consistent connectivity to the network. We didn’t want the light itself to become a source of downtime.
- Color Range: The light should be capable of producing a wide range of colors to accurately represent different service statuses.
- Brightness: The light should be bright enough to be easily visible even in well-lit environments.
- Price: Cost is always a consideration. We looked for a balance between features, reliability, and affordability.
We ultimately chose [Specify Smart Light Brand and Model Here - e.g., Philips Hue] due to its robust API, wide color range, and proven reliability. The availability of a Python library for controlling the light made integration with our monitoring scripts straightforward.
Setting Up the Monitoring Script
The core of our smart light uptime monitor is a Python script that performs the following tasks:
- Service Monitoring: The script periodically checks the status of each of our self-hosted services. This can be done by sending HTTP requests, pinging servers, or querying specific ports.
- Status Aggregation: The script aggregates the status of all monitored services to determine the overall system status.
- Light Control: Based on the overall system status, the script sends commands to the smart light to change its color accordingly.
Here’s a simplified example of the Python script:
import requests
from phue import Bridge # Replace with your light's API library
import time
# Configuration
BRIDGE_IP = "your_bridge_ip"
LIGHT_NAME = "Uptime Monitor Light"
SERVICES = {
"Website": "https://magiskmodule.gitlab.io",
"API Server": "https://your-api-server.com",
"Database": "https://your-database-server.com"
}
# Initialize the Hue bridge
b = Bridge(BRIDGE_IP)
b.connect()
lights = b.get_light_objects('name')
light = lights[LIGHT_NAME]
def check_service_status(url):
try:
response = requests.get(url, timeout=5)
response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx)
return True
except requests.exceptions.RequestException as e:
print(f"Error checking {url}: {e}")
return False
def update_light_color(status):
if status == "green":
light.brightness = 254
light.xy = [0.4091, 0.518] # Green color
elif status == "red":
light.brightness = 254
light.xy = [0.675, 0.322] # Red color
elif status == "yellow":
light.brightness = 254
light.xy = [0.484, 0.448] # Yellow color
elif status == "blue":
light.brightness = 254
light.xy = [0.167, 0.04] # Blue color
else:
light.brightness = 50
light.xy = [0.3127, 0.3290] # White color, dimmed
# Main loop
while True:
all_services_up = True
for service_name, service_url in SERVICES.items():
if not check_service_status(service_url):
all_services_up = False
print(f"{service_name} is down!")
break
if all_services_up:
update_light_color("green")
else:
update_light_color("red")
time.sleep(60) # Check every 60 seconds
Explanation:
- Imports: The script imports the necessary libraries:
requests
for making HTTP requests,phue
(or your light’s API library) for controlling the smart light, andtime
for pausing between checks. - Configuration: The
BRIDGE_IP
,LIGHT_NAME
, andSERVICES
variables need to be configured with your specific values. check_service_status()
: This function checks the status of a single service by making an HTTP request to its URL. It returnsTrue
if the service is up and responding, andFalse
otherwise.update_light_color()
: This function sends commands to the smart light to change its color based on the specified status. Thexy
values are color coordinates that define the desired color.- Main Loop: The main loop iterates through the list of services, checks their status, and updates the light color accordingly. It then pauses for 60 seconds before repeating the process.
Important Considerations:
- Error Handling: The script includes basic error handling to catch exceptions during the HTTP requests. You should enhance this with more robust error handling and logging.
- Security: Ensure that your smart light and its API are properly secured. Change the default passwords and consider using a firewall to restrict access to the light.
- Dependencies: Install the necessary libraries using
pip install requests phue
(or the equivalent for your light’s API library). - Running the Script: Run the script in the background using
nohup python your_script_name.py &
to ensure it continues running even after you close your terminal. You may need to adjust the python path, if python is not directly invokable from your terminal.
Integrating with Existing Monitoring Systems
Our smart light uptime monitor doesn’t replace our existing monitoring systems; rather, it complements them. We integrate the smart light with our Prometheus and Grafana setup by having Prometheus trigger the script via a webhook when a specific alert fires. This allows us to leverage the advanced alerting capabilities of Prometheus while still benefiting from the immediate visual indication of the smart light.
Specifically, we configured Prometheus to send a webhook to a simple Flask application that runs on our server. This Flask application receives the webhook data from Prometheus, extracts the relevant information about the alert, and then calls the same update_light_color()
function used in the standalone script to change the color of the smart light.
This integration allows us to use the smart light to visually represent the status of complex metrics and alerts that would be difficult to convey through a simple uptime check.
Addressing False Positives and Network Issues
Network connectivity issues or temporary glitches can sometimes trigger false positives, causing the smart light to incorrectly indicate downtime. To mitigate this, we implemented several strategies:
- Multiple Checks: The script performs multiple checks of each service before declaring it down. If a service fails one check, it waits a few seconds and tries again. Only if the service fails multiple consecutive checks is it considered down.
- Ping Monitoring: In addition to HTTP requests, we also monitor the services using ping. This helps to identify network connectivity issues that might not be apparent from HTTP requests alone.
- Alert Throttling: We configured Prometheus to throttle alerts, preventing the smart light from flickering on and off due to transient issues.
- Dedicated Network: To ensure the reliability of the smart light, we connect it to a dedicated Wi-Fi network with minimal traffic. This minimizes the chances of the light being affected by network congestion.
Expanding Functionality: Beyond Simple Uptime
The smart light uptime monitor can be extended to provide more than just a simple indication of uptime. Here are some ideas for expanding its functionality:
- CPU Usage: Change the light’s brightness based on the CPU usage of your server.
- Disk Space: Use different colors to indicate the amount of free disk space on your server.
- Custom Metrics: Integrate the smart light with your custom monitoring metrics to visualize any data you’re tracking.
- Time-Based Alerts: Schedule the light to change color at specific times of day to remind you of important tasks or deadlines.
For example, you could set the light to pulse slowly in yellow if the CPU usage exceeds 80% for more than 5 minutes. Or, you could have it flash red if disk space is running low.
Benefits of Using a Smart Light for Uptime Monitoring
We’ve found that using a smart light as an uptime monitor offers several benefits:
- Immediate Visual Indication: The flashing red light instantly alerts us to downtime, even when we’re not actively monitoring our dashboards.
- Reduced Alert Fatigue: By providing a visual indication of downtime, we can reduce the number of email and SMS alerts we receive, minimizing alert fatigue.
- Improved Response Time: The immediate visual cue allows us to respond to issues more quickly, minimizing downtime.
- Cost-Effective: Smart lights are relatively inexpensive compared to dedicated monitoring solutions.
- Easy to Set Up: The setup process is straightforward and requires minimal technical expertise.
Conclusion: A Simple Yet Powerful Solution
Using a smart light as an uptime monitor is a simple yet powerful solution for improving the reliability of your self-hosted services. The immediate visual indication of downtime allows you to respond to issues more quickly, minimizing downtime and ensuring a seamless user experience. While it doesn’t replace traditional monitoring systems, it complements them by providing a visually compelling and easily accessible way to stay informed about the status of your infrastructure. At Magisk Modules, we highly recommend implementing this solution for any self-hosted environment.