瀏覽代碼

Update README with comprehensive project documentation

DevOps Team 3 月之前
父節點
當前提交
6d5afea4cb
共有 1 個文件被更改,包括 193 次插入1 次删除
  1. 193 1
      README.md

+ 193 - 1
README.md

@@ -1 +1,193 @@
-ansible-playbook to install monitor system
+# FX-Monitor - Monitoring Infrastructure with Ansible
+
+## Overview
+FX-Monitor is an Ansible-based automation toolkit for deploying and managing a complete monitoring system stack. It leverages Prometheus, Grafana, Consul, and Node Exporter to provide comprehensive infrastructure monitoring and visualization.
+
+## Project Structure
+`
+.
+ ansible.cfg                 # Ansible configuration
+ deploy.yml                  # Main playbook
+ hosts                       # Inventory file
+ README.md                   # This file
+ group_vars/
+    all.yml                # Group variables
+ roles/
+     prometheus/            # Prometheus monitoring server
+        tasks/
+           main.yml
+        templates/
+            prometheus.service.j2
+            prometheus.yml.j2
+     grafana/               # Grafana dashboard and visualization
+        tasks/
+            main.yml
+     node_exporter/         # Node metrics exporter
+        tasks/
+           main.yml
+        templates/
+            node_exporter.service.j2
+     consul/                # Consul service discovery
+         tasks/
+            main.yml
+         templates/
+             consul.conf.json.j2
+             consul.service.j2
+`
+
+## Components
+
+### 1. Prometheus
+- Central monitoring and time-series database
+- Collects metrics from monitored systems
+- Provides alerting capabilities
+- **Tags**: prom
+
+### 2. Grafana
+- Web-based visualization platform
+- Creates dashboards from Prometheus data
+- Provides analytics and alerting UI
+- **Tags**: grafana
+
+### 3. Node Exporter
+- Collects system metrics from target machines
+- Exposes metrics in Prometheus format
+- Installed on all monitored hosts
+- **Tags**: 
+e
+
+### 4. Consul
+- Service discovery and registration
+- Health checking for services
+- Configuration management
+- **Tags**: consul
+
+## Prerequisites
+- Ansible 2.9+
+- Python 3.6+ on target systems
+- SSH access to target hosts
+- Root or sudo privileges on target machines
+
+## Installation & Deployment
+
+### 1. Configure Inventory
+Edit the hosts file to define your target machines:
+`
+[monitor]
+# Prometheus, Grafana, and Consul servers
+host1 ansible_host=10.0.0.1
+
+[exporter]
+# Node Exporter targets
+host2 ansible_host=10.0.0.2
+host3 ansible_host=10.0.0.3
+`
+
+### 2. Configure Variables
+Modify group_vars/all.yml with your environment-specific settings:
+`yaml
+# Example variables
+prometheus_port: 9090
+grafana_port: 3000
+consul_port: 8500
+node_exporter_port: 9100
+`
+
+### 3. Run the Playbook
+Deploy all components:
+`ash
+ansible-playbook deploy.yml
+`
+
+Deploy specific components using tags:
+`ash
+# Deploy only Prometheus
+ansible-playbook deploy.yml --tags prom
+
+# Deploy only Grafana
+ansible-playbook deploy.yml --tags grafana
+
+# Deploy only Node Exporter
+ansible-playbook deploy.yml --tags ne
+
+# Deploy only Consul
+ansible-playbook deploy.yml --tags consul
+`
+
+## Configuration
+
+### Ansible Configuration (ansible.cfg)
+- Inventory path: /hosts
+- Forks: 5 (concurrent task execution)
+- User privilege: root (via become)
+- SSH port: 22
+- Key authentication: /root/.ssh/id_rsa
+- Timeout: 10 seconds
+- Logs: /var/log/ansible.log
+
+### Important Notes
+- All playbooks run with gather_facts: false for faster execution
+- Root privilege escalation (ecome: yes) is required
+- Ensure passwordless SSH or key-based authentication is configured
+
+## Service Management
+
+After deployment, services can be managed on target hosts:
+
+`ash
+# Check service status
+systemctl status prometheus
+systemctl status grafana-server
+systemctl status node_exporter
+systemctl status consul
+
+# Restart services
+systemctl restart prometheus
+systemctl restart grafana-server
+systemctl restart node_exporter
+systemctl restart consul
+`
+
+## Default Ports
+- Prometheus: 9090
+- Grafana: 3000
+- Node Exporter: 9100
+- Consul: 8500
+
+## Accessing Services
+
+### Prometheus
+- URL: http://<prometheus_host>:9090
+- Metrics endpoint: http://<prometheus_host>:9090/api/v1/query
+
+### Grafana
+- URL: http://<grafana_host>:3000
+- Default credentials: admin/admin (change on first login)
+
+### Consul
+- URL: http://<consul_host>:8500/ui
+
+### Node Exporter
+- Metrics: http://<exporter_host>:9100/metrics
+
+## Troubleshooting
+
+### Connection Issues
+- Verify SSH connectivity: ssh -i ~/.ssh/id_rsa user@hostname
+- Check firewall rules on target hosts
+- Ensure ports are not blocked
+
+### Service Issues
+- Check Ansible logs: 	ail -f /var/log/ansible.log
+- Verify service logs: journalctl -u <service_name> -f
+- Check port availability: 
+etstat -tuln | grep <port>
+
+## License
+Internal use only
+
+## Author
+jiangkai
+
+## Support
+For issues and questions, contact the infrastructure team.