Gkrellm replacement

gkrellm/gkrellmd is the suxors.

Would be good to have a replacement in place. A couple of components would be nice.

Logging daemons. These would sit on each host and collect data every interval and save the data to (network available and local) log files so I could keep track of how things have gone over time. Could also serve the collected data over a socket interface.
Server side aggregator. This would sit on my webserver, get data from the logging daemons and the old log files, and make it available for a web client (or whatever) to display.
Client side viewers. This might be a ajaxy program thingy that displays all the realtime data and allows navigation of the historical data as well.

Things to keep track of:

Temperatures
fan speeds
Voltages
Processor speeds
Processor load
Number of processes
memory utilization
network traffic
disk traffic
battery state, rates, and remaining time estimates
local date/time
local hostname
local uptime
raid data

Where to find these things:

/sys/class/hwmon/*/device/ (for temps, fan speeds, voltages)
- in#_label(I) in#_input(o)
- fan#_label(I) fan#_input(o) pwm#(o)
- temp#_label(I) temp#_input(o)
- asume X# for label, and assume nothing for pwm
/sys/class/net/*/statistics/
- rx_bytes(o) rx_packets(o)
- tx_bytes(o) tx_packets(o)
/sys/devices/system/cpu/*/cpufreq/ (for cpu frequencies)
- cpuinfo_max_freq(I) scaling_max_freq(I)
- cpuinfo_cur_freq(o) scaling_cur_freq(o)
- cpuinfo_min_freq(I) scaling_min_freq(I)
/proc/acpi/thermal_zone/*/temperature
/proc/acpi/battery/*/info
- design capacity(I)
- last full capacity(o)
- design voltage(I)
- design capacity warning(I)
- design capacity low(I)
/proc/acpi/battery/*/state
- capacity state(o)
- charging state(o)
- present rate(o)
- remaining capacity(o)
- present voltage(o)
`hostname`(I)
`date`(o)
/proc/version(I)
/proc/loadavg(o)
- <1min> <5min> <15m> <cur>/<total> *blah*
/proc/uptime(o)
- <up> <idle>
/proc/mdstat(I)
- md# : active raidN <discs>
  1. blocks chunk size XX [4/4] UUUU
/proc/meminfo
- MemTotal(I)
- MemFree(o)
- Buffers(o)
- Cached(o)
- SwapCached(o)
- SwapTotal(I)
- SwapFree(o)
/proc/diskstats(o)
- <maj> <min> <dev> x x <blks read> x x x <blks written> x x x x
/proc/stat(o)
- cpu[#] <usr> <nice> <sys> <idle> *blah*

Each class of thing should be distilled down to a small number of numbers that can all go on one line for each sample time, and each gets their own log file and data header in the served stream. Each time the daemon starts itself, it should write to each log file the current date and time, the hostname, the kernel version, and what it is keeping track of, and what each column of numbers means. Also, what the time interval between samples is going to be (though some error must be assumed to exist).

Gkrellm replacement

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools