Difference between revisions of "Gkrellm replacement"

From Fernseher
Jump to navigationJump to search
Line 28: Line 28:
**temp#_label(I) temp#_input(o)
**temp#_label(I) temp#_input(o)
**asume X# for label, and assume nothing for pwm
**asume X# for label, and assume nothing for pwm
***hwmon-init: <dev><type><num> <label>
***hwmon: <dev><type><num> <input> <pwm>
*/sys/class/net/*/statistics/
*/sys/class/net/*/statistics/
**rx_bytes(o) rx_packets(o)
**rx_bytes(o) rx_packets(o)
**tx_bytes(o) tx_packets(o)
**tx_bytes(o) tx_packets(o)
***net: <name> <rxb> <rxp> <txb> <txp>
*/sys/devices/system/cpu/*/cpufreq/ (for cpu frequencies)
*/sys/devices/system/cpu/*/cpufreq/ (for cpu frequencies)
**cpuinfo_max_freq(I) scaling_max_freq(I)
**cpuinfo_max_freq(I) scaling_max_freq(I)
**cpuinfo_cur_freq(o) scaling_cur_freq(o)
**cpuinfo_cur_freq(o) scaling_cur_freq(o)
**cpuinfo_min_freq(I) scaling_min_freq(I)
**cpuinfo_min_freq(I) scaling_min_freq(I)
***freq-init: <cpu> <max> <min>
***freq: <cpu> <cur>
*/proc/acpi/thermal_zone/*/temperature
*/proc/acpi/thermal_zone/*/temperature
*/proc/acpi/battery/*/info
*/proc/acpi/battery/*/info
Line 48: Line 53:
**remaining capacity(o)
**remaining capacity(o)
**present voltage(o)
**present voltage(o)
***battery-init: <name> <d cap> <d vlt> <d cap wrn> <d cap low>
***battery: <name> <cap state> <last full> <cur cap> <cur vlt> <chg state> <cur rate>
*`hostname`(I)
*`hostname`(I)
*`date`(o)
*`date`(o)
***date: <date> <time>
*/proc/version(I)
*/proc/version(I)
***system-init: <host> <version>
*/proc/loadavg(o)
*/proc/loadavg(o)
**<1min> <5min> <15m> <cur>/<total> *blah*
**<1min> <5min> <15m> <cur>/<total> *blah*
***load: <1m> <5m> <15m>
***proc: <cur> <total>
*/proc/uptime(o)
*/proc/uptime(o)
**<up> <idle>
**<up> <idle>
***uptime: <total uptime>
*/proc/mdstat(I)
*/proc/mdstat(I)
**md# : active raidN <discs>
**md# : active raidN <discs>
**<blocks> <chunk size> [4/4] UUUU
**<blocks> <chunk size> [4/4] UUUU
***md-init: <disk> <raidtype> <status> <disks active> <disks total>
*/proc/meminfo
*/proc/meminfo
**MemTotal(I)
**MemTotal(I)
Line 66: Line 79:
**SwapTotal(I)
**SwapTotal(I)
**SwapFree(o)
**SwapFree(o)
***mem-init: <total> <swap total>
***mem: <free> <buf> <cache> <swap free> <swap cache>
*/proc/diskstats(o)
*/proc/diskstats(o)
**<maj> <min> <dev> x x <blks read> x x x <blks written> x x x x
**<maj> <min> <dev> x x <blks read> x x x <blks written> x x x x
***disk: <name> <blocks read> <blocks written>
*/proc/stat(o)
*/proc/stat(o)
**cpu[#] <usr> <nice> <sys> <idle> *blah*
**cpu[#] <usr> <nice> <sys> <idle> *blah*
***cpu: <name> <usr> <nice> <sys> <idle>


Each class of thing should be distilled down to a small number of numbers that can all go on one line for each sample time, and each gets their own log file and data header in the served stream.  Each time the daemon starts itself, it should write to each log file the current date and time, the hostname, the kernel version, and what it is keeping track of, and what each column of numbers means.  Also, what the time interval between samples is going to be (though some error must be assumed to exist).
Each class of thing should be distilled down to a small number of numbers that can all go on one line for each sample time, and each gets their own log file and data header in the served stream.  Each time the daemon starts itself, it should write to each log file the current date and time, the hostname, the kernel version, and what it is keeping track of, and what each column of numbers means.  Also, what the time interval between samples is going to be (though some error must be assumed to exist).

Revision as of 21:58, 8 June 2008

gkrellm/gkrellmd is the suxors.

Would be good to have a replacement in place. A couple of components would be nice.

  • Logging daemons. These would sit on each host and collect data every interval and save the data to (network available and local) log files so I could keep track of how things have gone over time. Could also serve the collected data over a socket interface.
  • Server side aggregator. This would sit on my webserver, get data from the logging daemons and the old log files, and make it available for a web client (or whatever) to display.
  • Client side viewers. This might be a ajaxy program thingy that displays all the realtime data and allows navigation of the historical data as well.

Things to keep track of:

  • Temperatures
  • fan speeds
  • Voltages
  • Processor speeds
  • Processor load
  • Number of processes
  • memory utilization
  • network traffic
  • disk traffic
  • battery state, rates, and remaining time estimates
  • local date/time
  • local hostname
  • local uptime
  • raid data

Where to find these things:

  • /sys/class/hwmon/*/device/ (for temps, fan speeds, voltages)
    • in#_label(I) in#_input(o)
    • fan#_label(I) fan#_input(o) pwm#(o)
    • temp#_label(I) temp#_input(o)
    • asume X# for label, and assume nothing for pwm
      • hwmon-init: <dev><type><num> <label>
      • hwmon: <dev><type><num> <input> <pwm>
  • /sys/class/net/*/statistics/
    • rx_bytes(o) rx_packets(o)
    • tx_bytes(o) tx_packets(o)
      • net: <name> <rxb> <rxp> <txb> <txp>
  • /sys/devices/system/cpu/*/cpufreq/ (for cpu frequencies)
    • cpuinfo_max_freq(I) scaling_max_freq(I)
    • cpuinfo_cur_freq(o) scaling_cur_freq(o)
    • cpuinfo_min_freq(I) scaling_min_freq(I)
      • freq-init: <cpu> <max> <min>
      • freq: <cpu> <cur>
  • /proc/acpi/thermal_zone/*/temperature
  • /proc/acpi/battery/*/info
    • design capacity(I)
    • last full capacity(o)
    • design voltage(I)
    • design capacity warning(I)
    • design capacity low(I)
  • /proc/acpi/battery/*/state
    • capacity state(o)
    • charging state(o)
    • present rate(o)
    • remaining capacity(o)
    • present voltage(o)
      • battery-init: <name> <d cap> <d vlt> <d cap wrn> <d cap low>
      • battery: <name> <cap state> <last full> <cur cap> <cur vlt> <chg state> <cur rate>
  • `hostname`(I)
  • `date`(o)
      • date: <date>
  • /proc/version(I)
      • system-init: <host> <version>
  • /proc/loadavg(o)
    • <1min> <5min> <15m> <cur>/<total> *blah*
      • load: <1m> <5m> <15m>
      • proc: <cur> <total>
  • /proc/uptime(o)
    • <up> <idle>
      • uptime: <total uptime>
  • /proc/mdstat(I)
    • md# : active raidN <discs>
    • <blocks> <chunk size> [4/4] UUUU
      • md-init: <disk> <raidtype> <status> <disks active> <disks total>
  • /proc/meminfo
    • MemTotal(I)
    • MemFree(o)
    • Buffers(o)
    • Cached(o)
    • SwapCached(o)
    • SwapTotal(I)
    • SwapFree(o)
      • mem-init: <total> <swap total>
      • mem: <free> <buf> <cache> <swap free> <swap cache>
  • /proc/diskstats(o)
    • <maj> <min> <dev> x x <blks read> x x x <blks written> x x x x
      • disk: <name> <blocks read> <blocks written>
  • /proc/stat(o)
    • cpu[#] <usr> <nice> <sys> <idle> *blah*
      • cpu: <name> <usr> <nice> <sys> <idle>

Each class of thing should be distilled down to a small number of numbers that can all go on one line for each sample time, and each gets their own log file and data header in the served stream. Each time the daemon starts itself, it should write to each log file the current date and time, the hostname, the kernel version, and what it is keeping track of, and what each column of numbers means. Also, what the time interval between samples is going to be (though some error must be assumed to exist).