harep - Online in the Cloud

This is the command harep that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


harep - Ganeti auto-repair tool

SYNOPSIS


harep [ [-L | --luxi ] = socket ] [ --job-delay = seconds ]

harep --version

DESCRIPTION


Harep is the Ganeti auto-repair tool. It is able to detect that an instance is broken and
to generate a sequence of jobs that will fix it, in accordance to the policies set by the
administrator.

Harep is able to recognize what state an instance is in (healthy, suspended, needs repair,
repair disallowed, pending repair, repair failed) and to lead it through a sequence of
steps that will bring the instance back to the healthy state. Therefore, harep is mainly
meant to be run regularly and frequently using a cron job, so that it can actually follow
the instance along all the process. At every run, harep will update the tags it adds to
instances that describe its repair status, and will submit jobs that actually perform the
required repair operations.

By default, harep only reports on the health status of instances, but doesn't perform any
action, as they might be potentially dangerous. Therefore, harep will only touch
instances that it has been explicitly authorized to work on.

The tags enabling harep, can be associated to single instances, or to a nodegroup or to
the whole cluster, therefore affecting all the instances they contain. The possible tags
share the common structure:

ganeti:watcher:autorepair:<type>

where <type> can have the following values:

· fix-storage: allow disk replacement or fix the backend without affecting the instance
itself (broken DRBD secondary)

· migrate: allow instance migration

· failover: allow instance reboot on the secondary

· reinstall: allow disks to be recreated and the instance to be reinstalled

Each element in the list of tags, includes all the authorizations of the previous one,
with fix-storage being the least powerful and reinstall being the most powerful.

In case multiple autorepair tags act on the same instance, only one can actually be
active. The conflict is solved according to the following rules:

1. if multiple tags are in the same object, the least destructive takes precedence.

2. if the tags are across objects, the nearest tag wins.

Example: A cluster has instances I1 and I2, where I1 has the failover tag, and the cluster
has both fix-storage and reinstall. The I1 instance will be allowed to failover, the I2
instance only to fix-storage.

LIMITATIONS


Harep doesn't do any hardware failure detection on its own, it relies on nodes being
marked as offline by the administrator.

Also harep currently works only for instances with the drbd and plain disk templates.

Both these issues will be addressed by a new maintenance daemon in future Ganeti versions,
which will supersede harep.

OPTIONS


The options that can be passed to the program are as follows:

-L socket, --luxi=*socket*
collect data via Luxi, optionally using the given socket path.

--job-delay=*seconds*
insert this much delay before the execution of repair jobs to allow the tool to
continue processing instances.

REPORTING BUGS


Report bugs to project website (http://code.google.com/p/ganeti/) or contact the
developers using the Ganeti mailing list (ganeti@googlegroups.com).

Use harep online using onworks.net services



Latest Linux & Windows online programs