EnglishFrenchSpanish

Ad


OnWorks favicon

cr_checkpoint - Online in the Cloud

Run cr_checkpoint in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command cr_checkpoint that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


cr_checkpoint - checkpoints a process, process group, or session.

SYNOPSIS


cr_checkpoint [options] ID

DESCRIPTION


Invoking cr_checkpoint causes a process (with or without all of its descendants), all
processes within a process group, or all processes within a session, to be checkpointed.
The result is a checkpoint file (or a directory with one checkpoint file per process) that
contains all the state needed to restart the process(es) at a later time. Checkpointed
processes can be restarted via cr_restart(1).

To be checkpointed by cr_checkpoint, a process must have the libcr.so library (or one of
its relatives) loaded. This can be achieved by starting the program with cr_run(1), or by
linking your application with -lcr. Or, the library may be loaded by other libraries you
have linked with (such as a checkpoint-ready MPI library), or your system's parallel job
startup script, etc. Check your system documentation for details.

File creation/replacement
By default (or if --atomic is passed) cr_checkpoint creates the new context file/directory
atomically: either the checkpoint fails (and any existing context file/directory is
unchanged), or it appears in the directory ready to be used by cr_restart. If an existing
checkpoint with the same file name exists, it will either be be unmodified (if the new
checkpoint fails for any reason), or replaced atomically (via rename(2). If
--backup[=NAME] is passed, any existing checkpoint will be backed up instead, either to
NAME or with a numbered extension (.~1~, .~2~, etc., with more recent checkpoints having
higher numbers). If --clobber is passed, the checkpoint will immediately remove any
existing checkpoint files, and will write the checkpoint directly out into the target
file/directory: this option uses less disk space if an existing checkpoint is present,
since the old checkpoint is immediately discarded, but if the checkpoint fails, the pre-
existing checkpoint is lost. Finally, if --noclobber is passed, then the checkpoint will
fail if the target file/directory exists.

File sync
By default (or when --sync is passed), cr_checkpoint waits until the checkpoint is
complete in memory, and additionally calls fsync(2) on all files and directories involved
in the checkpoint (including back-up files) to disk before exiting. Passing --nosync
causes these fsync calls to be skipped.

Timeout
A maximum timeout in seconds can be set for a checkpoint via the --time flag: if the
checkpoint takes longer than this, cr_checkpoint will print an error mesage and exit with
an error. If a timeout occurs, the state of the process or processes that were being
checkpointed is undefined.

Signals
By default checkpointed processes continue to run after a checkpoint is complete.
Alternatively, you may specify that they be stopped (via --stop), or
terminated/aborted/killed (via --term, --abort, or --kill). This is done by sending the
appropriate signal to every process that is part of the checkpoint. If the processes were
stopped at the time the checkpoint was requested, then --cont may be used to send SIGCONT
to all processes after the checkpoint is completed.

Memory mapped files
By default, checkpoints do not include any files that are mmap()ed into the process
address space unless they are already unlinked at the time the checkpoint is taken. This
is a space/time saving optimization under the assumption that the files required will
still be present (and uncorrupted) at restart time. Typically the largest savings comes
from not saving the executable file or dynamic (a.k.a shared) libraries. However, options
exist to cause the checkpoint to save these files as well. The flag --save-exe will cause
the executable file to be included in the context file. The flag --save-private will
include in the context file any files that are mapped with the MAP_PRIVATE flag, which
under Linux includes the executable and dynamic/shared libaries. The flag --save-shared
is for saving files that are mapped with the MAP_SHARED flag. Note that this is not the
flag you want for shared libraries. At restart any file saved by these flags will be
mapped into the process regardless of whether any file exists at the original location.
If there is file at the original location it remains untouched by the restart. Finally
--save-all and --save-none will cause all (or none) of these optional mmaped files to be
saved. The default is --save-none. When passing multiple of these options they are
processed from left to right with all options being additive, except for --save-none which
cancels the effects of any these options appearing earlier.

Checkpointing ptrace()ed processes
There is (currently) no way to fully transparently deal with checkpoints of processes that
are being traced with ptrace(2). Therefore, the default behavior (also available via
--ptraced-error) is to return an error if any of the processes to be checkpointed are
currently being ptraced. However, there are two other possible behaviors to choose among:

--ptraced-skip
Ptraced processes will be siliently excluded from the checkpoint. No error is
generated unless this results in zero processes checkpointed.

--ptraced-allow
Ptraced processes will be checkpointed just like any other processes. WARNING:
Because the checkpointed process and the BLCR kernel module must interact using
signals and system calls, the debugger (or other tracer) may need to `continue' the
target process(es), possibly more than once, to allow the checkpoint to complete.

Checkpointing ptrace()ing processes
There is (currently) no way to fully transparently deal with checkpoints of processes that
are tracing other processes using ptrace(2). Therefore, the default behavior (also
available via --ptracer-error) is to return an error if any of the processes to be
checkpointed are currently ptracing other processes. However --ptracer-skip is available
to cause cr_checkpoint to silently exclude such processes from the checkpoint. No error
is generated in that case unless this would result in zero processes checkpointed.

OPTIONS


General options:
-v, --verbose
print progress messages to stderr.

-q, --quiet
suppress error/warning messages to stderr.

-?, --help
print this message and exit.

--version
print version information and exit.

Options for scope of the checkpoint:
-T, --tree
ID identifies a process id. It and all of its descendants are to be checkpointed.
This is the default.

-p, --pid, --process
ID identifies a single process id.

-g, --pgid, --group
ID identifies a process group id.

-s, --sid, --session
ID identifies a session id.

Options for destination location of the checkpoint:
-c, --cwd
checkpoint saved as a single 'context.ID' file in cr_checkpoint's working directory
(default).

-d, --dir DIR
checkpoint saved in new directory DIR, with one 'context.ID' file per process
(unimplemented).

-f, --file FILE
checkpoint saved as FILE.

-F, --fd FD
checkpoint written to an open file descriptor.

Options for creation/replacement policy for checkpoint files:
--atomic
checkpoint created/replaced atomically (default).

--backup[=NAME]
checkpoint created atomically, and any existing checkpoint backed up to NAME or
*.~1~, *.~2~, etc.

--clobber
checkpoint written incrementally to target, overwriting any pre-existing
checkpoint.

--noclobber
checkpoint will fail if the target file exists.

These options are ignored if the destination is a file descriptor.

Options for signal sent to process(es) after checkpoint:
--run no signal sent: continue execution (default).

-S, --signal NUM
signal NUM sent to all processess.

--stop SIGSTOP sent to all processes.

--term SIGTERM sent to all processes.

--abort
SIGABRT sent to all processes.

--kill SIGKILL sent to all processes.

--cont SIGCONT sent to all processes.

Options in this group are mutually exclusive. If more than one is given then only
the last will be honored.

Options for file system synchronization (default is --sync):
--sync fsync checkpoint file(s) to disk (default).

--nosync
do not fsync checkpoint file(s) to disk.

Options to save optional portions of memory:
--save-exe
save the executable file.

--save-private
save private mapped files. (executables and libraries are mapped this way)

--save-shared
save shared mapped files. (System V IPC is mapped this way).

--save-all
save all of the above.

--save-none
save none of the above (the default).

Options for ptraced processes (default is --ptraced-error):
--ptraced-error
return an error if a checkpoint is requested of a process being ptraced.

--ptraced-skip
ptraced processes are silently excluded from the checkpoint request. If the
checkpoint scope is --tree, then this will also exclude any children of such
processes. No error is produced unless this results in zero processes
checkpointed.

--ptraced-allow
checkpoint ptraced processes normally. WARNING: This may require the tracer to
"continue" the target process(es), possibly more than once.

Options for processes ptracing others (default is --ptracer-error):
--ptracer-error
return an error if a checkpoint is requested of a process which is ptracing others.

--ptracer-skip
processes ptracing others are silently excluded from the checkpoint request. If
the checkpoint scope is --tree, then this will also exclude any children of such
processes. No error is produced unless this results in zero processes
checkpointed.

Options for kernel log messages (default is --kmsg-error):
--kmsg-none
don't report any kernel messages.

--kmsg-error
on checkpoint failure, report on stderr any kernel messages associated with the
checkpoint request.

--kmsg-warning
report on stderr any kernel messages associated with the checkpoint request,
regardless of success or failure. Messages generated in the absence of failure are
considered to be warnings.

Options in this group are mutually exclusive. If more than one is given then only
the last will be honored. Note that --quiet suppresses all stderr output,
including these messages.

Misc Options:
-t, --time SEC
allow only SEC seconds for target to complete checkpoint (default: wait
indefinitely).

EXAMPLES


To checkpoint the process with process ID 23452, saving its state to file context.23452:

cr_checkpoint -p 23452

To checkpoint all the processes in process group 68473, and save them to file groupie:

cr_checkpoint -g -f groupie 68473

To checkpoint all the process in session 8362, and save separate 'context.PID' files for
each process in directory 'my_checkpoints':

cr_checkpoint -s -d my_checkpoints 8362

Use cr_checkpoint online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

Linux commands

Ad