OnWorks favicon

bup-split - Online in the Cloud

Run bup-split in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command bup-split that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator



bup-split - save individual files to bup backup sets


bup split [-t] [-c] [-n name] COMMON_OPTIONS

bup split -b COMMON_OPTIONS

bup split <--noop [--copy]|--copy> COMMON_OPTIONS

[-r host:path] [-v] [-q] [-d seconds-since-epoch] [--bench] [--max-pack-size=bytes]
[-#] [--bwlimit=bytes] [--max-pack-objects=n] [--fanout=count] [--keep-boundaries]
[--git-ids | filenames...]


bup split concatenates the contents of the given files (or if no filenames are given,
reads from stdin), splits the content into chunks of around 8k using a rolling checksum
algorithm, and saves the chunks into a bup repository. Chunks which have previously been
stored are not stored again (ie. they are 'deduplicated').

Because of the way the rolling checksum works, chunks tend to be very stable across
changes to a given file, including adding, deleting, and changing bytes.

For example, if you use bup split to back up an XML dump of a database, and the XML file
changes slightly from one run to the next, nearly all the data will still be deduplicated
and the size of each backup after the first will typically be quite small.

Another technique is to pipe the output of the tar(1) or cpio(1) programs to bup split.
When individual files in the tarball change slightly or are added or removed, bup still
processes the remainder of the tarball efficiently. (Note that bup save is usually a more
efficient way to accomplish this, however.)

To get the data back, use bup-join(1).


These options select the primary behavior of the command, with -n being the most likely

-n, --name=name
after creating the dataset, create a git branch named name so that it can be
accessed using that name. If name already exists, the new dataset will be
considered a descendant of the old name. (Thus, you can continually create new
datasets with the same name, and later view the history of that dataset to see how
it has changed over time.) The original data will also be available as a top-level
file named "data" in the VFS, accessible via bup fuse, bup ftp, etc.

-t, --tree
output the git tree id of the resulting dataset.

-c, --commit
output the git commit id of the resulting dataset.

-b, --blobs
output a series of git blob ids that correspond to the chunks in the dataset.
Incompatible with -n, -t, and -c.

--noop read the data and split it into blocks based on the "bupsplit" rolling checksum
algorithm, but don't do anything with the blocks. This is mostly useful for
benchmarking. Incompatible with -n, -t, -c, and -b.

--copy like --noop, but also write the data to stdout. This can be useful for
benchmarking the speed of read+bupsplit+write for large amounts of data.
Incompatible with -n, -t, -c, and -b.


-r, --remote=host:path
save the backup set to the given remote server. If path is omitted, uses the
default path on the remote server (you still need to include the ':'). The
connection to the remote server is made with SSH. If you'd like to specify which
port, user or private key to use for the SSH connection, we recommend you use the
~/.ssh/config file. Even though the destination is remote, a local bup repository
is still required.

-d, --date=seconds-since-epoch
specify the date inscribed in the commit (seconds since 1970-01-01).

-q, --quiet
disable progress messages.

-v, --verbose
increase verbosity (can be used more than once).

stdin is a list of git object ids instead of raw data. bup split will read the
contents of each named git object (if it exists in the bup repository) and split
it. This might be useful for converting a git repository with large binary files
to use bup-style hashsplitting instead. This option is probably most useful when
combined with --keep-boundaries.

if multiple filenames are given on the command line, they are normally concatenated
together as if the content all came from a single file. That is, the set of
blobs/trees produced is identical to what it would have been if there had been a
single input file. However, if you use --keep-boundaries, each file is split
separately. You still only get a single tree or commit or series of blobs, but
each blob comes from only one of the files; the end of one of the input files
always ends a blob.

print benchmark timings to stderr.

never create git packfiles larger than the given number of bytes. Default is 1
billion bytes. Usually there is no reason to change this.

never create git packfiles with more than the given number of objects. Default is
200 thousand objects. Usually there is no reason to change this.

when splitting very large files, try and keep the number of elements in trees to an
average of numobjs.

don't transmit more than bytes/sec bytes per second to the server. This is good
for making your backups not suck up all your network bandwidth. Use a suffix like
k, M, or G to specify multiples of 1024, 10241024, 10241024*1024 respectively.

-#, --compress=#
set the compression level to # (a value from 0-9, where 9 is the highest and 0 is
no compression). The default is 1 (fast, loose compression)


$ tar -cf - /etc | bup split -r myserver: -n mybackup-tar
tar: Removing leading /' from member names
Indexing objects: 100% (196/196), done.

$ bup join -r myserver: mybackup-tar | tar -tf - | wc -l

Use bup-split online using onworks.net services

Free Servers & Workstations

Download Windows & Linux apps

  • 1
    Canon EOS DIGITAL Info
    Canon EOS DIGITAL Info
    Canon doesn�t have shutter count
    included on the EXIF information of an
    image file, as opposed to Nikon and
    Pentax. There�s no official Canon based
    application ...
    Download Canon EOS DIGITAL Info
  • 2
    rEFInd is a fork of the rEFIt boot
    manager. Like rEFIt, rEFInd can
    auto-detect your installed EFI boot
    loaders and it presents a pretty GUI
    menu of boot option...
    Download rEFInd
  • 3
    ExpressLuke GSI
    ExpressLuke GSI
    This SourceForge download page was to
    grant users to download my source built
    GSIs, based upon phhusson's great
    work. I build both Android Pie and
    Android 1...
    Download ExpressLuke GSI
  • 4
    Music Caster
    Music Caster
    Music Caster is a tray music player
    that lets you cast your local music to a
    Google Cast device. On the first run,
    you will need to click the arrow in your
    Download Music Caster
  • 5
    PyQt is the Python bindings for
    Digia's Qt cross-platform
    application development framework. It
    supports Python v2 and v3 and Qt v4 and
    Qt v5. PyQt is avail...
    Download PyQt
  • 6
    Sardi is a complete restyling and
    optimisation of svg code. 6 choices for
    your applications and 10 kind of folders
    to use in your file manager. The sardi
    Download Sardi
  • More »

Linux commands

  • 1
    abi-tracker - visualize ABI changes
    timeline of a C/C++ software library.
    (abi-tracker) Visualize ABI changes
    timeline of a C/C+...
    Run abi-tracker
  • 2
    abicheck - check application binaries
    for calls to private or evolving symbols
    in libraries and for static linking of
    some system libraries. ...
    Run abicheck
  • 3
    cpanm - get, unpack build and install
    modules from CPAN ...
    Run cpanmp
  • 4
    cpanp - The CPANPLUS launcher ...
    Run cpanpp
  • 5
    gajim-remote � a remote control utility
    for gajim(1) ...
    Run gajim-remote
  • 6
    gajim � a Jabber/XMPP client ...
    Run gajim
  • More »