EnglishFrenchSpanish

Ad


OnWorks favicon

bmf - Online in the Cloud

Run bmf in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command bmf that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


bmf - efficient Bayesian mail filter

SYNOPSIS


bmf [-t] [-n] [-s] [-N] [-S] [-f fmt] [-d db] [-i file] [-k n] [-m type] [-p]
[-v] [-V] [-h]

DESCRIPTION


bmf is a Bayesian mail filter. In its normal mode of operation, it takes an email message
or other text on standard input, does a statistical check against lists of "good" and
"spam" words, registers the new data, and returns a status code indicating whether or not
the message is spam. BMF is written with fast, zero-copy algorithms, coded directly in C,
and tuned for speed. It aims to be faster, smaller, and more versatile than similar
applications.

bmf supports both mbox and maildir mail storage formats. It will automatically process
multiple messages within an mbox file separately.

OPTIONS


Without command-line options, bmf processes the input, registers it as either "good" or
"spam", and returns the appropriate error code. The wordlist directory and nonexistent
wordfiles are created if absent.

-t Test to see if the input is spam. The word lists are not updated. A report is written
to stdout showing the final score and the tokens with the highest deviation form a mean of
0.5.

-n Register the input as non-spam.

-s Register the input as spam.

-N Register the input as non-spam and undo a prior registration as spam.

-S Register the input as spam and undo a prior registration as non-spam.

-f fmt Specify database format. Valid formats are text, db, and mysql. Text is always
valid. The others may not be available if the corresponding option was not enabled at
compile time. The default is db if available, else text.

-d db Specify database or directory for loading and saving word lists. The default is
~/.bmf in text mode.

-i file Use file for input instead of stdin.

-k n Specify the number of extrema (keepers) to use in the Bayes calculation. The default
is 15.

-m fmt Specify mail storage format. Valid formats are mbox and maildir. The default is to
automatically detect the mail storage format. This option is deprecated.

-p Copy the input to the output (passthrough) and insert spam headers in the style of
SpamAssassin. An X-Spam-Status header is always inserted with processing details. The
contents of this header always begin with either "Yes" or "No". If the input is judged to
be spam, the header "X-Spam-Flag: YES" is also inserted.

-v Be more verbose. This option is not well supported yet.

-V Display version information.

-h Display usage information.

THEORY OF OPERATION


bmf treats its input as a bag of tokens. Each token is checked against "good" and "bad"
wordlists, which maintain counts of the numbers of times it has occurred in non-spam and
spam mails. These numbers are used to compute the probability that a mail in which the
token occurs is spam. After probabilities for all input tokens have been computed, a fixed
number of the probabilities that deviate furthest from average are combined using Bayes's
theorem on conditional probabilities.

While this method sounds crude compared to the more usual pattern-matching approach, it
turns out to be extremely effective. Paul Graham's paper A Plan For Spam:
http://www.paulgraham.com/spam.html is recommended reading.

bmf improves on Paul's proposal by doing smarter lexical analysis. In particular,
hostnames and IP addresses are not discarded, and certain types of MTA information are
discarded (such as message ids and dates).

MIME and other attachments are not decoded. Experience from watching the token streams
suggests that spam with enclosures invariably gives itself away through cues in the
headers and non-enclosure parts. Nonetheless, I would like to add the ability to decode
quoted-printable and perhaps base64 encodings for textual attachments.

INTEGRATION WITH OTHER TOOLS


Please see the /usr/share/doc/bmf/README.gz for samples and suggestions.

RETURN VALUES


In passthrough mode: zero for success, nonzero for failure.

In non-passthrough mode: 0 for spam; 1 for non-spam; 2 for I/O or other errors.

Use bmf online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    VASSAL Engine
    VASSAL Engine
    VASSAL is a game engine for creating
    electronic versions of traditional board
    and card games. It provides support for
    game piece rendering and interaction,
    and...
    Download VASSAL Engine
  • 2
    OpenPDF - Fork of iText
    OpenPDF - Fork of iText
    OpenPDF is a Java library for creating
    and editing PDF files with a LGPL and
    MPL open source license. OpenPDF is the
    LGPL/MPL open source successor of iText,
    a...
    Download OpenPDF - Fork of iText
  • 3
    SAGA GIS
    SAGA GIS
    SAGA - System for Automated
    Geoscientific Analyses - is a Geographic
    Information System (GIS) software with
    immense capabilities for geodata
    processing and ana...
    Download SAGA GIS
  • 4
    Toolbox for Java/JTOpen
    Toolbox for Java/JTOpen
    The IBM Toolbox for Java / JTOpen is a
    library of Java classes supporting the
    client/server and internet programming
    models to a system running OS/400,
    i5/OS, o...
    Download Toolbox for Java/JTOpen
  • 5
    D3.js
    D3.js
    D3.js (or D3 for Data-Driven Documents)
    is a JavaScript library that allows you
    to produce dynamic, interactive data
    visualizations in web browsers. With D3
    you...
    Download D3.js
  • 6
    Shadowsocks
    Shadowsocks
    A fast tunnel proxy that helps you
    bypass firewalls This is an application
    that can also be fetched from
    https://sourceforge.net/projects/shadowsocksgui/.
    It ha...
    Download Shadowsocks
  • More »

Linux commands

Ad