spam-assassin

Spam Assassin

The Spam Assassin public mail corpus.

Usage

var corpus = require( '@stdlib/datasets/spam-assassin' );

corpus()

Returns the Spam Assassin public mail corpus.

var data = corpus();
// returns [{...},{...},...]

Each array element has the following fields:

id: message id (relative to message group)
group: message group
checksum: object containing checksum info
text: message text (including headers)

The message group may be one of the following:

easy-ham-1: easier to detect non-spam e-mails (2500 messages)
easy-ham-2: easier to detect non-spam e-mails collected at a later date (1400 messages)
hard-ham-1: harder to detect non-spam e-mails (250 messages)
spam-1: spam e-mails (500 messages)
spam-2: spam e-mails collected at a later date (1396 messages)

The checksum object contains the following fields:

type: checksum type (e.g., MD5)
value: checksum value

Examples

var corpus = require( '@stdlib/datasets/spam-assassin' );

var data;
var i;

data = corpus();
for ( i = 0; i < data.length; i++ ) {
    console.log( 'Character Count: %d', data[ i ].text.length );
}

CLI

Usage

Usage: spam-assassin [options]

Options:

  -h,    --help                Print this message.
  -V,    --version             Print the package version.
         --format fmt          Output format: 'txt' or 'ndjson'.

Notes

The CLI supports two output formats: plain text (txt) and newline-delimited JSON (NDJSON). The default output format is txt.

Examples

$ spam-assassin

License

The data files (databases) are licensed under an Open Data Commons Public Domain Dedication & License 1.0 and their contents are licensed under Creative Commons Zero v1.0 Universal. The software is licensed under Apache License, Version 2.0.

Name		Name	Last commit message	Last commit date
parent directory ..
benchmark		benchmark
bin		bin
data		data
docs		docs
etc		etc
examples		examples
lib		lib
scripts		scripts
test		test
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
datapackage.json		datapackage.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Spam Assassin

Usage

corpus()

Examples

CLI

Usage

Notes

Examples

License

FilesExpand file tree

spam-assassin

Directory actions

More options

Directory actions

More options

Latest commit

History

spam-assassin

Folders and files

parent directory

README.md

Spam Assassin

Usage

corpus()

Examples

CLI

Usage

Notes

Examples

License