AboutCode is a family of FOSS projects to discover, report and manage metadata about software:
- Where does the software come from?
- What is its license? its copyright?
- What are its dependencies?
- Was the software well maintained?
- Are there security vulnerabilities?
- Are there licensing issues?
These are all important questions because there are millions of free and open source software components available on the web for reuse.
Knowing where a software package comes from, what its license is and whether it is vulnerable should be a problem of the past such that everyone can safely consume more free and open source software. We support not only open source software, but also open data, generated and curated by our applications.
AboutCode Projects Overview
AboutCode has been designed as a modular stack of applications, tools, libraries and data. All of the software is open source (primarily licensed under Apache-2.0) and all of the data is open (primarily licensed under CC-BY-SA-4.0).
The AboutCode stack supports important industry standards including:
Package-URL (PURL): a widely used standard to identify software packages of any type with simple, readable and concise URLs. The PURL standard is ECMA-427.
CycloneDX: (OWASP CycloneDX) is a full-stack Bill of Materials (BOM) standard (ECMA-424) that provides advanced supply chain capabilities for cyber risk reduction.
SPDX: (System Package Data Exchange) is a specification for representing systems with software components as SBOMs (Software Bill of Materials) and other AI, data and security references.
The following sections provide details about each project.
Application Projects
dejacode
DejaCode provides an enterprise-level application to automate open source license compliance and ensure software supply chain integrity, powered by ScanCode.
scancode.io
ScanCode.io provides a Web UI and API to run and review complex scans in rich scripted pipelines, on different kinds of containers, docker images, package archives, manifests etc, to get information on licenses, copyrights, sources, and vulnerabilities.
vulnerablecode
VulnerableCode provides a Web UI and API to access a database of known software package vulnerabilities with comprehensive information from upstream and downstream public sources including packages affected by a vulnerability and packages that fix a vulnerability. There is a public VulnerableCode database at: https://public.vulnerablecode.io/ and the project also provides the tools to build your own instance of the database.
ScanCode projects
scancode-toolkit
ScanCode Toolkit is a set of code scanning tools that detect the origin (copyrights), license and vulnerabilities of code, packages and dependencies in a codebase.
scancode-licensedb
ScanCode LIcenseDB is a free and open database of software and related licenses with over 2400 curated license texts, their metadata and ScanCode license detection rules. There is a public database available at: https://scancode-licensedb.aboutcode.org/
scancode-workbench
ScanCode Workbench is an application to visualize and review scan results from ScanCode Toolkit scans. You can install and use the Workbench on a Linux, MacOS or Windows desktop.
scancode-analyzer
Post-scan plugin to improve the accuracy of license detection by leveraging ScanCode scan data.
scancode-action
scancode-action enables you to run ScanCode.io pipelines from your Workflows.
scancode-plugins
Set of plugins either delivered as builtin scancode-toolkit or extra plugins.
matchcode-toolkit
Collection of plugins that makes matchcode-related functions available for scancode-toolkit and scancode.io.
federatedcode
federatedcode is a decentralized, federated metadata system for open source software code and security information.
aboutcode-toolkit
aboutcode-toolkit is a set of command line tools to document the provenance of your code and generate attribution notices. aboutcode-toolkit uses small yaml files to document code provenance inside a codebase.
Package-URL (PURL) projects
purldb
PURLDB provides tools to create and update a database of package metadata keyed by PURL (Package URL) and an API for the PURL data.
purldb-toolkit
Command line utility and library to use the PurlDB, its API and various related libraries.
purl-validator
Decentralized PURL validator so that libraries can use it offline and help them create better PURLs.
purlvalidator-go
Go library for validating Package URLs (PURLs). It works fully offline, including in air-gapped or restricted environments
packageurl-python
Python library to parse and build Package-URLs aka PURLs.
univers
univers is a Python package to parse and compare all package versions and package version ranges. Parse and compare all the package versions and all the ranges. From debian, npm, pypi, ruby and more. Process all the version range specs and expressions.
Inspectors
android-inspector
Library of utilities to introspect source and binary Android apps and Android device firmware.
binary-inspector
binary-inspector is a utility to extract symbols from various kinds of binaries, i.e. ELF, Mach-O, WinPE and other binary formats.
container-inspector
container-inspector is a tool to analyze the structure and provenance of software components in Docker images using static analysis.
debian-inspector
debian-inspector is a tool to inspect debian codebases.
dependency-inspector
General purpose, mostly universal software package dependency resolver.
elf-inspector
elf-inspector is a set of utilities to inspect binary ELF files and collect interesting data from them.
go-inspector
go-inspector is a utility to extract dependencies and symbols from Go binaries.
nuget-inspector
nuget-inspector is a tool to inspect manifests and code to resolve dependencies (vulnerable and non-vulnerable) for nuget packages.
python-inspector
python-inspector is a tool to inspect manifests and code to resolve dependencies (vulnerable and non-vulnerable) for python packages.
rpm-inspector
Python library to collect data from RPM packages including installed packages.
rust-inspector
rust-inspector is a utility to extract dependencies and symbols from Rust binaries.
source-inspector
source-inspector is a set of utilities to inspect and analyze source code and collect interesting data such as code symbols, strings and comments.
Libraries
ag-gen-code-search
Open source tools to find code that may have been generated using LLMs and GPT tools.
ahocode
Pure python implementation for pyahocorasick.
bitcode
Pure python implementation for intbitset.
commoncode
commoncode provides a set of common functions and utilities for handling various things like paths, dates, files and hashes.
extractcode
extractcode is a mostly universal file extraction library and CLI tool to extract almost any archive in a reasonably safe way on Linux, macOS and Windows.
fetchcode
fetchcode is a utility to reliably fetch any code via HTTP, FTP and version control systems such as git.
license-expression
license-expression is a library to parse, analyze, compare and normalize SPDX and SPDX-like license expressions using a boolean logic expression engine. The underlying boolean engine is at: https://github.com/bastikr/boolean.py.
plugincode
Library that provides pluggable functionality with plugins, including Click plugins. It is used by ScanCode toolkit and related projects.
pygmars
Tool to craft simple regex-based small language lexers and parsers. Build parsers from grammars and accept Pygments lexers as an input.
sanexml
sanexml is a fallback library for lxml.etree module, so the functions have same names and parameters.
saneyaml
Cleaner, simpler, safer and saner YAML parsing/serialization in Python, for YAML meant to be readable first, on top of PyYAML
scorecode
library to fetch and store various software package score, like OpenSSF Scorecard data.
turbo-spdx
Fast and lightweight Python library for parsing and writing SPDX JSON documents correctly.
typecode
Provides comprehensive filetype and mimetype detection using multiple detectors including libmagic (included as a dependency for Linux, Windows and macOS) and Pygments.
Supporters
The home for AboutCode software is the aboutcode-org organization on GitHub. AboutCode is managed by AboutCode Europe ASBL (a Brussels-based non-profit) and is supported by:
- Contributions from users like you
- Google, including the Google Summer of Code and Season of Docs programmes
- Mercedes-Benz Group
- Microsoft and Microsoft Azure
- nexB Inc.
- The European Commission NGI programme
- The NLnet Foundation
- The Swiss State Secretariat for Education, Research and Innovation (SERI)
- Zeiss
- and many others!