Skip to content

RecursiveExtractor is a .NET Standard 2.0/2.1 archive extraction library which can process 7zip, ar, bzip2, deb, gzip, iso, rar, tar, vhd, vhdx, vmdk, wim, xzip, and zip archives and any nested combination of the supported formats.

Notifications You must be signed in to change notification settings

gat786/RecursiveExtractor

 
 

Repository files navigation

About

Recursive Extractor is a .NET Standard 2.0/2.1 Library for parsing archive files and disk images, including nested archives and disk images.

Recursive Extractor is available on NuGet as Microsoft.CST.RecursiveExtractor.

Supported File Types (alphabetical)

  • 7zip
  • ar
  • bzip2
  • deb
  • gzip
  • iso
  • rar
  • tar
  • vhd
  • vhdx
  • vmdk
  • wim
  • xzip
  • zip

Usage

This example will print out the paths of all the files in the archive.

var path = "/Path/To/Your/Archive"
var extractor = new Extractor();
try {
    IEnumerable<FileEntry> results = extractor.ExtractFile(path);
    foreach(var found in results)
    {
        Console.WriteLine(found.FullPath);
    }
}
catch(OverflowException)
{
    // This means Recursive Extractor has detected a Quine or Zip Bomb
}

Advanced Usage

You can pass a delegate to process only filtered files.

public delegate bool PassFilter(FileEntryInfo fileEntryInfo);

For example, to only get files larger than 1000 bytes:

var path = "/Path/To/Your/Archive"
var extractor = new Extractor();
try {
    IEnumerable<FileEntry> results = extractor.ExtractFile(path, new ExtractorOptions() { Filter = SizeGreaterThan1000 });
    foreach(var found in results)
    {
        Console.WriteLine(found.FullPath);
    }
}
catch(OverflowException)
{
    // This means Recursive Extractor has detected a Quine or Zip Bomb
}

private bool SizeGreaterThan1000(FileEntryInfo fei)
{
    return fei.Size > 1000;
}

FileEntryInfo

The FileEntryInfo object has these fields:

public string Name { get; }
public string ParentPath { get; }
public long Size { get; }

Exceptions

ExtractFile will throw an overflow exception when a quine or zip bomb is detected.

Otherwise, invalid files found while crawling will emit a logger message and be skipped. RecursiveExtractor uses NLog for logging.

Feedback

If you have any issues or feature requests please open a new Issue

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

About

RecursiveExtractor is a .NET Standard 2.0/2.1 archive extraction library which can process 7zip, ar, bzip2, deb, gzip, iso, rar, tar, vhd, vhdx, vmdk, wim, xzip, and zip archives and any nested combination of the supported formats.

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 100.0%