Tools for extract figure, table, text,... from a pdf document.
Make sure you have python3 installed on your computer. Recommended to install on Ubuntu.
To get a local copy up and running follow these simple steps.
This is an example of how to list things you need to use the software and how to install them.
- Detectron2
Requirment
- CUDA=10.1
- Pytorch >= 1.7.0
How to install CUDA 10.1 can be found here: https://developer.nvidia.com/cuda-10.1-download-archive-base
How to install Pytorch can be found here: https://pytorch.org/
After installed above package, follow the instructions below to install detectron2:
$ pip install cython pyyaml==5.1
$ pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu100/index.html
After installed detectron2, run:
$ pip install -r requirments.txt
- Clone the repo
git clone https://github.com/Wild-Rift/Document-Layout-Analysis.git- Run demo
streamlit run virtualize.py