I have a PDF file contains table for eg employee (empID, empName, Title). I want to parse these pdf file to excel and parsing that table in Excel to datatable in my code.
-
3Stop asking for quick responses from complete strangers. The fact the something is urgent to you does not make it so for us.Oded– Oded2010-11-07 13:43:28 +00:00Commented Nov 7, 2010 at 13:43
-
i have tried to parse pdf using abcpdf.net ,,and it gives me a conversion of pdf to text file but unstructured because my pdf file contains multiple tables,,then i have a thought of converting pdf to excel file then dealing with excel file in my codehatem– hatem2010-11-07 14:20:03 +00:00Commented Nov 7, 2010 at 14:20
-
Can you post an example PDF that you are trying to extract from and then we may be able to give you some extra clues ?Andrew Cash– Andrew Cash2010-11-08 05:04:32 +00:00Commented Nov 8, 2010 at 5:04
Add a comment
|
1 Answer
If your file was created with Structured Content in it, then it may be possible to extract all the data as XML file and then import XML into Excel.
Otherwise, you pretty much left with bunch of text blocks and there is probably nothing you can do about it.
For more information check great article about PDF Text in JPedal's blog.
1 Comment
hatem
unfortunatly it is unstructured pdf and i cant convert it to xml