Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
845 views

There is a new open source python library from Microsoft markitdown https://github.com/microsoft/markitdown It basically works fine on my Docx documents (if anyone uses it, make sure you use it on ...
Bogdan_Ch's user avatar
  • 3,356
3 votes
1 answer
650 views

I'm doing an Android Kotlin project which will auto-generate certificates when we enter some details in the EditTexts. I have word files (.docx) in my assets folder which has some variables which will ...
Rushi Mayur's user avatar
1 vote
0 answers
96 views

It's like the title states. If I omit the table width, it ends up with 0, and doesn't show up at all. The only way to make it show up at all is by doing: width: { size: 100, type: Docx.WidthType....
Bosko Sinobad's user avatar
1 vote
1 answer
11k views

I need to convert a PDF text document to Markdown and maintaining its structure (ie. indexed numbered headers and subheaders should have their correspective number of hashtags # in markdown to keep ...
Guido's user avatar
  • 503
0 votes
1 answer
43 views

Respected Sir/Madam, I have a doubt regarding LZW BW 1200dpi tiff file creation using “UDC driver 6.7/6.8 version”. If we disable “'Perform High-Quality Smoothing”, then output data are not visible in ...
Shant's user avatar
  • 11
0 votes
1 answer
1k views

I am trying to covert a pdf to html using Pandoc. I have installed pandoc binary , added the environment variable path and then using import pypandoc import os os.environ.setdefault('PYPANDOC_PANDOC',...
SUBHRA SANKHA's user avatar
0 votes
0 answers
814 views

I have a requirement where I need to use openoffice in a standalone server and use a Java program for Document conversion. Right now, I have a setup where I have started openoffice in my linux ...
BlackViper's user avatar
0 votes
1 answer
66 views

I am creating multiple virtual documents and then I want to merge them into one PDF, without saving them somewhere. All I found for now are guides, in which they save the document as a PDF somewhere ...
GuterProgrammierer's user avatar
-1 votes
1 answer
652 views

I want to display a preview of files uploaded by a users. For this reason, I have to convert docx-files to pdf using python 3.7. When looking for a library to do the job I found the following: ...
Jekson's user avatar
  • 3,322
1 vote
1 answer
418 views

I'm struggling to find a solution. I have a bulk of Adobe inDesign files I'm trying to convert over as PDFs I know you can export to inDesign -> PDF then from Acrobat PDF -> PPTX. This would work ...
Joshua Jones's user avatar
9 votes
1 answer
15k views

I recently installed pandoc 2.4 on Windows and the conversion failed with error 1 occurs for all knitting. I can't knit html, word, and pdf. The error says output file: template.knitmd pandoc.exe: ...
Moses Kim's user avatar
0 votes
1 answer
3k views

I am creating an application to convert HTML Pages to an ePub format. I tried converting the file to PDF Since I require Table Of Contents as the first page of the ePub file. I have used Spire PDF and ...
Shubashree Ravi's user avatar
0 votes
1 answer
801 views

I am trying to open and read a .docx file using Ruby, and extract portions of the text and objects/images and save into another (non .docx) file. Using Nokogiri, I am able to properly extract text ...
Noel Euzebe's user avatar
0 votes
1 answer
718 views

We are implementing Question & Answering System using Watson Discovery Service(WDS). We required each answer unit available in single document. We have complex PDF files as corpus. The PDF files ...
Prashanth M's user avatar
0 votes
1 answer
720 views

We are changing systems and the new system only outputs .DOC or .TXT files for reports. Several of the reports that come out need to be converted to PDF so they are available for our web users on a ...
David M's user avatar
  • 43
0 votes
1 answer
119 views

I am planning to integrate the IBM Watson Document Conversion service with Salesforce. From there I am unable to send my pdf file directly to Watson and I'm getting Media Type not supported. I am ...
Umang's user avatar
  • 1
0 votes
1 answer
194 views

I have an html form that allows users to upload a file, which then uses IBM Watson's document conversion API to convert the text of the document into normalized text which is then inserted into a ...
Daniel La's user avatar
0 votes
1 answer
155 views

I recently implemented the Document Conversion API from IBM Watson. I always get an encoding error for converting pdf document!!! #!/usr/bin/env python #coding: utf-8 import json from ...
Ikmel's user avatar
  • 11
0 votes
1 answer
82 views

Trying to use the Document Conversion service to capture the json key/value pairs for the pdf documents such as (w2/1040/etc forms.) Content of such forms in json response are coming as part of the "...
user7981462's user avatar
0 votes
1 answer
177 views

We use IBM's Document Conversion service as a core part of our Watson-based AI system. Recently I have been getting a lot of this error whilst building our corpus: Error SLM-THROTTLE occurred when ...
David Powell's user avatar
0 votes
0 answers
613 views

I am trying to convert a document using IBM's Document Conversion service. It is a basic PDF, 116 pages,1.1MB file. Nothing special about it that I can see, but the DC service returns the error "...
David Powell's user avatar
0 votes
1 answer
105 views

I am trying to convert documents using the Bluemix Document Conversion service with a Node.js application. I am getting nothing but errors in my app, but the test document I'm using converts fine ...
David Powell's user avatar
0 votes
1 answer
61 views

I am trying to use DocumentConversionV1 function of watson_developer_cloud API on python , However the response in my case comes only as "<"Response 200">". import sys import os as o import json ...
Sanjay Josh's user avatar
0 votes
2 answers
64 views

We recently implemented the Document Conversion API from IBM Watson.In this can I use web files (www.something.com) as input. curl -X POST -u "username":"password" -F config="{\"conversion_target\":\"...
user94's user avatar
  • 419
0 votes
1 answer
179 views

I am still very new to Retrieve and Rank, and Document Conversion services, so I have been playing around with that lately. I encountered a problem where when I upload a large document (100+ pages) - ...
Ngoodles's user avatar
0 votes
0 answers
207 views

We recently implemented the Document Conversion API from IBM Watson. We always get the error, even though we specify the document type: 415 Unsupported Media Type - The media type of the input file ...
OSX55's user avatar
  • 170
0 votes
1 answer
120 views

I am trying to convert some documents into answer units with Watson's Document Conversion service, using the watson-developer-cloud Javascript library in Node.js. Certain ones (an example is at IBM ...
David Powell's user avatar
0 votes
1 answer
96 views

When I try to convert this document https://public.dhe.ibm.com/common/ssi/ecm/po/en/poq12347usen/POQ12347USEN.PDF with Watson's Document Conversion service, all I get is four answer units, one for ...
David Powell's user avatar
1 vote
1 answer
131 views

When I do this command: C:\curl -X POST -u "User":"Pass" -F config="{\"conversion_target\":\"answer_units\"}" -F file="D:\PATH\QeA.pdf;type=application/pdf" "https://gateway.watsonplatform.net/...
Marco Oliveira's user avatar
1 vote
2 answers
133 views

I am writing a program that takes advantage of IBM Watson's Document Conversion service to convert documents of various types into answer units. Each answer unit that is returned by the service ...
David Powell's user avatar
0 votes
1 answer
83 views

We are trying to use the IBM Watson Document Conversion service on Word documents and have noticed that text that is in the header (and is displayed when the doc file is viewed) is not returned by the ...
Christopher Hyland's user avatar
0 votes
1 answer
100 views

We are trying to convert a .docx – and later other potential file formats – into a kind of standard XML. This XML is going to be mapped through an XSLT to the XML of our choice (xsd). For the ...
sbadea's user avatar
  • 1
0 votes
2 answers
159 views

Trying to use Watson Document Conversion service from Node-Red with following payload setup and to feed into 'Convert' node, it always returns "Error: Lost connect to server". I'd think the setup is ...
nyker's user avatar
  • 57
0 votes
1 answer
207 views

I'm trying to convert a PDF document but I am having problems regarding the accents in words. The PDF is in Portuguese-Brazil language. This is the command i'm running: curl -X POST -u "OMITTED":"...
Fred Miranda's user avatar
0 votes
1 answer
75 views

When trying the Watson Document Conversion service on the following redbook: http://www.redbooks.ibm.com/redbooks/pdfs/ga195486.pdf, I get timeout error. I verified the size is less than 50 MB. Any ...
joe4k's user avatar
  • 21
2 votes
1 answer
14k views

following the document conversion API example trying to use Flask to convert msword document to text, but it does not work. Here is the code import os, json, requests from flask import Flask, ...
user6332732's user avatar
-1 votes
2 answers
408 views

I am trying to convert this document (http://www.redbooks.ibm.com/redbooks/pdfs/ga195486.pdf) to answer units in Watson's Document Conversion service using the watson-developer-cloud node.js library. ...
David Powell's user avatar
0 votes
1 answer
102 views

I am trying to convert this document: http://www.redbooks.ibm.com/redpapers/pdfs/redp5213.pdf to JSON answer units, but it (and many similar others) just won't process through the service. If I try ...
David Powell's user avatar
1 vote
1 answer
130 views

So I want to make classes for using Concept Insights on HTML documents converted from PDF thanks to Document Conversion. I am using an Eclipse IDE with a view of my Git directory. When I run it, I get ...
Tara E's user avatar
  • 13
-1 votes
1 answer
211 views

My goal is a single file of documents in JSON format, that would come from 50-100 MS Word or PDF documents. Is there a way to supply multiple documents to the "convert_document" command? I've tried ...
ralphearle's user avatar
  • 1,684
1 vote
4 answers
1k views

After spending hours and hours on StackOverflow and programmers forum, i've decided to use the SyncFusion on our project. Our main target is : convert to PDF/directly print existing Doc And Docx this ...
sstassin's user avatar
  • 388
3 votes
1 answer
340 views

How can I convert more than one document using the Document Conversion service. I have between 50-100 MS Word and PDF documents that I want to convert using the convert_document API method? For ...
German Attanasio's user avatar
1 vote
1 answer
113 views

I am trying to add a custom footer to pdfs created from docx files on my liferay6.2 installation. Specifically I have linked up open office, and I am successfully converting the documents from docx ...
Joe Andersen's user avatar
1 vote
0 answers
525 views

I am able to convert most of the word documents(doc & docx) to PDF on windows. "soffice.exe" --headless --convert-to pdf --outdir "C:\Ok" "C:\Ok\Test_Original.doc" But a few documents are not ...
pingu's user avatar
  • 695
0 votes
1 answer
734 views

I am using unoconv (https://github.com/dagwieers/unoconv) to convert DOCX and DOC file to PDF, but will often get strange results on certain characters when they are rendered in the PDF. One ...
rkp333's user avatar
  • 391
1 vote
2 answers
1k views

I’m using Java 6. I have an XML template, which begins like so <?xml version="1.0" encoding="UTF-8"?> However, I notice when I parse and output it with the following code (using Apache Commons-...
Dave A's user avatar
  • 2,850
0 votes
1 answer
293 views

I'm triggering a perl script from an postfix email server every time when an email is received for a specified domain. The perl script basically extracts all attachments and then calls unoconv to ...
markus's user avatar
  • 6,638
-2 votes
1 answer
47 views

I'm looking for either a code snippet or other solution capable of converting a high volume (thousands) of .pdf's into .html or .doc while at the same time: maintaining hierarchical structure of ...
Cognitivity's user avatar
1 vote
4 answers
6k views

First, I tried to use Cloudconvert. It can convert between so many fyletypes, but its PHP API causes memory leaks almost at all times. The second I tried was Pdfcrowd. It works perfectly, but it can ...
aleskva's user avatar
  • 1,845
1 vote
0 answers
1k views

I am using the following code to convert a PDF file into MS Word Document using the following code snippet. import java.io.FileOutputStream; import org.apache.poi.xwpf.usermodel.BreakType; import ...
Bhagyesh Jain's user avatar