Skip to content

rekrevs/python-docx

 
 

Repository files navigation

python-docx (xtend fork)

This is an experimental fork of python-docx with extensions to handle more of the full OOXML (.docx) format.

The upstream python-docx library provides excellent support for basic document operations. This fork extends it with read support for advanced features commonly found in real-world Word documents.

Extensions

This fork adds the following capabilities:

Feature Read Write Description
Content Controls (SDT) Structured document tags - text, date, dropdown, comboBox types
Fields Simple and complex fields - PAGE, DATE, TOC, REF, HYPERLINK, etc.
Footnotes & Endnotes Full paragraph and table support in notes
Bookmarks Named locations with rename/delete support
Track Changes Insertions/deletions with accept/reject support
Floating Images Anchored shapes with full modification support
Text Boxes Content in mc:AlternateContent elements
Themes Theme colors and fonts (read and modify)
Comments Comment threads with author metadata
SVG Images Recognition and parsing of SVG files
Math Equations OMML equations with creation and iteration
Charts Embedded chart detection and access
SmartArt SmartArt diagram detection and access
Custom XML Custom XML parts with read/write support
Modern Namespaces Word 2013+ namespaces (w14, w15, w16, etc.)
Conformance Detection Detect Strict vs Transitional, Word version

Installation

pip install git+https://github.com/sverker/python-docx.git@xtend

Example

>>> from docx import Document

>>> document = Document()
>>> document.add_paragraph("It was a dark and stormy night.")
<docx.text.paragraph.Paragraph object at 0x10f19e760>
>>> document.save("dark-and-stormy.docx")

>>> document = Document("dark-and-stormy.docx")
>>> document.paragraphs[0].text
'It was a dark and stormy night.'

Extension Examples

>>> from docx import Document
>>> doc = Document("complex-document.docx")

# Access content controls
>>> for cc in doc.content_controls:
...     print(f"{cc.tag}: {cc.text}")

# Access fields
>>> for field in doc.fields:
...     print(f"{field.field_type}: {field.field_code}")

# Access footnotes
>>> for fn in doc.footnotes:
...     print(fn.paragraphs[0].text)

# Access bookmarks
>>> for bm in doc.bookmarks:
...     print(f"{bm.name}: {bm.id}")

# Access track changes
>>> for rev in doc.revisions.insertions:
...     print(f"{rev.author}: {rev.text}")

# Access theme
>>> print(doc.theme.colors.accent1)  # RGBColor
>>> print(doc.theme.fonts.minor_latin)  # Font name

# Access math equations
>>> for eq in doc.equations:
...     print(eq.latex)

# Access charts
>>> for chart in doc.charts:
...     print(chart.name)

# Access SmartArt
>>> for smartart in doc.smartart_objects:
...     print(smartart.name)

# Create a text box
>>> text_box = doc.add_text_box(Inches(2), Inches(1))
>>> text_box.paragraphs[0].text = "Hello!"

# Modify floating shapes
>>> shape = doc.floating_shapes[0]
>>> shape.width = Inches(3)
>>> shape.pos_x = Inches(1)
>>> shape.delete()  # Remove from document

# Modify bookmarks
>>> bookmark = doc.bookmarks.get("MyBookmark")
>>> bookmark.name = "NewName"  # Rename
>>> bookmark.delete()  # Remove from document

# Modify fields
>>> field = doc.fields[0]
>>> field.delete()  # Remove field
>>> field.convert_to_text()  # Convert to static text

Documentation

For core python-docx functionality, see the python-docx documentation.

For a comprehensive API reference including all extensions, see WOTAN/docs/python-docx-api.md. This includes:

  • Complete API reference for all features
  • Pragmatics section with real-world document patterns
  • Tips for handling complex documents (nested content controls, machine-generated files, etc.)

Status

This is an experimental fork. All original python-docx tests pass. Extensions are additive and should not break existing functionality.

About

Create and modify Word documents with Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.2%
  • Gherkin 5.7%
  • Makefile 0.1%