Skip to content

airmang/python-hwpx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

202 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

python-hwpx

ํ•œ๊ธ€ ์—†์ด HWPX ๋ฌธ์„œ๋ฅผ Python์œผ๋กœ ์ฝ๊ณ , ํŽธ์ง‘ํ•˜๊ณ , ์ƒ์„ฑํ•˜๊ณ , ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.

PyPI Python License Docs


๐Ÿงฉ HWPX Stack (3์ข…)

๊ณ„์ธต ๋ ˆํฌ ์—ญํ• 
๐Ÿ“ฆ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ python-hwpx ์ˆœ์ˆ˜ ํŒŒ์ด์ฌ HWPX ํŒŒ์‹ฑยทํŽธ์ง‘ยท์ƒ์„ฑ ์ฝ”์–ด
๐Ÿ”Œ MCP ์„œ๋ฒ„ hwpx-mcp-server MCP ํด๋ผ์ด์–ธํŠธ(Claude Desktop, VS Code ๋“ฑ)์—์„œ HWPX ์กฐ์ž‘
๐ŸŽฏ ์—์ด์ „ํŠธ ์Šคํ‚ฌ hwpx-skill ์—์ด์ „ํŠธ๊ฐ€ HWPX๋ฅผ ๋ฐ”๋กœ ์“ฐ๊ฒŒ ํ•ด์ฃผ๋Š” ๊ณต์‹ ์˜จ๋ณด๋”ฉ ์Šคํ‚ฌ

์™œ python-hwpx์ธ๊ฐ€

  • ํ•œ์ปด์˜คํ”ผ์Šค ์„ค์น˜ ๋ถˆํ•„์š” โ€” ์ˆœ์ˆ˜ ํŒŒ์ด์ฌ์œผ๋กœ ์–ด๋””์„œ๋‚˜ ๋™์ž‘
  • XML-first ์›Œํฌํ”Œ๋กœ โ€” ์Šคํ‚ค๋งˆ ๊ฒ€์ฆยทunpack/pack๊นŒ์ง€ ํฌํ•จ
  • ์—์ด์ „ํŠธยท์ž๋™ํ™” ์นœํ™” โ€” MCP ์„œ๋ฒ„ยทSkill์ด ๊ฐ™์€ ์Šคํƒ ์œ„์—์„œ ์ง๊ฒฐ

๋Œ€ํ•ญ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋น„๊ต

ํ•ญ๋ชฉ python-hwpx pyhwp(x) ๋ฅ˜ ole+bin ์ˆ˜์ž‘์—…
HWPX Open XML ์ง€์› โœ… โš ๏ธ ๋ถ€๋ถ„ โŒ
ํ•œ์ปด์˜คํ”ผ์Šค ์„ค์น˜ ๋ถˆํ•„์š” โœ… โœ… โœ…
ํŽธ์ง‘/์ƒ์„ฑ API โœ… โŒ ๋Œ€๋ถ€๋ถ„ ์ฝ๊ธฐ โŒ
์Šคํ‚ค๋งˆ ๊ฒ€์ฆ โœ… โŒ โŒ
AI ์—์ด์ „ํŠธ ์—ฐ๋™ (MCP) โœ… (hwpx-mcp-server) โŒ โŒ

โšก 30์ดˆ ์•ˆ์— ๊ฐ€์น˜ ํ™•์ธ

1. ๊ธฐ์กด ๋ฌธ์„œ๋ฅผ ์—ด๊ณ  ์ˆ˜์ •

from hwpx import HwpxDocument

document = HwpxDocument.open("๋ณด๊ณ ์„œ.hwpx")
document.add_paragraph("์ž๋™ํ™”๋กœ ์ถ”๊ฐ€ํ•œ ๋ฌธ๋‹จ์ž…๋‹ˆ๋‹ค.")
document.save_to_path("๋ณด๊ณ ์„œ-์ˆ˜์ •.hwpx")

2. ์–‘์‹ํ˜• ํ‘œ๋ฅผ ์ฝ”๋“œ๋กœ ์ฑ„์šฐ๊ธฐ

from hwpx import HwpxDocument

doc = HwpxDocument.open("์‹ ์ฒญ์„œ.hwpx")
result = doc.fill_by_path({
    "์„ฑ๋ช… > right": "ํ™๊ธธ๋™",
    "์†Œ์† > right": "ํ”Œ๋žซํผํŒ€",
})
doc.save_to_path("์‹ ์ฒญ์„œ-์ž‘์„ฑ์™„๋ฃŒ.hwpx")

print(result["applied_count"], result["failed_count"])

3. ํ…์ŠคํŠธ ์ถ”์ถœ๊ณผ ๊ตฌ์กฐ ๊ฒ€์ฆ

from hwpx import HwpxDocument

text = HwpxDocument.open("๋ณด๊ณ ์„œ.hwpx").export_markdown()
print(text[:500])
hwpx-validate-package ๋ณด๊ณ ์„œ.hwpx
hwpx-analyze-template ๋ณด๊ณ ์„œ.hwpx

4. ํ’๋ถ€ํ•œ Markdown ๋ณ€ํ™˜ (์„œ์‹ยทํ‘œยท๊ฐ์ฃผยท์ด๋ฏธ์ง€ ๋ณด์กด)

export_markdown()๋Š” ๋‹จ์ˆœ ํ‰๋ฌธ ์ถ”์ถœ์ด๊ณ , export_rich_markdown()๋Š” ์ธ๋ผ์ธ ์„œ์‹(**๊ตต๊ฒŒ**, *๊ธฐ์šธ์ž„*, ~~์ทจ์†Œ์„ ~~), ํ‘œ(์ค‘์ฒฉ ํฌํ•จ, colspan/rowspan ์•ˆ์ „), ๋„ํ˜• ํ…์ŠคํŠธ, ์ด๋ฏธ์ง€, ๊ฐ์ฃผ/๋ฏธ์ฃผ, ํ•˜์ดํผ๋งํฌ, ์ œ๋ชฉ(#/##) ์ž๋™ ๊ฐ์ง€๊นŒ์ง€ ๋ณด์กดํ•œ๋‹ค.

from hwpx import HwpxDocument

doc = HwpxDocument.open("๋ณด๊ณ ์„œ.hwpx")

md = doc.export_rich_markdown(
    image_dir="out/images",          # BinData ์ด๋ฏธ์ง€๋ฅผ ๋””์Šคํฌ์— ์ถ”์ถœ
    image_ref_prefix="images/",      # ๋งˆํฌ๋‹ค์šด ๋‚ด ![](images/...) ๊ฒฝ๋กœ ์ ‘๋‘
    detect_headings=True,            # โ… ./1. ํŒจํ„ด ๊ธฐ๋ฐ˜ #/## ์ž๋™
)
print(md)

๋ฌธ์ž์—ดยท๊ฒฝ๋กœยท๋ฐ”์ดํŠธ๋„ ๊ทธ๋Œ€๋กœ ๋ฐ›๋Š”๋‹ค:

from hwpx.tools.markdown_export import export_markdown

md = export_markdown("๋ณด๊ณ ์„œ.hwpx")          # ๊ฒฝ๋กœ
md = export_markdown(open("a.hwpx", "rb").read())  # bytes

5. ๊ฐ์ฃผ ๋ณธ๋ฌธ์— ํ˜ผํ•ฉ ์„œ์‹ / ํ•˜์ดํผ๋งํฌ ์ถ”๊ฐ€

HwpxOxmlNote์— body_paragraph, add_run, add_hyperlink helper๊ฐ€ ์žˆ์–ด ๊ฐ์ฃผ ๋ณธ๋ฌธ์„ ์ง์ ‘ paragraph๋กœ ๋‹ค๋ฃจ์ง€ ์•Š๊ณ ๋„ ์ธ๋ผ์ธ ์„œ์‹ยท๋งํฌ๋ฅผ ์†์‰ฝ๊ฒŒ ์ฑ„์šธ ์ˆ˜ ์žˆ๋‹ค.

para = section.paragraphs[0]
note = para.add_footnote("")  # ๋นˆ ๊ฐ์ฃผ ์ƒ์„ฑ ํ›„ ๋ณธ๋ฌธ ๊ตฌ์„ฑ
note.add_run("์ž์„ธํ•œ ๋‚ด์šฉ์€ ", )
note.add_run("์ •๋ถ€ ๊ณต์‹ ์‚ฌ์ดํŠธ", bold=True)
note.add_run("๋ฅผ ์ฐธ๊ณ ํ•˜๋ผ: ")
note.add_hyperlink("https://www.kasa.go.kr", "์šฐ์ฃผํ•ญ๊ณต์ฒญ")

์ฒ˜์Œ์—๋Š” open/new -> edit/extract -> save_to_path ํ๋ฆ„๋งŒ ์žก์œผ๋ฉด ๋œ๋‹ค. ํŒจํ‚ค์ง€ ๊ตฌ์กฐ, XML ํŒŒํŠธ, ํ…œํ”Œ๋ฆฟ ํšŒ๊ท€ ์ ๊ฒ€์€ ํ•„์š”ํ•  ๋•Œ๋งŒ ํ™•์žฅํ•˜๋ฉด ๋œ๋‹ค.

์–ด๋””๋ถ€ํ„ฐ ์ฝ์œผ๋ฉด ๋˜๋‚˜

ํ•„์š”ํ•œ ์ž‘์—…๋ถ€ํ„ฐ ๋ฐ”๋กœ ๋“ค์–ด๊ฐ€๋ฉด ๋œ๋‹ค.

  • ์ฒซ ํŒŒ์ผ์„ ์—ด๊ณ  ์ €์žฅํ•˜๋Š” ์ตœ์†Œ ๊ฒฝ๋กœ โ†’ docs/quickstart.md
  • ๋ฌธ๋‹จ, ํ‘œ, ๋ฉ”๋ชจ, ์„น์…˜ ํŽธ์ง‘ ํŒจํ„ด โ†’ docs/usage.md
  • ํ…์ŠคํŠธ ์ถ”์ถœ, ๊ตฌ์กฐ ์กฐํšŒ, ๊ฒ€์ฆ/ํŒจํ‚ค์ง€ ์ž‘์—… โ†’ docs/usage.md
  • ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ์˜ˆ์ œ ๋ชจ์Œ โ†’ docs/examples.md
  • ํŒจํ‚ค์ง€ ๊ตฌ์กฐ์™€ ์Šคํ‚ค๋งˆ ์‹ฌํ™” โ†’ docs/schema-overview.md
  • ์„ค์น˜ ๊ฒ€์ฆ๊ณผ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ ํ™•์ธ โ†’ docs/installation.md

examples ํ•˜์ด๋ผ์ดํŠธ

build_release_checklist.py
๋ฉ”๋ชจ์™€ ์Šคํƒ€์ผ ํŽธ์ง‘์ด ํฌํ•จ๋œ ๋ฆด๋ฆฌ์Šค ์ฒดํฌ๋ฆฌ์ŠคํŠธ์šฉ HWPX๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.
extract_text.py
๋ณธ๋ฌธ๊ณผ ์ค‘์ฒฉ ๊ฐ์ฒด ํ…์ŠคํŠธ๋ฅผ CLI๋กœ ๋น ๋ฅด๊ฒŒ ์ถ”์ถœํ•œ๋‹ค.
find_objects.py
ํƒœ๊ทธยท์†์„ฑ ๊ธฐ์ค€์œผ๋กœ OWPML XML ๋…ธ๋“œ๋ฅผ ์ถ”์ ํ•œ๋‹ค.

Quick Start

์ƒˆ ๋ฌธ์„œ๋ฅผ ๋ฐ”๋กœ ๋งŒ๋“ค๊ณ  ์‹ถ๋‹ค๋ฉด ์ด๋ ‡๊ฒŒ ์‹œ์ž‘ํ•˜๋ฉด ๋œ๋‹ค.

from hwpx import HwpxDocument

document = HwpxDocument.new()
document.add_paragraph("python-hwpx๋กœ ๋งŒ๋“  ์ƒˆ ๋ฌธ์„œ")
document.save_to_path("์ƒˆ๋ฌธ์„œ.hwpx")

๐Ÿ’ก ์ปจํ…์ŠคํŠธ ๋งค๋‹ˆ์ €๋„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:

with HwpxDocument.open("๋ณด๊ณ ์„œ.hwpx") as doc:
    doc.add_paragraph("์ž๋™์œผ๋กœ ๋ฆฌ์†Œ์Šค๊ฐ€ ์ •๋ฆฌ๋ฉ๋‹ˆ๋‹ค.")
    doc.save_to_path("๊ฒฐ๊ณผ๋ฌผ.hwpx")

ํ‘œ, ๋ฉ”๋ชจ, ํ…์ŠคํŠธ ์ถ”์ถœ, ๊ฒ€์ฆ, ํŒจํ‚ค์ง€/XML ์‹ฌํ™”๋Š” docs/quickstart.md์™€ docs/usage.md์—์„œ ๋ฐ”๋กœ ์ด์–ด์ง„๋‹ค.

pyhwpx / pyhwp์™€ ๋‹ค๋ฅธ ์ ?

python-hwpx pyhwpx pyhwp
๋Œ€์ƒ ํฌ๋งท .hwpx (OWPML/OPC) .hwpx .hwp (v5 ๋ฐ”์ด๋„ˆ๋ฆฌ)
ํ•œ/๊ธ€ ์„ค์น˜ ๋ถˆํ•„์š” ํ•„์š” (Windows COM) ๋ถˆํ•„์š”
ํฌ๋กœ์Šค ํ”Œ๋žซํผ โœ… Linux / macOS / Windows / CI โŒ Windows ์ „์šฉ โœ…
๋ฐฉ์‹ ์ง์ ‘ XML ํŒŒ์‹ฑ COM ์ž๋™ํ™” OLE ํŒŒ์‹ฑ

HWPX plugin usage

The per-host bundles in the hwpx-plugins repository consume python-hwpx through hwpx-mcp-server and the local quickcheck scripts. During local development, set PYTHON_HWPX_REPO=/absolute/path/to/python-hwpx so the plugin launcher uses this checkout as an editable dependency.

๐ŸŒ ํฌ๋กœ์Šค ํ”Œ๋žซํผ ์ง€์›

HWPX ํŒŒ์ผ์€ ZIP + XML ๊ตฌ์กฐ์ด๋ฏ€๋กœ, ํ•œ/๊ธ€ ํ”„๋กœ๊ทธ๋žจ ์—†์ด Python๋งŒ์œผ๋กœ ์ฝ๊ณ  ํŽธ์ง‘ํ•˜๋Š” ์›Œํฌํ”Œ๋กœ๋ฅผ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ”Œ๋žซํผ ์ฝ๊ธฐ ์“ฐ๊ธฐ ๋น„๊ณ 
โœ… Windows โœ… โœ… ํ•œ์ปด์˜คํ”ผ์Šค
โœ… macOS โœ… โœ… ํ•œ์ปด์˜คํ”ผ์Šค Mac
โœ… Linux โœ… โœ… ํ•œ์ปด์˜คํ”ผ์Šค Linux
โœ… CI/CD โœ… โœ… Docker, GitHub Actions ๋“ฑ

์ฃผ์š” ๊ธฐ๋Šฅ ํ•œ๋ˆˆ์— ๋ณด๊ธฐ

์นดํ…Œ๊ณ ๋ฆฌ ๊ธฐ๋Šฅ ์„ค๋ช…
๐Ÿ“„ ๋ฌธ์„œ I/O ์—ด๊ธฐ/์ €์žฅ/์ƒ์„ฑ ํŒŒ์ผ, ๋ฐ”์ดํŠธ, ์ŠคํŠธ๋ฆผ ์ž…์ถœ๋ ฅ ยท ์›์ž์  ์ €์žฅ ยท ZIP ๋ฌด๊ฒฐ์„ฑ ๊ฒ€์ฆ
๐Ÿ“ ๋‹จ๋ฝ ์ถ”๊ฐ€/์‚ญ์ œ/ํŽธ์ง‘/์„œ์‹ ํ…์ŠคํŠธ ์„ค์ •, ๋‹จ๋ฝ ์‚ญ์ œ(remove_paragraph), ์Šคํƒ€์ผ ์ฐธ์กฐ
โœ๏ธ Run ํ…์ŠคํŠธ ์กฐ๊ฐ ์ถ”๊ฐ€, ๊ต์ฒด, ๋ณผ๋“œ/์ดํƒค๋ฆญ/๋ฐ‘์ค„/์ƒ‰์ƒ ์„œ์‹
๐Ÿ“Š ํ‘œ(Table) ์ƒ์„ฑ/ํŽธ์ง‘/๋ณ‘ํ•ฉ Nร—M ํ‘œ ์ƒ์„ฑ, ์…€ ํ…์ŠคํŠธ, ์…€ ๋ณ‘ํ•ฉ/๋ถ„ํ• , ์ค‘์ฒฉ ํ…Œ์ด๋ธ”
๐Ÿงญ ํ‘œ ์ž๋™ํ™” ํƒ์ƒ‰/์ฑ„์šฐ๊ธฐ ํ…Œ์ด๋ธ” ๋งต, ๋ผ๋ฒจ ๊ธฐ๋ฐ˜ ์…€ ํƒ์ƒ‰, ๊ฒฝ๋กœ ๊ธฐ๋ฐ˜ ๋ฐฐ์น˜ ์ฑ„์šฐ๊ธฐ
๐Ÿ“‘ ์„น์…˜ ์ถ”๊ฐ€/์‚ญ์ œ add_section(after=), remove_section(), manifest ์ž๋™ ๊ด€๋ฆฌ
๐Ÿ–ผ๏ธ ์ด๋ฏธ์ง€ ์ž„๋ฒ ๋“œ/์‚ญ์ œ ๋ฐ”์ด๋„ˆ๋ฆฌ ๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ, manifest ์ž๋™ ๋“ฑ๋ก
โœ๏ธ ๋„ํ˜• ์„ /์‚ฌ๊ฐํ˜•/ํƒ€์› OWPML ๋ช…์„ธ ์ค€์ˆ˜ ๋„ํ˜• ์‚ฝ์ž…
๐Ÿ“‘ ๋จธ๋ฆฌ๊ธ€/๋ฐ”๋‹ฅ๊ธ€ ์„ค์ •/์ œ๊ฑฐ ํ™€์ˆ˜/์ง์ˆ˜/์–‘์ชฝ ํŽ˜์ด์ง€ ๊ตฌ๋ถ„
๐Ÿ’ฌ ๋ฉ”๋ชจ ์ถ”๊ฐ€/์‚ญ์ œ ์•ต์ปค ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ, ๋ฉ”๋ชจ ์…ฐ์ดํ”„ ์ฐธ์กฐ
๐Ÿ“Œ ๊ฐ์ฃผ/๋ฏธ์ฃผ ์ถ”๊ฐ€ ํ…์ŠคํŠธ ์ ‘๊ทผ
๐Ÿ”— ๋ถ๋งˆํฌ/ํ•˜์ดํผ๋งํฌ ์‚ฝ์ž…/์กฐํšŒ URL ๋งํฌ, ๋‚ด๋ถ€ ๋ถ๋งˆํฌ
๐Ÿ“ฐ ๋‹ค๋‹จ ํŽธ์ง‘ ์ปฌ๋Ÿผ ์ •์˜ ๋‹ค๋‹จ ๋ ˆ์ด์•„์›ƒ ์ œ์–ด
๐Ÿ” ํ…์ŠคํŠธ ์ถ”์ถœ ํŒŒ์ดํ”„๋ผ์ธ ์„น์…˜/๋‹จ๋ฝ ์ˆœํšŒ, ์ฃผ์„ ๋ Œ๋”๋ง, ์ค‘์ฒฉ ๊ฐ์ฒด ์ œ์–ด
๐Ÿ”Ž ๊ฐ์ฒด ๊ฒ€์ƒ‰ ํƒœ๊ทธ/์†์„ฑ/XPath ํŠน์ • ์š”์†Œ ํƒ์ƒ‰, ์ฃผ์„ ์ดํ„ฐ๋ ˆ์ดํ„ฐ
๐ŸŽจ ์Šคํƒ€์ผ ์น˜ํ™˜ ์„œ์‹ ๊ธฐ๋ฐ˜ ํ•„ํ„ฐ ์ƒ‰์ƒ/๋ฐ‘์ค„/charPrIDRef ๊ธฐ๋ฐ˜ Run ๊ฒ€์ƒ‰ ๋ฐ ๊ต์ฒด
๐Ÿ“ค ๋‚ด๋ณด๋‚ด๊ธฐ ํ…์ŠคํŠธ/HTML/Markdown ๋ฌธ์„œ ๋ณ€ํ™˜ ์ถœ๋ ฅ
โœ… ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ XSD + ํŒจํ‚ค์ง€ ๊ตฌ์กฐ CLI(hwpx-validate, hwpx-validate-package) ๋ฐ API
๐Ÿงฐ ์ž‘์—… ๋„๊ตฌ unpack/pack/๋ถ„์„/๋น„๊ต pack-ready ์ž‘์—… ๋””๋ ‰ํ„ฐ๋ฆฌ ์ถ”์ถœ๊ณผ ์žฌ๊ตฌ์„ฑ ์ ๊ฒ€
๐Ÿ—๏ธ ์ €์ˆ˜์ค€ XML ๋ฐ์ดํ„ฐํด๋ž˜์Šค ๋งคํ•‘ OWPML ์Šคํ‚ค๋งˆ โ†” Python ๊ฐ์ฒด ์ง์ ‘ ์กฐ์ž‘
๐Ÿ”„ ๋„ค์ž„์ŠคํŽ˜์ด์Šค ํ˜ธํ™˜ ์ž๋™ ์ •๊ทœํ™” HWPML 2016 โ†’ 2011 ์ž๋™ ๋ณ€ํ™˜
๐Ÿ—๏ธ ๋นŒ๋” ์กฐ๋ฆฝํ˜• ์ƒ์„ฑ hwpx.builder โ€” Section/Heading/Table/Image/Header ์กฐ๋ฆฝ, ํ•˜๋“œ๊ฒŒ์ดํŠธ ์ €์žฅ ๋ฆฌํฌํŠธ
โœ… ํŽธ์ง‘๊ธฐ ์˜คํ”ˆ ์•ˆ์ „ validate_editor_open_safety ์ €์žฅ/ํŒฉ/๋ฆฌํŽ˜์–ด/๋นŒ๋” ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ, openSafety ์ฆ๊ฑฐ ๋ฐ˜ํ™˜
๐Ÿงช ํผ์ง• ์ˆ˜๋ ด ๋ฃจํ”„ hwpx.tools.fuzz ์‹œ๋“œ ๊ฒฐ์ •์  ์‹œ๋‚˜๋ฆฌ์˜ค ์ƒ์„ฑ ยท 3์ค‘ ์˜ค๋ผํด ๋Ÿฌ๋„ˆ ยท ํšŒ๊ท€ fixture ๋ฐ•์ œ
๐Ÿ–ฅ๏ธ ๋ ˆ์ด์•„์›ƒ ํ”„๋ฆฌ๋ทฐ hwpx.tools.layout_preview ํŽ˜์ด์ง€ ๋ฐ•์Šคยทํ‘œยท์—ฌ๋ฐฑ ๊ทผ์‚ฌ HTML/PNG (์—์ด์ „ํŠธ ์ž๊ธฐ๊ฒ€์ฆ์šฉ)
๐Ÿงท ๋ฐ”์ดํŠธ ๋ณด์กด ํŒจ์น˜ hwpx.patch section XML ๋ฐ”์ดํŠธ splice โ€” ๋ฏธ์ˆ˜์ • ์˜์—ญ ๋ฐ”์ดํŠธ ๋ณด์กด
๐Ÿ“ ๊ธฐ์กด ๋ฌธ์„œ ์„œ์‹ ํŽธ์ง‘ ๋ฌธ๋‹จยทํŽ˜์ด์ง€ ์ •๋ ฌยท์ค„๊ฐ„๊ฒฉยท๋“ค์—ฌ์“ฐ๊ธฐยท๋ฌธ๋‹จ ๊ฐ„๊ฒฉ, ์šฉ์ง€ยท์—ฌ๋ฐฑยท๋ฐฉํ–ฅ, ๋จธ๋ฆฌ๋ง/์ชฝ๋ฒˆํ˜ธ, ๋ถˆ๋ฆฟ/๋ฒˆํ˜ธ
๐Ÿ–Š๏ธ ๋ˆ„๋ฆ„ํ‹€ ์–‘์‹ ํ•„๋“œ ํด๋ฆญํžˆ์–ด ํ•„๋“œ ์กฐํšŒยท์„œ์‹ ๋ณด์กด ์ฑ„์›€
๐Ÿ›๏ธ ๊ณต๋ฌธ์„œ ๋„๊ตฌ official_lint ยท ๊ฒฐ์žฌ๋ž€ ํ•ญ๋ชฉ๊ธฐํ˜ธ ์œ„๊ณ„ยท"๋." ํ‘œ์‹œยท๋ถ™์ž„ยท๋‚ ์งœ ํ‘œ๊ธฐ lint, ๊ฒฐ์žฌ๋ž€ ํ”„๋ฆฌ์…‹
๐Ÿ“ท ๊ณ ๊ธ‰ ์ƒ์„ฑ๊ธฐ advanced_generators ์‚ฌ์ง„๋Œ€์ง€(image_grid)ยทํšŒ์˜ ๋ช…ํŒจยทํ‘œ ๊ธฐ๋ฐ˜ ์กฐ์ง๋„
๐Ÿ†š ์‹ ๊ตฌ๋Œ€์กฐ doc_diff ๋ฌธ๋‹จ LCS diffยท์‹ ๊ตฌ๋Œ€์กฐํ‘œ ์ƒ์„ฑยท์ฐธ์กฐ ์ •ํ•ฉ lint
๐Ÿ“จ ๋ฉ”์ผ๋จธ์ง€ยทํ‘œ ๊ณ„์‚ฐ mail_merge ํ…œํ”Œ๋ฆฟ+๋ฐ์ดํ„ฐ N๋ถ€ ๋Œ€๋Ÿ‰ ์ƒ์„ฑ, ํ‘œ ํ•ฉ๊ณ„ยทํ‰๊ท 
๐Ÿช„ ์„œ์‹ ์ด์‹ style_profile ์ฐธ์กฐ ๋ฌธ์„œ ํ”„๋กœํŒŒ์ผ ์ถ”์ถœยท์ ์šฉ, ํ…œํ”Œ๋ฆฟ ๋ ˆ์ง€์ŠคํŠธ๋ฆฌ
๐Ÿ›ก๏ธ ์ž…๋ ฅ ๊ฐ•๊ฑดํ™” opc.security XML entity ํญํƒ„ยทZIP ์••์ถ• ํญํƒ„ ๊ฐ€๋“œ

๊ธฐ๋Šฅ ์ƒ์„ธ

๐Ÿ“„ ๋ฌธ์„œ ํŽธ์ง‘

๋ฌธ๋‹จ, ํ‘œ, ๋ฉ”๋ชจ, ๋จธ๋ฆฌ๊ธ€/๋ฐ”๋‹ฅ๊ธ€์„ Python ๊ฐ์ฒด๋กœ ๋‹ค๋ฃน๋‹ˆ๋‹ค.

# ๋‹จ๋ฝ ์ถ”๊ฐ€ยท์‚ญ์ œ
doc.add_paragraph("์ƒˆ ๋ฌธ๋‹จ")
doc.remove_paragraph(doc.paragraphs[-1])   # ๋งˆ์ง€๋ง‰ ๋‹จ๋ฝ ์‚ญ์ œ

# ์„น์…˜ ์ถ”๊ฐ€ยท์‚ญ์ œ
new_sec = doc.add_section()          # ๋ฌธ์„œ ๋์— ์„น์…˜ ์ถ”๊ฐ€
new_sec.add_paragraph("๋‘ ๋ฒˆ์งธ ์„น์…˜ ๋‚ด์šฉ")
doc.remove_section(1)                # ์ธ๋ฑ์Šค๋กœ ์„น์…˜ ์‚ญ์ œ

# ๋จธ๋ฆฌ๊ธ€ยท๋ฐ”๋‹ฅ๊ธ€
doc.set_header_text("๊ธฐ๋ฐ€ ๋ฌธ์„œ", page_type="BOTH")
doc.set_footer_text("1 / 10", page_type="BOTH")

# ํ‘œ ์…€ ๋ณ‘ํ•ฉยท๋ถ„ํ• 
table.merge_cells(0, 0, 1, 1)   # (0,0)~(1,1) ๋ณ‘ํ•ฉ
table.set_cell_text(0, 0, "๋ณ‘ํ•ฉ๋œ ์…€", logical=True, split_merged=True)
table.set_cell_text(0, 0, "line 1\nline 2", split_paragraphs=True)

# ์–‘์‹ํ˜• ํ‘œ ์ž๋™ ์ฑ„์šฐ๊ธฐ
form = doc.add_table(2, 2)
form.cell(0, 0).text = "์„ฑ๋ช…:"
form.cell(1, 0).text = "์†Œ์†"

doc.find_cell_by_label("์„ฑ๋ช…")    # {"matches": [...], "count": 1}
doc.fill_by_path({
    "์„ฑ๋ช… > right": "ํ™๊ธธ๋™",
    "์†Œ์† > right": "ํ”Œ๋žซํผํŒ€",
})

doc.paragraphs์˜ ์ธ๋ฑ์Šค๋Š” ๋ณธ๋ฌธ ์ง์† ๋ฌธ๋‹จ 0-based ๊ธฐ์ค€์ž…๋‹ˆ๋‹ค. ํ‘œ ์•ˆ ๋ฌธ๋‹จ์€ ๋ณธ๋ฌธ paragraph_index์— ์„ž์ง€ ์•Š๊ณ  get_table_map()์˜ cell location (table_index, row, col, cell_paragraph_index)์œผ๋กœ ๋‹ค๋ฃน๋‹ˆ๋‹ค. get_table_map()์€ caption_text์™€ preceding_paragraph_text๋ฅผ ๋ถ„๋ฆฌํ•ด ๋ฐ˜ํ™˜ํ•˜๊ณ , ์…€ ๋ฏธ๋ฆฌ๋ณด๊ธฐ์˜ ์—ฌ๋Ÿฌ ๋ฌธ๋‹จ์€ \n์œผ๋กœ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ” ํ…์ŠคํŠธ ์ถ”์ถœ & ๊ฒ€์ƒ‰

from hwpx import TextExtractor, ObjectFinder

# ํ…์ŠคํŠธ ์ถ”์ถœ
with TextExtractor("๋ฌธ์„œ.hwpx") as extractor:
    for section in extractor.iter_sections():
        for para in extractor.iter_paragraphs(section):
            print(para.text())

# ํŠน์ • ๊ฐ์ฒด ํƒ์ƒ‰
for obj in ObjectFinder("๋ฌธ์„œ.hwpx").find_all(tag="tbl"):
    print(obj.tag, obj.path)

hp:tab๊ณผ ctrl id="tab"์€ ํƒญ ๋ฌธ์ž(\t)๋กœ ๋ณด์กด๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ Paragraph.text, TextExtractor, export_text()/export_html()/export_markdown() ๊ฒฝ๋กœ์—์„œ ๊ฐ™์€ ํƒญ ์˜๋ฏธ๋ฅผ ์œ ์ง€ํ•œ ์ฑ„ roundtrip ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•„์š”ํ•˜๋ฉด preserve_breaks=False๋กœ ์ค„๋ฐ”๊ฟˆ/ํƒญ์„ ๊ณต๋ฐฑ ๊ธฐ๋ฐ˜์œผ๋กœ ํ‰ํƒ„ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐ŸŽจ ์Šคํƒ€์ผ ๊ธฐ๋ฐ˜ ํ…์ŠคํŠธ ์น˜ํ™˜

์„œ์‹(์ƒ‰์ƒ, ๋ฐ‘์ค„, charPrIDRef)์œผ๋กœ ๋Ÿฐ์„ ํ•„ํ„ฐ๋งํ•ด ์„ ํƒ์ ์œผ๋กœ ๊ต์ฒดํ•ฉ๋‹ˆ๋‹ค.

# ๋นจ๊ฐ„์ƒ‰ ํ…์ŠคํŠธ๋งŒ ์ฐพ์•„์„œ ์น˜ํ™˜
doc.replace_text_in_runs(
    "์ž„์‹œ", "ํ™•์ •",
    text_color="#FF0000",
)

# ํŠน์ • ์„œ์‹์˜ ๋Ÿฐ ๊ฒ€์ƒ‰
runs = doc.find_runs_by_style(underline_type="SINGLE")

๐Ÿ“ค ๋‚ด๋ณด๋‚ด๊ธฐ

# ํ…์ŠคํŠธ, HTML, Markdown์œผ๋กœ ๋ณ€ํ™˜
text = doc.export_text()
html = doc.export_html()
md   = doc.export_markdown()

๐Ÿ—๏ธ ์ €์ˆ˜์ค€ XML ์ œ์–ด

OWPML ์Šคํ‚ค๋งˆ์— ๋งคํ•‘๋œ ๋ฐ์ดํ„ฐํด๋ž˜์Šค๋กœ XML ๊ตฌ์กฐ๋ฅผ ์ง์ ‘ ๋‹ค๋ฃน๋‹ˆ๋‹ค.

# ํ—ค๋” ์ฐธ์กฐ ๋ชฉ๋ก
doc.border_fills    # ํ…Œ๋‘๋ฆฌ ์ฑ„์šฐ๊ธฐ
doc.bullets         # ๊ธ€๋จธ๋ฆฌํ‘œ
doc.styles          # ์Šคํƒ€์ผ
doc.track_changes   # ๋ณ€๊ฒฝ ์ถ”์ 

# ๋ฐ”ํƒ•์ชฝยท์ด๋ ฅยท๋ฒ„์ „ ํŒŒํŠธ
doc.master_pages
doc.histories
doc.version

์•„ํ‚คํ…์ฒ˜

python-hwpx
โ”œโ”€โ”€ hwpx.document        # ๊ณ ์ˆ˜์ค€ ํŽธ์ง‘ API (HwpxDocument)
โ”œโ”€โ”€ hwpx.opc             # OPC ์ปจํ…Œ์ด๋„ˆ ์ฝ๊ธฐ/์“ฐ๊ธฐ (์›์ž์  ์ €์žฅ, ZIP ๋ฌด๊ฒฐ์„ฑ ๊ฒ€์ฆ)
โ”œโ”€โ”€ hwpx.oxml            # OWPML XML โ†” ๋ฐ์ดํ„ฐํด๋ž˜์Šค ๋งคํ•‘
โ”‚   โ”œโ”€โ”€ document.py      #   ์„น์…˜, ๋ฌธ๋‹จ, ํ‘œ, ๋Ÿฐ, ๋ฉ”๋ชจ, ๋„ํ˜•, ๋…ธํŠธ
โ”‚   โ”œโ”€โ”€ header.py        #   ํ—ค๋” ์ฐธ์กฐ ๋ชฉ๋ก (์Šคํƒ€์ผ, ๊ธ€๋จธ๋ฆฌํ‘œ, ๋ณ€๊ฒฝ์ถ”์  ๋“ฑ)
โ”‚   โ”œโ”€โ”€ body.py          #   ํƒ€์ž…์ด ์ง€์ •๋œ ๋ณธ๋ฌธ ๋ชจ๋ธ
โ”‚   โ””โ”€โ”€ common.py        #   ๋ฒ”์šฉ XML โ†” ๋ฐ์ดํ„ฐํด๋ž˜์Šค
โ”œโ”€โ”€ hwpx.tools
โ”‚   โ”œโ”€โ”€ archive_cli      #   unpack/pack CLI ๋ฐ ์žฌํŒจํ‚น ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
โ”‚   โ”œโ”€โ”€ text_extractor   #   ํ…์ŠคํŠธ ์ถ”์ถœ ํŒŒ์ดํ”„๋ผ์ธ
โ”‚   โ”œโ”€โ”€ text_extract_cli #   ํ…์ŠคํŠธ ์ถ”์ถœ CLI
โ”‚   โ”œโ”€โ”€ object_finder    #   ๊ฐ์ฒด ํƒ์ƒ‰ ์œ ํ‹ธ๋ฆฌํ‹ฐ
โ”‚   โ”œโ”€โ”€ exporter         #   ํ…์ŠคํŠธ/HTML/Markdown ๋‚ด๋ณด๋‚ด๊ธฐ
โ”‚   โ”œโ”€โ”€ validator        #   ์Šคํ‚ค๋งˆ ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ (hwpx-validate CLI)
โ”‚   โ”œโ”€โ”€ package_validator#   ZIP/OPC/HWPX ๊ตฌ์กฐ ๊ฒ€์‚ฌ
โ”‚   โ”œโ”€โ”€ page_guard       #   ๊ตฌ์กฐ ๋ณ€ํ™” ์ง•ํ›„ ์ ๊ฒ€
โ”‚   โ””โ”€โ”€ template_analyzer#   ๋ ˆํผ๋Ÿฐ์Šค ๋ฌธ์„œ ๋ถ„์„/์ถ”์ถœ
โ””โ”€โ”€ hwpx.templates       # ๋‚ด์žฅ ๋นˆ ๋ฌธ์„œ ํ…œํ”Œ๋ฆฟ

๋ฌธ์„œ

๐Ÿ“– ์ „์ฒด ๋ฌธ์„œ Sphinx ๊ธฐ๋ฐ˜ API ๋ ˆํผ๋Ÿฐ์Šค, ์‚ฌ์šฉ ๊ฐ€์ด๋“œ, FAQ
๐Ÿš€ ๋น ๋ฅธ ์‹œ์ž‘ 5๋ถ„ ์•ˆ์— HWPX ๋ฌธ์„œ ๋‹ค๋ฃจ๊ธฐ
๐Ÿ“š ์‚ฌ์šฉ ๊ฐ€์ด๋“œ 50+ ์‹ค์ „ ์‚ฌ์šฉ ํŒจํ„ด
๐Ÿ”ง API ๋ ˆํผ๋Ÿฐ์Šค ํด๋ž˜์Šคยท๋ฉ”์„œ๋“œ ์ƒ์„ธ ๋ช…์„ธ
๐Ÿ“ ์Šคํ‚ค๋งˆ ๊ฐœ์š” OWPML ์Šคํ‚ค๋งˆ ๊ตฌ์กฐ ์„ค๋ช…
๐Ÿงช ์Šคํƒ ํ†ตํ•ฉ ์ž๋ฃŒ fixture, smoke, validation, compatibility ์šด์˜ ์ž๋ฃŒ

์ง€์› ํฌ๋งท

ํฌ๋งท ํ™•์žฅ์ž ์ฝ๊ธฐ ์“ฐ๊ธฐ
HWPX .hwpx โœ… โœ…
HWP .hwp โŒ โŒ

Note: HWP(v5 ๋ฐ”์ด๋„ˆ๋ฆฌ) ํŒŒ์ผ์€ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ•œ์ปด์˜คํ”ผ์Šค์—์„œ HWPX๋กœ ๋ณ€ํ™˜ ํ›„ ์‚ฌ์šฉํ•˜์„ธ์š”.

์š”๊ตฌ ์‚ฌํ•ญ

  • Python 3.10+
  • lxml โ‰ฅ 4.9

์•Œ๋ ค์ง„ ์ œ์•ฝ

  • add_shape() / add_control()์€ ํ•œ/๊ธ€์ด ์š”๊ตฌํ•˜๋Š” ๋ชจ๋“  ํ•˜์œ„ ์š”์†Œ๋ฅผ ์ƒ์„ฑํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ณต์žกํ•œ ๊ฐœ์ฒด๋ฅผ ์ถ”๊ฐ€ํ•  ๋•Œ๋Š” ํ•œ/๊ธ€์—์„œ ์—ด์–ด ๊ฒ€์ฆํ•ด ์ฃผ์„ธ์š”.
  • ์ด๋ฏธ์ง€ ์‚ฝ์ž… ์‹œ ๋ฐ”์ด๋„ˆ๋ฆฌ ์ž„๋ฒ ๋“œ๋Š” ์ง€์›ํ•˜์ง€๋งŒ, <hp:pic> ์š”์†Œ์˜ ์™„์ „ํ•œ ์ž๋™ ์ƒ์„ฑ์€ ์ œ๊ณตํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ์•”ํ˜ธํ™”๋œ HWPX ํŒŒ์ผ์˜ ์•”๋ณตํ˜ธํ™”๋Š” ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๊ธฐ์—ฌํ•˜๊ธฐ

๋ฒ„๊ทธ ๋ฆฌํฌํŠธ, ๊ธฐ๋Šฅ ์ œ์•ˆ, PR ๋ชจ๋‘ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ ํ™˜๊ฒฝ ์„ค์ •๊ณผ ํ…Œ์ŠคํŠธ ๋ฐฉ๋ฒ•์€ CONTRIBUTING.md๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

git clone https://github.com/airmang/python-hwpx.git
cd python-hwpx
pip install -e ".[dev]"
pytest

๋จธ์ง€๋œ ๊ธฐ์—ฌ์ž ๋ชฉ๋ก์€ CONTRIBUTORS.md์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

License

Apache License 2.0. See LICENSE and NOTICE.


Maintainer

Primary maintainer/contact: ๊ณ ๊ทœํ˜„ โ€” ๊ด‘๊ต๊ณ ๋“ฑํ•™๊ต ์ •๋ณดยท์ปดํ“จํ„ฐ ๊ต์‚ฌ

About

Pure Python HWPX automation: read, edit, generate, and validate documents without Hancom Office.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages