How to Turn Markdown Into HTML for Publishing
Most content workflows start in Markdown. Most publishing systems ultimately use HTML. Converting between them is usually a build step — part of a static site generator, a CMS pipeline, or a document processing workflow. Understanding what changes and what to watch out for prevents subtle publishing errors.
When the conversion happens
Markdown to HTML conversion is a standard step in several workflows:
- Static site generators (Hugo, Jekyll, Eleventy, Astro) convert Markdown files to HTML pages as part of the build
- CMSes with Markdown support (Ghost, Contentful, Sanity) convert on read when serving content via API or template
- Documentation tools (MkDocs, Docusaurus, GitBook) convert Markdown docs to a browsable HTML site
- Single-page conversions where you need to paste a Markdown document into a system that only accepts HTML
What Markdown elements become
## Heading → <h2>Heading</h2>
**bold** → <strong>bold</strong>
*italic* → <em>italic</em>
[text](url) → <a href="url">text</a>
`code` → <code>code</code>
```block``` → <pre><code>block</code></pre>
--- → <hr>
- item → <ul><li>item</li></ul>
GFM table syntax produces <table>, <thead>, <tbody>, <tr>, and <td>/<th> elements. Task list syntax (- [ ] and - [x]) produces checkboxes.
Sanitising user-generated content
Markdown can contain inline HTML. If a user writes <script>alert('xss')</script> in a Markdown comment field and you convert and render it directly, you have a cross-site scripting vulnerability. The script runs in every visitor's browser.
Sanitise HTML output before rendering it if the Markdown came from user input. Libraries for this: DOMPurify (browser JavaScript), bleach (Python), sanitize-html (Node.js). These remove or escape dangerous HTML while keeping safe formatting tags.
allowDangerousHtml, python-markdown's html extension). Never enable this for user-generated content. It's appropriate only when you fully control the Markdown source.Heading level considerations
Markdown heading levels are relative — # means "top level heading" but in a published page that might mean H2 if the H1 is the page title. Many publishing systems shift heading levels by one to prevent Markdown content from generating H1 elements that conflict with the page's H1.
If your converted content has heading level mismatches — the visual hierarchy looks wrong in the rendered page — check whether the rendering pipeline applies a heading level offset and adjust your Markdown accordingly.