Hash :
96a92808
Author :
Date :
2016-11-26T14:00:40
Git HTTP | https://git.kmx.io/kc3-lang/md4c.git |
---|---|
Git SSH | git@git.kmx.io:kc3-lang/md4c.git |
Public access ? | public |
Description |
Fork of https://github.com/mity/md4c |
Users |
|
Tags |
|
Home: http://github.com/mity/md4c
MD4C stands for “Markdown for C” and, unsurprisingly, it is a C Markdown parser implementation.
In short, Markdown is the markup language this README.md
file is written in.
The following resources can explain more if you are unfamiliar with it:
MD4C is C Markdown parser with the following features:
Compliance: Generally MD4C aims to be compliant to the latest version of CommonMark specification. Right now we are quite close to CommonMark 0.27.
Extensions: If explicitly enabled, the parser supports some commonly requested and accepted extensions. See below.
Compactness: MD4C is implemented in one source file and one header file.
Embedding: MD4C is easy to reuse in other projects, its API is very straightforward.
Portability: MD4C builds and works on Windows and Linux, and it should be fairly trivial to build it also on other systems.
Encoding: MD4C can compiled to recognize ASCII-only control characters, UTF-8 and, on Windows, also UTF-16 little endian, i.e. what is commonly called Unicode on Windows.
Permissive license: MD4C is available under the MIT license.
Performance: MD4C is quite fast.
The parser is implemented in a single C source file md4c.c
and its
accompanying header md4c.h
.
The main provided function is md_parse()
. It takes a text in Markdown syntax
as an input and a renderer structure which holds pointers to few callback
functions. As md_parse()
eats the input, it calls appropriate callbacks
allowing application to convert it into another format or render it onto
the screen.
Refer to the header file for more details, the API is mostly self-explaining and there are some explanatory comments.
Example implementation of simple renderer is available in the md2html
directory which implements a conversion utility from Markdown to HTML.
By default, MD4C recognizes only elements defined by CommonMark specification.
However with appropriate flags enabling it, behavior of MD4C parse can be tuned to enable some extensions or allowing some deviations from the specification.
MD_FLAG_COLLAPSEWHITESPACE
, non-trivial whitespace is
collapsed into a single space. MD_FLAG_TABLES
, GitHub-style tables are supported. MD_FLAG_PERMISSIVEURLAUTOLINKS
permissive URL autolinks
(not enclosed in ‘<’ and ‘>’) are supported. MD_FLAG_PERMISSIVEAUTOLINKS
, ditto for e-mail autolinks. MD_FLAG_NOHTMLSPANS
or MD_FLAG_NOHTML
, raw inline HTML
or raw HTML blocks respectively are disabled. MD_FLAG_NOINDENTEDCODEBLOCKS
, indented code blocks are
disabled. The CommonMark specification generally assumes UTF-8 input, but under closer inspection Unicode is actually used on very few occasions.
MD4C uses this property of the standard and its implementation is to a large degree encoding-agnostic, just with the assumption the encoding of your choice is compatible with ASCII.
By default MD4C simply only understands the ASCII characters as those making the marks in the document, and all the other input (the text) is provided as it is on the input.
That said, the Unicode is supported too:
If you predefine macro MD4C_USE_UNICODE
, MD4C performs parsing of UTF-8
locally where it does matter.
On Windows, if you predefine macro MD4C_USE_WIN_UNICODE
, MD4C shall use
WCHAR
instead of char
and will assume UTF16-LE encoding.
It should be relatively easy to add support for any other encoding, as long as its codepoints below 128 are compatible with ASCII.
MD4C is covered with MIT license, see the file LICENSE.md
.
If you encounter any bug, please be so kind and report it. Unheard bugs cannot get fixed. You can submit bug reports here: