|
aeddaf58
|
2024-01-25T22:24:17
|
|
Simplify and fix handling of newline in code span.
Fixes #223 properly (one corner case has been unnoticed/hidden due test
suite normalization feature).
Fixes #230 (strictly speaking duplicate of the corner case).
|
|
d082cdd8
|
2024-01-25T21:25:26
|
|
test/run-testsuite.py: Allow disabling normalisation on per-unittest basis.
And use it for few tests in regressions.txt where the whitespace
matters.
|
|
a3c510ac
|
2024-01-21T14:11:47
|
|
Improve coverage testing of UTF-8 routines.
|
|
cd7c326f
|
2024-01-21T13:20:38
|
|
Add code coverage test for MD_FLAG_COLLAPSEWHITESPACE.
|
|
65957f53
|
2024-01-19T10:37:33
|
|
Limit number of table columns to prevent explosion of output...
with the input pattern in the form of geneated by this one-liner:
$ python3 -c 'N=1000; print("x|" * N + "\n" + "-|" * N + "\n" + "x\n" * N)'
Here the amount of HTML otput grows with N^2.
|
|
70b247cf
|
2024-01-19T13:59:45
|
|
md_analyze_permissive_autolink: Accept path ending with '/'.
Fixes #226.
|
|
601ff053
|
2024-01-18T16:28:16
|
|
Fix handling new line at beginning/end of a code span.
Fixes #223.
|
|
23b14168
|
2024-01-18T15:11:22
|
|
pathological-tests.py: Fix output if a test unit ends with non-zero exit code.
|
|
a08f6a05
|
2024-01-18T12:29:31
|
|
Improve/fix latex math extension.
To mitigate false positives:
* We accept $ and $$ as a potential opener only if it's not preceded
with alnum char.
* Similarly closer cannot be followed with alnum char.
* We now also match closer with last preceding pontential opener, not
the first one. (And to avoid nesting, any previous openers are
ignored.)
* Also revert an unintended change in 3fc207affaba313cc1f4ef3b4e9e57df89b0e028
which allowed keeping nested resolved marks in it.
|
|
4728cd98
|
2024-01-17T16:04:14
|
|
md_analyze_tilde: Pop from chain tail like other emphasis.
The function incorrectly used header from the head, leading to wrong
result (incompatible with e.. GFM) but even worse to bad internal state
md_rollback() is then potentially unable to solve.
Fixes #222.
|
|
f45dd442
|
2024-01-17T02:58:12
|
|
Add regression test for #213.
As it's now possible to add tests with multiple cmdline options easily.
|
|
d955c495
|
2024-01-17T02:48:57
|
|
Rework permissive autolinks. (#220)
* We have now dedicated run over the inline marks for them.
* We check more throughly whether it really looks as an URL or e-mail
address. The old implementation recognized even heavily broken ones.
* This allows us to be much more careful in order not to cross already
resolved marks.
* Share substantial parts of the code between all three types of the
permissive autolinks (URL, WWW, e-mail).
* Merge their tests into one file, spec-permissive-autolinks.txt.
* Add one pathological case which triggered quadratic behavior in the
old implementation.
|
|
a715b884
|
2024-01-16T15:29:35
|
|
Rename many files in test dir for better organization.
|
|
4b9e4d7c
|
2024-01-16T15:32:21
|
|
Move one more forgotten regression test to regressions.txt.
|
|
6685df9c
|
2024-01-16T15:09:33
|
|
Move all regression tests into new tests/regressions.txt.
(And update scripts/run-tests.sh accordingly.)
|
|
74e5f7a9
|
2024-01-16T14:56:09
|
|
Tests: Specify md2html command line options for each example as needed.
Previously the caller (or the script scripts/run_tests.sh) needed to
know what options to specify.
|
|
359406bf
|
2024-01-16T14:25:46
|
|
Test: Add support for per-example command line options.
(We also removed direct call support into the library. It was inherited
from cmark as the testsuite was originally taken from there, but it
actually was never updated to work with MD4C.)
|
|
78829427
|
2024-01-13T02:59:35
|
|
Fix some emphasis parsing issues.
* We incorrectly applied the infamous rule of three only to
asterisk-encoded emphasis, it has to be applied to underscore as
well.
* We incorrectly applied the rule of three only if the opener
and/or closer was inside a word. It has also to be applied if the
mark is both preceded and followed by punctuation.
Fixes #217.
|
|
5592352f
|
2024-01-13T00:30:08
|
|
HTML declaration doesn't require whitespace before the closer.
Fixes #216.
|
|
7497ea92
|
2024-01-13T00:17:08
|
|
Allow tabs after setext header underline.
Fixes #215.
|
|
0d10b60b
|
2024-01-12T22:44:31
|
|
Move test/fuzz-input/ to test/fuzzers/seed-corpus/.
|
|
821477b1
|
2024-01-10T17:35:46
|
|
Fix typo in fuzz-mdhtml.c, preventing oss-fuzz from working.
|
|
c6942ef0
|
2024-01-10T17:31:55
|
|
Treat TABLECELLBOUNDARIES chain as special one.
It's not an ordinary openers chain as (most of) the others, and
md_rollback() must not touch it.
Fixes #212.
|
|
ca169a92
|
2024-01-10T12:22:04
|
|
Fix HTML renderer to handle neted images correctly.
Fixes #210.
|
|
38303af3
|
2024-01-09T00:01:35
|
|
Make md_is_html_block_end_condition() reuse the same data...
... as md_is_html_block_start_condition() for the type 1 so we make all
tags are used consistently there.
Fixes #207.
|
|
8699cd5d
|
2024-01-08T21:58:26
|
|
test/hard-soft-breaks.txt: Fix wording.
|
|
6ef3be6e
|
2024-01-08T20:09:57
|
|
`MD_FLAG_HARD_SOFT_BREAKS` (#193)
|
|
4d2f8a2e
|
2024-01-08T19:35:53
|
|
Add test for issue #201.
Seems the issue got fixed by combination of previous commits.
Fixes #201.
|
|
132c29dc
|
2024-01-08T19:31:37
|
|
Allow indented code block to follow any block except paragraph without a blank line.
Fixes #200.
|
|
601c8ab7
|
2024-01-08T19:06:04
|
|
Restore parent's block indentation when interruping a list item with double blank line.
Fixes #190.
|
|
a27f8dc0
|
2023-12-12T19:31:30
|
|
test/fuzzers.fuzz-mdhtml.c: Remove stale comment.
|
|
d3c1c0bb
|
2022-01-14T17:27:05
|
|
fuzz-mdhtml.c: Cleanup of the code.
|
|
b42e7f5c
|
2022-01-10T11:41:25
|
|
md_resolve_links: Avoid link ref. def. lookup if...
if we know that the bracket pair contains nested brackets. That makes
the label invalid anyway, therefore we know that there is no link ref.
def. to be found anyway.
In case of heavily nested bracket pairs, the lookup could lead to
quadratic parsing times.
Fixes #172.
|
|
7f44e1ad
|
2022-01-10T10:39:29
|
|
pathological_tests.py: Improve code alignment.
|
|
a8bb4d30
|
2022-01-06T16:01:55
|
|
md_is_table_underline: Remove requirement for minimal length of a cell underline.
Fixes #169.
|
|
c01aa6b3
|
2021-06-27T18:28:26
|
|
Update CommonMark spec file to v. 0.30
|
|
bcb55d0d
|
2021-04-14T09:18:09
|
|
md_resolve_links: Suppress bogus nested permissive autolink.
Fixes #152.
|
|
3478ec69
|
2021-02-23T14:01:31
|
|
Added fuzzer for oss-fuzz integration. (#151)
|
|
fd7b5fe0
|
2021-02-05T21:40:47
|
|
md_analyze_line: Fix implicit ending of HTML blocks...
... when the HTML block is not explicitly ended (before the enclosing
container block ends).
Fixes #149.
|
|
da5821ae
|
2020-12-14T19:53:40
|
|
Fix testcase for issue #142.
|
|
5a44e327
|
2020-12-14T18:59:56
|
|
md_link_label_cmp: Fix the loop end condition.
The old version likely could stop prematurely in a corner case when
there was a Unicode character at the end of the either string, which
maps into multiple fold info codepoints.
Fixes #142.
|
|
3254b7cb
|
2020-11-13T12:02:39
|
|
md_process_table_block_contents: Suppress empty TBODY block generation.
When the table has no body rows, do not call the callback with
MD_BLOCK_TBODY events.
Fixes #138.
|
|
4585088a
|
2020-11-13T10:16:34
|
|
md_analyze_permissive_url_autolink: Better GFM compatibility.
The autolinks now allow unmatched parenthesis, only the trailing
parenthesis closers are handled specially to deal with the situation the
autolink is all inside an outer parenthesis.
Somehow our tests were broken and avoided the cases with unmatched
parenthesis pairs inside the auto-link. That's now fixed and in sync
with GFM specs too.
Fixes #135.
|
|
002f76c9
|
2020-10-18T09:37:45
|
|
md_resolve_links: Skip [...] used as a reference link/image label.
Fixes #131.
|
|
c501c891
|
2020-07-30T10:13:05
|
|
Fix spelling of "than" in many occurances.
I often spell it errorneously as "then". Doing this mistake way too
often when typing fast.
|
|
c595c2ed
|
2020-07-30T08:38:19
|
|
md_process_verbatim_block_contents: Fix off by 1 error.
This caused outputting wrong indentation inside a fenced code blocks for
lines indented with mor ethan 16 spaces.
Fixes #124.
|
|
0c4d7f3d
|
2020-07-28T07:18:23
|
|
test/normalize.py: Use html.escape instead of cgi.escape.
Fixes #123.
|
|
d0e3ed79
|
2020-03-12T22:45:32
|
|
md2html: Skip UTF-8 BOM, if present in the input.
|
|
9e6ab76c
|
2020-02-17T12:41:50
|
|
Minor fuzz-input cleanup.
Move some permissive links incorrectly placed in commonmark.md into
gfm.md.
|
|
cc9a9d28
|
2020-02-16T15:29:54
|
|
test/fuzz-test: Add some fuzzing testing initial input.
|
|
5d7c3597
|
2020-02-16T13:46:16
|
|
md_analyze_emph: Detect correctly opener chain when resolving the range.
Fixes #107.
|
|
b4c30cd6
|
2020-02-13T02:23:03
|
|
Improve wiki-link parsing.
* Get rid of MD_LINE::total_indent.
* Remove some special complicated branching for nested images: Instead
we use md_rollback() the wiki-link destination span to kill _any_
marks resolved so far, including the images.
* Remove any length limit from label. Only destination length is
limited, regardless of whether '|' is present or not.
* Move the special handling of `[[foo|]]` from md_process_inlines()
into md_resolve_links(). We simply expand the closer mark to consume
the `|`.
* Do not modify the opener and closer marks until we really know it
is indeed a wiki-link.
|
|
403043bb
|
2020-01-16T16:15:08
|
|
md_mark_chain_append: Set next of the tail mark to -1.
Fixes #104.
|
|
e6661f23
|
2020-01-10T19:27:10
|
|
Implement an underline extension. (#103)
Closes #101.
|
|
82d7d087
|
2020-01-10T15:48:00
|
|
Rework/improve recognition of strike-through spans.
Closes #102.
|
|
561f52e0
|
2020-01-05T18:33:46
|
|
md_is_autolink_email: Fix an off-by-one error.
Fixes #100.
|
|
46f25f0b
|
2019-11-12T21:48:26
|
|
md_analyze_emph: Call md_resolve_range() with proper chain.
Errorneously, we have called md_resolve_range() with mark chain derived
from the closer mark. In the case that the opener and closer marks
differ in length (and we have split one or the other), we pass in an
incorrect chain, which may lead to strange behavior in subsequent
analysis.
Fixes #98.
|
|
e336e640
|
2019-11-04T15:20:59
|
|
Add support for Wiki links (#92)
With a new flag MD_FLAG_WIKILINKS, recoginize wiki-style links
as [[foo]] and [[foo|bar]].
Update also the HTML renderer accordingly, to output a custom
HTML tag <x-wikilink> when seeing it.
|
|
ef85cfc2
|
2019-11-04T15:05:07
|
|
Simplify parsing of tables (#97)
We do so by removing the function md_is_table_row().
md_is_table_row() did some crazy inline parsing to detect whether the
line contains at least one pipe which is not inside a code span or other
high-priority inline element.
This was very complicated under the hood and to was actually breaking
the clean design which separates block analysis parse and inline analysis
of each block contents.
We now just use the table underline for determining the block is table
and its properties like e.g. the column count.
This means a paragraph now cannot interrupt a table. This is a change in
a behavior but likely acceptable one as it actually brings the behavior
closer to behavior of tables in cmark-gfm in this regard.
Last but not least, it seems to prevent adoption of other useful
features, for about that, see the discussion in PR #92.
|
|
993c7b9b
|
2019-11-03T23:32:46
|
|
Render LaTeX math into HTML as a tag <x-equation>...
... instead of <equation>. This is to highlight that it is not a
standard HTML tag.
|
|
e97d0250
|
2019-11-03T13:44:29
|
|
Link label comparision fixes.
* md_link_label_cmp: To match the labels, the loop has to reach ends of
the labels for both of them.
* md_link_label_cmp_load_fold_info: Collapse consequtive whitespace
into a single ' ' for the label comparison purposes.
Fixes #96.
|
|
0354e1ab
|
2019-10-04T22:34:08
|
|
md_is_container_mark: Ordered list mark requires at least one digit.
Fixes #95.
|
|
97606369
|
2019-07-07T11:19:21
|
|
Fix the last test case in latex-math.txt.
|
|
099ce69b
|
2019-07-07T11:15:44
|
|
Add missing file into git.
|
|
2e965941
|
2019-07-07T10:59:20
|
|
Add/improve docs for the LaTeX math spans.
|
|
8bac86aa
|
2019-07-07T09:46:10
|
|
Added support for LaTeX math (#87)
Addresses #86.
|
|
ce8b5d94
|
2019-05-27T22:16:35
|
|
md_analyze_line: Blockquote with blank line can interrupt a paragraph.
Fixes #83.
|
|
51386164
|
2019-05-19T11:46:26
|
|
md_link_label_cmp: Fix handling non-trivial folding info.
Fixes #78.
|
|
4f6a9e54
|
2019-05-19T10:46:26
|
|
Update Unicode support to 12.1.
* scipts/build_*_map.py: Implement helper pythonic scripts used to
generate some Unicode search maps and data for helper Unicode
functions used in MD4C.
This should simplify updating to future Unicode versions.
* md_get_unicode_fold_info: Use data generated by the scripts.
* md_is_unicode_whitespace__: Ditto.
* md_is_unicode_punct__: Ditto.
|
|
aca5c27f
|
2019-05-16T22:48:08
|
|
test/spec.txt: Update from upstream head.
|
|
64a1bc37
|
2019-05-15T23:25:05
|
|
test/coverage.txt: Sort the regression test cases by the issue number.
|
|
919a0cc9
|
2019-05-08T07:38:33
|
|
test/*.txt: Fix some formatting.
|
|
1757ff55
|
2019-05-07T23:10:46
|
|
test/spec_tests.py: Make ready for spec.txt from cmark-gfm project.
This allows easier checking of our GFM dialect compatibility.
|
|
83047d3e
|
2019-05-07T22:24:29
|
|
md_analyze_permissive_url_autolink: Improve.
* Fix domain recognition so that it has to have at least two
dot-delimited components.
* Fix handling if parenthesis so that they have to form balanced
pairs; i.e. the first ')' not having a preceding opener ends the
path.
Fixes #76.
|
|
609dfb0b
|
2019-05-05T15:56:51
|
|
md_analyze_line: Treat blank lines inside a HTML block more carefully...
... with respect to the parent list containers.
Fixes #10 (but now really).
|
|
95279131
|
2019-04-30T00:32:36
|
|
When undoing complete block from ctx->block_bytesp[], reset ctx->current_block properly.
Fixes #74.
|
|
d4d10915
|
2019-04-29T19:03:16
|
|
Improve parsing of inline raw HTML.
* Isolate some common code for scanning HTML closer into a new function
so most HTML scanner functions reuse the same code.
* Improve the scanning for the closer so that on failure we remember
the range where no closer is present. So any later scanning attempts
may fail early.
Fixes #73.
|
|
d7920b9c
|
2019-04-08T19:35:06
|
|
Merge pull request #67 from mity/spec-0.29
This merges all changes for CommonMark specification 0.28 -> 0.29 transition.
|
|
5b78f295
|
2019-04-08T11:00:27
|
|
test/spec.txt: Update from upstream head.
|
|
2a7b97ed
|
2019-04-05T08:18:54
|
|
test/spec.txt: Update from upstream head.
|
|
b8586987
|
2019-04-03T08:28:27
|
|
md_collect_mark: Add missing 'continue' to '~' branch.
Fixes #69.
|
|
855a1bfc
|
2019-03-27T02:04:24
|
|
test/spec.txt: Update from upstream head.
|
|
94c86fe2
|
2019-03-26T14:45:23
|
|
Revert "Fix problematic link destinations with angle brackets."
The updated specification now explicitly requests the behavior we
implemented before fixing #24.
This reverts commit 2e0a74ba990e291ef4eace047d50af05ca81daef.
Also remove associated regression test as it is no longer valid.
|
|
0959975a
|
2019-03-26T14:01:02
|
|
md_analyze_emph: Follow specs changes to the "rule of three".
|
|
98968e22
|
2019-03-26T13:33:05
|
|
Update spec.txt from upstream head.
(I previously used an updated revision of it by mistake.)
|
|
1edd0c9c
|
2019-03-26T11:49:25
|
|
test/spec.txt: Update to current upstream HEAD.
|
|
2dd96ab4
|
2019-03-12T09:56:11
|
|
Fix O(n^2) in handling the "rule of three".
We had to break the list of potential '*' openers into multiple ones so
we do not have to walk it when looking for matching length due to the
"rule of three" for intraword delimiter runs.
Fixes #63.
|
|
b2108652
|
2019-03-11T21:13:15
|
|
md_analyze_line: Fix O(n^2) in thematic break handling.
Fixes #66.
|
|
37104fc2
|
2019-03-11T20:26:58
|
|
md_is_code_span: Fix crash at EOF.
Fixes #65.
|
|
966b8e39
|
2019-03-11T19:56:46
|
|
md_is_link_title: Stop on ')' lin ()-style title.
Fixes #60.
|
|
fc27108e
|
2019-03-11T19:55:08
|
|
test/pathological_tests.py: Output test durations.
|
|
53f65852
|
2019-03-11T19:03:34
|
|
test/spec.txt: Little update.
Somehow we were having little different spec.txt version that the one
from CommonMark repo tag 0.28. But we still pass all its compliance
test suite.
|
|
685b7144
|
2019-03-10T11:20:39
|
|
Move codespan detection from md_analyze_backtick() into...
md_is_code_span(), called from md_collect_marks().
We have to do this at the same time as detecting raw inline HTML to
follow CommonMark priority requirements.
Also it is done very differently now:
When scanning for the closer mark, we remember (the latest) position of
potential closers for all other lengths as well.
This means that:
(1) If we find it, we reduced the task because all subsequent scan shall
begin after the closer.
(2) If we do not find it, then we have to reach the end of the block and
hence we then know (for every allowed marker length) the position of last
such backtick sequence.
(3) That makes the guaranty that any subsequent call with either succeed
in its scan (and reduce the task even further); or that we shall be able
to detect instantly there is no suitable closer.
I.e. every call either reduces the task by O(n) scan (1); or collects
all the data in O(n) because (2) happens at most once; or fails in O(1)
(3).
This makes O(n) guaranty of the function complexity.
Fixes #59.
|
|
0cb61205
|
2019-03-10T10:50:23
|
|
Move raw inline HTML detection from md_analyze_lt_qt() into md_collect_marks().
Fixes #58:
For resolving raw inline HTML the function tried closer with all
potential openers, because raw HTML can have '<' inside of an attribute.
However this caused O(n^2) for input like "<><><><><><><>...".
We solved by handling raw HTML in earlier stage, directly in
md_collect_marks(), where we can scan linerary forward.
Fixes #61:
As a side effect, this also fixes the issue that MD_FLAG_NOHTMLSPANS
disabled also recognition of CommonMark autolinks.
|
|
8e01a769
|
2019-02-10T22:58:42
|
|
Implement task lists. (#50)
Fixes #30.
|
|
d32aa2e0
|
2019-02-09T10:40:52
|
|
Fix conflict in parsing permissive autolinks and ordinary links.
The issues is caused by the fact that we do not know exact position
of permissive auto-link in time of md_collect_marks() because there
is no syntax to mark its end on the 1st place.
This causes that eventually, the closer mark in ctx->marks[] can be
out-of-order somewhat.
As a consequence, if some other mark range (e.g. ordinary link)
shadows the auto-link, the closer mark may be left outside the shadowed
range and survive till the phase when we generate the output.
We fix by using an extra mark flag to remember we did really output
the opener mark, and output the closer only in such case.
Fixes #53.
|
|
67401e70
|
2019-02-06T04:31:25
|
|
md_analyze_inlines: Resolve table cell boundaries before links.
This brings some corner cases closer to cmark-gfm.
Also fixes #51.
|
|
8fc692ba
|
2018-06-11T18:17:26
|
|
md_rollback: Do not touch TABLECELLBOUNDARIES chain.
This chain is not normal opener/closer inline mark chain.
Fixes #42.
|
|
e6e2ea4c
|
2018-06-11T11:43:47
|
|
md_analyze_line: Fix mixing list and table parsing.
If table header underline is not nested the same way as the preceding
line (i.e. the wannabe table header line), then it cannot form a table.
Fixes #41.
|
|
4ef024fb
|
2018-05-29T23:30:02
|
|
md_process_inlines: Fix link/image closers spanning over multiple lines.
Fixes #40.
|