|
aa53f82c
|
2024-02-07T11:44:39
|
|
Introduce an overall limit to link. ref. defs instantiations.
This is to prevent time and output size explosion in case of input
pattern generated by this:
$ python -c 'N=1000; print("[x]: " + "x" * N + "\n[x]" * N)'
We roughly allow to blowing up the input size of the document
16 times by link reference definitions or up to 1 MB, whatever is
smaller. When the threshold is reached, following reference definitions
are sent to output unresolved as a text.
Fixes #238.
|
|
30945d80
|
2024-02-01T22:16:22
|
|
md_is_link_label: Fix warning about potentially uninitialized variable...
... when build with gcc 13.2.0 in release build.
|
|
f37a89f5
|
2024-02-01T21:55:45
|
|
md_is_inline_link_spec: Use md_lookup_line() instead of walking.
Fixes #236.
|
|
a44a1cf8
|
2024-01-11T11:29:29
|
|
Update tags for HTML block starting condition.
Specifically, "<source>" has been removed, "<search>" added.
|
|
4aea320a
|
2024-01-09T02:25:05
|
|
md_is_html_comment: Reflect updated spec.txt.
* Accept "<!-->" and "<!--->" as valid HTML comments.
* HTML comment now can contain "--"
|
|
ef4dcd41
|
2024-01-09T02:54:25
|
|
Updated spec.txt expands what's recognized as Unicode punctuation.
Namely all P and S general categories are now treated as punctuation.
|
|
5bd62241
|
2024-01-28T08:10:06
|
|
Fix warning about a shadowed variable (with -Wshadow).
Fixes #234.
|
|
90f8d964
|
2024-01-26T10:23:36
|
|
Put all compiler option to one place and unify them for all targets.
(And fix a newly triggered warning in md2html/md2html.c.)
|
|
3e8048db
|
2024-01-26T03:09:56
|
|
Improve/unify approach to line indexing.
* Use consistently type MD_SIZE for line indeces.
* Remove pointer arithmetic if lines and replace it with line index
arithmetic.
This resolves some warnings in MSVC builds.
See PR #232.
Co-authored-by: Martin Mitas <mity@morous.org>
Co-authored-by: Shawn Rutledge <s@ecloud.org>
|
|
5178c585
|
2024-01-25T23:53:58
|
|
Fix uninitialized variable.
This was regression introduced in the commit
aeddaf587f5e3faceb2d88a45ef7987bdebfe837.
|
|
aeddaf58
|
2024-01-25T22:24:17
|
|
Simplify and fix handling of newline in code span.
Fixes #223 properly (one corner case has been unnoticed/hidden due test
suite normalization feature).
Fixes #230 (strictly speaking duplicate of the corner case).
|
|
f46000c7
|
2024-01-24T09:49:59
|
|
Use UTF-8 in copyright notes.
|
|
2cb4f23f
|
2024-01-22T09:14:58
|
|
md_collect_marks: Improve pre-test for '.'.
|
|
23e7929b
|
2024-01-22T09:10:25
|
|
md_analyze_permissive_autolink: Check left boundary asap.
|
|
fcd3ca13
|
2024-01-21T15:20:49
|
|
Fix source indentation.
|
|
83e093fb
|
2024-01-21T11:50:18
|
|
md_opener_stack: Mark the default branch of switch as unreachable.
We were returning NULL previously, but that would lead to a crash
anyway; all callsites expect to get their respective stack anyway
and anything else would mean we are internally broken.
|
|
0672f27c
|
2024-01-21T11:45:02
|
|
md_process_table_row: Remove not needed freeing of ptr_stack.
This is already handled universally in
md_process_normal_block_contents() which is called from
md_process_table_row() via md_process_table_cell().
|
|
faf39849
|
2024-01-21T11:42:30
|
|
md_is_html_cdata: Remove not needed max_end shrinking.
md_scan_for_html_closer() handles that internally.
|
|
65957f53
|
2024-01-19T10:37:33
|
|
Limit number of table columns to prevent explosion of output...
with the input pattern in the form of geneated by this one-liner:
$ python3 -c 'N=1000; print("x|" * N + "\n" + "-|" * N + "\n" + "x\n" * N)'
Here the amount of HTML otput grows with N^2.
|
|
70b247cf
|
2024-01-19T13:59:45
|
|
md_analyze_permissive_autolink: Accept path ending with '/'.
Fixes #226.
|
|
bbb43fe0
|
2024-01-18T17:30:44
|
|
Rename PUSH_MARK() to ADD_MARK().
This is to pevent confusion with opener stack operations.
|
|
246e105d
|
2024-01-18T17:22:54
|
|
Refactor mark chains. (#224)
* Rename MD_MARKCHAIN to MD_MARKSTACK to indicate its semantics much
clearer.
* Simplify its implementation (single-linked list instead of
double-linked one).
* Where it was reused (misused?) for other, unrelated stuff, with other
semantics, it's now done explicitly. (i.e. got rid of
TABLECELLBOUNDARIES).
* PTR_CHAIN still uses the stack (we don't care about order there), but
it got separated from the array of ordinary opener stacks at least.
|
|
601ff053
|
2024-01-18T16:28:16
|
|
Fix handling new line at beginning/end of a code span.
Fixes #223.
|
|
c076698a
|
2024-01-18T16:10:46
|
|
md_collect_marks: Get rid of helper vars line_beg, line_end.
|
|
08728831
|
2024-01-18T13:39:48
|
|
md_rollback: Update outdated comment.
|
|
d40458b5
|
2024-01-18T12:39:36
|
|
md_rollback: Simplify the function.
We assume the provided opener_index and closer_index do not cross
boundaries of already resolved ranges. Previously the function tried
deal with such situation but this code should not be needed, it was very
complex and, most importantly, broken anyway.
|
|
a08f6a05
|
2024-01-18T12:29:31
|
|
Improve/fix latex math extension.
To mitigate false positives:
* We accept $ and $$ as a potential opener only if it's not preceded
with alnum char.
* Similarly closer cannot be followed with alnum char.
* We now also match closer with last preceding pontential opener, not
the first one. (And to avoid nesting, any previous openers are
ignored.)
* Also revert an unintended change in 3fc207affaba313cc1f4ef3b4e9e57df89b0e028
which allowed keeping nested resolved marks in it.
|
|
3fc207af
|
2024-01-18T10:56:12
|
|
Handle e-mail autolinks in a safer way.
For standard e-mail autolinks <user@host> we internally transformed '<'
into '@' (permissive e-mail autolink) to unify handling of missing
"mailto:" needed into the destination attribute.
This is now not true anymore and we handle that specially.
It is actually what has bitten us in
https://oss-fuzz.com/testcase-detail/4815193402048512.
Even though this isn't the root cause of the issue, this change makes
the code safer and easier to understand.
|
|
4728cd98
|
2024-01-17T16:04:14
|
|
md_analyze_tilde: Pop from chain tail like other emphasis.
The function incorrectly used header from the head, leading to wrong
result (incompatible with e.. GFM) but even worse to bad internal state
md_rollback() is then potentially unable to solve.
Fixes #222.
|
|
006611b9
|
2024-01-17T15:03:00
|
|
md_analyze_dollar: Call md_rollback() only when resolving.
Fixes #221.
|
|
d955c495
|
2024-01-17T02:48:57
|
|
Rework permissive autolinks. (#220)
* We have now dedicated run over the inline marks for them.
* We check more throughly whether it really looks as an URL or e-mail
address. The old implementation recognized even heavily broken ones.
* This allows us to be much more careful in order not to cross already
resolved marks.
* Share substantial parts of the code between all three types of the
permissive autolinks (URL, WWW, e-mail).
* Merge their tests into one file, spec-permissive-autolinks.txt.
* Add one pathological case which triggered quadratic behavior in the
old implementation.
|
|
0ac9f35d
|
2024-01-16T09:53:41
|
|
md_analyze_marks: Skip analyzing marks if...
they fall into range of previously analyzed mark. That can happen if the
previous mark has been expanded. That typically happens for permissive
auto-links.
This fixes one case of pathologic input leading to quadratic behavior.
|
|
b6777d78
|
2024-01-16T01:30:59
|
|
Wiki-links extension: Search for '|' only outside resolved ranges.
|
|
afeece29
|
2024-01-15T23:03:21
|
|
Fix line indentation calculation when interrupting list...
due the "list item cannot begin with two blank lines" rule.
|
|
78829427
|
2024-01-13T02:59:35
|
|
Fix some emphasis parsing issues.
* We incorrectly applied the infamous rule of three only to
asterisk-encoded emphasis, it has to be applied to underscore as
well.
* We incorrectly applied the rule of three only if the opener
and/or closer was inside a word. It has also to be applied if the
mark is both preceded and followed by punctuation.
Fixes #217.
|
|
5592352f
|
2024-01-13T00:30:08
|
|
HTML declaration doesn't require whitespace before the closer.
Fixes #216.
|
|
7497ea92
|
2024-01-13T00:17:08
|
|
Allow tabs after setext header underline.
Fixes #215.
|
|
2750d9fa
|
2024-01-13T00:02:12
|
|
Add tags <h2>...<h6> as triggers for HTML block type 6.
Fixes #214.
|
|
4a64fee2
|
2024-01-11T13:12:55
|
|
Bump copyright years.
|
|
5204c30d
|
2024-01-11T12:41:40
|
|
md_is_html_block_end_condition: Fix return value.
|
|
f32a861e
|
2024-01-11T12:20:23
|
|
md_end_current_block: Fix EOL handling.
|
|
76abc636
|
2024-01-11T12:09:22
|
|
md_is_html_block_end_condition: Fix EOF handling.
|
|
4a7246de
|
2024-01-11T11:55:38
|
|
md_is_inline_link_spec: Fix EOL checking.
|
|
e25ea3d1
|
2024-01-11T03:34:24
|
|
Update list of named entities.
|
|
c6535ff3
|
2024-01-10T21:39:24
|
|
Fix eof handling in a middle of task list item.
|
|
ebbb12e5
|
2024-01-10T20:29:02
|
|
Revert most of PR #168
i.e of the commit f436c3029850c138e54a0de055d61db45130409e.
It added bunch of checks all over the place, but most of them
shouldn't be needed: If they are true, our internal state is
already broken. In other words, those checks are hiding real bugs
and making debugging harder.
Hopefully the underlying bugs are already fixed in some of previous
commits addressing some fuzzing issues, like these:
* d775b5103ee130edbd808e21d1da6ca75f76a558
* c6942ef03ed46a67bd9b3af8ce1eefd781622777
|
|
d775b510
|
2024-01-10T18:33:32
|
|
More fixes of TABLECELLBOUNDARIES chain handling.
Fixes #213.
|
|
c6942ef0
|
2024-01-10T17:31:55
|
|
Treat TABLECELLBOUNDARIES chain as special one.
It's not an ordinary openers chain as (most of) the others, and
md_rollback() must not touch it.
Fixes #212.
|
|
ca169a92
|
2024-01-10T12:22:04
|
|
Fix HTML renderer to handle neted images correctly.
Fixes #210.
|
|
efcfd7e7
|
2024-01-09T02:32:17
|
|
Added MD_SPAN_A_DETAIL.is_autolink (#181)
This allows the processor to tell whether an <A> tag is the result of
an autolink, and customize its output. For example, I want to emit an
autolink of an image URL as an <IMG> tag, and an autolink of a YouTube
URL as a video embed.
|
|
61949ee9
|
2024-01-09T02:08:48
|
|
Update to Unicode 15.1.
|
|
38303af3
|
2024-01-09T00:01:35
|
|
Make md_is_html_block_end_condition() reuse the same data...
... as md_is_html_block_start_condition() for the type 1 so we make all
tags are used consistently there.
Fixes #207.
|
|
319631f6
|
2024-01-08T21:52:30
|
|
Don't merge multiple HTML blocks together.
Fixes #202.
|
|
6ef3be6e
|
2024-01-08T20:09:57
|
|
`MD_FLAG_HARD_SOFT_BREAKS` (#193)
|
|
f554bf11
|
2024-01-08T20:55:54
|
|
Don't trim HTML block lines (MD_LINE_HTML) (#206)
Markdown 0.30 doesn't mandate right-trimming the contents of HTML lines.
Doing so is more work and breaks output compatibility with cmark, tested
with https://github.com/commonmark/cmark/commit/9393560.
|
|
132c29dc
|
2024-01-08T19:31:37
|
|
Allow indented code block to follow any block except paragraph without a blank line.
Fixes #200.
|
|
601c8ab7
|
2024-01-08T19:06:04
|
|
Restore parent's block indentation when interruping a list item with double blank line.
Fixes #190.
|
|
28f253d7
|
2024-01-08T18:18:51
|
|
Fix some gcc warnings with -pedantic.
Fixes #187.
|
|
f7c8db75
|
2022-01-14T11:04:02
|
|
md_rollback: Fix dummization of virtual closers.
Fixes #173.
|
|
6abb7789
|
2022-01-14T10:13:28
|
|
Remove debug messages left by mistake in the previous commit.
|
|
62b60979
|
2022-01-14T10:00:09
|
|
Reset TABLECELLBOUNDARIES with ordinary opener chains.
This is needed because special handling of '|' is now done also if the
wiki-links extension is enabled so the chain is populated even with that
extension.
Fixes #174.
|
|
db9ab417
|
2022-01-12T16:16:00
|
|
Improve wiki-link parsing.
* md_rollback: Restore dummy marks changed to virtual zero-length
closers.
* md_analyze_links: Be more careful in how we rollback contents
of a full wiki link (`[[destination|label]]`). The destination has to
be rollbacked completely (MD_ROLBACK_ALL) while the label only with
MD_ROLLBACK_CROSSING.
Fixes #173.
|
|
8dd35762
|
2022-01-11T20:53:04
|
|
md_analyze_dollar: Simplify the function.
|
|
4358c40a
|
2022-01-11T10:28:06
|
|
md_lookup_line: Advance to the next line even if the offset...
falls into a gap between two lines, instead of returning NULL.
Fixes NULL dereference in md_is_link_reference(). This was a regression
in 2e9b13cc512b5984b010a7934253702a6763f4f7.
|
|
c058e82c
|
2022-01-10T12:34:57
|
|
md_is_table_underline: Fix detection by the end of file.
This was a regression in a8bb4d3020eb1cfa07f01241c2aa668d91011cb5.
|
|
b42e7f5c
|
2022-01-10T11:41:25
|
|
md_resolve_links: Avoid link ref. def. lookup if...
if we know that the bracket pair contains nested brackets. That makes
the label invalid anyway, therefore we know that there is no link ref.
def. to be found anyway.
In case of heavily nested bracket pairs, the lookup could lead to
quadratic parsing times.
Fixes #172.
|
|
2e9b13cc
|
2022-01-10T03:10:43
|
|
md_lookup_line: New function.
The function performs a binary search over array of MD_LINE structs to
find a line the given offset lives on.
Replaced few linear scans for such lines with a call to this function.
|
|
f436c302
|
2022-01-06T16:21:51
|
|
Fix buffer overflows and other errors found with fuzzying. (#168)
Fix multiple buffer overflow on input found with fuzzying.
|
|
eeb32ecc
|
2022-01-06T16:16:45
|
|
Merge pull request #167 from dtldarek/master
Two buffer overflow fixes.
|
|
a8bb4d30
|
2022-01-06T16:01:55
|
|
md_is_table_underline: Remove requirement for minimal length of a cell underline.
Fixes #169.
|
|
260cd339
|
2021-08-25T15:02:38
|
|
Fix buffer overflow on input found with fuzzying (in c-string format):
"\n# h1\nc hh##e2ked\n\n A | rong__ ___strong \u0000\u0000\u0000\u0000\u0000\u0000\a\u0000\u0000\u0000\u0000\n# h1\nh# #2\n### h3\n#### h4\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\\\n##### h5\n#*#####\u0000\n6"
|
|
933388a6
|
2021-08-25T14:41:49
|
|
This is a fix for a buffer overflow that happens on input found with fuzzying (in c-string format): "\xA9##r[](r[](".
|
|
ab422e83
|
2021-07-15T19:12:49
|
|
md4c-html.h: Fix typo in a comment.
|
|
ccc8b64a
|
2021-07-15T18:56:46
|
|
md_html: Add `~` to the list of characters not escaped in URIs.
Fixes #165.
|
|
82b226ff
|
2021-06-27T18:42:09
|
|
md_is_html_block_start_condition: Accept lower-case HTML declaration.
The change is mandated by the spec v. 0.30.
|
|
d50a0142
|
2021-06-27T18:31:02
|
|
md_is_html_block_start_condition: Update for 0.30.
The spec. 0.30 adds the tag <textarea> into the list if HTML blocks
start condition type 1.
|
|
e8285942
|
2021-06-14T09:47:17
|
|
Fix MSVC compiler level 3 warnings (#162)
Fix various C4244 warnings with the MSVC compiler for 64 bit
|
|
b2ee4b19
|
2021-04-14T18:27:19
|
|
md_resolve_links: Fix the test for the nested autolink covering whole link text.
This fixes the fix for #152.
|
|
bcb55d0d
|
2021-04-14T09:18:09
|
|
md_resolve_links: Suppress bogus nested permissive autolink.
Fixes #152.
|
|
4fc808d8
|
2021-03-29T12:51:48
|
|
md_analyze_line: Avoid reading 1 byte beyond the input size.
Fixes #155.
|
|
aa654230
|
2021-03-22T14:00:35
|
|
md_enter_child_containers: Propagate list mark character properly.
Fixes #153, #154.
|
|
fe2f2427
|
2021-02-11T11:35:54
|
|
Fix copy&paster error in a comment.
|
|
fd7b5fe0
|
2021-02-05T21:40:47
|
|
md_analyze_line: Fix implicit ending of HTML blocks...
... when the HTML block is not explicitly ended (before the enclosing
container block ends).
Fixes #149.
|
|
2b6ebdfa
|
2021-01-09T11:54:27
|
|
Fix use of the cmake package (#146)
Fix use of the cmake package
Fix use of the cmake package and its imported targets.
Make sure that the include dir comes with the cmake targets
Put everything under md4cConfig so that the md4c-html can see
md4c.
Use md4c namespace so that the targets become md4c::md4c and
md4c::md4c-html following cmake standards for imported targets.
Fixes #145.
|
|
9ba57ccb
|
2020-12-14T19:53:58
|
|
md_link_label_cmp_load_fold_info: Remove a bogus code.
The input into the function is already guaranted to not have a new line
characters. (And handling of them in the function was broken anyway.)
|
|
5a44e327
|
2020-12-14T18:59:56
|
|
md_link_label_cmp: Fix the loop end condition.
The old version likely could stop prematurely in a corner case when
there was a Unicode character at the end of the either string, which
maps into multiple fold info codepoints.
Fixes #142.
|
|
d4a78622
|
2020-12-14T18:49:35
|
|
Minor cleanup.
|
|
701a0626
|
2020-12-14T18:45:54
|
|
Make MD_UNICODE_FOLD_INFO::n_codepoints unsigned.
|
|
a45f839b
|
2020-12-14T12:21:50
|
|
Fix mixed signed/unsigned comparisons
Force both operands to unsigned. n_codepoints does not seem to ever
contain negative offsets anyhow, should it actually be unsigned?
|
|
6dd64346
|
2020-12-14T01:40:40
|
|
Silence "unused parameter" warnings
Merely added a suitable macro. Didn't refactor any code to
actually figure out why the parameters were not used.
|
|
569defae
|
2020-12-14T01:25:26
|
|
Silence -Wimplicit-fallthrough warnings
Use a macro that dispatches to the compiler-specific magic
to silence implicit fallthrough warnings when the fallthrough
was actually intended. The code already featured comments,
so these are actually safe to place.
(Unfortunately, Clang does not recognize any comment as
"fall through" comment, and GCC only recognizes some variations
of "fall through", not "pass through". Moreover, one of the
comments replaced here had a typo...)
|
|
e1b41876
|
2020-12-14T01:24:56
|
|
Enable more warnings when building under GCC/Clang
|
|
26003b88
|
2020-12-04T20:42:22
|
|
md_is_container_mark: Recognize list item marks just before EOF.
We were recognizing the list item marks when a new line or a blank
character follows.
However, given end-of-file means implicitly also an end-of-line, we
should recognize in that situation too.
Fixes #139.
|
|
3254b7cb
|
2020-11-13T12:02:39
|
|
md_process_table_block_contents: Suppress empty TBODY block generation.
When the table has no body rows, do not call the callback with
MD_BLOCK_TBODY events.
Fixes #138.
|
|
a997cb21
|
2020-10-18T09:34:10
|
|
Add MD_BLOCK_TABLE_DETAIL.
This allows renderers to have the info about table dimension (table
column and row count) in advance and e.g. simplify their memory
allocation strategy.
|
|
4585088a
|
2020-11-13T10:16:34
|
|
md_analyze_permissive_url_autolink: Better GFM compatibility.
The autolinks now allow unmatched parenthesis, only the trailing
parenthesis closers are handled specially to deal with the situation the
autolink is all inside an outer parenthesis.
Somehow our tests were broken and avoided the cases with unmatched
parenthesis pairs inside the auto-link. That's now fixed and in sync
with GFM specs too.
Fixes #135.
|
|
c3a18d55
|
2020-11-13T09:27:10
|
|
md_collect_marks: continue -> break
Does not cause any change in behavior: we just avoid needless loop
iterations now.
|
|
baa1dd06
|
2020-11-09T16:02:06
|
|
Fix some English wording in comments.
|
|
125e8e03
|
2020-10-18T10:18:11
|
|
Initializes an uninitilized variable in md_analyze_emph
Fixes the following, reported by clang analysis:
src/md4c.c:3729:61: warning: variable 'opener_index' may be uninitialized when used here [-Wconditional-uninitialized]
MD_MARKCHAIN* opener_chain = md_mark_chain(ctx, opener_index);
^~~~~~~~~~~~
src/md4c.c:3686:25: note: initialize the variable 'opener_index' to silence this warning
int opener_index;
^
= 0
|
|
1a2f4816
|
2020-10-18T10:56:49
|
|
Adds missing field initializers (undefined behavior)
src/md4c.c:5667:72: warning: missing field 'beg' initializer [-Wmissing-field-initializers]
static const MD_LINE_ANALYSIS md_dummy_blank_line = { MD_LINE_BLANK, 0 };
|