• Show log

    Commit

  • Hash : 463bbeec
    Author : Nick Wellnhofer
    Date : 2022-12-19T18:39:45

    entities: Rework entity amplification checks
    
    This commit implements robust detection of entity amplification attacks,
    better known as the "billion laughs" attack.
    
    We now limit the size of the document after substitution of entities to
    10 times the size before expansion. This guarantees linear behavior by
    definition. There already was a similar check before, but the accounting
    of "sizeentities" (size of external entities) and "sizeentcopy" (size of
    all copies created by entity references) wasn't accurate.
    
    We also need saturation arithmetic since we're historically limited to
    "unsigned long" which is 32-bit on many platforms.
    
    A maximum of 10 MB of substitutions is always allowed. This should make
    use cases like DITA work which have caused problems in the past.
    
    The old checks based on the number of entities were removed. This is
    accounted for by adding a fixed cost to each entity reference.
    
    Entity amplification checks are now enabled even if XML_PARSE_HUGE is
    set. This option is mainly used to allow larger text nodes. Most users
    were unaware that it also disabled entity expansion checks.
    
    Some of the limits might be adjusted later. If this change turns out to
    affect legitimate use cases, we can add a separate parser option to
    disable the checks.
    
    Fixes #294.
    Fixes #345.