Edit

kc3-lang/brotli/scripts/dictionary/step-02-rfc-to-bin.py

Branch :

  • Show log

    Commit

  • Author : Eugene Kliuchnikov
    Date : 2021-08-31 14:07:17
    Hash : 0e42caf3
    Message : Migrate to github actions (#920) Not all combinations are migrated to the initial configuration; corresponding TODOs added. Drive-by: additional combinations uncovered minor portability problems -> fixed Drive-by: remove no-longer used "script" files. Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>

  • scripts/dictionary/step-02-rfc-to-bin.py
  • # Step 02 - parse RFC.
    #
    # Static dictionary is described in "Appendix A" section in a hexadecimal form.
    # This tool locates dictionary data in RFC and converts it to raw binary format.
    
    import re
    
    rfc_path = "rfc7932.txt"
    
    with open(rfc_path, "r") as rfc:
      lines = rfc.readlines()
    
    re_data_line = re.compile("^      [0-9a-f]{64}$")
    
    appendix_a_found = False
    dictionary = []
    for line in lines:
      if appendix_a_found:
        if re_data_line.match(line) is not None:
          data = line.strip()
          for i in range(32):
            dictionary.append(int(data[2 * i:2 * i + 2], 16))
          if len(dictionary) == 122784:
            break
      else:
        if line.startswith("Appendix A."):
          appendix_a_found = True
    
    bin_path = "dictionary.bin"
    
    with open(bin_path, "wb") as output:
      output.write(bytearray(dictionary))
    
    print("Parsed and saved " + str(len(dictionary)) + " bytes to " + bin_path)