lib/converters.h

Branch


Log

Author Commit Date CI Message
Bruno Haible 9f28054d 2024-12-15T12:52:18 Implement the //NON_IDENTICAL_DISCARD suffix from POSIX:2024. * include/iconv.h.in (ICONV_GET_DISCARD_INVALID, ICONV_SET_DISCARD_INVALID, ICONV_GET_DISCARD_NON_IDENTICAL, ICONV_SET_DISCARD_NON_IDENTICAL): New macros. * lib/converters.h (struct conv_struct): Change type of discard_ilseq to 'unsigned int'. (DISCARD_INVALID, DISCARD_UNCONVERTIBLE): New macros. * lib/iconv.c (iconv_open, iconv_open_into): Change type of discard_ilseq to 'unsigned int'. (iconvctl): Implement ICONV_GET_DISCARD_INVALID, ICONV_SET_DISCARD_INVALID, ICONV_GET_DISCARD_NON_IDENTICAL, ICONV_SET_DISCARD_NON_IDENTICAL. Change the implementation of ICONV_GET_DISCARD_ILSEQ, ICONV_SET_DISCARD_ILSEQ to test/set both bits. * lib/iconv_open1.h: Update comment. Recognize //NON_IDENTICAL_DISCARD. * lib/iconv_open2.h: Update comment. * lib/loop_unicode.h (mb_to_uc_write_replacement): Test the DISCARD_UNCONVERTIBLE bit of discard_ilseq. (unicode_loop_convert): Test the respective bit of discard_ilseq. (unicode_loop_reset): Test the DISCARD_UNCONVERTIBLE bit of discard_ilseq. * lib/loop_wchar.h (wchar_from_loop_convert, wchar_to_loop_convert): Test the DISCARD_INVALID bit of discard_ilseq. * man/iconv_open.3: Mention the //NON_IDENTICAL_DISCARD suffix. Mark as conforming to POSIX:2024. * man/iconv.3: Likewise. * man/iconv_close.3: Mark as conforming to POSIX:2024. * man/iconv.1: Likewise. * man/iconvctl.3: Document ICONV_GET_DISCARD_INVALID, ICONV_SET_DISCARD_INVALID, ICONV_GET_DISCARD_NON_IDENTICAL, ICONV_SET_DISCARD_NON_IDENTICAL. Revise the description of ICONV_GET_DISCARD_ILSEQ, ICONV_SET_DISCARD_ILSEQ. * tests/test-discard.c (test_default, test_translit, test_ignore, test_ignore_translit): Test also the ICONV_GET_DISCARD_INVALID, ICONV_GET_DISCARD_NON_IDENTICAL accessors. (test_nid, test_nid_translit, test_invd, test_invd_translit): New functions. (main): Add test cases with //NON_IDENTICAL_DISCARD suffix. * NEWS: Mention the change.
Bruno Haible e310efbf 2024-12-14T17:00:22 Remove left-overs of preloadable library (removed in 1.16). * include/iconv.h.in: Remove LIBICONV_PLUG conditionals. * lib/iconv.c: Likewise. * lib/converters.h: Likewise. * lib/loop_unicode.h: Likewise. * lib/loop_wchar.h: Likewise. * lib/iconv_open2.h: Likewise.
Bruno Haible a4c1470b 2024-12-13T23:55:13 Distinguish byte-order state and shift-state. Reported by Tomas Kalibera <tomas.kalibera@gmail.com> in <https://lists.gnu.org/archive/html/bug-gnu-libiconv/2024-12/msg00000.html>. * lib/converters.h (struct conv_struct): Add field 'ibyteorder'. * lib/iconv_open2.h: Initialize the ibyteorder field. * lib/ucs2.h (ucs2_mbtowc): Use the ibyteorder field instead of the istate field. * lib/ucs4.h (ucs4_mbtowc): Likewise. * lib/utf16.h (utf16_mbtowc): Likewise. * lib/utf32.h (utf32_mbtowc): Likewise. * tests/test-bom-state.c: New file. * tests/Makefile.in (check): Run test-bom-state. (test-bom-state, test-bom-state.@OBJEXT@): New targets. (clean): Remove test-bom-state. (SOURCE_FILES): Add test-bom-state.c. * NEWS: Mention the change.
Bruno Haible 6549d20c 2023-05-21T00:29:37 Implement GB18030 version 2022. * lib/encodings.def (GB18030): Add alias GB18030:2005. (GB18030:2022): New encoding. * lib/gb18030ext.h (gb18030_2005_ext_2uni_pagefe): Renamed from gb18030ext_2uni_pagefe. (gb18030_2022_ext_2uni_pagefe): New array. (gb18030_2005_ext_mbtowc): Renamed from gb18030ext_mbtowc. (gb18030_2022_ext_mbtowc): New function. (gb18030_2005_ext_wctomb): Renamed from gb18030ext_wctomb. (gb18030_2022_ext_wctomb): New function. * lib/gb18030uni.h (gb18030_2022_charset2uni_pua1, gb18030_2022_charset2uni_pua2): New arrays. (gb18030_2005_uni_mbtowc): Renamed from gb18030uni_mbtowc. (gb18030_2022_uni_mbtowc): New function. (gb18030_2022_uni2charset_pua1, gb18030_2022_uni2charset_pua2): New arrays. (gb18030_2005_uni_wctomb): Renamed from gb18030uni_wctomb. (gb18030_2022_uni_wctomb): New function. * lib/gb18030_2005.h: Renamed from lib/gb18030.h. Update comments. (gb18030_2005_mbtowc): Renamed from gb18030_mbtowc. (gb18030_2005_pua2charset): Renamed from gb18030_pua2charset. (gb18030_2005_wctomb): Renamed from gb18030_wctomb. * lib/gb18030_2022.h: New file, based on lib/gb18030_2005.h. * lib/converters.h: Don't include gb18030.h. Include gb18030_2005.h, gb18030_2022.h. * lib/Makefile.in (SOURCE_FILES): Remove gb18030.h. Add gb18030_2005.h, gb18030_2022.h. * tests/GB18030-2005-BMP.TXT: Renamed from tests/GB18030-BMP.TXT. * tests/GB18030-2005.IRREVERSIBLE.TXT: Renamed from tests/GB18030.IRREVERSIBLE.TXT. * tests/GB18030-2022-BMP.TXT: New file. * tests/Makefile.in (check): Test GB18030:2005 instead of GB18030. Also test GB18030:2022. (clean): Don't remove GB18030.TXT. Instead, remove GB18030-2005.TXT and GB18030-2022.TXT. (SOURCE_FILES): Update. Add GB18030-2022-BMP.TXT. * README: Mention the new encoding. * man/iconv_open.3: Likewise. * NEWS: Likewise.
Bruno Haible 19b6af5e 2023-04-03T04:12:01 Allow overriding the newline conversion for EBCDIC encodings. Reported by Mike Fulton <mikefultonpersonal@gmail.com> in <https://lists.gnu.org/archive/html/bug-gnu-libiconv/2023-04/msg00009.html>. * include/iconv.h.in (ICONV_SURFACE_NONE, ICONV_SURFACE_EBCDIC_ZOS_UNIX): New macros. (ICONV_GET_FROM_SURFACE, ICONV_SET_FROM_SURFACE, ICONV_GET_TO_SURFACE, ICONV_SET_TO_SURFACE): New macros. * lib/converters.h (struct conv_struct): Add the fields isurface, osurface. (swap_x15_x25): New macro. * lib/iconv.c (iconv_open, iconv_open_into): Add local variables from_surface, to_surface. (ALL_SURFACES): New macro. (iconvctl): Adjust ICONV_TRIVIALP implementation. Implement the ICONV_{GET,SET}_{FROM,TO}_SURFACE requests. * lib/iconv_open1.h: Parse a /ZOS_UNIX surface specifier. Set from_surface, to_surface. * lib/iconv_open2.h: Copy the values of from_surface, to_surface into the conversion descriptor. * lib/ebcdic*.h (*_mbtowc): Test the isurface. If requested, call swap_x15_x25 right after fetching an input byte. (*_wctomb): Test the osurface. If requested, call swap_x15_x25 right before storing an output byte. * man/iconvctl.3 (REQUEST VALUES): Document the ICONV_{GET,SET}_{FROM,TO}_SURFACE requests. * src/iconv.c (main): If ICONV_EBCDIC_ZOS_UNIX is set, set the from/to surfaces accordingly. * man/iconv.1 (ENVIRONMENT): New section. * tests/check-ebcdic: New file. * tests/Makefile.in (check): Invoke it. (SOURCE_FILES): Add it. * NEWS: Mention the new functionality.
Bruno Haible 59b4d2b4 2022-01-24T01:31:08 Optimize the EBCDIC table sizes. * lib/converters.h (DEDUPLICATE_TABLES): New macro. * lib/ebcdic1025.h: Deduplicate tables with ebcdic880.h. * lib/ebcdic1123.h: Deduplicate tables with ebcdic1025.h. * lib/ebcdic1132.h: Deduplicate tables with ebcdic838.h. * lib/ebcdic1153.h: Deduplicate tables with ebcdic870.h. * lib/ebcdic1154.h: Deduplicate tables with ebcdic880.h. * lib/ebcdic1155.h: Deduplicate tables with ebcdic1026.h. * lib/ebcdic1156.h: Deduplicate tables with ebcdic1112.h. * lib/ebcdic1157.h: Deduplicate tables with ebcdic1122.h. * lib/ebcdic1158.h: Deduplicate tables with ebcdic1154.h, ebcdic1123.h. * lib/ebcdic1160.h: Deduplicate tables with ebcdic838.h. * lib/ebcdic1164.h: Deduplicate tables with ebcdic1130.h. * lib/ebcdic1165.h: Deduplicate tables with ebcdic870.h. * lib/ebcdic1166.h: Deduplicate tables with ebcdic1154.h. * lib/ebcdic4971.h: Deduplicate tables with ebcdic875.h. * lib/ebcdic12712.h: Deduplicate tables with ebcdic424.h.
Bruno Haible 68ac8a9f 2022-01-23T23:37:30 New EBCDIC encodings. Reported by Ulrich Schwab and Calvin Buckley via Jack Woehr. * NOTES: Mention how to enable EBCDIC encodings. * tests/IBM-*.TXT: New files. * tools/8bit_tab_to_h.c (main): Emit copyright header with year 2022. * tools/Makefile: Add rules for generating ebcdic*.h. * lib/ebcdic*.h: New files, automatically generated by tools/Makefile. * lib/ebcdic838.h: Tweak reverse mapping manually. * lib/ebcdic1160.h: Likewise. * lib/converters.h: Include all ebcdic*.h. * lib/encodings_zos.def: New file. * lib/genaliases2.c: Include encodings_zos.def. * lib/genflags.c: Likewise. * Makefile.devel (lib/aliases_zos.h lib/canonical_zos.h): New rule. (lib/flags.h, totally-clean): Update. * lib/aliases2.h: Include aliases_zos.h. * lib/iconv.c (USE_ZOS): New macro. Include encodings_zos.def, canonical_zos.h. * README, man/iconv_open.3: Document the IBM-* encodings. * tests/Makefile.in (check-extra-yes): Also test the EBCDIC encodings.
Bruno Haible 91f96be0 2021-06-06T11:51:12 Change the license of the library from LGPL 2.0 to LGPL 2.1.
Bruno Haible 3acb1179 2020-04-04T14:58:34 Change the license of the library from LGPL 2.0 to LGPL 2.1.
Bruno Haible e54fc9c1 2018-09-17T18:28:56 Prefer https URLs where possible.
Bruno Haible 40924a62 2016-10-14T03:18:05 Use 'size_t', not 'int', for the length of a string.
Bruno Haible 48f31c74 2012-02-12T20:54:51 Replace FSF snail-mail address with URL.
Bruno Haible 3a33986e 2011-10-24T02:39:35 New encoding ISO-2022-CP-MS.
Bruno Haible fd7d5707 2010-11-24T03:33:29 Implement newer release of BIG5-HKSCS.
Bruno Haible 459ce580 2009-01-24T23:16:06 New converter for CP1131.
Bruno Haible bb8f7987 2008-09-07T23:28:41 More consistent behaviour when invalid input is preceded by a shift sequence.
Bruno Haible 614f279f 2007-05-25T23:41:00 Add support for the Kazakh RK1048 encoding.
Bruno Haible 7edefe50 2006-05-18T12:48:00 Implement newer releases of BIG5-HKSCS.
Bruno Haible 422b3b1d 2006-01-23T13:25:49 New feature: character-dependent substitutions.
Bruno Haible 05a280da 2005-12-15T12:43:45 CP936 is now different from GBK.
Bruno Haible 38981e01 2005-05-23T10:06:44 Implement BIG5-2003 encoding.
Bruno Haible 45bd190c 2005-05-19T17:14:19 Update FSF postal address.
Bruno Haible 9341db94 2005-05-06T11:04:28 Support for PT154 encoding.
Bruno Haible bb98761e 2005-03-29T13:55:27 Implement and document ATARIST converter.
Bruno Haible 719d85b4 2005-03-14T11:24:40 Introduce iconv_hooks.
Bruno Haible 7736228d 2004-07-22T12:49:09 New encoding ISO-8859-11.
Bruno Haible ed9ef091 2002-05-29T14:11:14 New encoding C99.
Bruno Haible e9eef361 2002-05-22T12:41:45 New encodings CP853, TDS565, RISCOS-LATIN1.
Bruno Haible e0569bb6 2002-05-14T17:17:41 New ASCII compatible encodings from IBM.
Bruno Haible efa47f6d 2002-05-15T16:16:17 More DOS encodings.
Bruno Haible 377fe8ed 2002-05-17T12:07:27 New JISX0213 based encodings.
Bruno Haible ca7aa552 2002-05-16T12:01:34 New configure option --enable-extra-encodings. Add the extra encodings and the platform dependent encodings to the testsuite.
Bruno Haible 33057762 2002-05-13T10:03:19 Add KOI8-T encoding.
Bruno Haible 4f99b68d 2002-01-15T12:47:34 Support for "iconv -c".
Bruno Haible e9d9e17d 2001-10-24T11:09:37 Add support for CP1125.
Bruno Haible 7c035296 2001-06-08T13:12:15 Handle Unicode 3.1 tag characters.
Bruno Haible 19bb4fef 2001-05-26T00:31:46 The multibyte to Unicode conversion may now be stateful. Add Unicode normalization to multibyte to Unicode direction for CP1255, CP1258, TCVN.
Bruno Haible db94e408 2001-05-25T19:21:53 Decouple the mbtowc and wctomb calling conventions.
Bruno Haible e91c0ce3 2001-04-12T12:55:41 Add UTF-32 encodings.
Bruno Haible d5ac1c2b 2001-03-20T20:35:46 Update copyright notice.
Bruno Haible e38d9caa 2001-03-01T15:21:15 Add CP775 to the encodings supported on DOS/DJGPP.
Bruno Haible 8c5fb204 2001-03-06T13:43:56 Add support for OSF/1 (Tru64) 5.1.
Bruno Haible 3d31606d 2001-02-26T13:36:17 Add support for DOS encodings.
Bruno Haible 06aa1274 2001-01-05T13:52:21 Add support for CP862.
Bruno Haible a615528b 2000-11-23T19:54:07 Move src/ to lib/, and install the iconv program.