To: vim_dev@googlegroups.com Subject: Patch 8.2.2605 Fcc: outbox From: Bram Moolenaar Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ------------ Patch 8.2.2605 Problem: Vim9: string index and slice does not include composing chars. Solution: Include composing characters. (issue #6563) Files: runtime/doc/vim9.txt, src/vim9execute.c, src/testdir/test_vim9_expr.vim *** ../vim-8.2.2604/runtime/doc/vim9.txt 2021-02-19 21:42:51.540789780 +0100 --- runtime/doc/vim9.txt 2021-03-14 18:31:50.342134768 +0100 *************** *** 96,103 **** def CallMe(count: number, message: string): bool - Call functions without `:call`: > writefile(['done'], 'file.txt') ! - You cannot use `:xit`, `:t`, `:k`, `:append`, `:change`, `:insert` or ! curly-braces names. - A range before a command must be prefixed with a colon: > :%s/this/that - Unless mentioned specifically, the highest |scriptversion| is used. --- 96,103 ---- def CallMe(count: number, message: string): bool - Call functions without `:call`: > writefile(['done'], 'file.txt') ! - You cannot use `:xit`, `:t`, `:k`, `:append`, `:change`, `:insert`, `:open` ! or curly-braces names. - A range before a command must be prefixed with a colon: > :%s/this/that - Unless mentioned specifically, the highest |scriptversion| is used. *************** *** 279,286 **** variables, because they are not really declared. They can also be deleted with `:unlet`. ! Variables and functions cannot shadow previously defined or imported variables ! and functions. Variables may shadow Ex commands, rename the variable if needed. Global variables and user defined functions must be prefixed with "g:", also --- 279,286 ---- variables, because they are not really declared. They can also be deleted with `:unlet`. ! Variables, functions and function arguments cannot shadow previously defined ! or imported variables and functions in the same script file. Variables may shadow Ex commands, rename the variable if needed. Global variables and user defined functions must be prefixed with "g:", also *************** *** 307,320 **** const myList = [1, 2] myList = [3, 4] # Error! myList[0] = 9 # Error! ! muList->add(3) # Error! < *:final* `:final` is used for making only the variable a constant, the value can be changed. This is well known from Java. Example: > final myList = [1, 2] myList = [3, 4] # Error! myList[0] = 9 # OK ! muList->add(3) # OK It is common to write constants as ALL_CAPS, but you don't have to. --- 307,320 ---- const myList = [1, 2] myList = [3, 4] # Error! myList[0] = 9 # Error! ! myList->add(3) # Error! < *:final* `:final` is used for making only the variable a constant, the value can be changed. This is well known from Java. Example: > final myList = [1, 2] myList = [3, 4] # Error! myList[0] = 9 # OK ! myList->add(3) # OK It is common to write constants as ALL_CAPS, but you don't have to. *************** *** 341,347 **** Using `:call` is still possible, but this is discouraged. A method call without `eval` is possible, so long as the start is an ! identifier or can't be an Ex command. Examples: > myList->add(123) g:myList->add(123) [1, 2, 3]->Process() --- 341,348 ---- Using `:call` is still possible, but this is discouraged. A method call without `eval` is possible, so long as the start is an ! identifier or can't be an Ex command. For a function either "(" or "->" must ! be following, without a line break. Examples: > myList->add(123) g:myList->add(123) [1, 2, 3]->Process() *************** *** 412,418 **** *vim9-curly* To avoid the "{" of a dictionary literal to be recognized as a statement block ! wrap it in parenthesis: > var Lambda = (arg) => ({key: 42}) Also when confused with the start of a command block: > --- 413,419 ---- *vim9-curly* To avoid the "{" of a dictionary literal to be recognized as a statement block ! wrap it in parentheses: > var Lambda = (arg) => ({key: 42}) Also when confused with the start of a command block: > *************** *** 696,703 **** used, not "v:false" and "v:true" like in legacy script. "v:none" is not changed, it is only used in JSON and has no equivalent in other languages. ! Indexing a string with [idx] or [idx : idx] uses character indexes instead of ! byte indexes. Example: > echo 'bár'[1] In legacy script this results in the character 0xc3 (an illegal byte), in Vim9 script this results in the string 'á'. --- 697,705 ---- used, not "v:false" and "v:true" like in legacy script. "v:none" is not changed, it is only used in JSON and has no equivalent in other languages. ! Indexing a string with [idx] or taking a slice with [idx : idx] uses character ! indexes instead of byte indexes. Composing characters are included. ! Example: > echo 'bár'[1] In legacy script this results in the character 0xc3 (an illegal byte), in Vim9 script this results in the string 'á'. *************** *** 800,805 **** --- 802,809 ---- The 'edcompatible' option value is not used. The 'gdefault' option value is not used. + You may also find this wiki useful. It was written by an early adoptor of + Vim9 script: https://github.com/lacygoill/wiki/blob/master/vim/vim9.md ============================================================================== *************** *** 1029,1038 **** - Using a number where a string is expected. *E1024* One consequence is that the item type of a list or dict given to map() must ! not change. This will give an error in compiled code: > map([1, 2, 3], (i, v) => 'item ' .. i) ! E1012: Type mismatch; expected list but got list ! Instead use |mapnew()|. ============================================================================== --- 1033,1046 ---- - Using a number where a string is expected. *E1024* One consequence is that the item type of a list or dict given to map() must ! not change. This will give an error in Vim9 script: > map([1, 2, 3], (i, v) => 'item ' .. i) ! E1012: Type mismatch; expected number but got string ! Instead use |mapnew()|. If the item type was determined to be "any" it can ! change to a more specific type. E.g. when a list of mixed types gets changed ! to a list of numbers. ! Same for |extend()|, use |extendnew()| instead, and for |flatten()|, use ! |flattennew()| instead. ============================================================================== *************** *** 1084,1090 **** vim9script # Vim9 script commands go here This allows for writing a script that takes advantage of the Vim9 script ! syntax if possible, but will also work on an Vim version without it. This can only work in two ways: 1. The "if" statement evaluates to false, the commands up to `endif` are --- 1092,1098 ---- vim9script # Vim9 script commands go here This allows for writing a script that takes advantage of the Vim9 script ! syntax if possible, but will also work on a Vim version without it. This can only work in two ways: 1. The "if" statement evaluates to false, the commands up to `endif` are *************** *** 1107,1113 **** export class MyClass ... As this suggests, only constants, variables, `:def` functions and classes can ! be exported. {classes are not implemented yet} *E1042* `:export` can only be used in Vim9 script, at the script level. --- 1115,1121 ---- export class MyClass ... As this suggests, only constants, variables, `:def` functions and classes can ! be exported. {not implemented yet: export class} *E1042* `:export` can only be used in Vim9 script, at the script level. *** ../vim-8.2.2604/src/vim9execute.c 2021-03-14 12:13:30.192279488 +0100 --- src/vim9execute.c 2021-03-14 18:35:58.069458973 +0100 *************** *** 985,992 **** } /* ! * Return the character "str[index]" where "index" is the character index. If ! * "index" is out of range NULL is returned. */ char_u * char_from_string(char_u *str, varnumber_T index) --- 985,993 ---- } /* ! * Return the character "str[index]" where "index" is the character index, ! * including composing characters. ! * If "index" is out of range NULL is returned. */ char_u * char_from_string(char_u *str, varnumber_T index) *************** *** 1005,1011 **** int clen = 0; for (nbyte = 0; nbyte < slen; ++clen) ! nbyte += MB_CPTR2LEN(str + nbyte); nchar = clen + index; if (nchar < 0) // unlike list: index out of range results in empty string --- 1006,1012 ---- int clen = 0; for (nbyte = 0; nbyte < slen; ++clen) ! nbyte += mb_ptr2len(str + nbyte); nchar = clen + index; if (nchar < 0) // unlike list: index out of range results in empty string *************** *** 1013,1027 **** } for (nbyte = 0; nchar > 0 && nbyte < slen; --nchar) ! nbyte += MB_CPTR2LEN(str + nbyte); if (nbyte >= slen) return NULL; ! return vim_strnsave(str + nbyte, MB_CPTR2LEN(str + nbyte)); } /* * Get the byte index for character index "idx" in string "str" with length ! * "str_len". * If going over the end return "str_len". * If "idx" is negative count from the end, -1 is the last character. * When going over the start return -1. --- 1014,1028 ---- } for (nbyte = 0; nchar > 0 && nbyte < slen; --nchar) ! nbyte += mb_ptr2len(str + nbyte); if (nbyte >= slen) return NULL; ! return vim_strnsave(str + nbyte, mb_ptr2len(str + nbyte)); } /* * Get the byte index for character index "idx" in string "str" with length ! * "str_len". Composing characters are included. * If going over the end return "str_len". * If "idx" is negative count from the end, -1 is the last character. * When going over the start return -1. *************** *** 1036,1042 **** { while (nchar > 0 && nbyte < str_len) { ! nbyte += MB_CPTR2LEN(str + nbyte); --nchar; } } --- 1037,1043 ---- { while (nchar > 0 && nbyte < str_len) { ! nbyte += mb_ptr2len(str + nbyte); --nchar; } } *************** *** 1056,1062 **** } /* ! * Return the slice "str[first:last]" using character indexes. * "exclusive" is TRUE for slice(). * Return NULL when the result is empty. */ --- 1057,1064 ---- } /* ! * Return the slice "str[first : last]" using character indexes. Composing ! * characters are included. * "exclusive" is TRUE for slice(). * Return NULL when the result is empty. */ *************** *** 1079,1085 **** end_byte = char_idx2byte(str, slen, last); if (!exclusive && end_byte >= 0 && end_byte < (long)slen) // end index is inclusive ! end_byte += MB_CPTR2LEN(str + end_byte); } if (start_byte >= (long)slen || end_byte <= start_byte) --- 1081,1087 ---- end_byte = char_idx2byte(str, slen, last); if (!exclusive && end_byte >= 0 && end_byte < (long)slen) // end index is inclusive ! end_byte += mb_ptr2len(str + end_byte); } if (start_byte >= (long)slen || end_byte <= start_byte) *************** *** 3249,3256 **** res = string_slice(tv->vval.v_string, n1, n2, FALSE); else // Index: The resulting variable is a string of a ! // single character. If the index is too big or ! // negative the result is empty. res = char_from_string(tv->vval.v_string, n2); vim_free(tv->vval.v_string); tv->vval.v_string = res; --- 3251,3259 ---- res = string_slice(tv->vval.v_string, n1, n2, FALSE); else // Index: The resulting variable is a string of a ! // single character (including composing characters). ! // If the index is too big or negative the result is ! // empty. res = char_from_string(tv->vval.v_string, n2); vim_free(tv->vval.v_string); tv->vval.v_string = res; *** ../vim-8.2.2604/src/testdir/test_vim9_expr.vim 2021-03-10 18:43:05.573396183 +0100 --- src/testdir/test_vim9_expr.vim 2021-03-14 18:37:13.489101329 +0100 *************** *** 2367,2372 **** --- 2367,2401 ---- assert_equal('abcd', g:teststring[: -3]) assert_equal('', g:teststring[: -9]) + # composing characters are included + g:teststring = 'àéû' + assert_equal('à', g:teststring[0]) + assert_equal('é', g:teststring[1]) + assert_equal('û', g:teststring[2]) + assert_equal('', g:teststring[3]) + assert_equal('', g:teststring[4]) + + assert_equal('û', g:teststring[-1]) + assert_equal('é', g:teststring[-2]) + assert_equal('à', g:teststring[-3]) + assert_equal('', g:teststring[-4]) + assert_equal('', g:teststring[-5]) + + assert_equal('à', g:teststring[0 : 0]) + assert_equal('é', g:teststring[1 : 1]) + assert_equal('àé', g:teststring[0 : 1]) + assert_equal('àéû', g:teststring[0 : -1]) + assert_equal('àé', g:teststring[0 : -2]) + assert_equal('à', g:teststring[0 : -3]) + assert_equal('', g:teststring[0 : -4]) + assert_equal('', g:teststring[0 : -5]) + assert_equal('àéû', g:teststring[ : ]) + assert_equal('àéû', g:teststring[0 : ]) + assert_equal('éû', g:teststring[1 : ]) + assert_equal('û', g:teststring[2 : ]) + assert_equal('', g:teststring[3 : ]) + assert_equal('', g:teststring[4 : ]) + # blob index cannot be out of range g:testblob = 0z01ab assert_equal(0x01, g:testblob[0]) *** ../vim-8.2.2604/src/version.c 2021-03-14 16:20:33.158928849 +0100 --- src/version.c 2021-03-14 18:20:48.259532113 +0100 *************** *** 752,753 **** --- 752,755 ---- { /* Add new patch number below this line */ + /**/ + 2605, /**/ -- Your company is doomed if your primary product is overhead transparencies. (Scott Adams - The Dilbert principle) /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ an exciting new programming language -- http://www.Zimbu.org /// \\\ help me help AIDS victims -- http://ICCF-Holland.org ///