
Сообщение от
Shadow Maker
@
Ped7g as a Linux user I am very interested in converting characters in "ENCODING" from UTF-8 to one of these (WIN/DOS). But unfortunately it does not work it seems. Is it possible to implement? Or maybe I am missing something here. Actually it would be nice to have a full-fledged iconv or similar, as now it is limited to russian encodings...
I'm also on linux and used to utf-8 everywhere, so I do get your idea, but I'm not sure what/how to add to sjasmplus.
As ZX asm programmer I expect everything to be 8bit, so the utf8 -> 8bit conversion makes sense, but by what rules? In our ZX demos we often use custom encoding, not even some old DOS or win CP page, but completely custom one, so no generic tool will help much with that.
Maybe with Russian texts you have less mixing and use only few encodings, but I'm still not sure how to define one.
And finally the sjasmplus project currently doesn't contain any utf-8 implementation, so if I would link against iconv/similar, it will grow the dependencies list, so it should be rather something very well working feature, to make that cost worth it.
BTW if you need some standard win/dos encoding, I guess you can add into your makefile/build script the pre-build step using iconv itself, for example having .asm in utf8, and "building" .a80 files by implicit makefile rule using iconv similar to this:
Код:
$ echo "Мне одному кажется" | iconv -f UTF-8 -t CP1251 | hd
00000000 cc ed e5 20 ee e4 ed ee ec f3 20 ea e0 e6 e5 f2 |... ...... .....|
00000010 f1 ff 0a |...|
and then build the ".a80" with sjasmplus. If you use the utf-8 chars only within double quotes or comments, it should work.
So under linux, for general conversion like utf-8 -> cp1251 I don't feel sjasmplus needs any change, you can easily work around that (I have no idea if other OS have iconv and other powerful tools, maybe it's more difficult on other systems).
But if you guys have some cool ideas, how to bring into sjasmplus something even better, something what would do even things which iconv can't cover and|or makes life easier for people who use completely custom encoding, let me know.
But I can't imagine any particular nice syntax for some new directive covering these special cases, and when I recently was helping on one sjasmplus project which was "scrambling" text strings with custom xor-scheme to make strings hidden from simple view, I did end writing macro using the {b adr} memory read, changing the regular `db "some text"` into final scrambled bytes in DUP-loop produced by encoding macro. So even many of custom encodings (following some simple formula for 90% of chars and having only few special rules) could be done quite easily in sjasmplus with post-process macros.
Feels to me like it's not very difficult to resolve any of this use-cases even with current sjasmplus, but I have difficult time to imagine change which would help and be also elegant and worth implementing. (except the obvious utf8->cp1251 internal conversion, but that feels to me a bit useless, as I can use `iconv` for that already).
- - - Updated - - -
edit: to make that command line more complete, including the sjasmplus...
(just for my own amusement and test)
Код:
$ echo "txt: db \"Мне одному кажется\"" | iconv -f UTF-8 -t WINDOWS-1251 | sjasmplus - --raw=- | hd
SjASMPlus Z80 Cross-Assembler v1.18.2 (https://github.com/z00m128/sjasmplus)
Pass 1 complete (0 errors)
Pass 2 complete (0 errors)
Pass 3 complete
Errors: 0, warnings: 0, compiled: 2 lines, work time: 0.001 seconds
00000000 cc ed e5 20 ee e4 ed ee ec f3 20 ea e0 e6 e5 f2 |... ...... .....|
00000010 f1 ff |..|