#36: file-position broken for utf16 and utf32 ---------------------+------------------------------------------------------ Reporter: rtoy | Owner: somebody Type: defect | Status: new Priority: minor | Milestone: Component: Core | Version: 2010-01 Resolution: | Keywords: ---------------------+------------------------------------------------------
Comment(by rtoy):
One possible solution is to keep track of the number of octets used to create each character. This has a relatively high cost because we need to save this for each character, for all inputs, but the data is only used for file-position. This seems really wasteful of MIPS and memory since file-position probably occurs much less often than reading characters.
Another alternative would be to modify string-encode so that the BOM is not included. But that's a bit tricky too. Either we need a new method for each external format (that needs it) or we need to add an extra parameter to the external format method to say we don't want a BOM. Not too hard to do, but some work to modify every format for this.
Or maybe string-encode can take a new argument specifying the ef state. But then we would need a new ef function to give us the ef state that will guarantee no BOM.
Or, the most hackish, but workable solution is to look at the output of string-encode. If the first two octets are the BOM, adjust for that. A bit hackish, but seems doable.