On Sat Feb 24, 2007 at 09:47:15PM +0100, Edi Weitz wrote:
My guess is that the website sends wrong content-type headers. (Or, in other words, it claims to send UTF-8 but it doesn't.) This is not unusual. See the mailing list archive of the last weeks for similar problems and for workarounds.
If you still think this is a bug in FLEXI-STREAMS, send a simple, reproducible test case and point out where in the sequence of characters FLEXI-STREAMS thinks it's not UTF-8 anymore although it is.
I believe you are right - incorrectly identified content-type. This gets it to work:
(setf flexi-streams::*SUBSTITUTION-CHAR* (code-char #xA0)) (setf flexi-streams::*PROVIDE-USE-VALUE-RESTART* t) (http-request "http://www.gifttree.com/Christmas/Christmas-gift-idea.html")
And I read about the performance hit associated with setting this up as a default. But it seems like it raises some issues - at least for what I'm doing, which is trying to automate updating information about some sites I have no control over. In this case I set it to make a substitution for the 'bad' character. Is it possible for there to be more than one? If so, how could that be handled?
And more generally, should there not be a way to set drakma so it may take a performance hit but is guaranteed not to die on any html that is thrown at it?
Thanks,
--Jeff