On Tue, Oct 18, 2011 at 19:15, Christian von Essen christian@mvonessen.de wrote:
Hi,
I use CL to scrape several comic websites and generate a website that collects the daily strips from that. The (small) program's features:
- Easy definition of comic sources
- Uses xpath to get the comics
- Stores an archive of daily comics
- Generate web pages with comics per day
A feature that is IMHO still missing is easier (maybe interactive) comic specification. Currently you better get your xpath right.
I think it would be possible to promote CL with an application like that.
Could you have a look at the code and give me some hints on style and CL in general, so that the code actually becomes good enough for that purpose?
You can find it there: https://github.com/Neronus/Lisp-Utils/blob/master/comics.lisp And an example of the generated output here: http://christian.ftwca.de/comics/
Thank you,
Christian
Dear Christian,
I'm interested in your web scraping technology in CL.
I'd like to build a distributed web proxy that persistently records everything one views, so that you can always read and share the pages you like even when the author dies, the servers are taken off-line, the domain name is bought by someone else, and the new owner puts a new robots.txt that tells archive.org to not display the pages anymore.
I don't know if this adventure tempts you, but I think the time is ripe for end-user-controlled peer-to-peer distributed archival and sharing of information. Obvious application, beyond archival, is a distributed facebook/g+ replacement.
PS: in shelisp, maybe you could use xcvb-driver:run-program/process-output-stream instead of yet another partial run-program interface. I really would like to see half-baked portability layers die. If you really need input as well as output, I could hack that into the xcvb-driver utility.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org