antiweb seems to be using a lot of the techniques we'd like for an
erlang-in-lisp implementation: processes that communicate through
messages, efficient event loop to manage a great number of
connections, etc.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
A tautology is a thing which is tautological.
---------- Forwarded message ----------
From: <doug(a)hcsw.org>
Date: 2008/7/17
Subject: Invitation to beta-test new Common Lisp webserver
To: fahree(a)gmail.com
Hi Faré,
We spoke earlier on IRC and apparently have similar ideas
about server design so I thought you might be interested
in having a look at a webserver I have been working on:
http://hoytech.com/antiweb/
I'm appending the first section of the "Design of Antiweb"
page of the manual.
Best,
Doug
Antiweb is a webserver written in Common Lisp, C, and Perl by Doug Hoyte and
Hoytech. Antiweb is not a "proof of concept" and is not "exploratory code". We
intend the core design of Antiweb (as laid out in this design paper) to be
stable for the next 10+ years of use.
The two webservers that have had the largest influence on Antiweb4 are nginx and
lighttpd. We took liberal advantage investigating these and other excellent
servers while designing Antiweb4. Another more obscure server that has
influenced
Antiweb is fhttpd.
Why another webserver? In our opinion, the biggest problem with the above
servers is that they aren't written in lisp. Many servers that we studied have
grafted on extension languages (ie, Perl for nginx and Lua for lighttpd).
Antiweb is different. Instead of being a C program that uses some other
language, Antiweb is a lisp program that uses C (and Perl).
Like nginx, lighttpd, fhttpd, and Antiweb3, Antiweb4 is an asynchronous or
event-based server, meaning that a single thread of control multiplexes multiple
client connections. Antiweb is a collection of unix processes. Connections are
transferred between processes with sendmsg(). When this happens, any data that
was initially read from the socket is transferred along with the socket itself.
The socket is always closed in the sending process.
To multiplex connections inside a process, Antiweb uses a state machine data
structure defined in src/libantiweb.h. Antiweb requires either the kqueue() or
epoll() stateful level-triggered event APIs.
* On a 32-bit linux/CMUCL system, 10000 inactive keepalive connections
consume about 3M of user-space memory (in addition to two lisp images).
* The number of inactive keepalive connections has negligible performance
impact on new connections.
There are three modes for sending files: medium, small, and large:
1. Medium: These files are mmap()ed (memory mapped) to avoid copying the
file's data into user-space. The data is copied directly from the filesystem
to the kernel's socket buffer.
2. Small: These files are read into a user-space buffer because a small
read() is often cheaper than mmap()+munmap().
3. Large: Antiweb uses a user-space buffer for large files. This is to avoid
disk-thrashing when serving many large files to clients concurrently (idea
from lighttpd) and to avoid running out of address space on 32 bit systems.
Super-size it: Because Antiweb uses a 64-bit off_t type and lisp's unlimited
precision integers on all systems, Antiweb can serve files of any size. It also
supports download resuming for all three file send modes.
Antiweb's data structures are designed for pipelining. Antiweb uses vectored I/O
(also known as scatter-gather I/O) along with non-blocking I/O nearly
everywhere. Antiweb's internal message passing protocol uses pipelining also.
For example, an HTTP connection that pipelines two requests for small files
followed by one request for a medium file is responded to with a single writev()
system call consisting of the following:
* The HTTP headers and file contents for the first two small files
* The HTTP headers for the medium file
* As much of the memory mapped medium file as it takes to fill the kernel's
socket buffer.
Subsequently, all the generated log messages are written to the hub process with
another writev(). The hub will eventually append the log messages (as well as
any others that queue up) onto the axslog log file.
To see the connection statistics of a worker process, use the -stats command:
# antiweb -stats /var/aw/example.conf
...
Keepalive Time: 65 seconds
Total Connections: 41 HTTP requests: 72 Avg reqs/conn: 1.8
File descriptor usage (estimate): 17/32767
Current Connections: 11
Keepalives: 7 Sending files to: 2
Proxy: Sources: 0 Sinks: 0 Idle: 0
Timers: 0 Hub: 1 Unix Connections: 1
Lingering: 0 Zombies: 0
...
Notice that in addition to the HTTP traffic, there is also a connection to the
hub's unix socket that was opened on start-up, and one other open unix socket.
That other unix socket is you. You created a supervisor connection while asking
for stats info.
-stats will also tell you how hosts are mapped to directories on a worker:
# antiweb -stats /var/aw/example.conf
...
Host -> HTML root mappings:
localhost -> /var/www/testing
example.com -> /var/www/example.com
www.example.com -> /var/www/example.com
...
Although usually we love it, sometimes pipelining is bad. Antiweb deliberately
tears down persistent HTTP connections on certain responses:
* 4XX and 5XX HTTP Errors - This is to prevent blind web vulnerability
scanners like nikto from persisting or pipelining 95+ percent of their
requests.
* Directory Listings - To prevent pipelined recursive crawling.
When finished with a connection, Antiweb will shutdown the write direction of
the socket and linger as required by HTTP/1.1. Antiweb always gracefully
degrades for HTTP/0.9 and HTTP/1.0 clients. Antiweb has first-class IPv6
support. If you really do want to pipeline 4XX and 5XX errors, you have two
options:
1. Use Antiweb's rewrite module to change problematic requests into requests
for existing files.
2. Use Antiweb's fast-files module. This is a memory cache that supports
accelerated static content, pre-generation of HTTP headers, negative
caching, and persisting/pipelining 404 errors.
Antiweb was designed with security in mind from the beginning. Here are some of
the security decisions made during the Antiweb design process:
* Virtual hosts are privilege-separated without proxying. Once the hub has
determined which worker should handle a connection, it transfers the socket
to the worker process and has nothing further to do with the
connection. Worker
processes run under different UIDs from the hub (and each-other). Workers are
optionally chroot()ed.
* Workers have no access to log files: all log messages are sent to the hub
over the unix socket. This means that a compromised worker process
cannot steal
previously created log messages or log messages created by other workers.
* CGI processes can be restricted with resource limitations.
* Even on lisps without unicode support, Antiweb4 guarantees all internal
data and filenames are UTF-8 encoded. This includes verifying all code-points
are in their shortest possible representation and that there are no otherwise
invalid surrogates.
* Antiweb processes never try to clean-up or recover in the event of an
unexpected condition. A process cannot do that because it has failed. Some
other process that hasn't failed will clean-up after it.
Antiweb also includes an experimental new technology for constructing webpages
called Anti Webpages. These are Perl-inspired programs that let you draw page
layouts with significant whitespace, glue together HTML/CSS/Javascript, and
more.
Antiweb was created for admins, by admins. Please let us know any ways you think
it could be better.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFIfuKO3LTjmOMguVMRAvfUAJ4kg5aWoMfmHkrcWHvITs8Sqa9oEgCeMqPP
lOxSU0c6lc3ZU1BOTH6L79w=
=9+fV
-----END PGP SIGNATURE-----
I'm now just back from vacations, and will be more available for work
on Erlang-in-Lisp.
I am asked to submit a Mid-term review of Erlang-in-Lisp. Matt, can
you produce a short document that describes what was done already, and
what's you intend to focus on next?
NB: This post was seen on Planet Lisp:
http://www.foldr.org/~michaelw/log/programming/lisp/erlang-common-lisp
Some other person seems to already tackle the part where a Lisp client
talks to an Erlang server, but it looks like it's using an ad-hoc
TCP/IP protocol rather than a native Erlang wire protocol. Distel
looks more promising as a code base for future Lisp and Erlang
integration.
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
The common argument that crime is caused by poverty
is a kind of slander on the poor.
-- H. L. Mencken
For the last several weeks I've been doing my work in the epmd branch.
I merged all those changes with the master branch tonight and deleted
it. If you were on the epmd branch you can:
git checkout master ;;gets you back on master branch
git pull http://common-lisp.net/project/erlang-in-lisp/git master
;;gets latest changes
Or if you are on the master branch already, the previous pull (or make
pull) will fetch the latest changes as usual. This is still pretty
SVN-ish behavior, but hey, it works for now.
--matt