I emailed Edi a week or so with some half-baked ideas about authorization/authentication and hunchentoot. He suggested continuing this discussion on the list, so I'm forwarding his initial comments below. For some background here, my initial approach to authorization with hunchentoot can be found at:
http://git.cyrusharmon.org/cgi-bin/gitweb.cgi?p=hunchentoot-auth.git
quoting my initial message, (some of) the motivation for this is to have:
* HTML-based login, rather than the standard browser authentication dialog box * some notion of users/groups/realms * an easy way to serve content to: ** non-authenticated users ** authenticated but not necessarily authorized for this page users ** authenticated and authorized users
Edi's questions and some answers are
- How do you enable users of hunchentoot-auth to use their own design
for the login and rejection pages?
Yes, this is one of the main issues with the current design. Right now this is handled through the authorized-page macro. It allows for users (developers) to pass in a :login-page-function which should handle that.
- Where do you end up if the login was successful? Is it the same
page for all users?
authorized-page is a macro that wraps each authorized page, so you end up on the selected page. not the most elegant way to do things. I'd like to remove the use of the macro here and tap into ht's dispatch machinery to do this more elegantly.
- Do you have a mechanism where, if I enter the URL of a page which
needs authorization, I first have to log in but will then be redirected to the page I wanted to go to in the first place?
yes, that's what the authorized-page macro does. again, there's probably a better approach.
This also raises the question how these are stored. Maybe provide hooks or a backend API so that one can use an existing database and/or combine with your own user class? (I.e. the USER class of hunchentoot-auth should probably be prepared to be a mixin.)
For ht-auth, this is stored in the realm class. It would certainly be simple enough to have a generic realm class along with a simple-realm or some such that has the current (admittedly trivial) functionality of storing the users/password-hashes in a local file.
Further thoughts/comments on authorization and hunchentoot from the list?
thanks,
Cyrus
On Apr 11, 2008, at 12:41 AM, Edi Weitz wrote:
Hi Cyrus,
I've looked at this (a bit!) now and I'm adding my thoughts about the issue below. Sorry for the delay.
On Thu, 3 Apr 2008 11:52:32 -0700, Cyrus Harmon cyrus@cyrusharmon.org wrote:
The issue is authentication. Hunchentoot kindly provides some basic services, but I'd like a more full-featured authentication mechanism with features like:
- HTML-based login, rather than the standard browser authentication dialog box
- some notion of users/groups/realms
- an easy way to serve content to: ** non-authenticated users ** authenticated but not necessarily authorized for this page
users ** authenticated and authorized users
I agree that all of this is usually needed/wanted in a "modern" web application and I've often implemented at least parts of this myself.
Possible problems that I'm seeing here (if you want a flexible and general solution):
- How do you enable users of hunchentoot-auth to use their own design
for the login and rejection pages?
- Where do you end up if the login was successful? Is it the same
page for all users?
- Do you have a mechanism where, if I enter the URL of a page which
needs authorization, I first have to log in but will then be redirected to the page I wanted to go to in the first place?
- an administrative UI for managing users/groups/realms
This also raises the question how these are stored. Maybe provide hooks or a backend API so that one can use an existing database and/or combine with your own user class? (I.e. the USER class of hunchentoot-auth should probably be prepared to be a mixin.)
- a UI for requesting new accounts, possibly with some sort of admin step hook or email based confirmation of accounts
Yep.
Would it make sense to continue this discussion on the mailing list?
Cheers, Edi.
Is this a server function or an app function? By the time you start rolling out full ACL capability, aren't you pretty far removed from the server?
On the app I'm maintaining (non-lisp), data can come in from several sources with different licenses. Some of the data sources give us site-wide licenses, others are limited to certain specific individuals. So it isn't just a question of is the user authorized to see a certain area of the site (and, if so, do they have read/add/edit capabilities within that area), but data searches need to see if this chunk of data has license restrictions which further limit the read/add/edit capability for that specific piece of data.
So a typical internal user will say - log me in and the app knows that this user is limited to area 123. The user asks "tell me everything we know about in XYZ in Thailand" The app then looks for everything in the system about XYZ in Thailand which this user is allowed to view and builds a page from that data.
Since "pages" are built on the fly from the database search results, there is no such thing as an "authorized page". A different user who may be on more or fewer or just different data source licenses would see a completely "different" page. (Different in the sense that the data on the page would be different - the url would be exactly the same.) The user could even see a page with no data, just a generic error message that either we have no data on the particular question or they are not authorized for access to whatever data we have on the particular question. Since the webapp only knows that the database search returned no data, it has no idea whether there is no data in the system or just no data accessible by this user's permissions.
If a user has access to multiple areas within the webapp, they can choose the generic home page or an area specific home page, but even those pages are built on the fly. E.g., someone might have worldwide access, but their concentration is on Europe, so their default homepage after logging in shows only recent data updates for Europe.
Then again, I may be missing the whole point here.
Bryan
I would argue that this sort of plumbing sits between the server and the application. It's the kind of infrastructure that multiple application uses and it probably needs to know something about the server in order to function properly. But the whole distinction between application and server is rather nebulous and arbitrary.
What I have in mind is a relative simple, extensible system for managing users/passwords/groups and for allowing one to server pages with various scenarios in mind, such as requiring that a user become an authenticated user before seeing certain pages, that different users see different content based on various properties, etc... Clearly more complex applications may have more demanding requirements for this functionality, but a basic, extensible infrastructure sitting on top of hunchentoot seems like it would enable folks to have a leg up when beginning to write the kind of apps you mention.
Cyrus
On Apr 12, 2008, at 3:16 PM, Bryan Emrys wrote:
Is this a server function or an app function? By the time you start rolling out full ACL capability, aren't you pretty far removed from the server?
On the app I'm maintaining (non-lisp), data can come in from several sources with different licenses. Some of the data sources give us site-wide licenses, others are limited to certain specific individuals. So it isn't just a question of is the user authorized to see a certain area of the site (and, if so, do they have read/add/ edit capabilities within that area), but data searches need to see if this chunk of data has license restrictions which further limit the read/add/edit capability for that specific piece of data.
So a typical internal user will say - log me in and the app knows that this user is limited to area 123. The user asks "tell me everything we know about in XYZ in Thailand" The app then looks for everything in the system about XYZ in Thailand which this user is allowed to view and builds a page from that data.
Since "pages" are built on the fly from the database search results, there is no such thing as an "authorized page". A different user who may be on more or fewer or just different data source licenses would see a completely "different" page. (Different in the sense that the data on the page would be different - the url would be exactly the same.) The user could even see a page with no data, just a generic error message that either we have no data on the particular question or they are not authorized for access to whatever data we have on the particular question. Since the webapp only knows that the database search returned no data, it has no idea whether there is no data in the system or just no data accessible by this user's permissions.
If a user has access to multiple areas within the webapp, they can choose the generic home page or an area specific home page, but even those pages are built on the fly. E.g., someone might have worldwide access, but their concentration is on Europe, so their default homepage after logging in shows only recent data updates for Europe.
Then again, I may be missing the whole point here.
Bryan
Agreed. I got too mono-focused there.
Bryan
On Saturday 12 April 2008 03:25:33 pm Cyrus Harmon wrote:
I would argue that this sort of plumbing sits between the server and the application. It's the kind of infrastructure that multiple application uses and it probably needs to know something about the server in order to function properly. But the whole distinction between application and server is rather nebulous and arbitrary.
What I have in mind is a relative simple, extensible system for managing users/passwords/groups and for allowing one to server pages with various scenarios in mind, such as requiring that a user become an authenticated user before seeing certain pages, that different users see different content based on various properties, etc... Clearly more complex applications may have more demanding requirements for this functionality, but a basic, extensible infrastructure sitting on top of hunchentoot seems like it would enable folks to have a leg up when beginning to write the kind of apps you mention.
Cyrus
On Apr 12, 2008, at 3:16 PM, Bryan Emrys wrote:
Is this a server function or an app function? By the time you start rolling out full ACL capability, aren't you pretty far removed from the server?
On the app I'm maintaining (non-lisp), data can come in from several sources with different licenses. Some of the data sources give us site-wide licenses, others are limited to certain specific individuals. So it isn't just a question of is the user authorized to see a certain area of the site (and, if so, do they have read/add/ edit capabilities within that area), but data searches need to see if this chunk of data has license restrictions which further limit the read/add/edit capability for that specific piece of data.
So a typical internal user will say - log me in and the app knows that this user is limited to area 123. The user asks "tell me everything we know about in XYZ in Thailand" The app then looks for everything in the system about XYZ in Thailand which this user is allowed to view and builds a page from that data.
Since "pages" are built on the fly from the database search results, there is no such thing as an "authorized page". A different user who may be on more or fewer or just different data source licenses would see a completely "different" page. (Different in the sense that the data on the page would be different - the url would be exactly the same.) The user could even see a page with no data, just a generic error message that either we have no data on the particular question or they are not authorized for access to whatever data we have on the particular question. Since the webapp only knows that the database search returned no data, it has no idea whether there is no data in the system or just no data accessible by this user's permissions.
If a user has access to multiple areas within the webapp, they can choose the generic home page or an area specific home page, but even those pages are built on the fly. E.g., someone might have worldwide access, but their concentration is on Europe, so their default homepage after logging in shows only recent data updates for Europe.
Then again, I may be missing the whole point here.
Bryan
To me, this seems like pushing Hunchentoot more in the direction of being a web framework than just a webserver. Not that there's anything wrong with that, of course, but it could probably be just as easily done with a library that sits on top and implements a light-weight framework. Rob
Robert,
Yes, I agree. This certainly doesn't have to be part of hunchentoot. However, I see an opportunity for lightweight hunchentoot-specific user authentication/authorization package that could provide this functionality. Moreover, some minor tweaks to hunchentoot might make this an easier task. I'd like to stay away from a full-fledged web framework ala UCW or weblocks, however, and I'm willing to 1) make this library totally hunchentoot-specific and 2) if necessary propose modifications to hunchentoot that would facilitate the implementation of this library. In particular, the hunchentoot dispatch stuff, while flexible, could, I think, be improved in ways that would make the implementation of this library more facile. But I'm just brainstorming at the moment and don't have any concrete examples, other than the fact that meta-dispatch stuff feels like it might be cleaner with some sort of CLOS custom method combination might. Then again, it's flexible enough that this can be implemented on top of the existing stuff with little penalty, so perhaps I should explore that instead of just writing windy emails...
Yes, to be clear, this discussion is more about my half-baked ideas for the future of hunchentoot-auth than it is about changes to hunchentoot to include some user authentication framework.
Thanks for listening,
Cyrus
On Apr 12, 2008, at 5:45 PM, Robert Synnott wrote:
To me, this seems like pushing Hunchentoot more in the direction of being a web framework than just a webserver. Not that there's anything wrong with that, of course, but it could probably be just as easily done with a library that sits on top and implements a light-weight framework. Rob
On 13 apr 2008, at 03:21, Cyrus Harmon wrote:
This certainly doesn't have to be part of hunchentoot.
Perhaps this can be resolved by considering Hunchentoot a web-server and, like Robert said, not a web application framework; if this deals with HTTP details (like HTTP layer authentication and authorization) the answer is simply 'yes'. If your authorization and authentication is not at the HTTP layer, I would say it is part of your application and not HTPP and thus should remain there.
Although a covers-everything web framework has allure at first; I've often experienced disappointment after actually using some due to un- layered or incomplete design forcing you to do it their way or having to use horrible workarounds and hacks that make you feel dirty.
However, there is definitely a place for frameworks that sit on top of a generic web-server with a good HTTP layer API. I hope Hunchentoot will remain being such a web-server.
In fact, I firmly believe there is even place for a layer in-between (think Python's WSGI but perhaps richer) so that web frameworks can be written in a web-server agnostic way and can perhaps even cooperate and nest each-other.
Woops, sorry for hijacking your thread with that - I'll get off my soapbox now ;)
-Arjan
On Apr 12, 2008, at 5:45 PM, Robert Synnott wrote:
To me, this seems like pushing Hunchentoot more in the direction of being a web framework than just a webserver. Not that there's anything wrong with that, of course, but it could probably be just as easily done with a library that sits on top and implements a light-weight framework.
On Apr 13, 2008, at 03:21 , Cyrus Harmon wrote:
I'd like to stay away from a full-fledged web framework ala UCW or weblocks
Yes. please. I use Hunchentoot primarily as a thin layer above HTTP, which shields me from dealing with the raw bytes on the wire. Also, I have to say I haven't seen a convincing CL web framework yet. They all seem to be UI centric, whereas I am dealing mostly with resources.
I'm willing to 1) make this library totally hunchentoot-specific and 2) if necessary propose modifications to hunchentoot that would facilitate the implementation of this library.
I have no problem with that. For me, the only reason to prefer something else over HT would be if I couldn't deploy behind mod_proxy, and then I'd probably rather write an adaptor for that situation (FastCGI, WSGI, ...).
In particular, the hunchentoot dispatch stuff, while flexible, could, I think, be improved in ways that would make the implementation of this library more facile.
I actually find it overly flexible. (This is perhaps another of the "organic growth" areas.) There are several ways to plug into the dispatcher. At the moment, I am using this:
(defvar *toplevel-routing-table* (let ((rt (make-instance 'ht-routing-table))) (shiftf (get-routes rt) hunchentoot:*dispatch-table* rt) rt))
(defmethod hunchentoot:dispatch-request ((table routing-table)) (let ((controller (find-controller table *request*))) (handle-request controller *request*)))
However, another option for me would be to just push
(lambda () (hunchentoot:dispatch-request *toplevel-routing-table* *request*))
onto hunchentoot:*dispatch-table*. And I haven't even look at the meta-dispatcher stuff and starting multiple server instances.
There's probably a way to simplify all this without losing any power or convenience.
Cheers, Michael BTW: The reasons behind all this: * I like the mappings between URLs and handlers a little more descriptive than bare function designators, for example, to print out the mapping or appropriate Apache config stanzas. So I use CLOS objects. Alternatively, I could have used (:metaclass funcallable- standard-class).
* I like to be able to rearrange URL mappings while running in development. (make-prefix-matcher "/foo/") is a little too static for my taste.
* I bundle several end points (handlers) together (into a "controller"), because on their own, they don't make sense. Also, the end points don't know anything about the URL they are mapped to.
* I can deploy a single controller several times on different URL routes (e.g., "/~foo/...", "/~bar/...", etc.). The routing dissects the URL and provides parameters to controller and end points. Deploying multiple "web apps" comes for free.
* Authentication is done by Apache, for the moment, because it's convenient and works for files served statically, too.
* Authorization is done by Apache and by controllers (for, say, DB access), because all end points are usually subject to the same rules. End points can do additional checks with finer granularity.
On Apr 13, 2008, at 3:49 AM, Michael Weber wrote:
I actually find it overly flexible. (This is perhaps another of the "organic growth" areas.) There are several ways to plug into the dispatcher. At the moment, I am using this:
Yes, I think this is part of the challenge here. There are many ways to plug the sort of functionality I'm thinking of into hunchentoot. I'm not satisfied with the current approach taken by hunchentoot-auth and the myriad of choices for plugging this stuff in is what initially precipitated this discussion.
At the risk of taking this discussion from the philosophical to the practical, allow me to discuss some options for wedging in the authorization stuff.
+ authorized-page macro
The existing approach is an authorized-page -style macro that wraps each page and does the following:
* check if an https connection is required and if so that we're actually using an https connection otherwise, redirect to an https page on the appropriate port. * either check that the supplied user and password are correct or that the user's session was properly authenticated. If necessary, squirrel away the user name in a per-session hash-table and set a flag that in a per-session hash-table that the user is authenticated.
The disadvantage of this approach is that it requires wrapping the code that generates each page with the authorized-page macro. One can't take arbitrary request handling code and make the page require authorization without somehow wrapping it, or another function that calls it, with this macro.
+ *meta-dispatcher*
One could rig up *meta-dispatcher* such that it checked for authorization and possibly redirect things along the way. The problem with this is that there is only one meta-dispatcher, so you only get to do this once per hunchentoot instance.
And, of course, one could override the value of *dispatch-table*, which is what the default *meta-dispatcher* returns.
+ server-dispatch-table
Similarly to the case of *meta-dispatcher*, one could use the dispatch- table slot of the server instance to hijack the dispatch and check for authentication. But it's not clear to me how the elements of this table should be ordered.
Interestingly, there's a dispatch-request generic function that could be used with a suitably defined class.
Clearly there's plenty of rope for extensibility here, the challenge, for hunchentoot-auth at least, is figuring which of these hooks to exploit.
(defvar *toplevel-routing-table* (let ((rt (make-instance 'ht-routing-table))) (shiftf (get-routes rt) hunchentoot:*dispatch-table* rt) rt))
(defmethod hunchentoot:dispatch-request ((table routing-table)) (let ((controller (find-controller table *request*))) (handle-request controller *request*)))
However, another option for me would be to just push
(lambda () (hunchentoot:dispatch-request *toplevel-routing-table* *request*))
onto hunchentoot:*dispatch-table*. And I haven't even look at the meta-dispatcher stuff and starting multiple server instances.
Right. Multiple server-instances is another issue and it becomes important for what i'm doing because I use two server instances, one for http and one for https. There's no built-in infrastructure for managing multiple "servers". One could imagine some sort of meta- server (or renaming the server class to a listener and allowing for the server to have multiple listeners, but I guess that's just a nomenclature issue).
There's probably a way to simplify all this without losing any power or convenience.
Right.
Cheers, Michael BTW: The reasons behind all this:
- I like the mappings between URLs and handlers a little more
descriptive than bare function designators, for example, to print out the mapping or appropriate Apache config stanzas. So I use CLOS objects. Alternatively, I could have used (:metaclass funcallable- standard-class).
I agree. I like having more of the "metadata" kept with the function too.
- I like to be able to rearrange URL mappings while running in
development. (make-prefix-matcher "/foo/") is a little too static for my taste.
- I bundle several end points (handlers) together (into a
"controller"), because on their own, they don't make sense. Also, the end points don't know anything about the URL they are mapped to.
- I can deploy a single controller several times on different URL
routes (e.g., "/~foo/...", "/~bar/...", etc.). The routing dissects the URL and provides parameters to controller and end points. Deploying multiple "web apps" comes for free.
- Authentication is done by Apache, for the moment, because it's
convenient and works for files served statically, too.
Hmm... I've taken the perhaps crazy approach of using ht for everything, static files, CGI scripts, etc...
- Authorization is done by Apache and by controllers (for, say, DB
access), because all end points are usually subject to the same rules. End points can do additional checks with finer granularity.
Thanks for your comments,
Cyrus