Hi,
I was optimizing my code and noticed that I got a bunch of brute force searches for an extremum of a function of more then one value like so:
(bind (((best-x best-y best-z) (iter outer (for x ...) (iter (for y ...) (iter (for z ...) (in outer (finding (list x y z) maximizing (objective x y z)))))))) ...)
The above code is causing excessive consing in the (list x y z) part and one can't specify (values) instead.
With the attached patch I can write:
(bind (((values best-x best-y best-z) (iter outer (for x ...) (iter (for y ...) (iter (for z ...) (in outer (finding (values x y z) maximizing (objective x y z)))))) ...)
which does not cons and results in faster code.
Regards, Max
Hi, Max Mikhanosha suggests a multiple value extension to FINDING... MAXIMIZING &optional INTO: (finding (values x y z) maximizing (complex x y z))
I have good and bad news about that. Bad news first.
I plan to reject this patch, based on the following critique of the patch: - Generality: This syntax (VALUES ...) pattern based extension does not cover functions returning multiple values e.g. (finding (floor x y) maximizing (foo x y)) - Arbitrary implementation limit to 9 *result-vars* (MULTIPLE-VALUE-LIMIT is the only one that would make sense, but it's not practical) - A lot more complexity added to the source for just one clause. - Unclear yet whether this syntax extension makes sense for other clauses. - &optional INTO is not integrated. What would it look like? INTO (values a b c) INTO ((values a b c) max) INTO ((a b c) max) - Slight bug: clause as expression must define result and constantly yield either all, or only primary value, e.g. (for (values running-x running-y) = (finding ...)) [if that were the only thing, I'd fix it myself before applying the patch]
Now, the good news.
Iterate claims to be an extensible iteration facility. Thus I wonder why many contributors in recent years have spent their time understanding and modifying the low-level implementation, instead of using Iterate's documented extension mechanisms?
Why not use the following definition?
(defmacro-clause (find-values expr maximizing max-expr &optional into vars) "Please don't forget the documentation string here" ;; Not a good design: number of values differs whether or not INTO is used (unless (or vars (eq 'values (if (consp expr) (first expr)))) (error "Expression must start with VALUES: ~S" expr)) (let ((temps (or vars (mapcar #'(lambda (s) (gensym "VALUE")) (rest expr)))) ;;#L(gensym (if (symbolp !1) (symbol-name !1) "VALUE")) (max (gensym "MAX")) (temp (gensym "NEW"))) `(progn (with (,max .,temps)) (let ((,temp ,max-expr)) (when (or (first-time-p) (> ,temp ,max)) (setq ,max ,temp) ;;(multiple-value-setq ,temps ,expr) (dsetq (values .,temps) ,expr))) ,.(if vars () (list `(finally (return (values .,temps))))) )))
This code is much shorter than the proposed patch. If it's not of general usage, it can be embedded in your application code. Your application does not depend on the Iterate maintainers to accept your patch for you to make successful use of Iterate. Doesn't that sound great?
Test cases ;(iter(find-values (values x y z) maximizing (complex x y z))) ;(iter(find-values (values x y z) maximizing (complex x y z) into (a b c))) ;(iter (find-values (floor x y) maximizing (rem x y))) ; error ;(iter (find-values (floor x y) maximizing (rem x y) into (q r))) ;(iter (find-values (floor x y) maximizing (rem x y) into ((the integer q) r)))
Consider possible extension: destructuring: (dsetq (values a (b . c) d) #) ;(iter(find-values (values x y z) maximizing (complex x y z) into (a (b . c) d))) ; breaks WITH ;(iter(find-values (values x y z) maximizing (complex x y z) into (a (b c) d))) ; breaks WITH
What are the drawbacks of this macro w.r.t. a built-in clause? - No type inference (and type-dependent initial values) - No conflict when used together with other default accumulation clauses
The first is a general problem of Iterate's API. The second can be remedied as follows, using the technique described in the Iterate manual. With the update below, a conflict of gatherers will result in a "Duplicate variable: RESULT456" error. Not pretty, but better than silence.
(defmacro-clause (find-values expr maximizing max-expr &optional into vars) "Please don't forget the documentation string here" ;; Not a good design: number of values differs whether or not INTO is used (unless (or vars (eq 'values (if (consp expr) (first expr)))) (error "Expression must start with VALUES: ~S" expr)) (let ((temps (or vars ;; Bind iterate::*result-var* so Iterate knows it's a gatherer ;; and will signal a conflict with other gathering clauses. (cons iterate::*result-var* (mapcar #'(lambda (s) (gensym "VALUE")) (rest (rest expr)))))) ;;#L(gensym (if (symbolp !1) (symbol-name !1) "VALUE")) (max (gensym "MAX")) (temp (gensym "NEW"))) `(progn (with (,max .,temps)) (let ((,temp ,max-expr)) (when (or (first-time-p) (> ,temp ,max)) (setq ,max ,temp) ;;(multiple-value-setq ,temps ,expr) (dsetq (values .,temps) ,expr))) ,.(unless vars (list `(finally (return-from ,iter::*block-name* (values .,temps)))))))) ;(iter(find-values (values x y z) maximizing (complex x y z))(collect x)) ; error ; I think I'll add that to the testsuite as another example.
The latter also shows how to exit from a named loop via iter::*block-name* -- this is not really great, and not documented, but LEAVE does not work inside FINALLY (a bug in the example in the original documentation, recently fixed), because expressions inside finally are not walked (and I don't believe the solution is to walk clauses in FINALLY, as that could lead to infinite loops at run-time with malformed loops).
Note another current bug: Cannot have 2 clauses maximizing &optional into maximizing values &optional into at the same time (one exact prefix of another). Who jumps in to fix this?
Comments are welcome. - general mechanisms for multiple values? - better API for gatherer into the default value(s) vs. iter::*result-var*? - finally (leave foo) & *block-name* issue? - exact prefix bug caretaker?
Regards, Jorg Hohle
Hi,
Well I did not wanted to request a feature/wish-list item without first trying to solve it myself; so included patch was more to start the discussion and wasn't intended to be a finished product. Sorry I did not made that clear in the submission.
Now to the subject matter itself (citations re-arranged to have discussion in a more logical order, with the response to patch critiques in the end)
Iterate claims to be an extensible iteration facility. Thus I wonder why many contributors in recent years have spent their time understanding and modifying the low-level implementation, instead of using Iterate's documented extension mechanisms?
The reason number one is that I got intimidated by the reader macros part, ie I've seen sharp-L and bang-vars thing, and thought too myself "well this seems to be pretty complicated thing right there".
Another reason is that even as you said implementing this with defmacro-clause does not allow (finding-values (values (the fixnum a) (the fixnum b)) type of expression, that would make "MAX" variable type known. Since my whole reason for wanting this functionality was optimizing my inner loop code for speed, that would defeat the purpose.
Comments are welcome.
- general mechanisms for multiple values?
Should be part of the base library (IMHO of course)
What attracted me to iterate in in a first place is ability to do "finding (list a b c d e) maximizing (f a b c d e)" type of clause, and ability to execute it from inner loop.
Finding x,y,z maximizing f(x,y,z) is much more common (well at least in mine) AI/GA/optimization type of code then finding single values.
So while "finding" a one clause among many (iterate (collect/finding)) is probably 80% of all my usage of iterate, so making "finding" more efficient and easy to use is a priority.
- better API for gatherer into the default value(s) vs.
iter::*result-var*?
My *result-var-2..9* business was silly I realize that now. But one can change the main (iter) macro so that it does:
(if *result-var* `(values ,@(ensure-list *result-var)) nil)
And then making it so that when the clause returns multiple values it does something like:
(if (listp *result-var*) ;; other clause already returned (values) (unless (= (length *result-var*) ...num-values-being-returned...) (error "Previous clause(s) returned different number of values")) ;; else (cond ((have-binding-for *result-var*) (error "Previous clause(s) returned single value")) (t ... convert *result-var* to list and add additional vars here ...)
- exact prefix bug caretaker?
don't know enough about this. I assume code that parses this should have some kind of look-ahead added to solve ambiguities like that? I'm not volunteering tho.
Regarding the critiques of the submitted patch, some of them I can solve, let me know if you want me to work on it for a few days with the goal of inclusion. For my own uses I'm pretty happy with a quick-fix I got.
- Generality: This syntax (VALUES ...) pattern based extension does not cover functions returning multiple values e.g. (finding (floor x y) maximizing (foo x y))
This can only be fixed by differentiating on syntax. While I do not believe that finding clause like above would be used very often, I agree it should be supported.
Syntax variants that can be used (note we have to know the number of values being returned by the function):
- finding (values (foo x) (bar x)) ; returns 2 values
vs
- finding (n-values-of 2 (foo x)) ; returns 2 values returned by foo or - finding (2-values-of (foo x))) ; will have to parse symbol name
Also the following shortcut will have to be supported:
- finding (the fixnum (values ....))
being same as enclosing each individual value in fixnum, otherwise code looks too heavy with relatively many values like >4 being returned.. LOOP has similar shortcut (loop for (a b c) fixnum) and its very useful.
- Arbitrary implementation limit to 9 *result-vars* (MULTIPLE-VALUE-LIMIT is the only one that would make sense, but it's not practical)
Yea that was a thinko. Simple making *result-var* a list of var names can be used as a flag that one of the clauses returns (values) which gets rid of 9 vars limit and reduces number of lines changed.
- A lot more complexity added to the source for just one clause.
Its just one clause but not all clauses are created equal. IMHO finding is one of the most important iterate clauses, along with collect, so should get correspondingly more attention.
I can offer some fix for this by eliminating 9 additional result-vars ugliness, also some of the changes were really cosmetic ie I renamed the variables in the function var to vars, expr to exprs etc to reflect the fact that these were now lists.
- Unclear yet whether this syntax extension makes sense for other
clauses.
collect (values a b c) returns 3 lists? collect (values a b c) into (values alist blist clist) collects the same 3 lists into 3 variables?
;; convert a single list of something complicated that have 2 ;; attributes into 2 lists of each attribute
(iter (for elem in complex-list) (for a = (extract-a elem) (for b = (extract-b elem) (collect (values a b))))))
- &optional INTO is not integrated. What would it look like? INTO (values a b c) INTO ((values a b c) max) INTO ((a b c) max)
That one is easy to fix, I just forgot to do it. It will look like so:
INTO ((values a b c) max), ie element of least surprise. You just use (values..) form in the same place as the single variable in 1 variable case.
- Slight bug: clause as expression must define result and constantly yield either all, or only primary value, e.g. (for (values running-x running-y) = (finding ...)) [if that were the only thing, I'd fix it myself before applying the
patch]
Missed that one also, shoulde be fixable as already noted.
Regards, Max
Hi,
Max Mikhanosha wrote:
included patch was more to start the discussion and wasn't intended to be a finished product.
Ah, ok. Indeed, one can wonder whether Iterate could provide a standardized means to return multiple values. The discussion is open.
collect (values a b c) returns 3 lists?
That is too limited to be useful IMHO. In many cases, one wants to return values of different kind, e.g. a sum and a length, or a list. Iterates only mechanism for this is collecting into named variables. I believe no other mechanism would match Iterate's design (e.g. it's not Series or comprehensions), at least I can't think of any.
(if *result-var* `(values ,@(ensure-list *result-var)) nil)
I believe putting *result-var* to different uses, either as symbol or list of symbols, may cause more headaches than good (consider user written extensions as macros). But I have not thought through the situation (lack of time).
- finding (the fixnum (values ....))
being same as enclosing each individual value in fixnum, otherwise code looks too heavy with relatively many values like >4 being returned.. LOOP has similar shortcut (loop for (a b c) fixnum) and its very useful.
There's a comment about this in the code, which seems to exclude exactly this situation: ;;; Possible extension: ;;; When more than one variable can occur, we allow a single ;;; the-expression to cover them all. Unfortunately, this makes ;;; things rather hairy--probably better to avoid it. ;[defun distribute-type-spec commented out] The comment (from J. Amsterdam) does not say what exactly becomes hairy. I have no clue wether Amsterdam means a hairy design or a hairy implementation or whatever.
For my own uses I'm pretty happy with a quick-fix I got.
What do you mean? your patch, my defmacro-clause or something else?
Another reason is that even as you said implementing this with defmacro-clause does not allow (finding-values (values (the fixnum a) (the fixnum b)) type of expression, that would make "MAX" variable type known.
What do you mean? It should be straightforward to support this in my finding-values defmacro-clause, using THE. Generate (with (the type var1)) or (with (foo (the type var1) the ...)), with types taken from either the expression or the INTO variable list. Similarly for the syntax `maximizing (the type (foo))'
The macro slowly grows cumbersome, as standard CL syntax is (the (values type1 ...) expression), but so far my macro special-cased the pattern (VALUES ...) as expressions, as in (find-values (values (the integer (foo)) (the real (bar))) maximizing ...)
Maybe that would make a good tutorial example, as it shows how simple code grows more and more complex with the supported features?
You may wish to explore the THE path (together with iter:declare-variables) for your performance-critical iteration needs.
- finding (n-values-of 2 (foo x)) ; returns 2 values returned by foo
or
- finding (2-values-of (foo x))) ; will have to parse symbol name
What about (finding expr &optional VALUES 2 INTO var-spec) ? But then type decls would become cumbersome to parse, e.g. for (finding (the (values t1 t2) (floor x y)) values 2)
Also note that (the integer (values x y)) is a perfectly legal CL expression, but it's meaning is not to distribute the type across the 2 values. Thus one can argue that it's not allowed to extend the meaning of a legal expression to a different meaning. Such an extension could be plain wrong. E.g. consider a pseudo accessor FROB implemented as a macro, which expands to gethash. Saying (the integer (FROB)) means that I only care about the primary value, not the second "present-p" one, IMHO. (the (values integer integer) (FROB)) would be an error, as the second value is a boolean.
Regards, Jorg Hohle