Hi all,
I am using Git to manage a CL project I am working on, and have noticed
that there is no predefined regular expression to pull out "hunk headers".
If you look at 'git diff' output, each hunk (sequence of consecutive lines
with differences noted) has a header line, which is intended to show the
first line of the top-level syntactic object (function definition, class
declaration, etc.) that contains the hunk, so you can quickly see what
function, class, etc. it is within. To find the appropriate header line
for each hunk, Git scans backwards from the top of the hunk with a regular
expression, and takes the first line that matches. The regex to use, of
course, depends on the language of the source file.
Git has a table of built-in regexes for a number of languages, including
Scheme, but there is neither a generic Lisp entry nor a more specific CL
entry. I am inclined to submit one for inclusion, but I wanted to bounce
it off you folks first.
Here is the current Scheme regex. This is in POSIX Extended ("ERE") syntax:
^[\t
]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/
\t]|(library|module|struct|class)[*+ \t]).*)$
This is unfortunately not general enough to work for CL, as it doesn't pick
up 'defun', though it does allow some other CL constructs.
So I started to think about what would be good for CL. Some possibilities:
1. Simply match any line that starts with an open paren in column 0.
The upside of this simple rule is that it allows for arbitrary top-level
construct names. But if you indent your defuns for some reason, it will
overlook them.
2. Match any line that starts with '(def', even if indented. This could
have false positives.
3. Match either (1) or (2).
I'm leaning toward (3), but would like to hear your thoughts. Certainly,
we could use a more specific regex that matches only predefined CL
top-level constructs, but this seems wrong to me considering that CL
encourages us to define macros to add such constructs when we see a need.
-- Scott