I've been trying to convince the people over the in FBP group that this is much easier than it sounds (esp. if you're rolled your own RTOS, as I have done many times).
(I have to leave now, but will answer in more detail later tomorrow).
Here are some slides for a not-yet-presented talk:
https://github.com/guitarvydas/FBP-from-scratch
And here is a raw implementation of such extremely-green threads done in C (sorry :-):
https://github.com/guitarvydas/collate-fbp-classic
Given closures, this should be even easier.
'til later
pt