Threads, Quasar, Fibers and DSLs

Tuesday, 16 June 2015

Hi all,

we recently discussed about the problem of modelling locks as Java
locks, and blocking our actual threads - or have to delegate to
internal large threadpools - to model the "parking" of a thread which
is visiting the interceptor chain, when all we want is to model a lock
as a boolean with possibly some metadata like owners and timestamps.

We all agreed that it needs to change, and we proposed some high-level
basic ideas about how to store the "on stack" state in some container
object. We know what we want to achieve, but I haven't seen yet much
progress on details about how exactly get there.

One basic first step we discussed is to split the internal contract of
the visitor chain between separate methods "up" and "down" (very much
in JGroups style). That would force us to highlight local variables,
which need to be moved within an appropriate container.
We then briefly discussed how such a container should model a stack,
to allow for different visitors to avoid name clashing, and do it
efficiently: we wouldn't like to allocate a vector for the stack, nor
use a dictionary for the names-object variables, but at every least a
custom payload object would need to be coded for each visitor, and
appended to the chain.. that would have the allocation costs of a
linked list and a container object to allow field access pointers be
allocated for each visitor in the chain.
Pooling is also an option.. but really it's all just looking horrible :-)

I was now reading about Quasar and Fibers (credit to Mark Little and
Tim Fox for the pointer) :
 - http://docs.paralleluniverse.co/quasar/

It seems very interesting, and relevant to our Infinispan discussion.
Although it does much more than what we need and the prospect of
supporting production systems which extensively rely on ASM and modify
definitions like threads and lock owners scares me a bit, so I'm not
advocating we use Fibers directly but it would be very nice to:
 - be able to experiment with it
 - see if something similar could be done without needing the bytecode
manipulation
 - maybe use it?

The overall impression I get, is that it's a very complex task for us
to split the visitor chain at this point, and it's equally expensive
to make any such experiment viable, so embarking in an experiment to
evaluate something like Fiber gets the "Epic" JIRA level.. but there
might be better ways!

If you look at our RPC Commands, a lot of code is very repetitive,
following established patterns and conventions. The component of the
Visitor chain are a bit less repetitive, but still there are some
patterns.

So, what if we had some basic templates for all this repeating code,
and used a custom DSL to define what a Command or a Visitor is
supposed to do?
I'm not suggesting we invent yet another general purpose programming
language, but something very ad-hoc which takes the input we need and
generates Java sources which we all understand, so issues in the code
generation should be easy to spot (better so than ASM!).

If we had such a tool, it would be sustainable to make cross-the-board
changes to the threading model, experiment with things like Fibers,
generate the visitor code and the correct stack handling w/o getting
dirty with very boring code which we don't even know how it will work
out.

In addition such a tool could optionally be enhanced to generate
metadata for tooling, like correctness evaluations, model checkers,
inject correct trace logging w/o polluting our design, output to
visualizers or other debugging helpers.
More importantly, it would significantly reduce the manpower needed to
experiment with model changes which would otherwise affect a too large
code base. For example, for some configurations one might not need the
InvocationContext at all, and to push the performance level to the
extreme it would be better handled by generating an alternative set of
visitors (or use bytecode manipulation).

Thanks,
Sanne

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009