Status, problems and future requirements
This is only a proof-of-concept implementation of the rudiments of a specification language. However, it is extensive enough to define a range of models including a variety of synapses and simple networks. This should help inform where further capabilities are needed. Some such areas are described below.
This proof of concept prioritizes various things that other specification languages treat as optional extras to be included in some distant version. In particular:
- There is a proper model for dimensions and units. Units are not just strings: they refer to a structured entity which specifies the dimensionality in terms of Mass, Length, Time and Current (you can't express luminous intensity as yet).
- All equations are checked to make sure they are dimensionally consistent.
- It supports extension and specialization among types (eg "this type has all the properties of that one and a few extras" or "it has all the properties of that one except some are set to particular values so the user only needs to set the rest").
- It supports component prototyping ("this model element is like that one but these bits are different").
- There is a clear distinction between types (categories of thing), components (a thing with a particular set of paramter values) and and the model that is run (any number of instances corresponding to each set of parameter values).
Notably, it does not have any support at present for user-defined functions. Somewhat surprisingly, it is possible that these may not be needed. See for example the implementation of the conventional HH model. A priori I would have expected any compact expression of this model to require user defined functions, but it doesn't use any functions and is still relatively compact. One could, of course, introduce a generic functions to express the functional form of, for example, the sigmoid used in the example, but it is not clear that this would make it more compact, readable or of lower entropy. At some stage, an external reference to a generic case is actually of higher entropy than a concise local expression with no other dependencies.
A number of technical issues exist with the specification and the interpreter. Some should be straightforward to resolve. Others may take more work.
- DerivedVariables currently require a dimension even though this should be deducible from their target
- Paths in derived variables use simple expressions, but only the simplest forms are supported by the interpreter. It needs a smarter XPath like grammar and support for this in the interpreter*.
- You can't define and use functions yet
- The numerics are trivial: it could do with smarter numerics and code generation. Janino would be a good choice (as used with Catacomb) to dynamically compile component behaviors.
- Error reporting is somewhat cryptic and full of stack traces.
* With respect to accessing variables on the running simulation, it may be that a good solution would be to (virtually) expand the component definitions to a full XML tree and use genuine XPath expressions over that tree. This could be more difficult than it sounds because of constructs like that in example3 which dynamically instantiate component instances (inserting synapses in this case) that are not defined in the component hierarchy itself. This may still work OK because the container for the instances is still there in the xpath, but a path to its contents won't resolve until after a model is built. On the other hand, since the focus here is on synapse modeling, not on large networks but it is probably reasonable to map the instantiated model to xml and use a standard XPath processor on that.
On another, related, point. Excluding visualization, parsing and file utilities, the current interpreter is about 4000 lines of Java (7000 with the parsers). I'd guess that, thanks largely to XPath, a model could be mapped to an XML representation of the runnable instantiation via XSLT with a similar, possibly smaller, amount of XLST. I'm not sure if this would be useful, but it would be very interesting to know. In general it seems that a good rule of thumb is that a specification such as this shouldn't include anything that can't be processed relatively easily in XSLT. If a need for such a thing arises, then it could suggest that the concept should be expanded into a "more declarative" form until it can be handled by straightforward XSLT.
A system allowing user-defined types can go wrong in a number of ways. It could fail to work at all. It could prove too hard to use for anyone to bother. It could be formally powerful but too complicated for anyone (else?) to write an interpreter. It could yield model representations that are too messy to appeal to users (high entropy models). It could make it too complicated to do simple things that users expect to be simple. It could force people to think in an unfamiliar way to the extent that they choose to do something else. It could end up as just another programming language.
The last point is a particular concern. After all, a programming language is a pretty powerful user-defined type system: the thing that differentiates it from a model specification language is precisely the restrictions in the latter. If you keep taking restrictions away, at some stage it ceases to achieve other objectives.
Of these, the most likely pitfalls here seem to be that it could require users to think in an unfamiliar way and it could become too complicated for anyone to write an interpreter. Both of these issues relate to the three-layer structure involving types, components and instances (for comparison, SBML just has a 1 layer structure: models are the same as state instances). As far as I can see, three layers are the minimum for a low entropy model description capable of expressing the type of models that need to be expressed but I'm sure others disagree. The main counter-contender seems to be the NeuroSpaces approach with a smooth (rather than layered) hierarchy which makes a seamless transition from type to instance within a single layer by using prototyping throughout (rather than two class/instance divisions as here).
The need for user defined types in NeuroML parallels in some ways the goals of NineML. However, NineML is layered by design. LEMS is slightly layered, but not very: one could compare the structure available for defining types to the NineML abstraction layer, and the structures for using them to the user layer. However, looking at the list of elements (ignoring the deprecated bits), 95% is to do with defining types, and only one paragraph describes how they are used. If someone only wanted to use types, they would also need some of the information in the types section, such as the syntax of path expressions, so there would be a little more than one paragraph to a "user layer" specification, but it would still be an extremely short document. On the other hand, every example defines a new bunch of types: if these were included as part of the user level specification, then it could rapidly become very large. But a more natural place for these seems to be some catalog of type definitions rather than a specification document. There is scope for selecting a preferred set of types (eg for HH or Kinetic Scheme channels) so you don't get a proliferation of similar but incompatible models, but the best mechanism for doing this is not clear.
Single element type in the user-layer
Given the simplicity of the model specification layer, (defining components rather than types), where everything is a component or a parameter value, it could be argued that there should be some segregation into different component types (eg things that produce spikes, things that define connectivity etc). For convenience, I started with custom types for simulation and display elements but rapidly got rid of them. They are slightly easier to implement as custom types, but insidious problems keep cropping up. Eg, the runtime in a simulation specification element should have dimensions time, but specific dimensions, such as time are only defined in component-space, not type-space so you can't actually say that a hard-coded component has a parameter with a particular dimensionality. Issues like this strongly suggest that everything in a model should just be a "component" corresponding to a particular user defined type - no special cases. For expressive convenience, though, models don't have to be written "<Component type="MySyanpse" .../>, but simply as <MySynapse .../>. This proves to be simple to implement and makes the models much more readable. But it would also cause confusion if there were any elements allowed that weren't the names of user-defined types - another reason to make everything a component.
Specific types for networks and populations not needed
As well as starting with a hard-coded simulation element, the early examples also have hard-coded network and population elements. I had previously thought that this would be hard to avoid but it turns out not to be as bad as expected. The Build element with MultiInstantiate and ForEach children proved sufficient to replace the (albeit trivial) hard-coded population and connectivity elements with user defined types. Whether this extends to more subtle ways of specifying networks remains to be seen, but I suspect that by adding a few more constructs like ForEach (Choose, When, Otherwise, If etc) and using the selection rules effectively one could do most of what would be needed. This relates to the debate as to whether the NineML user layer needs network-specific constructs, and would tend to suggest that it doesn't.
Joys of dimension checking
Even from limited experience making up the toy models, the automated dimension checking for equations and transparent unit handling is invaluable. It cuts out a whole family of time-consuming potential errors. It is just a shame there isn't support for this kind of thing in IDEs yet...
What to do if you get beyond point process models
Everything here is to do with point models without any spatial extent. While it would be easy to define types to represent, say, a cell morphology, I have no clue how to attach a behavior to them in a meaningful way. For a morphology, one could imagine associating a membrane potential state variable with each section, and computing resistances to the neighboring sections, but that would be some kind of crime against numerical analysis since it would actually be representing a different model (one where all the capacitance was at the midpoints of the segments). A correct approach would involve introducing behavior elements for scalar fields and geometrical volumes but this could well increase the complexity to the point where it becomes useless to try writing an interpreter. Perhaps an intermediate route via magic tags to say for example "this structure can be approximated by a 1D scalar field satisfying equation ... etc" might work.
Retrofitting existing component types
After the false start with hard-coded elements for specifying networks and display settings, it proved surprisingly easy to retrofit user defined types to these elements. In general, this was not even restricted to creating equivalent functionality, but it could use exactly the same xml. If this applies in general, then it might be possible to retrofit more extensive building-block languages such as NeuroML or PSICS with user defined types. For the point models this could let a generic simulator run them. It wouldn't help with more complex, spatially extended, models but it would at least make it rather easier to read and process them. In a sense, the type definitions would just be acting like a rather restricted domain-specific schema.
Correspondence to XSL
If one ignores "apply-templates", the structures used for building populations and connections are more than a little reminiscent of XSL. In a way this is hardly surprising, given that they are both about processing nodes from a tree. It also suggests that maybe using XSL and XPath directly might work, and would avoid gradually introducing equivalents to half the element types in XSL as they prove to be needed.
Correspondence to CSS
Somewhat surprisingly, there isn't any. None of the examples seemed to fit better with a css-like pattern than with an XPath like one. Perhaps this is because you generally need to be sure which nodes you will hit instead of the rather heterogeneous matching that css is best for.