The GSL Engine ============== The following gives an outline of the conceptual approach to simulate flow-system behavior and carries details of the implementation along. Introductionary remarks: ------------------------ The GSL Engine simulates signal flow systems by mapping them onto a network of signal processing modules which are connected via signal streams. The signal streams are value quantized and time discrete, where floats are used to store the quantized values and the samples are taken at arbitrary but equidistant points in time. Also, the sampling periods are assumed to be synchronous for all nodes. In the public GSL C API, engine modules are exposed as GslModule structures, for the internal engine implementation, each GslModule is embedded in an OpNode structure. Node Model: ----------- * a node has n_istreams input streams * a node has n_jstreams joint input streams facilities, that is, an unlimited number of ouput streams can be connected to each "joint" input stream * a node has n_ostreams output streams * all streams are equally value quantized as IEEE754 floats, usually within the range -1..1 * all streams are synchronously time discrete * the flow-system behavior can be iteratively approximated by calculating a node's output streams from its input streams * since all streams are equally time discrete, n output values for all output streams can be calculated from n input values at all input streams of a single network * some nodes always react delayed ("deferred" nodes) and can guarantee that they can always produce n output values ahead of receiving the corresponding n input values, with n>=1 * a node that has no output facilities (n_ostreams==0) is considered a "consumer" and has to be processed // FIXME Node Methods: ------------- ->process() This method specifies through one of its arguments the number of iterations the node has to perform, and therefore the number of values that are supplied in its stream input buffers and which have to be supplied in its stream output buffers. ->process_deferred() This method specifies the number of input values supplied and the number of output values that should be supplied. The number of input values may be smaller than the number of output values requested, in which case the node may return less output values than requested. Node Relationships: ------------------- Node B is an "input" of node A if: * one of A's input streams is connected to one of B's output streams, or * node C is an "input" of A and B is an "input" of C Processing Order: ----------------- If node A has an input node B and A is not a deferred node, B has to be processed prior to processing A. Connection Cycles: ------------------ Nodes A and B "form a cycle" if A is an input to B and B is an input to A. Invalid Connections: -------------------- For nodes A and B (not necessarily distinct) which form a cycle, the connections that the cycle consists of are only valid if the following is true: (C is a deferred node) and (C==A or C==B or (if C is completely disconnected, the nodes A and B do not anymore form the cycle)) Implementation Notes ==================== * if a node is deferred, all output channels are delayed * independent leaf nodes (nodes that have no inputs) can be scheduled separately * nodes contained in a cycle have to be scheduled together Scheduling Algorithm -------------------- To schedule a consumer and its dependency nodes, schedule_query() it: Query and Schedule Node: * tag current node * ignore scheduled input nodes * schedule_query_node on untagged input nodes, then do one of: * schedule input node (if it has no cycles) * resolve all input nodes cycles and then schedule the input nodes cycle (if not self in cycle) * take over cycle dependencies from input node * a tagged node is added to the precondition list (opens new cycle) * own leaf level is MAX() of input node leaf-levels + 1 * untag node Resolving Cycles: * eliminate child from precondition list, once the list is empty the cycle is resolved * at least one node being eliminated has to be deferred for the cycle to be valid Scheduling: * nodes need to be processed in the order of leaf-level * within leaf-levels, processing is determined by a per-node processing costs hint (cheap, normal, expensive) Implementation Considerations: ------------------------------ For deferred nodes, the number n specifying the amount of output values that are produced ahead of input can be considered mostly-fixed. that is, it's unlikely to change often and will do so only at block boundaries. Supporting n to be completely variable or considering it mostly fixed has certain implications. Here're the considerations that led to supporting a completely variable n for the implementation: n is block-boundary fixed: + for complex cycles (i.e. cycles that contain other cycles, "subcycles"), the subcycles can be scheduled separately if the n of the subcycle is >= block_size - if n is the only thing that changed at a block-boundary, rescheduling the flow-graph is required in the cases where n = old_n + x with old_n < block_size or if x < 0 - deferred nodes can not change their delay in response to values of an input stream n is variable for every iteration step: + no rescheduling is required if n changes at block-boundary - subcycles can not be scheduled separately from their outermost cycle + the delay of deferred nodes can correlate to an input stream Threads, communication, main loops ================================== Thread types: * UserThread; for the scope of the engine (the functions exposed in gslengine.h), only one user thread may execute API functions at a time. i.e. if more than one user thread need to call engine API functions, the user has to take measures to avoid concurrency in calling these functions, e.g. by using a GslMutex which is to be locked around engine API calls. * MasterThread; the engine, if configured accordingly, sets up one master thread which - processes transactions from the UserThread - schedules processing order of engine modules - processes single modules when required - processes module cycles when required - passes back processed transactions and flow jobs to the UserThread for garbage collection * SlaveThread; the engine can be configured to spawn slave threads which, in addition to the master thread - process single modules when required - process module cycles when required Communication at thread boundaries: * Job transaction queue; the UserThread constructs job transactions and enqueues them for the MasterThread. The UserThread also dequeues already processed transactions, in order for destroy functions of modules and accessors to only be executed within the UserThread. Also, the UserThread can wait (block) until all pending transactions have been processed by the MasterThread (in order to sync state with module network contained in the engine). * Flow job collection list; the MasterThread adds processed flow jobs into a collection queue, the UserThread then collects the queued flow jobs and frees them. * Module/cycle pool; the MasterThread fills in the module/cycle pool with modules which need to be processed. The MasterThread and the SlaveThreads pop modules/cycles from this pool, process them, and push back processed nodes. * load control; // FIXME Main loop integration: in order to process certain engine modules only from within the UserThread and to drive the engine even without master or slave threads, the engine can be hooked up to a main loop mechanism supplied by the UserThread. The engine provides API entry points to: - export file descriptors and timeout, suitable for main loop backends such as poll(2) - check whether dispatching is necessary - dispatch outstanding work to be performed by the engine FIXME: needs mentioning of pollfd/callback jobs TODO: ===== - virtualization (poke ibuffer into outbuffer) flag (needs memcpy for deferred nodes) - flag UserThread nodes - sample timed jobs - need global timestamp - need async-queues that have time-stamped jobs - process only so much samples until a time-stamped job needs to be processed - self-input cycles need to be resolved in parent as well - node-locale async timestamped jobs - engine-init: { block(pipe-fd), check } - sample-timed activation can offset node's block-boundary - need complete_blocks_only flag in node classes - cost rating: cost*n_inputs+cost*n_outputs - multi-input(joint) streams: job_disconnect(dmod,distr,smod,sistr); (jstreams) Jan 07 2002 Tim Janik * cosmetic updates, flow jobs Aug 19 2001 Tim Janik * notes on threads, communication, main loops Jul 29 2001 Tim Janik * wording/spelling fixups May 05 2001 Tim Janik * initial writeup LocalWords: GSL API GslModule OpNode istreams ostreams A's B's sync LocalWords: gslengine GslMutex UserThread MasterThread SlaveThread SlaveThreads