Tool for the conversion of Bayesian Networks (in net Hugin format) with nodes up to 3 discrete states into Artificial Neural Networks by way of exhaustive state permutations.

MedPreceptor.com.br tries to emulate the decision-making ability of a human expert, and as such it figures itself as an **Expert System** (or at least a work-in-progress in that direction).

This tool was created as a way to accelerate and streamline the translation of medical flowcharts/guidelines into this expert system.

Currently the work-flow used by the author of this site (“Antibiotic Prescribing in Hospitals” project) can be illustrated as:

**1. Knowledge Base Development**- Medical literature reading; -> Medical specialist consultations; -> Medical guidelines/flowchart analysis; -> Knowledge base drafting; -> Domain validation by human expert ->
**2. Bayesian Network Creation**- Bayesian Network modeling from validated knowledge base; -> Bayesian Network creation; -> Bayesian Network validation by human expert; ->
**3. Artificial Neural Network (ANN) Creation**- ANN creation by way of automatic conversion from validated Bayesian Network
*****BayesToANN was created specifically for this step*****; -> ANN validation by human expert; -> **4. Website implementation**- Use of validated ANN in the creation of the dynamic website page as the final step for the automatic decision-making/expert system consultancy; -> Website validation by human expert.

**Spoiler alert!** As this site is aimed at a broader audience and If you're not into computer geek stuff nor are you a statistics fan, please avoid the rest, otherwise you'll be bored to a very certain outcome.

BayesToANN is a tool that implements a heuristic to convert a Bayesian Network (in .net format -> Hugin Expert influence diagram file format) into a domain-restricted feed-forward Artificial Neural Network (in .EG format -> Encog file format).

It employs an exhaustive permutation of discrete node states (all possible state combinations of all user-selected input events in a Bayesian Network). During iteration, after junction tree compilation and evidence propagation, the output event in the user-selected output pool with highest probability is chosen as a training candidate for the corresponding ANN. BayesToANN makes use of Junction Tree compilation (courtesy of Projeto AMPLIA and derived pair UNBBayes) that is orders of magnitude more efficient than exact-inference and even randomization algorithms (still being exact for non-loopy belief networks).

In case of more than one output event fulfilling training criteria (if it surpasses the user-selected learning probability threshold), an additional neural network (encog_**k+1**.EG file) is trained for each possible output event (in decreasing order of probability) to a maximal number of ANNs that equals the number of output events in pool. The notification of additional output events occurring at any given time is represented by encog_**more_n**.EG ANNs - in a singly linked-list fashion. A positive result in encog_**more_k**.EG ANN represents an additional output event (above learning threshold) occurring at encog_**k+1**.EG ANN.

The “no result” flag is raised by encog_**infer_n**.EG ANNs. A positive result in encog_**infer_k**.EG ANN represents machine inference (that is, in this case input neurons are in a state that do not produce an output above the learning threshold). In this situation the result achieved by querying the corresponding encog_**k**.EG (i.e the main ANN) should be used with caution. Input neurons are in a non-anticipated state (untrained state).

As a proof of concept, BayesToANN automatically created the ANN used by MedPreceptor in “Ventilator Associated Pneumonia”. This ANN is the first at MedPreceptor to make use of 3-state neurons. To illustrate this point: the “Bacterial Culture” nodes can be in one of the following 3 states at any given time: (1) “Negative culture -> value 0.0”, (2) “Positive culture -> value 1.0”, (3) “Undetermined -> value -1.0”, meaning that either the culture was not collected or is still in lab. Every each state influences a different outcome (different antibiotic scheme for each state). The state traversal from “Undetermined” to “Negative” and than to “Positive” is treated as a *continuum* by the ANN. But in fact you're asking the ANN to perform a nonparametric test (rank test). The distance from 'Undetermined' to 'Positive' shouldn't be any wider than 'Negative' to 'Positive'. That presents a challenge to the ANN algorithm either for untrained data - because it weakens machine inference - as for trained data because then, one needs an ever increasing number of hidden layer neurons to compensate for the lack of decision lines in covariate space.

At present, Levenberg-Marquardt training (implemented by Heaton Research) represents the only working solution (in my experience) for the case of 3-state-neuron ANNs (where there is a clear separation in logic from 'Negative' to 'Undetermined'). For the training to work under acceptable error rates the structure of ANNs is defaulted to 2 hidden layers with each double the size of the input layer (totaling 4 layers).

BayesToANN is therefore limited to that training algorithm. The downside of this is that the output neuron layer is always normalized in a "1 out of n" fashion. This comes at the expense of a weaker machine inference performance for untrained data (if compared to equilateral normalization).

BayesToANN is also limited to Bayesian discrete nodes with 3 states maximum. That restriction (imposed only for the input layer) aims for the tractability of the problem at hand (namely, the number of event states ** n** raised to the input neuron layer size

Other constraints: it is always assumed the first state to be 'Positive', second 'Negative' and optional third 'Undetermined/Unknown/Unset'; BayesToANN was not tested with Decision nor Utility nodes.

As a convenience BayesToANN saves both the training sets (in Encog .csv and Fann .data formats) and the trained ANN(s) in the output folder. Also, modeled Bayesian nodes with the description string beginning with I{index} or O{index} are selected and ordered for Input nodes and Output nodes respectively.

- Make BayesToANN algorithm aware of Junction Tree clique structure during permutation, thus taking advantage of causal independence among supernodes; The idea is to minimize the size of the ANN training set (and thus reducing training time without compromising exactness) by removing training records that include redundant evidences (namely, evidences laying outside the
*Running intersection property*route in which the winning ouput event resides). The objective being to condense this redundancy into smaller training sets by heuristically altering positive and negative evidences into undetermined. - Make BayesToANN algorithm aware of d-separation among cliques during permutation of evidences, applying a similar optimization as above.