1Frustrated Inhibited Synapse Learning (FISL) and Forward Inculcated Relief of Excitation (FIRE) Training Neural Network Learning and Training Algorithms David W. Croft 1993 Dec 20 Mon Red: new or recovered material Blue: discarded material Objectives Our primary objective is to build an intuitive learning system that will learn to duplicate any desired function. This will be done using a neural network training algorithm known as Frustrated Inculcated Synapse Training (FIST). The algorithm will be presented in 5 sections: the input rule, the activation rule, the learning rule, the connection rule, and the training rules. For each rule, a continuous and a discrete model are presented. Theory and Definitions A neuron is a device with inputs and an output whose function is determined by a simple "activation rule". An artificial neural network (ANN) is a network of neurons whose interconnections of inputs and outputs are weighted. An artificial neural network (ANN) has the structure to duplicate any function out of the infinite set of possible functions by choosing the appropriate set of interconnecting weights. Choosing the right set of weights to make your ANN duplicate your desired function is very hard because there are an infinite number of possible sets of weights. A neural network training algorithm is method of choosing weights for an ANN. If the algorithm is good, it will cause the weights to converge to the correct or nearly-correct values in a reasonable amount of time. The "neuron state" describes the state of the presynaptic neuron as it is seen by postsynaptic neurons. There are 2 states: firing, and not firing. The "network state" describes the present state of all the neuron states in the network. In an ANN of N 2-state neurons, there are 2**N possible network states. The "output state" describes the pattern of the network as it changes from one network state to another as it stabilizes. Updating can be either asynchronous, synchronous, or continuous. To "inculcate" is "to teach or impress by urging or frequent repetition; instill." The word is derived from "to force upon" and "to trample" (with the heel). American Heritage Dictionary, 2nd College Ed., 1982, p653. To "ooze" is to "progress slowly but steadily". American Heritage Dictionary, 2nd College Ed., 1982, p869. 0.0 The Input Rule 0.1 The Continuous Model postsynaptic potential is sigmoidal with respect to the presynaptic spike amplitude KSJ 196 Neurotransmitter ( Time ) := Threshold ( Presynaptic_Neuron_State ( Time - Delay ) ); Must be thresholded so never negative Release only occurs after Ca2+ delay and if depolarization still around Should actually exponentially decrease Should last as long as presynaptic neuron is depolarized KSJ 197 Synpase_Input ( Time ) := Neurotransmitter ( Time ) * Weight * exp (-K * Time ) Must decay exponentially to ensure that it last for both the firing and tired state of the neuron, model as RC circuit KSJ pp138-139,149-151 Neuronal integration dependent on time constant, bigger=better, KSJ 167 Weights could be related to number of available receptor channels both inhibitory and excitatory; the probability of one being open with a given amount of neurotransmitter probably does not vary Current declines exponentially much faster than voltage due to membrane capacitance May also be called postsynaptic potential (PSP). [Need to formulate how T is calculated from the pre-synaptic neuron state. Should only be generated when pre-synaptic neuron actually fires (otherwise we will be using linear neurons.) Use Nernst potential of Calcium.] [dT/dt := -TanH ( TanH ( T(t+1) ) - 0.5 )* T_L; that is, the amount of neurotransmitter released with each triggering will increase if it is rapidly absorbed by the receptor channels and decrease if is not. Hypothesis: transmitter is poisonous to axon; if it releases transmitter and nothing absorbs it, it will reduce the efficacy of the axon; on the other hand, the axon will grow in the direction in which transmitter is being most rapidly absorbed such as glial cells, or a positive electric field which attracts negatively-charged neurotransmitter such as a depolarized neuron (guide post cells KSJ 916).] 0.2 The Discrete Model For each pre-synaptic weight that is in the Fire phase, Disconnected Exh := Exh + 0.0; Inh := Inh + 0.0; Inhibitory Exh := Exh + 1.0; Inh := Inh + 2.0; Excitatory Exh := Exh + 2.0; Inh := Inh + 1.0; For Threshold := 1.0, Combined := Exh / ( 1.0 + Inh ); Combined = 0.0 Input := Low 0.0 < Combined < Threshold Input := Medium Combined >= Threshold Input := High 1.0 The Activation Rule [Define "depolarized" and "hyperpolarize".] [Define "threshold" as it relates to dVm/dt.] 1.1 Neuron Phases The "neuron phase" describes the phase of the membrane potential of an individual neuron: cold, resting, oncoming, firing, tiring, degenerating, and warm. [Need to add distinctions of phases based on membrane potential and activation of voltage-gated channels. Show figure.] A neuron is considered to be in the cold state when it has received generally inhibitory inputs prior to the current inputs. If it continues to receive generally inhibitory inputs, it will remain cold. If it receives a net zero input, it will become resting. If it receives generally excitatory inputs that are not sufficient to exceed the threshold required to fire, it will transition to the oncoming state. If it receives more, it will transition to the firing state. It cannot directly transition to the tiring or degenerating states. If the neuron is resting, it will transition as it would for the cold or oncoming states given the same inputs. If the neuron is oncoming, it will transition as it would for the cold or resting states given the same inputs. If the neuron is firing, it will transition to the tiring state. If the neuron is tiring, it will transition to the degenerating state. If the neuron is degenerating, it will transition to the cold state. [Warm: occurs when slowly depolarized but dVm/dt is never greater than threshold.] 1.1.1 Continuous Model For the continuous model, the next phase is dependent on the previous phase as well as the current inputs. Thus, the cold and oncoming phases will effectively shift the threshold required for the present inputs up and down respectively. A_K >= 0.0;-- activation of voltage-gated Potassium channels A_Na >= 0.0;-- activation of voltage-gated Sodium channels Cm :constant := 1.0;-- postsynaptic neuron membrane capacitance E_K :constant := -1.0;-- Nernst potential of Potassium channel E_Na :constant := +1.0;-- Nernst potential of Sodium channel E_Ex :constant := E_Na;-- Nernst potential of excitatory receptor channel E_Cl :constant := 0.0;-- Nernst potential of inhibitor Chloride G_Ex :constant := 1.0;-- conductivity of excitatory receptor channels G_In :constant := 1.0;-- conductivity of inhibitory receptor channels G_K :constant := 1.0;-- conductivity of Potassium receptor channels G_Na :constant := 1.0;-- conductivity of Sodium receptor channels I_Ex -- excitatory current I_In -- inhibitory current I_K -- Potassium current I_Na -- Sodium current M_Ex : 0 <= M_Ex < 1.0;-- permeability ratio of excitatory neurotransmitter M_In : 0 <= M_In < 1.0;-- permeability ratio of inhibitory neurotransmitter Vm -- postsynaptic neuron membrane potential dVm -- change in Vm (with respect to time) Ch_Ex > 0.0;-- number or excitatory receptor channels Ch_In > 0.0;-- number of inhibitory receptor channels T(N) >= 0.0;-- neurotransmitter from a given presynaptic neuron Thr :constant := +0.5;-- Threshold rate of change of potential for N in Neurons'range loop M_Ex := TanH ( T ( N ) / Ch_Ex ); M_In := TanH ( T ( N ) / Ch_In ); I_Ex := I_Ex + M_Ex * Ch_Ex * G_Ex ( N ) * ( Vm - E_Ex ); I_In := I_In + M_In * Ch_In * G_In ( N ) * ( Vm - E_Cl ); end loop; A_Na := ( TanH ( 10.0 * ( +dVm - Thr ) ) + 1.0 ) * ( TanH ( +Vm * 100.0 ) + 1.0 ) / 4.0; A_K := ( TanH ( 10.0 * ( -dVm - Thr ) ) + 1.0 ) / 2.0; A_Na := 0.990 * A_Na + 1.0 * Sigmoid ( 100_000.0 * ( dVm - Thr ) ); A_K := 0.999 * A_K + 0.2 * Sigmoid ( 100_000.0 * ( dVm - Thr ) ); I_Na := G_Na * ( 1 + A_Na ) * ( Vm - E_Na ); I_K := G_K * ( 1 + A_K ) * ( Vm - E_K ); dVm := - ( I_In + I_Ex + I_Na + I_K ) / Cm; [Note how Threshold is dependent on dVm/dt, not Vm.] 1.1.2 Discrete Model For the discrete model, we simplify by only considering four phases derived from the continuous model: Cold (degenerating and cold), Rest (resting), Ooze (oncoming and warm), and Fire (firing and tiring). We further simplify by limiting the inputs to just three values: Low, Medium, and High. Current Phase Inputs Next Phase -------------------------- --------- ------------ Cold, Rest, or Ooze Low Rest Cold, Rest, or Ooze Medium Ooze Cold, Rest, or Ooze High Fire Fire Any Cold 1.2 Neuron States From the perspective of postsynaptic neurons, a presynaptic neuron has just two states: firing or not firing. Thus, information is not propagated when the neuron is in the cold, resting, oncoming, tiring, or degenerating phases. Postsynaptic synapses are "triggered" when the presynaptic neuron is in the firing phase. This trigger signal cannot be negative although, if connected to an inhibitory weight, it may have a negative influence on the membrane potential of the postsynaptic neuron. 2.0 The Learning Rule Using FIST, a neuron learns by assuming that if it is presented with the same input in quick succession, it failed to respond appropriately to the input the first time. It will then modify its weights in such a way that it will respond differently when the input is presented again -- not firing if it fired earlier or firing if it failed to fire. 2.1 Potentiation and Depression The Learning Rule is composed of two parts: the Hebbian Rule which potentiates synaptic weights and the Frustration Rule which depresses synaptic weights. 2.1.1 The Hebbian Rule To "potentiate" a synaptic weight is to make it more excitatory and less inhibitory. The well-known Hebbian Rule states that a synaptic weight should be potentiated if the postsynaptic neuron is depolarized. [Get actual quote about pre- and post-synaptic activity.] "When an axon of cell A is near enough to excite a cell B and repeatedly or persistantly takes part in firing it,...". Organization of Behavior, Hebb, D.O., 1949. [Not sure about the resting phase.] 2.1.2 The Frustration Rule To "depress" a synaptic weight is to make it more inhibitory and less excitatory. The Frustration Rule is crucial to the FIST algorithm. It is basically this: depress the synaptic weight if the postsynaptic neuron is hyperpolarized. [Need to explain why I call it the "Frustration" rule. Use door example.] [Senowsky 1977?] [Rob Malenka, Ca into postsynaptic cell, LTD when ooze, not cold] 2.2 ... Learning occurs when an input arrives, whether inhibitory or excitatory, and the postsynaptic neuron is not in the resting phase. If the neuron is hyperpolarized and one of the synaptic weights attempts to excite it, the weight is depressed as it is assumed that the weight is attempting to fire the neuron repetitiously after unsuccessfully achieving the desired output the first time. If the neuron is depolarized and one of the synaptic weights attempts to excite it, the weight is potentiated as it is assumed that the weight is attempting to fire the neuron after being unsuccessful in doing so the first time. Only those synapses which are triggered are modified. Post-Synaptic Neuron Potential Learning Mode ------------------------------ ------------------ Hyperpolarized (Vm < 0.0) Depress Polarized (Vm = 0.0) Ignore Depolarized (Vm > 0.0) Potentiate 2.2.1 Continuous Model For the continuous model, we allow the weights to take on analog values and we divide the weight into two components: inhibitory channels and excitatory channels. dCh_Ex -- 0.0 < dCh_Ex < Ch_Ex -- change in Ch_Ex dCh_In -- 0.0 < dCh_In < Ch_In -- change in Ch_In L -- 0.0 <= L <= 1.0 -- learning rate dCh_Ex ( N ) := +Vm * M_Ex * Ch_Ex ( N ) * G_Ex * L; dCh_In ( N ) := -Vm * M_In * Ch_In ( N ) * G_In * L; dW ( N ) := dCh_Ex ( N ) - dCh_In ( N ); Note that when the neuron fires and the synapse is enhanced, the neuron will always become tired afterwards and the synapse will then be frustrated for a net learning effect of zero under a constant input above threshold. If, however, the enhancement is due to being oncoming, the neuron will not become tired, there will be no frustration, and the net synaptic learning will be positive. Thus, the synapse is likely only to be enhanced after being triggered successively where the first trigger will prime the neuron without causing it to fire. The net synaptic learning will also be positive if the positive input is removed before the neuron transitions from firing to tired. 2.2.2 Discrete Model For the discrete model, we simplify be only allowing the weights to have three values: inhibitory, disconnected, and excitatory. Weight_Old Post Phase Weight_New ------------ ------------ ------------ Disconnected Any Disconnected Inhibitory Cold Inhibitory Rest Inhibitory Ooze or Fire Excitatory Excitatory Cold Inhibitory Rest Excitatory Ooze or Fire Excitatory 3.0 The Connection Rule 3.1 Complexity and Distance Since our primary objective is to build a "black box" that will duplicate any desired function, complexity issues must be dealt with after the learning algorithm is established. For very complex functions, the NOB should acquire more neurons. The number of neurons should be limited to just enough to handle the training set accurately to maintain its property of generalization. If the system is trained in real-time, it may need to increase its number of neurons on the fly. Consider an ANN with an infinite number of neurons. There is a subset of neurons which have connections to themselves and the neurons on which the inputs are placed with a weight learning rate above zero. All of the other neurons have connection weights of zero with learning rates of zero with respect to the active subset of neurons. Although they may have non-zero learning rate neurons to themselves, they are effectively disconnected from the input/active subset. The active subset will adjust its weights based upon the inputs and its dynamic inner states. As all neurons are connected to the input neurons, it will be impossible to train weights without some correlation to the inputs. This creates difficulties if we wish to extract some information from a processed version of the inputs and then process it free of noise from the original inputs. We are also limited to the small subset of active neurons which may make processing of extremely complex functions impossible. To provide training free of the original inputs, we can pass a processed version of the inputs from the input subset to a less connected subset. We do this by increasing the learning rates of a few connections from the second subset to the input subset. We can do this ad infinitum. As we do not know how many neurons are required in the two subsets, we do not know how many of the connections should have the learning rates increased. We can compromise by smoothly varying the values of the learning rates from high to zero from the input neurons to the neurons in the infinite reaches. We can also allow the secondary subsets to be somewhat isolated by exponentially decreasing their learning rate connections from left and right of their center. The learning rate, L, between neurons as a function of the distance, D, in neurons is L ( D ) = L0 / ( D ** 2 ) or L ( D ) = L0 * exp ( -abs ( D ) ). Thus, each of the infinite neurons is at the center of one of an infinite number of subsets of neurons. Thus, the input subset will reach out to the pool of infinite neurons for very complex functions but will find it increasingly harder to do so, preserving its property of generalization. In neurobiology, the connectivity and learning rate of neuronal subsets is determined by the constraints of the medium ( "wetware" ) and evolution. This can be seen in the spatiotemporal decrease in connections and connection effectiveness with distance between neurons. The areas of process specialization such as speech, hearing, etc. can be represented by our neuronal subsets. In addition to the connection rule given above, we may use evolutionary algorithms for adjusting our artificial neural network learning rates. [Hyper-evolve dL/dD. Assume we don't care what connections are made between subsets of neurons as long as it works. Let dL/dD be determined by an evolutionary algorithm and be hard-wired at birth. Build virtual reality with ANNs in virtual robotic bodies. Hyper-evolve them in fast-time by killing off those that don't perform as well as the others (competitive). Randomly mutate by changing dL/dD with each successive generation. When training is complete, transfer ANNs from software to hardware and real robotic bodies insuring that the ANNs are slowed down from hyper-time to actual time. Could also be applied to any ANN problem such as stock market algorithm evolution.] [Show political diagrams using connections. Contrast Democracy with a Republic.] 3.2 Global Minima and Activity In the example below, the learning rate is not a function of L, but rather the conductivity of the excitatory and inhibitory receptor channels. Note that both the slow-learning and fast-learning neurons have an initial weight of =0.0 where W := G_Ex - G_In. However, after applying the learning rule, the weight of the slow-learning neuron becomes +1.0 and the weight of the fast-learning neuron becomes +10.0. L := 1.0; Vm := 1.0; M := 0.5 Slow Fast ---- ----- G_Ex 1.0 10.0 G_In 1.0 10.0 W =0.0 =0.0 dG_Ex +0.5 +5.0 dG_In -0.5 -5.0 G_Ex_New 1.5 15.0 G_In_New 0.5 5.0 W_New +1.0 +10.0 Since it is more probable that the fast-learning neuron will significantly change the function of the neural network than the slow-learning neuron, it is more likely to cause the neural network to escape a local minima. It would be desirable to have fast-learning neurons when we are trying to teach a neural network something new and have slow-learning neurons when we are out of the training phase. If learning causes the conductances to generally increase while searching for the proper weight and normal firings of the neuron with minimal net learning causes the conductances to generally decrease, we will have that attribute. Short- and Long-Term Memory Since neurons near the input set have high learning rates, they will quickly train to inputs creating an effective short-term memory which will be overwritten by the next input pattern. If an input pattern is presented over a long period of time, the dynamic oscillations near the inputs will gradually drift out to train neurons more distant creating a long-term memory. This long-term memory will reflect the average short-term memory over time. Besides time-averaging of the short-term memory, long-term memory may act as a noise filter by triggering only on highly correlated input patterns which are temporarily in short-term memory. Thus, if an input pattern is similar to a pattern stored in long-term memory, it is more likely to be able to pass on to deeper levels for memorization before it is overwritten by the next incoming pattern. Note that a neural network with a high learning rate is likely to have a good short-term memory, a poor long-term memory, and an improved ability to create analogies between disparate inputs. Imagination, Dreams, and Intuition Input neurons receive inputs from external real-world stimuli and, possibly, other neurons which cause it to fire sending a signal that represents a processed version of the external stimuli. If the external stimuli are shut off and if the input neuron continues to receive inputs from other neurons, it may continue to fire causing a signal representing a fictionary external stimuli. Since the synaptic weights have been conditioned to minimize energy, the firings will be representative of the most probable or most frequently repeated real-world stimuli. Thus, if an input neuron is continuously "clamped" to external real-world stimuli, the synaptic weights from other internal inputs will adjust such that when the external inputs are released or "floating", the internal inputs will continue to cause the neuron to fire as though the external stimuli were still there. As the neural network continues to process in the absence of external stimuli, each fictionary stimuli signal will cause the system to create the most-trained or minimal energy subsequent fictionary stimuli, and so on. If one watches a sequence of images repetitively, such as a ball falling over and over, when one closes one's eyes, one may imagine the scene over and over again. If one watches a ball falling just once and closes one's eyes, one may imagine the most probable outcome of the ball bouncing until it finally comes to a stop. If one sleeps, the current events of the day which may be oscillating in short term memory may be repeated over and over again until they have burned a groove in the energy state of the synaptic connections causing the energy for that pattern to be minimized, effectively being transferred to long-term memory. On the other hand, the events may mix with internal inputs to follow the minimal energy grooves to predict the most probable progressions of those events giving the dreamer an intuitive prediction of the future based upon inputs that were available. However, this prediction may become distorted as time progresses due to incorrect internal energy models of the real world which may lead to unusual dreams that deviate from possible reality. A practical application for this process is to perform supervised training on the neural network where the input neurons are clamped to real-world historical data such as stock market prices. The system will then adjust its weights to minimize system energy and create a model of stock market system. When the network is "played" with the historical data and then released when the data is currently not available such as for future stock market prices, the network will dream the most probable future stock market prices based upon its energy model of the real stock market that it derived from its external inputs. The stock market dream will go astray, however, if there are excessive unknown inputs or if the actual system is so chaotic that the internal model can only approximate the system by treating the unknown inputs are random variables or by stabilizing its energy model around the chaotic attractors. 4.0 The Training Rule "Training" is the act of presenting inputs to the ANN in such a way that the ANN will learn to act upon those inputs in a desired manner. "Supervised training" trains the neurons given the desired neuron states. "Reinforcement training" trains the neurons given only information as to whether the output was correct or not. Energy and Power Minimization By applying this learning rule, the network will naturally minimize the energy required to perform its function. If a neuron fires unnecessarily to neurons that have already been fired, the weights between them will be decreased. If a neuron fires unnecessarily to neurons that have not fired, they will learn to respond by firing and perhaps changing the inputs that are driving the first neuron to fire unnecessarily. The system will seek stable oscillations as chaotic patterns are likely to cause repetitious inputs which trigger the learning rule. As the learning rule is applied, the system weights change until the repetitious inputs cease, most probably when the network is stable. [Define "power" as the number of neurons firing per unit time. Define "average power". Show maximum limit on power with this model.] [Need to contrast energy minimization systems which have "hot" and "cold" neurons versus this power minimization system which has phases.] [If neuron is warmed under a constant input, it will not fire but will go to full depolarization slowly. As it does so, G_Ex will go to infinity in order to reduce the power expended by the constant input. Power = IV = (I**2)*R = (I**2)/G.] 4.1 Hopfield Training 4.1.1 NOB A "Neural Optimization Box" (NOB) is a specialized ANN that has undergone Hopfield training to set the weights such that it will optimize a specific problem. To find a good or best solution to a problem with many possible solutions, you should use your NOB. [Portfolio optimization problem to go here. Stock market optimizer: find the minimum variance portfolio given the Betas of all of the stocks available in the world. Take advantage of the correlation function of a neuron. Set weights between stocks to reflect the Betas. Where to read the number of each stocks, X, to buy? The adjusted self-connection delay? Or in the energy of the system or pulsings of each neuron? Is cost of each stock important? (Probably not, since MV point not related to average return. May want to seek just the "efficient set".) Local minima problems? Assumes network will automatically seek minimum variance based upon the energy minimization built into its learning rule. Corporate Finance 285. On p290, the stocks are arranged in an "N x N" matrix, Tij=Tji, and the % of $ to spend on each stock is given as the "weight"! Put in section called Hopfield Training after the Supervised and Reinforcement Learning Sections. p320 Beta's: inflation, GNP, interest rates. Note that in a real stock market, any arbitrage will cause the stock prices to adjust themselves. Any comparison to the way self-connection weights adjust themselves to balance average inputs?] 4.2 Supervised Training Consider a horse pulling a wagon and driver on a fixed route. Initially, the driver must steer the horse in the directions along the route. As days go by, however, the horse will eventually learn to walk the route without supervision and the driver may sleep along the way. If the desired states of the individual neurons in response to a given input are known, training is simple: clamp the neurons at the desired states at the given times and allow the weights to modify themselves. When the neurons are released, they will continue to fire. If the weights that were trained when the neurons were clamped create stable patterns, no further learning will occur. If the weights cause the neurons to fire chaotically, the weights will continue to settle causing your model to deviate but matching it as closely as possible given the number of neurons, inputs, and complexity. 4.3 Reinforcement Training [In reinforcement training, unlike supervised training, no teacher. Instead must determine correct output based on indicator. That indicator in biology is pain. Pain alone, however, would be just another input unless we define pain and our rules in such a way that pain forces the system to change. Since our rules cause the weights to change with an increase in the frequency of an input, pain will not be ignored. The system will stop changing when the weight changes cause a reduction of pain due to a change in the network function or the network finds a stable state (local minima) with the new pain input. Pain is our universal "cost" which should not have to be tailor-made to each problem. The rule is to increase the input frequency based on our error. Continuous and Discrete Model forthcoming.] Training is more difficult if only the desired output is known and the desired states for the neurons between the input and the output are not. As it is unknown what intermediate states will create the desired output, the individual neurons must be trained based upon a comparison of the output results to the desired outputs which indicates only whether the error of the whole network was either small or large. The training algorithm must perform without directly modifying any of the intermediate neuron states with only the network output error as a guide. The training algorithm depends upon the timing of the input presentation. If a neuron sees an input presented repeatedly, it will assume that it responded incorrectly to the input the first time and modify its weights, settling only when the input presentation stops for a sufficient length of time. Where the input is presented indirectly to the neuron, such as in multi-layer networks with hidden layers, the timing becomes complicated as it is difficult to train indirectly connected neurons without untraining directly connected neurons. One strategy is to make the input presentation frequency proportional to the error in the final output. As the error diminishes, the input is presented less frequently, allowing the weights to settle. If the error increases, the input frequency is increased, causing the weights to continue to change until the new network function produces an output closer to the desired output. The boundaries on the input presentation frequency vary between zero and faster than the neuron can react. If the error is still unacceptable after reaching one of these limits, it may be necessary to reverse or gradually oscillate around the desired minimum error frequency to allow the network to settle. Pain and Pleasure This training algorithm can be said to put a neural network in "pain" when it is repetitiously presenting the inputs in order to train the network. The neural network will modify its weights until its function is sufficiently different to minimize the output error which is proportional to the input frequency, or the level of pain. The training algorithm's definition of pleasure can be read in the response to the following joke. David: "Ann, why do you keep punching your head with your fist?" Ann: "Because it feels so good when I stop!" 4.3.1 NOUS Neural Optimization Utilization System 4.3.2 ORBS [Robotic bugs that eat algae. They have no inputs except their hunger. They have 4 legs to move them in the 4 cardinal directions. Will move about the field in patterns that are dictated by their hunger input. Will learn to rest when not hungry, cruise when hungry and not finding food in the area, and twirl when hungry and finding food in the area. Take-off of a Scientific American "Computer Recreations" article that used evolution to train movement patterns instead of a neural network. Note difference between "life" (genetic bugs) and "intelligent life" (neural bugs). The difference is in the "pain" (hunger) input which causes an optimized behavioral modification. Relate how we could be "bugs" to solve a TSP (green algae => cities with money). For use in supervised training, driver should be blind to surroundings as are the bugs.] These bugs could be said to be in a different ORB, or Optimizing Reality Basis, from our own. 4.4 Unsupervised Training [Music is a pattern of sounds as opposed to noise which is patternless or random. Note the human mind's affinity toward learning patterns and relate to stability.] Things to Possibly Add or Change Cerebellum. Nose-touch control. Drive-reinforcement? Gain maintained @ 1 as eyes track image as head moves. Cerebellum is the teacher. "Homonuclus"? Inferior olive and climbin fiber provide error signal. LTD occurs @ Purkinje cells which are (exclusively?) inhibitory. Forward inculcation is limited. If there could be pain signals going to neurons that don't have direct contact with the inputs other than the pain itself, the ANN would have a better clue as to what part of itself needs to be modified. That is, it is easier to learn not to put your hand in the fire if every time you do so you feel pain in your hand as opposed to your foot. Additionally, "forward" inculcation may be a misnomer in that FISL changes the "weights" of both the axons and the synapses (backwards and forwards, irrespectively). Monitoring chat mode on a computer communications system would be an easy way to get training data for a Turing System undergoing supervised training. As two users chat, two ANNs could be undergoing output clamping. SysOps-In-A-Box (SIABs) have been around for awhile which fool many users into believing that they are chatting with a human. To demonstrate FIRE training, users would converse with the SIAB, repeating their input whenever the SIAB gives an unmeaningful response. FIRE training could be combined with or seeded by supervised training to accelerate the initial learning. Note that the equations for the activation of ion-gated channels are not time-dependent. If I throw a ball straight up and take a snapshot of it sometime later when it is 10' up, you cannot tell from the snapshot whether it was going up or on its way down when I photographed it, unless you can photograph inertia or momentum. If you can take a snapshot of momentum, you can determine the next state of the system without depending on any previous states except for the current state. If I flip a weighted coin, will it be heads (+1.0) or tails (0.0)? Possible answers include: +1.0, 0.0, don't know, +1.0 and 0.0, 0.5, 0.5 + or - 0.5. If I now give more information, the answer may improve: +1.0, 0.0, don't know, +1.0 nine times and 0.0 once, 0.9, 0.9 + 0.1 or - 0.9. If I allow only one answer of either +1.0 or 0.0, it should be +1.0. If I allow an analog response, it should be 0.9. If I allow only +1.0 or 0.0 but allow the system to give me 10 responses, it should be 1.0 nine times and 0.0 once. The "don't know" answer may or may not be desirable given a threshhold of uncertainty. Suppose now that the coin flips periodically every 10th time on tails and heads otherwise. Create bestiary of garden slugs that are blind to all but smell and wind direction, monsters that eat chess pieces, bugs that only sense hunger but can move, etc. Set up ORB and model the history of data. When run given the same initial conditions, it will always come to the same conclusion. Allow interaction at any point in time of the system to allow "what-if" scenarios. Training can spin the ANN off into a local minimum unless the training inputs are carefully selected and presented in a set order. Stochastic Training seems to avoid this problem. "Learning with Hints" may be a method of presenting the training inputs in a fashion which avoids local minima. Consider trying to teach calculus to students before they know basic arithmetic. Consider stereotyping given a biased set of inputs. Does dL/dD rule already incorporate Kohonen self-association as the neurons gradually reach out to neurons with closely correlated patterns? Or would we have to actually modify the distances between neurons with time, (dL/dD)/dt. Put into the Unsupervised Training section. Compare stochastic learning's ability to escape local minima to dL/dActivity. Stochastic Learning: calculate the probability that the weight will change to another value given the current value of the weight and all of the possible inputs to that weight. What happens if the horse sees something unusual along the way? Will new input variables throw off supervised training networks once out of the training phase? If a creature is hyper-sensitive to pain (high-frequency input), will it become paralyzed (saturated learning)? If a creature is insensitive to pain, will it do nothing to learn and die? Hypothesis: conversion from polyneuronal innervation during development to single innervation at maturity is due to Frustration Rule (phase competition), not competition for trophic factors (KSJ 943). Use for competitive Kohonen example. Demonstrate with the 4 motor axons of a "bug" being connected to all 4 legs initially and then eventually only being connected to one leg each. Does Ch_Max saturation help in the 3-neuron layered XOR problem? Or just stochastic training without a Ch_Max? XOR with 1 neuron. Inputs A and B as constant currents. Neuron has single weight to itself. Put in the software demo. Note phase relation. Klopf's Drive-Reinforcement Learning y from 0 to 1 dW/dt = dy/dt * sum ( L * abs ( Wj ) * dX/dt ) Croft's FIST Learning y from -1 to +1 dW/dt = y * L * abs ( W ) * X Seems close but he deals in frequency changes. Same behavioral characteristics on the macro level? Possibly the biological explanation for Klopf's work on the micro (pulsed) level? Change FIST to FIRE (Frustrated-Inhibit Responsive-Enhance)? Give analogy of pain (high frequency input) associated with sticking one's fist in a fire. Will quickly learn to move it out, not into nor just leave it there (unless local minima -- out of the frying pan into the fire). Given a neuron with a non-pulsed constant input and a single connection to itself with a random weight and some delay, we would expect that its weight would converge to some value over time. This value would be dependent upon the value of the constant input and the self-connection delay. The weight would reflect the correlation (autocorrelation or possibly even variance?) between the neuron's state and its delayed previous state. If the delayed signal arrived when the neuron was tired, the weight would be -1. If it arrived when the neuron was ready to fire again, it would be +1. It would correlate the phase difference due to the delay and would be dependent on the constant value of the input (average value of pulsed inputs? Klopf?). Neurotransmitter and Cm charge-up delay probably could be modeled by Cm alone. Encryption. Can encrypt, encode, compress by sending signal that is processed in parallel (real-time) by an ANN with a specific "combination" of weights. May be able to perform one-way password protection. May not work if some kind of "energy" analysis allows one to separate and decode the data (Kohonen algorithm). Best if decrypted on a non-continuous neural-network that is very chaotic. Note that this model allows neurons to have connections to themselves without causing neurons to repetitiously fire. Since inputs from itself will arrive after a delay when the neuron has transitioned to the tired state, the synapse will be frustrated to the point of becoming disconnected. That is, this training algorithm will automatically zero the diagonal of its weight matrix. Question: does this training algorithm tend to make Tij=Tji since it seeks stability? What happens when neurons fire faster than desired (plays pattern too fast)? Increase precision to slow down? Increase neurons to create delay loops which build over time before firing desired output? Show how you can trade off the precision of the inputs by increasing the number of neurons. Hypothesis: synapses learn best when there is a high correlation between their firing frequencies; may occur at periodic multiples; learn best when phase differences are low. Synapses become weaker (less excitatory) when continuously fired in phase with the receiving neuron's tired state. This effect is seen is real-life neurobiology. Check for the effect of a tetanus on the model. It should enhance the weights. How is this related to Klopf's work? Equivalent to a positive correlation function of the phase (but not negative)? Optimizer: settles into a small loop of network states; becomes stable Sequencer: plays a large loop of network states (music box, movie player) Are both content-addressable memories? Create a sequencer that plays a sinusoidal. Test with software that will do generic pattern matching to evaluate the theory and measure the performance. Initial software for the continuous model has been developed and is looking promising. Exhibits desirable characteristics of input phase difference detection. Will create menu of options to demonstrate interesting single and multi-neuron characteristics. Find mathematical limitations: stability constraints, number of neurons per memory, training time, Universal Computation (any binary function) Compare and contrast to the approaches of Hopfield and Klopf Show biological plausibility Weight changes due to one or both of the following: change in number of receptor channels and/or change in amount of neurotransmitter released. When neuron state is tired, current may attempt to flow the wrong way through channels causing the channels to be destroyed or created and visa versa when neuron state is oncoming or firing; on the presynaptic side, the amount of neurotransmitter released may be based upon previous amount that were absorbed -- it may learn by sensing the gradient in the synapse. Music Box: collections of neurons represent different frequencies, weights dictate timing and sequence; use for audile observations of instability, error, and training strategies Discarded Material Learning occurs when an excitatory input arrives and the neuron is either tired, oncoming, or firing. A "tetanus" is a short burst of high frequency excitatory inputs. (may only have short term duration effect on presynaptic neurotransmitter release related to a temporary build-up of intracellular residual Ca2+ (KSJ 207); uninteresting from our standpoint) Already incorporated using the dL/dD rule? long-term potentiation (KSJ Ch 65) interesting? predict the results of the application of the activation rule for the given inputs, compare the predicted activated responses to the desired responses and change the current neurons states if necessary, then apply the learning rule. If the activation rule dictated that the neuron should transition to a state that was not the desired response state, the learning rule will modify the weights correctly. Normal stability rules may not apply to pulse neurons. There should be a unifying-field theory between complexity, stability, and energy-minimization that would allow the complexity issue to be an integral part of FIST. Weight_Delta := Learning_Rate * Neurotransmitter * Vm where the quantity of neurotransmitter released is always non-negative. or maybe, W_D := L_R * abs ( W ) * Nt * Vm consider whether to allow negative correlation (inhibitory weight) when out of phase or just go to zero. If just go to zero, then W_D := L_R * Sgn ( W ) * Nt * Vm which may be necessary to incorporate the 5th state and prevent saturation in the inhibitory direction? The "synaptic learning mode" describes a synapses propensity to learn at a given time: enhanced, normal, or frustrated. Frustrated Inculcated Synapse Training (FIST) neural network training algorithm to enable it to duplicate any desired function. where A_Na, A_K indicates the activation of voltage gated ion channels. Might be related to the second derivative of Vm. For now, let A_Na := 1.0 when in the firing phase and 0.0 when in any other. Let A_K := 1.0 when in the tiring and degenerating phases and 0.0 when in any other. Another way to express A_Na and A_K is as follows: if ( dVm >= Thr ) and ( Vm >= 0.0 ) then A_Na := 10.0; else A_Na := 0.0; end if; if ( dVm <= -Thr ) then A_K := 10.0; else A_K := 0.0; end if; the next state of each neuron is indifferent to the previous state and is dependent only upon the current inputs unless the previous state was Firing or Tired.