diff --git a/src/ontology/aio-src.csv b/src/ontology/aio-src.csv
index 7dd0b36..973ba47 100644
--- a/src/ontology/aio-src.csv
+++ b/src/ontology/aio-src.csv
@@ -1,28 +1,28 @@
Name,ID,Category,Synomyms,Related synonyms,Description,Reference (DOI),Layers,Parent
LABEL,ID,,A oio:hasExactSynonym SPLIT=|,A oio:hasRelatedSynonym SPLIT=|,A IAO:0000115,>A oio:hasDbXref,A rdfs:comment,SC % SPLIT=|
-Denoising Auto Encoder,AIO:DenoisingAutoEncoder,Network,DAE,,"Denoising Auto Encoders (DAEs) take a partially corrupted input and are trained to recover the original undistorted input. In practice, the objective of denoising autoencoders is that of cleaning the corrupted input, or denoising. (https://en.wikipedia.org/wiki/Autoencoder)",,"Noisy Input, Hidden, Matched Output-Input",AIO:AE
-Deep Convolutional Inverse Graphics Network,AIO:DeepConvolutionalInverseGraphicsNetwork,Network,DCIGN,,A Deep Convolution Inverse Graphics Network (DC-IGN) is a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. (https://arxiv.org/abs/1503.03167),,"Input, Kernel, Convolutional/Pool, Probabilistic Hidden, Convolutional/Pool, Kernel, Output",AIO:AE
-Sparse AE,AIO:SparseAE,Network,SAE,,"Sparse autoencoders may include more (rather than fewer) hidden units than inputs, but only a small number of the hidden units are allowed to be active at the same time (thus, sparse). This constraint forces the model to respond to the unique statistical features of the training data. (https://en.wikipedia.org/wiki/Autoencoder)",,"Input, Hidden, Matched Output-Input",AIO:AE
-Variational Auto Encoder,AIO:VariationalAutoEncoder,Network,VAE,,Variational autoencoders are meant to compress the input information into a constrained multivariate latent distribution (encoding) to reconstruct it as accurately as possible (decoding). (https://en.wikipedia.org/wiki/Variational_autoencoder),,"Input, Probabilistic Hidden, Matched Output-Input",AIO:AE
-Deep Neural Network,AIO:DeepNeuralNetwork,Network,DNN,,"A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers.[13][2] There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions. (https://en.wikipedia.org/wiki/Deep_Learning#:~:text=A%20deep%20neural%20network%20(DNN,weights%2C%20biases%2C%20and%20functions.)",,,AIO:ANN
-Feedback Network,AIO:FeedbackNetwork,Network,FBN,,A feedback based approach in which the representation is formed in an iterative manner based on a feedback received from previous iteration's output. (https://arxiv.org/abs/1612.09508),,"Input, Hidden, Output, Hidden",AIO:ANN
-Perceptron,AIO:Perceptron,Network,Single Layer Perceptron|SLP,,"The perceptron is an algorithm for supervised Learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector. (https://en.wikipedia.org/wiki/Perceptron)",,"Input, Output",AIO:ANN
-word2vec-CBOW,AIO:word2vec-CBOW,Network,W2V-CBOW,CBOW,"In the continuous bag-of-words architecture, the model predicts the current word from a window of surrounding context words. The order of context words does not influence prediction (bag-of-words assumption). (https://en.wikipedia.org/wiki/Word2vec)",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:ANN
-word2vec-SkipGram,AIO:word2vec-SkipGram,Network,W2V-SkipGram,SkipGram,"In the continuous skip-gram architecture, the model uses the current word to predict the surrounding window of context words. The skip-gram architecture weighs nearby context words more heavily than more distant context words.",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:ANN
-Restricted Boltzmann Machine,AIO:RestrictedBoltzmannMachine,Network,RBM,,A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs.,https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine,"Backfed Input, Probabilistic Hidden",AIO:BM
-Deep Convolutional Network,AIO:DeepConvolutionalNetwork,Network,DCN|Convolutional Neural Network|CNN|ConvNet,,"A convolutional neural network (CNN, or ConvNet) is a class of artificial neural network, most commonly applied to analyze visual imagery. They are also known as shift invariant or space invariant artificial neural networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation equivariant responses known as feature maps. CNNs are regularized versions of multilayer perceptrons. (https://en.wikipedia.org/wiki/Convolutional_neural_network)",https://en.wikipedia.org/wiki/Convolutional_neural_network,"Input, Kernel, Convolutional/Pool, Hidden, Output",AIO:DNN
-Deep FeedFoward,AIO:DeepFeedFoward,Network,DFF|Feedforward Network|FFN|Multilayer Perceptoron|MLP,,"The feedforward neural network was the first and simplest type of artificial neural network devised. In this network, the information moves in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.",https://en.wikipedia.org/wiki/Feedforward_neural_network,"Input, Hidden, Output",AIO:DNN
-Deconvolutional Network,AIO:DeconvolutionalNetwork,Network,DN,,"Deconvolutional Networks, a framework that permits the unsupervised construction of hierarchical image representations. These representations can be used for both low-level tasks such as denoising, as well as providing features for object recognition. Each level of the hierarchy groups information from the level beneath to form more complex features that exist over a larger scale in the image. (https://ieeexplore.ieee.org/document/5539957)",https://ieeexplore.ieee.org/document/5539957,"Input, Kernel, Convolutional/Pool, Output",AIO:DNN
-Graph Convolutional Network,AIO:GraphConvolutionalNetwork,Network,GCN,,GCN is a type of convolutional neural network that can work directly on graphs and take advantage of their structural information. (https://arxiv.org/abs/1609.02907),https://arxiv.org/abs/1609.02907,"Input, Hidden, Hidden, Output",AIO:DNN
-Recurrent Neural Network,AIO:RecurrentNeuralNetwork,Network,RecNN|Recurrent Network|RN,,"A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs.",https://en.wikipedia.org/wiki/Recurrent_neural_network,"Input, Memory Cell, Output",AIO:DNN
-Recursive Neural Network,AIO:RecursiveNeuralNetwork,Network,RecuNN|RvNN,,"A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in Learning sequence and tree structures in natural language processing, mainly phrase and sentence continuous representations based on word embedding.",https://en.wikipedia.org/wiki/Recursive_neural_network,,AIO:DNN
-Residual Neural Network,AIO:ResidualNeuralNetwork,Network,ResNN|Deep Residual Network|ResNet|DRN,,"A residual neural network (ResNet) is an artificial neural network (ANN) of a kind that builds on constructs known from pyramidal cells in the cerebral cortex. Residual neural networks do this by utilizing skip connections, or shortcuts to jump over some layers. Typical ResNet models are implemented with double- or triple- layer skips that contain nonlinearities (ReLU) and batch normalization in between. An additional weight matrix may be used to learn the skip weights; these models are known as HighwayNets. Models with several parallel skips are referred to as DenseNets. In the context of residual neural networks, a non-residual network may be described as a 'plain network'.",https://en.wikipedia.org/wiki/Residual_neural_network,"Input, Weight, BN, ReLU, Weight, BN, Addition, ReLU",AIO:DNN
-Transformer Network,AIO:TransformerNetwork,Network,,,"A transformer is a deep Learning model that adopts the mechanism of attention, differentially weighing the significance of each part of the input data. It is used primarily in the field of natural language processing (NLP) and in computer vision (CV). (https://en.wikipedia.org/wiki/Transformer_(machine_Learning_model))",https://en.wikipedia.org/wiki/Transformer_(machine_Learning_model),,AIO:DNN
-Extreme Learning Machine,AIO:ExtremeLearningMachine,Network,ELM,,"Extreme Learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature Learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned. These hidden nodes can be randomly assigned and never updated (i.e. they are random projection but with nonlinear transforms), or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to Learning a linear model. (https://en.wikipedia.org/wiki/Extreme_Learning_machine)",https://en.wikipedia.org/wiki/Extreme_Learning_machine,"Input, Hidden, Output",AIO:FBN
-Radial Basis Network,AIO:RadialBasisNetwork,Network,RBN|Radial Basis Function Network|RBFN,,"Like recurrent neural networks (RNNs), transformers are designed to handle sequential input data, such as natural language, for tasks such as translation and text summarization. However, unlike RNNs, transformers do not necessarily process the data in order. Rather, the attention mechanism provides context for any position in the input sequence.",https://en.wikipedia.org/wiki/Radial_basis_function_network,"Input, Hidden, Output",AIO:DFF
-Neural Turing Machine Network,AIO:NeuralTuringMachineNetwork,Network,NTM,,"A Neural Turing machine (NTMs) is a recurrent neural network model. The approach was published by Alex Graves et al. in 2014. NTMs combine the fuzzy pattern matching capabilities of neural networks with the algorithmic power of programmable computers. An NTM has a neural network controller coupled to external memory resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using gradient descent. An NTM with a long short-term memory (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.",https://en.wikipedia.org/wiki/Neural_Turing_machine,"Input, Hidden, Spiking Hidden, Output",AIO:DFF|AIO:LSTM
-Graph Convolutional Policy Network,AIO:GraphConvolutionalPolicyNetwork,Network,GPCN,,"Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement Learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules.",https://arxiv.org/abs/1806.02473,"Input, Hidden, Hidden, Policy, Output",AIO:GCN
-Gated Recurrent Unit,AIO:GatedRecurrentUnit,Network,GRU,,"Gated recurrent units (GRUs) are a gating mechanism in recurrent neural networks, introduced in 2014 by Kyunghyun Cho et al. The GRU is like a long short-term memory (LSTM) with a forget gate, but has fewer parameters than LSTM, as it lacks an output gate. GRU's performance on certain tasks of polyphonic music modeling, speech signal modeling and natural language processing was found to be similar to that of LSTM.[4][5] GRUs have been shown to exhibit better performance on certain smaller and less frequent datasets.",https://en.wikipedia.org/wiki/Gated_recurrent_unit,"Input, Memory Cell, Output",AIO:LSTM
+Denoising Auto Encoder,AIO:DenoisingAutoEncoder,Network,DAE|Denoising Autoencoder,,"Denoising Auto Encoders (DAEs) take a partially corrupted input and are trained to recover the original undistorted input. In practice, the objective of denoising autoencoders is that of cleaning the corrupted input, or denoising. (https://en.wikipedia.org/wiki/Autoencoder)",https://doi.org/10.1145/1390156.1390294,"Noisy Input, Hidden, Matched Output-Input",AIO:AutoEncoderNetwork
+Deep Convolutional Inverse Graphics Network,AIO:DeepConvolutionalInverseGraphicsNetwork,Network,DCIGN,,A Deep Convolution Inverse Graphics Network (DC-IGN) is a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. (https://arxiv.org/abs/1503.03167),TBD,"Input, Kernel, Convolutional/Pool, Probabilistic Hidden, Convolutional/Pool, Kernel, Output",AIO:AutoEncoderNetwork
+Sparse Auto Encoder,AIO:SparseAutoEncoder,Network,SAE|Sparse AE|Sparse Autoencoder,,"Sparse autoencoders may include more (rather than fewer) hidden units than inputs, but only a small number of the hidden units are allowed to be active at the same time (thus, sparse). This constraint forces the model to respond to the unique statistical features of the training data. (https://en.wikipedia.org/wiki/Autoencoder)",TBD,"Input, Hidden, Matched Output-Input",AIO:AutoEncoderNetwork
+Variational Auto Encoder,AIO:VariationalAutoEncoder,Network,VAE,,Variational autoencoders are meant to compress the input information into a constrained multivariate latent distribution (encoding) to reconstruct it as accurately as possible (decoding). (https://en.wikipedia.org/wiki/Variational_autoencoder),TBD,"Input, Probabilistic Hidden, Matched Output-Input",AIO:AutoEncoderNetwork
+Deep Neural Network,AIO:DeepNeuralNetwork,Network,DNN,,"A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers.[13][2] There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions. (https://en.wikipedia.org/wiki/Deep_Learning#:~:text=A%20deep%20neural%20network%20(DNN,weights%2C%20biases%2C%20and%20functions.)",TBD,,AIO:ArtificialNeuralNetwork
+Feedback Network,AIO:FeedbackNetwork,Network,FBN,,A feedback based approach in which the representation is formed in an iterative manner based on a feedback received from previous iteration's output. (https://arxiv.org/abs/1612.09508),TBD,"Input, Hidden, Output, Hidden",AIO:ArtificialNeuralNetwork
+Perceptron,AIO:Perceptron,Network,Single Layer Perceptron|SLP,,"The perceptron is an algorithm for supervised Learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector. (https://en.wikipedia.org/wiki/Perceptron)",TBD,"Input, Output",AIO:ArtificialNeuralNetwork
+word2vec-CBOW,AIO:word2vec-CBOW,Network,W2V-CBOW,CBOW,"In the continuous bag-of-words architecture, the model predicts the current word from a window of surrounding context words. The order of context words does not influence prediction (bag-of-words assumption). (https://en.wikipedia.org/wiki/Word2vec)",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:ArtificialNeuralNetwork
+word2vec-SkipGram,AIO:word2vec-SkipGram,Network,W2V-SkipGram,SkipGram,"In the continuous skip-gram architecture, the model uses the current word to predict the surrounding window of context words. The skip-gram architecture weighs nearby context words more heavily than more distant context words.",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:ArtificialNeuralNetwork
+Restricted Boltzmann Machine,AIO:RestrictedBoltzmannMachine,Network,RBM,,A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs.,https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine,"Backfed Input, Probabilistic Hidden",AIO:BoltzmannMachineNetwork
+Deep Convolutional Network,AIO:DeepConvolutionalNetwork,Network,DCN|Convolutional Neural Network|CNN|ConvNet,,"A convolutional neural network (CNN, or ConvNet) is a class of artificial neural network, most commonly applied to analyze visual imagery. They are also known as shift invariant or space invariant artificial neural networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation equivariant responses known as feature maps. CNNs are regularized versions of multilayer perceptrons. (https://en.wikipedia.org/wiki/Convolutional_neural_network)",https://en.wikipedia.org/wiki/Convolutional_neural_network,"Input, Kernel, Convolutional/Pool, Hidden, Output",AIO:DeepNeuralNetwork
+Deep FeedForward,AIO:DeepFeedForward,Network,DFF|Feedforward Network|FFN|Multilayer Perceptoron|MLP,,"The feedforward neural network was the first and simplest type of artificial neural network devised. In this network, the information moves in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.",https://en.wikipedia.org/wiki/Feedforward_neural_network,"Input, Hidden, Output",AIO:DeepNeuralNetwork
+Deconvolutional Network,AIO:DeconvolutionalNetwork,Network,DN,,"Deconvolutional Networks, a framework that permits the unsupervised construction of hierarchical image representations. These representations can be used for both low-level tasks such as denoising, as well as providing features for object recognition. Each level of the hierarchy groups information from the level beneath to form more complex features that exist over a larger scale in the image. (https://ieeexplore.ieee.org/document/5539957)",https://ieeexplore.ieee.org/document/5539957,"Input, Kernel, Convolutional/Pool, Output",AIO:DeepNeuralNetwork
+Graph Convolutional Network,AIO:GraphConvolutionalNetwork,Network,GCN,,GCN is a type of convolutional neural network that can work directly on graphs and take advantage of their structural information. (https://arxiv.org/abs/1609.02907),https://arxiv.org/abs/1609.02907,"Input, Hidden, Hidden, Output",AIO:DeepNeuralNetwork
+Recurrent Neural Network,AIO:RecurrentNeuralNetwork,Network,RecNN|Recurrent Network|RN,,"A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs.",https://en.wikipedia.org/wiki/Recurrent_neural_network,"Input, Memory Cell, Output",AIO:DeepNeuralNetwork
+Recursive Neural Network,AIO:RecursiveNeuralNetwork,Network,RecuNN|RvNN,,"A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in Learning sequence and tree structures in natural language processing, mainly phrase and sentence continuous representations based on word embedding.",https://en.wikipedia.org/wiki/Recursive_neural_network,,AIO:DeepNeuralNetwork
+Residual Neural Network,AIO:ResidualNeuralNetwork,Network,ResNN|Deep Residual Network|ResNet|DRN,,"A residual neural network (ResNet) is an artificial neural network (ANN) of a kind that builds on constructs known from pyramidal cells in the cerebral cortex. Residual neural networks do this by utilizing skip connections, or shortcuts to jump over some layers. Typical ResNet models are implemented with double- or triple- layer skips that contain nonlinearities (ReLU) and batch normalization in between. An additional weight matrix may be used to learn the skip weights; these models are known as HighwayNets. Models with several parallel skips are referred to as DenseNets. In the context of residual neural networks, a non-residual network may be described as a 'plain network'.",https://en.wikipedia.org/wiki/Residual_neural_network,"Input, Weight, BN, ReLU, Weight, BN, Addition, ReLU",AIO:DeepNeuralNetwork
+Transformer Network,AIO:TransformerNetwork,Network,,,"A transformer is a deep Learning model that adopts the mechanism of attention, differentially weighing the significance of each part of the input data. It is used primarily in the field of natural language processing (NLP) and in computer vision (CV). (https://en.wikipedia.org/wiki/Transformer_(machine_Learning_model))",https://en.wikipedia.org/wiki/Transformer_(machine_Learning_model),,AIO:DeepNeuralNetwork
+Extreme Learning Machine,AIO:ExtremeLearningMachine,Network,ELM,,"Extreme Learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature Learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned. These hidden nodes can be randomly assigned and never updated (i.e. they are random projection but with nonlinear transforms), or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to Learning a linear model. (https://en.wikipedia.org/wiki/Extreme_Learning_machine)",https://en.wikipedia.org/wiki/Extreme_Learning_machine,"Input, Hidden, Output",AIO:FeedbackNetwork
+Radial Basis Network,AIO:RadialBasisNetwork,Network,RBN|Radial Basis Function Network|RBFN,,"Like recurrent neural networks (RNNs), transformers are designed to handle sequential input data, such as natural language, for tasks such as translation and text summarization. However, unlike RNNs, transformers do not necessarily process the data in order. Rather, the attention mechanism provides context for any position in the input sequence.",https://en.wikipedia.org/wiki/Radial_basis_function_network,"Input, Hidden, Output",AIO:DeepFeedForward
+Neural Turing Machine Network,AIO:NeuralTuringMachineNetwork,Network,NTM,,"A Neural Turing machine (NTMs) is a recurrent neural network model. The approach was published by Alex Graves et al. in 2014. NTMs combine the fuzzy pattern matching capabilities of neural networks with the algorithmic power of programmable computers. An NTM has a neural network controller coupled to external memory resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using gradient descent. An NTM with a long short-term memory (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.",https://en.wikipedia.org/wiki/Neural_Turing_machine,"Input, Hidden, Spiking Hidden, Output",AIO:DeepFeedForward|AIO:LongShortTermMemory
+Graph Convolutional Policy Network,AIO:GraphConvolutionalPolicyNetwork,Network,GPCN,,"Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement Learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules.",https://arxiv.org/abs/1806.02473,"Input, Hidden, Hidden, Policy, Output",AIO:GraphConvolutionalNetwork
+Gated Recurrent Unit,AIO:GatedRecurrentUnit,Network,GRU,,"Gated recurrent units (GRUs) are a gating mechanism in recurrent neural networks, introduced in 2014 by Kyunghyun Cho et al. The GRU is like a long short-term memory (LSTM) with a forget gate, but has fewer parameters than LSTM, as it lacks an output gate. GRU's performance on certain tasks of polyphonic music modeling, speech signal modeling and natural language processing was found to be similar to that of LSTM.[4][5] GRUs have been shown to exhibit better performance on certain smaller and less frequent datasets.",https://en.wikipedia.org/wiki/Gated_recurrent_unit,"Input, Memory Cell, Output",AIO:LongShortTermMemory
Artificial Neural Network,AIO:ArtificialNeuralNetwork,Network,ANN|NN,,"An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives a signal then processes it and can signal neurons connected to it. The ""signal"" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as Learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.",https://en.wikipedia.org/wiki/Artificial_neural_network,,AIO:Network
Kohonen Network,AIO:KohonenNetwork,Network,KN|Self-Organizing Map|SOM|Self-Organizing Feature Map|SOFM,,"A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine Learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data. For example, a data set with p variables measured in n observations could be represented as clusters of observations with similar values for the variables. These clusters then could be visualized as a two-dimensional ""map"" such that observations in proximal clusters have more similar values than observations in distal clusters. This can make high-dimensional data easier to visualize and analyze. An SOM is a type of artificial neural network but is trained using competitive Learning rather than the error-correction Learning (e.g., backpropagation with gradient descent) used by other artificial neural networks. The SOM was introduced by the Finnish professor Teuvo Kohonen in the 1980s and therefore is sometimes called a Kohonen map or Kohonen network.[1][2] The Kohonen map or network is a computationally convenient abstraction building on biological models of neural systems from the 1970s[3] and morphogenesis models dating back to Alan Turing in the 1950s.",https://en.wikipedia.org/wiki/Self-organizing_map,"Input, Hidden",AIO:Network
Liquid State Machine Network,AIO:LiquidStateMachineNetwork,Network,LSM,,"A liquid state machine (LSM) is a type of reservoir computer that uses a spiking neural network. An LSM consists of a large collection of units (called nodes, or neurons). Each node receives time varying input from external sources (the inputs) as well as from other nodes. Nodes are randomly connected to each other. The recurrent nature of the connections turns the time varying input into a spatio-temporal pattern of activations in the network nodes. The spatio-temporal patterns of activation are read out by linear discriminant units. The soup of recurrently connected nodes will end up computing a large variety of nonlinear functions on the input. Given a large enough variety of such nonlinear functions, it is theoretically possible to obtain linear combinations (using the read out units) to perform whatever mathematical operation is needed to perform a certain task, such as speech recognition or computer vision. The word liquid in the name comes from the analogy drawn to dropping a stone into a still body of water or other liquid. The falling stone will generate ripples in the liquid. The input (motion of the falling stone) has been converted into a spatio-temporal pattern of liquid displacement (ripples). (https://en.wikipedia.org/wiki/Liquid_state_machine)",https://en.wikipedia.org/wiki/Liquid_state_machine,"Input, Spiking Hidden, Output",AIO:Network
@@ -30,31 +30,32 @@ Markov Chain,AIO:MarkovChain,Model,MC|Markov Process|MP,,"A Markov chain or Mark
Symmetrically Connected Network,AIO:SymmetricallyConnectedNetwork,Network,SCN,,"Like recurrent networks, but the connections between units are symmetrical (they have the same weight in both directions).",https://ieeexplore.ieee.org/document/287176,,AIO:Network
Support Vector Machine,AIO:SupportVectorMachine,Network,SVM|Supper Vector Network|SVN,,"In machine Learning, support-vector machines (SVMs, also support-vector networks) are supervised Learning models with associated Learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Vapnik et al., 1997) SVMs are one of the most robust prediction methods, being based on statistical Learning frameworks or VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974). Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier (although methods such as Platt scaling exist to use SVM in a probabilistic classification setting). SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall.",https://en.wikipedia.org/wiki/Support-vector_machine,"Input, Hidden, Output",AIO:Network
Unsupervised Pretrained Network,AIO:UnsupervisedPretrainedNetwork,Network,UPN,,"Unsupervised pre-training initializes a discriminative neural net from one which was trained using an unsupervised criterion, such as a deep belief network or a deep autoencoder. This method can sometimes help with both the optimization and the overfitting issues.","https://metacademy.org/graphs/concepts/unsupervised_pre_training#:~:text=Unsupervised%20pre%2Dtraining%20initializes%20a,optimization%20and%20the%20overfitting%20issues",,AIO:Network
-Echo State Network,AIO:EchoStateNetwork,Network,ESN,,"The echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer (with typically 1% connectivity). The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.","https://en.wikipedia.org/wiki/Echo_state_network#:~:text=The%20echo%20state%20network%20(ESN,are%20fixed%20and%20randomly%20assigned","Input, Recurrent, Output",AIO:RecNN
-Long Short Term Memory,AIO:LongShortTermMemory,Network,LSTM,,"Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep Learning. Unlike standard feedforward neural networks, LSTM has feedback connections. It can process not only single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition and anomaly detection in network traffic or IDSs (intrusion detection systems). A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.",https://en.wikipedia.org/wiki/Long_short-term_memory,"Input, Memory Cell, Output",AIO:RecNN
-Boltzmann Machine Network,AIO:BoltzmannMachineNetwork,Network,BM|stochastic Hopfield network with hidden units|Sherrington–Kirkpatrick model with external field|stochastic Ising-Lenz-Little model,,"A Boltzmann machine is a type of stochastic recurrent neural network. It is a Markov random field. It was translated from statistical physics for use in cognitive science. The Boltzmann machine is based on a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model that is a stochastic Ising Model[2] and applied to machine Learning.",https://en.wikipedia.org/wiki/Boltzmann_machine,"Backfed Input, Probabilistic Hidden",AIO:SCN
-Hopfield Network,AIO:HopfieldNetwork,Network,HN|Ising model of a neural network|Ising–Lenz–Little model,,"A Hopfield network is a form of recurrent artificial neural network and a type of spin glass system popularised by John Hopfield in 1982 as described earlier by Little in 1974 based on Ernst Ising's work with Wilhelm Lenz on the Ising model. Hopfield networks serve as content-addressable (""associative"") memory systems with binary threshold nodes, or with continuous variables. Hopfield networks also provide a model for understanding human memory. (https://en.wikipedia.org/wiki/Hopfield_network)",https://en.wikipedia.org/wiki/Hopfield_network,Backfed input,AIO:SCN
-Auto Encoder Network,AIO:AutoEncoderNetwork,Network,AE,,"An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised Learning). The encoding is validated and refined by attempting to regenerate the input from the encoding. The autoencoder learns a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore insignificant data (“noise”). (https://en.wikipedia.org/wiki/Autoencoder)",https://en.wikipedia.org/wiki/Autoencoder,"Input, Hidden, Matched Output-Input",AIO:UPN
-Deep Belief Network,AIO:DeepBeliefNetwork,Network,DBN,,"In machine Learning, a deep belief network (DBN) is a generative graphical model, or alternatively a class of deep neural network, composed of multiple layers of latent variables (""hidden units""), with connections between the layers but not between units within each layer. When trained on a set of examples without supervision, a DBN can learn to probabilistically reconstruct its inputs. The layers then act as feature detectors. After this Learning step, a DBN can be further trained with supervision to perform classification. DBNs can be viewed as a composition of simple, unsupervised networks such as restricted Boltzmann machines (RBMs) or autoencoders, where each sub-network's hidden layer serves as the visible layer for the next. An RBM is an undirected, generative energy-based model with a ""visible"" input layer and a hidden layer and connections between but not within layers. This composition leads to a fast, layer-by-layer unsupervised training procedure, where contrastive divergence is applied to each sub-network in turn, starting from the ""lowest"" pair of layers (the lowest visible layer is a training set). The observation that DBNs can be trained greedily, one layer at a time, led to one of the first effective deep Learning algorithms. (https://en.wikipedia.org/wiki/Deep_belief_network)",https://en.wikipedia.org/wiki/Deep_belief_network,"Backfed Input, Probabilistic Hidden, Hidden, Matched Output-Input",AIO:UPN
-Generative Adversarial Network,AIO:GenerativeAdversarialNetwork,Network,GAN,,"A generative adversarial network (GAN) is a class of machine Learning frameworks designed by Ian Goodfellow and his colleagues in 2014. Two neural networks contest with each other in a game (in the form of a zero-sum game, where one agent's gain is another agent's loss). Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised Learning, GANs have also proven useful for semi-supervised Learning, fully supervised Learning,[ and reinforcement Learning. The core idea of a GAN is based on the ""indirect"" training through the discriminator,[clarification needed] which itself is also being updated dynamically. This basically means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.",https://en.wikipedia.org/wiki/Generative_adversarial_network,"Backfed Input, Hidden, Matched Output-Input, Hidden, Matched Output-Input",AIO:UPN
-node2vec-CBOW,AIO:node2vec-CBOW,Network,N2V-CBOW,CBOW,"In the continuous bag-of-words architecture, the model predicts the current node from a window of surrounding context nodes. The order of context nodes does not influence prediction (bag-of-words assumption).",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:W2VCBOW
-node2vec-SkipGram,AIO:node2vec-SkipGram,Network,N2V-SkipGram,SkipGram,"In the continuous skip-gram architecture, the model uses the current node to predict the surrounding window of context nodes. The skip-gram architecture weighs nearby context nodes more heavily than more distant context nodes. (https://en.wikipedia.org/wiki/Word2vec)",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:W2VSkipGram
+Echo State Network,AIO:EchoStateNetwork,Network,ESN,,"The echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer (with typically 1% connectivity). The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.","https://en.wikipedia.org/wiki/Echo_state_network#:~:text=The%20echo%20state%20network%20(ESN,are%20fixed%20and%20randomly%20assigned","Input, Recurrent, Output",AIO:RecurrentNeuralNetwork
+Long Short Term Memory,AIO:LongShortTermMemory,Network,LSTM,,"Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep Learning. Unlike standard feedforward neural networks, LSTM has feedback connections. It can process not only single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition and anomaly detection in network traffic or IDSs (intrusion detection systems). A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.",https://en.wikipedia.org/wiki/Long_short-term_memory,"Input, Memory Cell, Output",AIO:RecurrentNeuralNetwork
+Boltzmann Machine Network,AIO:BoltzmannMachineNetwork,Network,BM|stochastic Hopfield network with hidden units|Sherrington–Kirkpatrick model with external field|stochastic Ising-Lenz-Little model,,"A Boltzmann machine is a type of stochastic recurrent neural network. It is a Markov random field. It was translated from statistical physics for use in cognitive science. The Boltzmann machine is based on a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model that is a stochastic Ising Model[2] and applied to machine Learning.",https://en.wikipedia.org/wiki/Boltzmann_machine,"Backfed Input, Probabilistic Hidden",AIO:SymmetricallyConnectedNetwork
+Hopfield Network,AIO:HopfieldNetwork,Network,HN|Ising model of a neural network|Ising–Lenz–Little model,,"A Hopfield network is a form of recurrent artificial neural network and a type of spin glass system popularised by John Hopfield in 1982 as described earlier by Little in 1974 based on Ernst Ising's work with Wilhelm Lenz on the Ising model. Hopfield networks serve as content-addressable (""associative"") memory systems with binary threshold nodes, or with continuous variables. Hopfield networks also provide a model for understanding human memory. (https://en.wikipedia.org/wiki/Hopfield_network)",https://en.wikipedia.org/wiki/Hopfield_network,Backfed input,AIO:SymmetricallyConnectedNetwork
+Auto Encoder Network,AIO:AutoEncoderNetwork,Network,AE,,"An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised Learning). The encoding is validated and refined by attempting to regenerate the input from the encoding. The autoencoder learns a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore insignificant data (“noise”). (https://en.wikipedia.org/wiki/Autoencoder)",https://en.wikipedia.org/wiki/Autoencoder,"Input, Hidden, Matched Output-Input",AIO:UnsupervisedPretrainedNetwork
+Deep Belief Network,AIO:DeepBeliefNetwork,Network,DBN,,"In machine Learning, a deep belief network (DBN) is a generative graphical model, or alternatively a class of deep neural network, composed of multiple layers of latent variables (""hidden units""), with connections between the layers but not between units within each layer. When trained on a set of examples without supervision, a DBN can learn to probabilistically reconstruct its inputs. The layers then act as feature detectors. After this Learning step, a DBN can be further trained with supervision to perform classification. DBNs can be viewed as a composition of simple, unsupervised networks such as restricted Boltzmann machines (RBMs) or autoencoders, where each sub-network's hidden layer serves as the visible layer for the next. An RBM is an undirected, generative energy-based model with a ""visible"" input layer and a hidden layer and connections between but not within layers. This composition leads to a fast, layer-by-layer unsupervised training procedure, where contrastive divergence is applied to each sub-network in turn, starting from the ""lowest"" pair of layers (the lowest visible layer is a training set). The observation that DBNs can be trained greedily, one layer at a time, led to one of the first effective deep Learning algorithms. (https://en.wikipedia.org/wiki/Deep_belief_network)",https://en.wikipedia.org/wiki/Deep_belief_network,"Backfed Input, Probabilistic Hidden, Hidden, Matched Output-Input",AIO:UnsupervisedPretrainedNetwork
+Generative Adversarial Network,AIO:GenerativeAdversarialNetwork,Network,GAN,,"A generative adversarial network (GAN) is a class of machine Learning frameworks designed by Ian Goodfellow and his colleagues in 2014. Two neural networks contest with each other in a game (in the form of a zero-sum game, where one agent's gain is another agent's loss). Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised Learning, GANs have also proven useful for semi-supervised Learning, fully supervised Learning,[ and reinforcement Learning. The core idea of a GAN is based on the ""indirect"" training through the discriminator,[clarification needed] which itself is also being updated dynamically. This basically means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.",https://en.wikipedia.org/wiki/Generative_adversarial_network,"Backfed Input, Hidden, Matched Output-Input, Hidden, Matched Output-Input",AIO:UnsupervisedPretrainedNetwork
+node2vec-CBOW,AIO:node2vec-CBOW,Network,N2V-CBOW,CBOW,"In the continuous bag-of-words architecture, the model predicts the current node from a window of surrounding context nodes. The order of context nodes does not influence prediction (bag-of-words assumption).",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:word2vec-CBOW
+node2vec-SkipGram,AIO:node2vec-SkipGram,Network,N2V-SkipGram,SkipGram,"In the continuous skip-gram architecture, the model uses the current node to predict the surrounding window of context nodes. The skip-gram architecture weighs nearby context nodes more heavily than more distant context nodes. (https://en.wikipedia.org/wiki/Word2vec)",https://en.wikipedia.org/wiki/Word2vec,"Input, Hidden, Output",AIO:word2vec-SkipGram
Bayesian Network,AIO:BayesianNetwork,Network,,,A probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG).,https://en.wikipedia.org/wiki/Bayesian_network,,AIO:Network
-Contrastive Learning,AIO:ContrastiveLearning,Network,,,Learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.,https://arxiv.org/abs/2202.14037,,AIO:DNN
-Continual Learning,AIO:ContinualLearning,Network,Incremental Learning|Life-Long Learning,,"A concept to learn a model for a large number of tasks sequentially without forgetting knowledge obtained from the preceding tasks, where the data in the old tasks are not available any more during training new ones.",https://paperswithcode.com/task/continual-learning,,AIO:DNN
-Federated Learning,AIO:FederatedLearning,Network,,,"A technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them.",https://en.wikipedia.org/wiki/Federated_learning,,AIO:DNN
-Deep Active Learning,AIO:DeepActiveLearning,Network,DeepAL,,"The combination of deep learning and active learning, where active learning attempts to maximize a model’s performance gain while annotating the fewest samples possible.",https://arxiv.org/pdf/2009.00236.pdf,,AIO:DNN
-One-shot Learning,AIO:One-shotLearning,Network,OSL,,"A method which aims to classify objects from one, or only a few, examples.",https://en.wikipedia.org/wiki/One-shot_learning,,AIO:DNN
-Generalized Few-shot Learning,AIO:GeneralizedFew-shotLearning,Network,GFSL,,"Methods that can learn novel classes from only few samples per class, preventing catastrophic forgetting of base classes, and classifier calibration across novel and base classes.",https://paperswithcode.com/paper/generalized-and-incremental-few-shot-learning/review/,,AIO:DNN
-Incremenetal Few-shot Learning,AIO:IncremenetalFew-shotLearning,Network,IFSL,,"Methods that train a network on a base set of classes and then is presented several novel classes, each with only a few labeled examples.",https://arxiv.org/abs/1810.07218,,AIO:DNN
-Metric Learning,AIO:MetricLearning,Network,Distance Metric Learning,,Methods which can learn a representation function that maps objects into an embedded space.,https://paperswithcode.com/task/metric-learning,,AIO:DNN
-Multimodal Deep Learning,AIO:MultimodalDeepLearning,Network,,,Methods which can create models that can process and link information using various modalities.,https://arxiv.org/abs/2105.11087,,AIO:DNN
-Deep Transfer Learning,AIO:DeepTransferLearning,Network,,,"Deep transfer learning methods relax the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data.",https://arxiv.org/abs/1808.01974,,AIO:DNN
-Zero-shot Learning,AIO:Zero-shotLearning,Network,ZSL,,"Methods where at test time, a learner observes samples from classes, which were not observed during training, and needs to predict the class that they belong to.",https://en.wikipedia.org/wiki/Zero-shot_learning,,AIO:DNN
-Representation Learning,AIO:RepresentationLearning,Network,Feature Learning,,Methods that allow a system to discover the representations required for feature detection or classification from raw data.,https://en.wikipedia.org/wiki/Feature_learning,,AIO:DNN
+Contrastive Learning,AIO:ContrastiveLearning,Network,,,Learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.,https://arxiv.org/abs/2202.14037,,AIO:DeepNeuralNetwork
+Continual Learning,AIO:ContinualLearning,Network,Incremental Learning|Life-Long Learning,,"A concept to learn a model for a large number of tasks sequentially without forgetting knowledge obtained from the preceding tasks, where the data in the old tasks are not available any more during training new ones.",https://paperswithcode.com/task/continual-learning,,AIO:DeepNeuralNetwork
+Federated Learning,AIO:FederatedLearning,Network,,,"A technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them.",https://en.wikipedia.org/wiki/Federated_learning,,AIO:DeepNeuralNetwork
+Deep Active Learning,AIO:DeepActiveLearning,Network,DeepAL,,"The combination of deep learning and active learning, where active learning attempts to maximize a model’s performance gain while annotating the fewest samples possible.",https://arxiv.org/pdf/2009.00236.pdf,,AIO:DeepNeuralNetwork
+One-shot Learning,AIO:One-shotLearning,Network,OSL,,"A method which aims to classify objects from one, or only a few, examples.",https://en.wikipedia.org/wiki/One-shot_learning,,AIO:DeepNeuralNetwork
+Generalized Few-shot Learning,AIO:GeneralizedFew-shotLearning,Network,GFSL,,"Methods that can learn novel classes from only few samples per class, preventing catastrophic forgetting of base classes, and classifier calibration across novel and base classes.",https://paperswithcode.com/paper/generalized-and-incremental-few-shot-learning/review/,,AIO:DeepNeuralNetwork
+Incremenetal Few-shot Learning,AIO:IncremenetalFew-shotLearning,Network,IFSL,,"Methods that train a network on a base set of classes and then is presented several novel classes, each with only a few labeled examples.",https://arxiv.org/abs/1810.07218,,AIO:DeepNeuralNetwork
+Metric Learning,AIO:MetricLearning,Network,Distance Metric Learning,,Methods which can learn a representation function that maps objects into an embedded space.,https://paperswithcode.com/task/metric-learning,,AIO:DeepNeuralNetwork
+Multimodal Deep Learning,AIO:MultimodalDeepLearning,Network,,,Methods which can create models that can process and link information using various modalities.,https://arxiv.org/abs/2105.11087,,AIO:DeepNeuralNetwork
+Deep Transfer Learning,AIO:DeepTransferLearning,Network,,,"Deep transfer learning methods relax the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data.",https://arxiv.org/abs/1808.01974,,AIO:DeepNeuralNetwork
+Zero-shot Learning,AIO:Zero-shotLearning,Network,ZSL,,"Methods where at test time, a learner observes samples from classes, which were not observed during training, and needs to predict the class that they belong to.",https://en.wikipedia.org/wiki/Zero-shot_learning,,AIO:DeepNeuralNetwork
+Representation Learning,AIO:RepresentationLearning,Network,Feature Learning,,Methods that allow a system to discover the representations required for feature detection or classification from raw data.,https://en.wikipedia.org/wiki/Feature_learning,,AIO:DeepNeuralNetwork
Sparse Learning,AIO:SparseLearning,Network,Sparse dictionary Learning|Sparse coding,,Methods which aim to find sparse representations of the input data in the form of a linear combination of basic elements as well as those basic elements themselves.,https://en.wikipedia.org/wiki/Sparse_dictionary_learning,,AIO:RepresentationLearning
-Layer,AIO:Layer,Class,,,Network layer parent class,,,
-Network,AIO:Network,Class,,,Network parent class,,,
+Layer,AIO:Layer,Class,,,Network layer parent class,TBD,,
+Network,AIO:Network,Class,,,Network parent class,TBD,,
+Function,AIO:Function,Class,,,Function parent class,TBD,,
ELU Function,AIO:ELUFunction,Function,Exponential Linear Unit|ELU,,The exponential linear unit (ELU) with alpha > 0 is: x if x > 0 and alpha * (exp(x) - 1) if x < 0 The ELU hyperparameter alpha controls the value to which an ELU saturates for negative net inputs. ELUs diminish the vanishing gradient effect. ELUs have negative values which pushes the mean of the activations closer to zero. Mean activations that are closer to zero enable faster Learning as they bring the gradient closer to the natural gradient. ELUs saturate to a negative value when the argument gets smaller. Saturation means a small derivative which decreases the variation and the information that is propagated to the next layer.,https://www.tensorflow.org/api_docs/python/tf/keras/activations/elu,,AIO:Function
Exponential Function,AIO:ExponentialFunction,Function,,,The exponential function is a mathematical function denoted by f(x)=exp or e^{x}.,https://www.tensorflow.org/api_docs/python/tf/keras/activations/exponential,,AIO:Function
GELU Function,AIO:GELUFunction,Function,Gaussian Error Linear Unit|GELU,,"Gaussian error linear unit (GELU) computes x * P(X <= x), where P(X) ~ N(0, 1). The (GELU) nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLU.",https://www.tensorflow.org/api_docs/python/tf/keras/activations/gelu,,AIO:Function
@@ -286,9 +287,8 @@ Representation Bias,AIO:RepresentationBias,Bias,,,"Arises due to non-random samp
Simpon's Paradox Bias,AIO:Simpon'sParadoxBias,Bias,Simpson's Paradox,,"A statistical phenomenon where the marginal association between two categorical variables is qualitatively different from the partial association between the same two variables after controlling for one or more other variables. For example, the statistical association or correlation that has been detected between two variables for an entire population disappears or reverses when the population is divided into subgroups.",https://doi.org/10.6028/NIST.SP.1270,,AIO:SelectionAndSamplingBias
Temporal Bias,AIO:TemporalBias,Bias,,,Bias that arises from differences in populations and behaviors over time.,https://doi.org/10.6028/NIST.SP.1270,,AIO:SelectionAndSamplingBias
Uncertainty Bias,AIO:UncertaintyBias,Bias,,,"Arises when predictive algorithms favor groups that are better represented in the training data, since there will be less uncertainty associated with those predictions.",https://doi.org/10.6028/NIST.SP.1270,,AIO:SelectionAndSamplingBias
-Method,AIO:Method,Class,,,Method parent class.,,,
-Machine Learning,AIO:MachineLearning,Method,,,"A field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks.",https://en.wikipedia.org/wiki/Machine_learning,,AIO:Method
-Multimodal Learning,AIO:MultimodalLearning,Network,,,Methods which can represent the joint representations of different modalities.,,,AIO:MachineLearning
+Machine Learning,AIO:MachineLearning,Method,,,"A field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks.",https://en.wikipedia.org/wiki/Machine_learning,,
+Multimodal Learning,AIO:MultimodalLearning,Network,,,Methods which can represent the joint representations of different modalities.,TBD,,AIO:MachineLearning
Meta-Learning,AIO:Meta-Learning,Network,,,Automatic learning algorithms applied to metadata about machine Learning experiments.,https://en.wikipedia.org/wiki/Meta_learning_(computer_science),,AIO:MachineLearning
Transfer Learning,AIO:TransferLearning,Network,,,Methods which can reuse or transfer information from previously learned tasks for the Learning of new tasks.,https://en.wikipedia.org/wiki/Transfer_learning,,AIO:MachineLearning
Active Learning,AIO:ActiveLearning,Machine Learning,Query Learning,,Methods which can interactively query a user (or some other information source) to label new data points with the desired outputs.,https://en.wikipedia.org/wiki/Active_learning_(machine_learning),,AIO:MachineLearning
@@ -328,7 +328,7 @@ Probabilistic Topic Model,AIO:ProbabilisticTopicModel,Machine Learning,,,"Method
Linear Regression,AIO:LinearRegression,Machine Learning,,,A linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables).,https://en.wikipedia.org/wiki/Linear_regression,,AIO:RegressionAnalysis
Spatial Regression,AIO:SpatialRegression,Machine Learning,,,Regression method used to model spatial relationships.,https://gisgeography.com/spatial-regression-models-arcgis/,,AIO:RegressionAnalysis
Survival Analysis,AIO:SurvivalAnalysis,Machine Learning,,,"Methods for nalyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems.",https://en.wikipedia.org/wiki/Survival_analysis,,AIO:MachineLearning
-Proportional Hazards Model,AIO:ProportionalHazardsModel,Machine Learning,,,A surival modeling method where the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate.,https://en.wikipedia.org/wiki/Proportional_hazards_modelProportional Hazards Model,,AIO:RegressionAnalysis|AIO:SurvivalAnalysis
+Proportional Hazards Model,AIO:ProportionalHazardsModel,Machine Learning,,,A surival modeling method where the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate.,https://en.wikipedia.org/wiki/Proportional_hazards_model,,AIO:RegressionAnalysis|AIO:SurvivalAnalysis
Least-squares Analysis,AIO:Least-squaresAnalysis,Machine Learning,,,A standard approach in regression analysis to approximate the solution of overdetermined systems(sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals (a residual being the difference between an observed value and the fitted value provided by a model) made in the results of each individual equation.,https://en.wikipedia.org/wiki/Least_squares,,AIO:RegressionAnalysis
Principal Component Analysis,AIO:PrincipalComponentAnalysis,Machine Learning,PCA,,"A method for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data.",https://en.wikipedia.org/wiki/Principal_component_analysis,,AIO:DimensionalityReduction
t-Distributed Stochastic Neighbor embedding,AIO:t-DistributedStochasticNeighborembedding,Machine Learning,tSNE|t-SNE,,A statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map.,https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding,,AIO:DimensionalityReduction
@@ -338,82 +338,83 @@ Data Imputation,AIO:DataImputation,Machine Learning,,,Methods that replace missi
Fixed Effects Model,AIO:FixedEffectsModel,Machine Learning,FEM,,A statistical model in which the model parameters are fixed or non-random quantities.,https://en.wikipedia.org/wiki/Fixed_effects_model,,AIO:RegressionAnalysis
Random Effects Model,AIO:RandomEffectsModel,Machine Learning,REM,,A statistical model where the model parameters are random variables.,https://en.wikipedia.org/wiki/Random_effects_model,,AIO:RegressionAnalysis
Self-supervised Learning,AIO:Self-supervisedLearning,Machine Learning,,,Regarded as an intermediate form between supervised and unsupervised learning.,https://en.wikipedia.org/wiki/Self-supervised_learning,,AIO:MachineLearning
-Advanced Techniques,AIO:AdvancedTechniques,Model,,,An abstract parent class grouping LLMs based on advanced techniques.,,,AIO:LLM
-Enhancement Strategies,AIO:EnhancementStrategies,Model,,,An abstract parent class grouping LLMs based on enhancement strategies.,,,AIO:LLM
-Application Focus,AIO:ApplicationFocus,Model,,,An abstract parent class grouping LLMs based on application focus.,,,AIO:LLM
-Training Paradigm,AIO:TrainingParadigm,Model,,,An abstract parent class grouping LLMs based on training paradigm.,,,AIO:LLM
-Model Architecture,AIO:ModelArchitecture,Model,,,An abstract parent class grouping LLMs based on model architecture.,,,AIO:LLM
Large Language Model,AIO:LargeLanguageModel,Model,LLM,,"A large language model (LLM) is a language model consisting of a neural network with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using self-supervised learning or semi-supervised learning.",https://en.wikipedia.org/wiki/Large_language_model,,
-Encoder LLM,AIO:EncoderLLM,Model,LLM,,The Encoder-only architecture is used when only encoding the input sequence is required and the decoder is not necessary. The input sequence is encoded into a fixed-length representation and then used as input to a classifier or a regressor to make a prediction. These models have a pre-trained general-purpose encoder but will require fine-tuning of the final classifier or regressor.,"https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:LLMModelArchitecture
-Decoder LLM,AIO:DecoderLLM,Model,LLM,,"In the decoder-only architecture, the model consists of only a decoder, which is trained to predict the next token in a sequence given the previous tokens. The critical difference between the Decoder-only architecture and the Encoder-Decoder architecture is that the Decoder-only architecture does not have an explicit encoder to summarize the input information. Instead, the information is encoded implicitly in the hidden state of the decoder, which is updated at each step of the generation process.","https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:LLMModelArchitecture
-Encoder-Decoder LLM,AIO:Encoder-DecoderLLM,Model,LLM,,The Encoder-Decoder architecture was the original transformer architecture introduced in the Attention Is All You Need (https://arxiv.org/abs/1706.03762) paper. The encoder processes the input sequence and generates a hidden representation that summarizes the input information. The decoder uses this hidden representation to generate the desired output sequence. The encoder and decoder are trained end-to-end to maximize the likelihood of the correct output sequence given the input sequence.,"https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:LLMModelArchitecture
-Autoregressive Language Model,AIO:AutoregressiveLanguageModel,Model,Autoregressive Language Model,generative language model|sequence-to-sequence model,"An autoregressive language model is a type of language model that generates text sequentially, predicting one token at a time based on the previously generated tokens. It excels at natural language generation tasks by modeling the probability distribution over sequences of tokens.",,,AIO:LLM
-Masked Language Model,AIO:MaskedLanguageModel,Model,Masked Language Model,bidirectional encoder|denoising autoencoder,"A masked language model is a type of language model that is trained to predict randomly masked tokens in a sequence, based on the remaining unmasked tokens. This allows it to build deep bidirectional representations that can be effectively transferred to various NLP tasks via fine-tuning.",,,AIO:EncoderLLM
-Multimodal Transformer,AIO:MultimodalTransformer,Model,Multimodal Transformer,vision-language model|unified encoder,"A multimodal transformer is a transformer architecture that can process and relate information from different modalities, such as text, images, and audio. It uses a shared embedding space and attention mechanism to learn joint representations across modalities.",,,AIO:LLMApplicationFocus
-Generative Adversarial Network-Augmented LLM,AIO:GenerativeAdversarialNetwork-AugmentedLLM,Model,GAN-LLM|Generative Adversarial Network-Augmented LLM,text generation|adversarial training,"A GAN-augmented LLM incorporates a generative adversarial network (GAN) into its training process, using a discriminator network to provide a signal for generating more realistic and coherent text. This adversarial training can improve the quality and diversity of generated text.",,,AIO:LLMApplicationFocus
-Retrieval-Augmented LLM,AIO:Retrieval-AugmentedLLM,Model,Retrieval-Augmented LLM,knowledge grounding|open-book question answering,"A retrieval-augmented LLM combines a pre-trained language model with a retrieval system that can access external knowledge sources. This allows the model to condition its generation on relevant retrieved knowledge, improving factual accuracy and knowledge grounding.",,,AIO:LLMApplicationFocus
-Reinforcement Learning LLM,AIO:ReinforcementLearningLLM,Model,RL-LLM|Reinforcement Learning LLM,reward modeling|decision transformers,"An RL-LLM is a language model that is fine-tuned using reinforcement learning, where the model receives rewards for generating text that satisfies certain desired properties or objectives. This can improve the quality, safety, or alignment of generated text.",,,AIO:LLMTrainingParadigm
-Continual Learning LLM,AIO:ContinualLearningLLM,Model,CL-LLM|Continual Learning LLM,lifelong learning|catastrophic forgetting,"A continual learning LLM is designed to continually acquire new knowledge and skills over time, without forgetting previously learned information. This allows the model to adapt and expand its capabilities as new data becomes available.",,,AIO:LLMTrainingParadigm
-Compositional Generalization LLM,AIO:CompositionalGeneralizationLLM,Model,Compositional Generalization LLM,systematic generalization|out-of-distribution generalization,"A compositional generalization LLM is trained to understand and recombine the underlying compositional structures in language, enabling better generalization to novel combinations and out-of-distribution examples.",,,AIO:LLM
-Prompt-based Fine-Tuning LLM,AIO:Prompt-basedFine-TuningLLM,Model,Prompt-tuned LLM|Prompt-based Fine-Tuning LLM,in-context learning|few-shot learning,"A prompt-tuned LLM is fine-tuned on a small number of examples or prompts, rather than full task datasets. This allows for rapid adaptation to new tasks with limited data, leveraging the model's few-shot learning capabilities.",,,AIO:LLMTrainingParadigm
-Sparse LLM,AIO:SparseLLM,Model,Sparse LLM,model compression|parameter efficiency,"A sparse LLM uses techniques like pruning or quantization to reduce the number of non-zero parameters in the model, making it more parameter-efficient and easier to deploy on resource-constrained devices.",,,AIO:LLMTrainingParadigm
-Federated LLM,AIO:FederatedLLM,Model,Federated LLM,privacy-preserving|decentralized training,"A federated LLM is trained in a decentralized manner across multiple devices or silos, without directly sharing private data. This enables collaborative training while preserving data privacy and security.",,,AIO:LLMTrainingParadigm
-Energy-Based LLM,AIO:Energy-BasedLLM,Model,Energy-Based LLM,energy scoring|explicit density modeling,"An energy-based LLM models the explicit probability density over token sequences using an energy function, rather than an autoregressive factorization. This can improve modeling of long-range dependencies and global coherence.",,,AIO:LLMEnhancementStrategies
-Controllable LLM,AIO:ControllableLLM,Model,Controllable LLM,conditional generation|guided generation,"A controllable LLM allows for explicit control over certain attributes of the generated text, such as style, tone, topic, or other desired characteristics, through conditioning or specialized training objectives.",,,AIO:LLMEnhancementStrategies
-Knowledge-Grounded LLM,AIO:Knowledge-GroundedLLM,Model,Knowledge-Grounded LLM,factual grounding|knowledge integration,"A knowledge-grounded LLM incorporates external knowledge sources or knowledge bases into the model architecture, enabling it to generate more factually accurate and knowledge-aware text.",,,AIO:LLMApplicationFocus
-Domain-Adapted LLM,AIO:Domain-AdaptedLLM,Model,Domain-Adapted LLM,transfer learning|domain robustness,"A domain-adapted LLM is first pre-trained on a broad corpus, then fine-tuned on domain-specific data to specialize its capabilities for particular domains or applications, like scientific literature or code generation.",,,AIO:LLMApplicationFocus
-Dialogue LLM,AIO:DialogueLLM,Model,Dialogue LLM,conversational AI|multi-turn dialogue,"A dialogue LLM is optimized for engaging in multi-turn conversations, understanding context and generating relevant, coherent responses continuously over many dialogue turns.",,,AIO:LLMApplicationFocus
-Instruction-Tuned LLM,AIO:Instruction-TunedLLM,Model,Instruction-Tuned LLM,natural language instructions|constitutional AI,"An instruction-tuned LLM is fine-tuned to follow natural language instructions accurately and safely, learning to map from instructions to desired model behavior in a more controlled and principled way.",,,AIO:LLMApplicationFocus
-Modular LLM,AIO:ModularLLM,Model,Modular LLM,component skills|skill composition,"A modular LLM consists of multiple specialized components or skills that can be dynamically composed and recombined to solve complex tasks, mimicking the modular structure of human cognition.",,,AIO:LLMModelArchitecture
-Recursive LLM,AIO:RecursiveLLM,Model,Recursive LLM|Self-Attending LLM,self-attention|iterative refinement,"A recursive or self-attending LLM incorporates recursive self-attention mechanisms, allowing it to iteratively refine its own outputs and capture long-range dependencies more effectively.",,,AIO:LLMModelArchitecture
-Mixture-of-Experts LLM,AIO:Mixture-of-ExpertsLLM,Model,MoE LLM|Mixture-of-Experts LLM,conditional computation|model parallelism,"A Mixture-of-Experts LLM dynamically selects and combines outputs from multiple expert submodels, allowing for efficient scaling by conditionally activating only a subset of model components for each input.",,,AIO:LLMModelArchitecture
-Meta-Learning LLM,AIO:Meta-LearningLLM,Model,Meta-Learning LLM,few-shot adaptation|learning to learn,"A meta-learning LLM is trained in a way that allows it to quickly adapt to new tasks or datasets through only a few examples or fine-tuning steps, leveraging meta-learned priors about how to efficiently learn.",,,AIO:LLMTrainingParadigm
-Neuro-Symbolic LLM,AIO:Neuro-SymbolicLLM,Model,Neuro-Symbolic LLM,knowledge reasoning|symbolic grounding,"A neuro-symbolic LLM combines neural language modeling with symbolic reasoning components, leveraging structured knowledge representations and logical inferences to improve reasoning capabilities.",,,AIO:LLMApplicationFocus
-Causal LLM,AIO:CausalLLM,Model,Causal LLM,unidirectional|autoregressive,"A causal LLM only attends to previous tokens in the sequence when generating text, modeling the probability distribution autoregressively from left-to-right or causally.",,,AIO:LLMAdvancedTechniques
-Ordinal LLM,AIO:OrdinalLLM,Model,Ordinal LLM,ranking|preference modeling,"An ordinal LLM is trained to model ordinal relationships and rank outputs, rather than model probability distributions over text sequences directly.",,,AIO:LLMAdvancedTechniques
-Differentiable LLM,AIO:DifferentiableLLM,Model,Differentiable LLM,end-to-end training|fully backpropagable,"A differentiable LLM has an architecture amenable to full end-to-end training via backpropagation, without relying on teacher forcing or unlikelihood training objectives.",,,AIO:LLMModelArchitecture
-Generative Language Circuit,AIO:GenerativeLanguageCircuit,Model,GLC|Generative Language Circuit,analogy circuit|program synthesis,"A generative language circuit is a type of LLM that learns to generate programs or code by training on input-output pairs, effectively learning the underlying ""program"" that maps inputs to outputs.",,,AIO:LLMAdvancedTechniques
-Unsupervised LLM,AIO:UnsupervisedLLM,Model,Unsupervised LLM,self-supervised,"An unsupervised LLM is trained solely on unlabeled data using self-supervised objectives like masked language modeling, without any supervised fine-tuning.",,,AIO:LLMTrainingParadigm
-Semi-Supervised LLM,AIO:Semi-SupervisedLLM,Model,Semi-Supervised LLM,self-training,A semi-supervised LLM combines self-supervised pretraining on unlabeled data with supervised fine-tuning on labeled task data.,,,AIO:LLMTrainingParadigm
-Multilingual LLM,AIO:MultilingualLLM,Model,Multilingual LLM,cross-lingual transfer,"A multilingual LLM is trained on text from multiple languages, learning shared representations that enable zero-shot or few-shot transfer to new languages.",,,AIO:LLM
-Personalized LLM,AIO:PersonalizedLLM,Model,Personalized LLM,user adaptation,"A personalized LLM adapts its language modeling and generation to the preferences, style and persona of individual users or audiences.",,,AIO:LLM
-Embodied LLM,AIO:EmbodiedLLM,Model,Embodied LLM,multimodal grounding,"An embodied LLM integrates language with other modalities like vision, audio, robotics to enable grounded language understanding in real-world environments.",,,AIO:LLMApplicationFocus
-Ethical LLM,AIO:EthicalLLM,Model,Ethical LLM,value alignment|constituitional AI,"An ethical LLM is trained to uphold certain ethical principles, values or rules in its language generation to increase safety and trustworthiness.",,,AIO:LLMApplicationFocus
-Multi-Task LLM,AIO:Multi-TaskLLM,Model,Multi-Task LLM,transfer learning,"A multi-task LLM is trained jointly on multiple language tasks simultaneously, learning shared representations that transfer across tasks.",,,AIO:LLMApplicationFocus
-Rational LLM,AIO:RationalLLM,Model,Rational LLM,reasoning|logical inferences,"A rational LLM incorporates explicit reasoning capabilities, leveraging logical rules, axioms or external knowledge to make deductive inferences during language tasks.",,,AIO:LLM
-Multimodal Fusion LLM,AIO:MultimodalFusionLLM,Model,Multimodal Fusion LLM,cross-modal grounding,"A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.",,,AIO:LLMAdvancedTechniques
-Memory-Augmented LLM,AIO:Memory-AugmentedLLM,Model,Memory-Augmented LLM,external memory,"A memory-augmented LLM incorporates external writeable and readable memory components, allowing it to store and retrieve information over long contexts.",,,AIO:LLM
-Hierarchical LLM,AIO:HierarchicalLLM,Model,Hierarchical LLM,multi-scale representations,"A hierarchical LLM models language at multiple levels of granularity, learning hierarchical representations that can capture both low-level patterns and high-level abstractions.",,,AIO:LLMAdvancedTechniques
-Generative Commonsense LLM,AIO:GenerativeCommonsenseLLM,Model,Generative Commonsense LLM,physical reasoning|causal modeling,"A generative commonsense LLM is trained to understand and model basic physics, causality and common sense about how the real world works.",,,AIO:LLMApplicationFocus
-Implicit Language Model,AIO:ImplicitLanguageModel,Model,Implicit Language Model,Energy-based models|Token-level scoring,An implicit language model uses an energy function to score full sequences instead of factorizing probabilities autoregressively. This can better capture global properties and long-range dependencies.,,,AIO:LLM
-Factorized LLM,AIO:FactorizedLLM,Model,Factorized LLM,Conditional masking|Product of experts,A factorized LLM decomposes the full language modeling task into multiple sub-components or experts that each focus on a subset of the information. This enables more efficient scaling.,,,AIO:LLM
-Graph LLM,AIO:GraphLLM,Model,Graph LLM,Structured representations,"A graph LLM operates over structured inputs/outputs represented as graphs, enabling reasoning over explicit relational knowledge representations during language tasks.",,,AIO:LLMAdvancedTechniques
-Data-to-Text LLM,AIO:Data-to-TextLLM,Model,Data-to-Text LLM|Meaning representation,"A data-to-text LLM generates natural language descriptions from structured data sources like tables, graphs, knowledge bases, etc. Requiring grounding meaning representations.",,,,AIO:LLMAdvancedTechniques
-Recursive Language Model,AIO:RecursiveLanguageModel,Model,Recursive Language Model,Compositional generalization,"A recursive language model uses recursive neural network architectures like TreeLSTMs to learn syntactic composition functions, improving systematic generalization abilities.",,,AIO:LLMAdvancedTechniques
-Language Interface LLM,AIO:LanguageInterfaceLLM,Model,Language Interface LLM,Interactive learning,"A language interface LLM supports interactive semantic parsing, enabling users to provide feedback/corrections which are used to dynamically refine and update the language model.",,,AIO:LLMAdvancedTechniques
-Multimodal Fusion LLM,AIO:MultimodalFusionLLM,Model,Multimodal Fusion LLM,cross-modal grounding,"A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.",,,AIO:LLMAdvancedTechniques
-Lifelong Learning LLM,AIO:LifelongLearningLLM,Model,Lifelong Learning LLM|Continual Learning LLM,Catastrophic forgetting|Plasticity-Stability balance,"A lifelong learning LLM can continually acquire new knowledge over time without forgetting previously learned information, maintaining a balance between plasticity and stability.",,,AIO:LLMAdvancedTechniques
-Curriculum Learning LLM,AIO:CurriculumLearningLLM,Model,Curriculum Learning LLM,Learning progression,"A curriculum learning LLM is trained by presenting learning examples in a meaningful order from simple to complex, mimicking the learning trajectory followed by humans.",,,AIO:LLMAdvancedTechniques
-Self-Supervised LLM,AIO:Self-SupervisedLLM,Model,Self-Supervised LLM,Pretext tasks,"A self-supervised LLM learns rich representations by solving pretext tasks that involve predicting parts of the input from other observed parts of the data, without relying on human-annotated labels.",,,AIO:LLMTrainingParadigm
-Contrastive Learning LLM,AIO:ContrastiveLearningLLM,Model,Contrastive Learning LLM,Representation learning,"A contrastive learning LLM is trained to pull semantically similar samples closer together and push dissimilar samples apart in the representation space, learning high-quality features useful for downstream tasks.",,,AIO:LLMAdvancedTechniques
-Generative Language Interface,AIO:GenerativeLanguageInterface,Model,Generative Language Interface,Interactive generation,"A generative language interface enables users to engage in an interactive dialogue with an LLM, providing feedback to guide and refine the generated outputs iteratively.",,,AIO:LLM
-Explainable LLM,AIO:ExplainableLLM,Model,Explainable Language Model|XAI LLM,interpretability|model understanding,"An explainable LLM is designed to provide insights into its decision-making process, making it easier for users to understand and trust the model's outputs. It incorporates mechanisms for interpreting and explaining its predictions in human-understandable terms.",,,AIO:LLMAdvancedTechniques
-Adversarial-Resistant LLM,AIO:Adversarial-ResistantLLM,Model,Robust LLM,adversarial attacks|robustness,"An adversarial-resistant LLM is engineered to withstand or mitigate the effects of adversarial attacks, ensuring reliable performance even in the presence of deliberately misleading input designed to confuse the model.",,,AIO:LLMEnhancementStrategies
-Cross-Domain LLM,AIO:Cross-DomainLLM,Model,Domain-General LLM,domain adaptation|cross-domain transfer,"A cross-domain LLM is capable of performing well across a wide range of domains without significant loss in performance, facilitated by advanced domain adaptation techniques.",,,AIO:LLMEnhancementStrategies
-Zero-Shot Learning LLM,AIO:Zero-ShotLearningLLM,Model,Zero-Shot LLM,zero-shot learning,"A zero-shot learning LLM is able to perform tasks or understand concepts it has not explicitly been trained on, demonstrating a high degree of generalization and understanding.",,,AIO:LLMEnhancementStrategies
-Transfer Learning LLM,AIO:TransferLearningLLM,Model,Transfer LLM,transfer learning,"A transfer learning LLM leverages knowledge acquired during training on one task to improve performance on different but related tasks, facilitating more efficient learning and adaptation.",,,AIO:LLMEnhancementStrategies
-Low-Resource LLM,AIO:Low-ResourceLLM,Model,Low-Resource Language Model,resource-efficient|low-resource languages,"A low-resource LLM is optimized for performance in scenarios with limited data, computational resources, or for languages with sparse datasets.",,,AIO:LLMEnhancementStrategies
-Interpretable LLM,AIO:InterpretableLLM,Model,Interpretable Language Model,model transparency|interpretability,"An interpretable LLM prioritizes transparency and ease of understanding in its operations, making its decision-making processes clear and rational to human users.",,,AIO:LLMEnhancementStrategies
-Evolutionary LLM,AIO:EvolutionaryLLM,Model,Evolutionary Language Model,evolutionary algorithms|genetic programming,"An evolutionary LLM applies principles of evolutionary computation to optimize its structure and parameters, evolving over time to improve performance.",,,AIO:LLMEnhancementStrategies
-Data Preparation,AIO:DataPreparation,Preprocessing,Data Curation|Data Processing|Data Assembly,,"Techniques focused on preparing raw data for training, including cleaning, normalization, and tokenization.",,,
-Model Efficiency,AIO:ModelEfficiency,Preprocessing,Model Optimization|Computational Efficiency,,"Techniques aimed at making models more efficient, such as knowledge distillation.",,,
-Training Strategies,AIO:TrainingStrategies,Preprocessing,Learning Techniques|Instructional Methods,,Specific strategies or methodologies employed during model training.,,,
-Data Enhancement,AIO:DataEnhancement,Preprocessing,Data Improvement|Data Refinement,,"Methods that enhance the training data or its representation, including augmentation and feature extraction.",,,
-Distillation,AIO:Distillation,Preprocessing,Refining|Purification,Knowledge compression|Teacher-student model,"Knowledge distillation involves training a smaller model to replicate the behavior of a larger model, aiming to compress the knowledge into a more compact form without significant loss of performance.",,,AIO:ModelEfficiency
-Tokenization And Vocabulary Reduction,AIO:TokenizationAndVocabularyReduction,Preprocessing,Vocabulary Condensation|Lexical Simplification,Tokenization|Vocabulary size reduction,Breaking down text data into manageable pieces called tokens and reducing the model's vocabulary to streamline processing.,,,AIO:DataPreparation
-Cleaning And Normalization,AIO:CleaningAndNormalization,Preprocessing,Data Cleansing|Standardization,Data cleaning|Text normalization,"Removing irrelevant data, correcting typos, and standardizing text to reduce noise and ensure consistency in the data.",,,AIO:DataPreparation
-Subword Segmentation,AIO:SubwordSegmentation,Preprocessing,Fragmentation|Part-word Division,Byte Pair Encoding|SentencePiece,"Utilizing techniques like Byte Pair Encoding (BPE) or SentencePiece to break down words into smaller units, allowing the model to handle a wide range of vocabulary with a fixed-size list.",,,AIO:DataPreparation
-Data Augmentation,AIO:DataAugmentation,Preprocessing,Data Expansion|Data Enrichment,Paraphrasing|Synonym replacement,Expanding the training dataset artificially by modifying existing data points to improve the model's robustness and generalization ability.,,,AIO:DataEnhancement
-Curriculum Learning,AIO:CurriculumLearning,Preprocessing,Sequential Learning|Structured Learning,Sequential learning|Complexity grading,"Training the model on simpler tasks or easier data first, then gradually introducing more complex tasks to improve learning efficiency and performance.",,,AIO:TrainingStrategies
-Transfer Learning,AIO:TransferLearning,Preprocessing,Knowledge Transfer|Skill Acquisition,Pretrained models|Adaptation,Starting the training from a model already trained on a related task to reduce training time and improve performance on tasks with limited data.,,,AIO:TrainingStrategies
-Feature Extraction,AIO:FeatureExtraction,Preprocessing,Attribute Extraction|Feature Isolation,Syntactic information|Semantic embeddings,"Extracting specific features or patterns from the text before training to guide the model's learning process, including syntactic information or semantic embeddings.",,,AIO:DataEnhancement
\ No newline at end of file
+Encoder LLM,AIO:EncoderLLM,Model,LLM,,The Encoder-only architecture is used when only encoding the input sequence is required and the decoder is not necessary. The input sequence is encoded into a fixed-length representation and then used as input to a classifier or a regressor to make a prediction. These models have a pre-trained general-purpose encoder but will require fine-tuning of the final classifier or regressor.,"https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:ModelArchitecture
+Decoder LLM,AIO:DecoderLLM,Model,LLM,,"In the decoder-only architecture, the model consists of only a decoder, which is trained to predict the next token in a sequence given the previous tokens. The critical difference between the Decoder-only architecture and the Encoder-Decoder architecture is that the Decoder-only architecture does not have an explicit encoder to summarize the input information. Instead, the information is encoded implicitly in the hidden state of the decoder, which is updated at each step of the generation process.","https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:ModelArchitecture
+Encoder-Decoder LLM,AIO:Encoder-DecoderLLM,Model,LLM,,The Encoder-Decoder architecture was the original transformer architecture introduced in the Attention Is All You Need (https://arxiv.org/abs/1706.03762) paper. The encoder processes the input sequence and generates a hidden representation that summarizes the input information. The decoder uses this hidden representation to generate the desired output sequence. The encoder and decoder are trained end-to-end to maximize the likelihood of the correct output sequence given the input sequence.,"https://www.practicalai.io/understanding-transformer-model-architectures/#:~:text=Encoder%2Donly&text=These%20models%20have%20a%20pre,Named%20entity%20recognition",,AIO:ModelArchitecture
+Distillation,AIO:Distillation,Preprocessing,Refining|Purification,Knowledge compression|Teacher-student model,"Knowledge distillation involves training a smaller model to replicate the behavior of a larger model, aiming to compress the knowledge into a more compact form without significant loss of performance.",https://doi.org/10.48550/arXiv.2105.13093,,AIO:ModelEfficiency
+Tokenization And Vocabulary Reduction,AIO:TokenizationAndVocabularyReduction,Preprocessing,Vocabulary Condensation|Lexical Simplification,Tokenization|Vocabulary size reduction,Breaking down text data into manageable pieces called tokens and reducing the model's vocabulary to streamline processing.,TBD,,AIO:DataPreparation
+Cleaning And Normalization,AIO:CleaningAndNormalization,Preprocessing,Data Cleansing|Standardization,Data cleaning|Text normalization,"Removing irrelevant data, correcting typos, and standardizing text to reduce noise and ensure consistency in the data.",TBD,,AIO:DataPreparation
+Subword Segmentation,AIO:SubwordSegmentation,Preprocessing,Fragmentation|Part-word Division,Byte Pair Encoding|SentencePiece,"Utilizing techniques like Byte Pair Encoding (BPE) or SentencePiece to break down words into smaller units, allowing the model to handle a wide range of vocabulary with a fixed-size list.",TBD,,AIO:DataPreparation
+Data Augmentation,AIO:DataAugmentation,Preprocessing,Data Expansion|Data Enrichment,Paraphrasing|Synonym replacement,Expanding the training dataset artificially by modifying existing data points to improve the model's robustness and generalization ability.,TBD,,AIO:DataEnhancement
+Curriculum Learning,AIO:CurriculumLearning,Preprocessing,Sequential Learning|Structured Learning,Sequential learning|Complexity grading,"Training the model on simpler tasks or easier data first, then gradually introducing more complex tasks to improve learning efficiency and performance.",TBD,,AIO:TrainingStrategies
+Knowledge Transfer,AIO:KnowledgeTransfer,Preprocessing,Skill Acquisition|Inductive Transfer,Pretrained models|Adaptation,Starting the training from a model already trained on a related task to reduce training time and improve performance on tasks with limited data.,https://doi.org/10.1016/j.knosys.2015.01.010,,AIO:TrainingStrategies
+Feature Extraction,AIO:FeatureExtraction,Preprocessing,Attribute Extraction|Feature Isolation,Syntactic information|Semantic embeddings,"Extracting specific features or patterns from the text before training to guide the model's learning process, including syntactic information or semantic embeddings.",TBD,,AIO:DataEnhancement
+Data Preparation,AIO:DataPreparation,Preprocessing,Data Curation|Data Processing|Data Assembly,,"Techniques focused on preparing raw data for training, including cleaning, normalization, and tokenization.",TBD,,AIO:Preprocessing
+Model Efficiency,AIO:ModelEfficiency,Preprocessing,Model Optimization|Computational Efficiency,,"Techniques aimed at making models more efficient, such as knowledge distillation.",https://doi.org/10.1145/3578938,,AIO:Preprocessing
+Training Strategies,AIO:TrainingStrategies,Preprocessing,Learning Techniques|Instructional Methods,,Specific strategies or methodologies employed during model training.,TBD,,AIO:Preprocessing
+DataEnhancement,AIO:DataEnhancement,Preprocessing,,,"The processes and techniques used to improve data quality and value for better decision-making, analysis, and AI model training.",TBD,,AIO:Preprocessing
+Preprocessing,AIO:Preprocessing,Preprocessing,,,A range of techniques and processes applied to data before it is used in machine learning models or AI algorithms,https://doi.org/10.1109/ICDE.2019.00245,,
+Recursive Language Model,AIO:RecursiveLanguageModel,Model,RLM,Compositional generalization,"A recursive language model uses recursive neural network architectures like TreeLSTMs to learn syntactic composition functions, improving systematic generalization abilities.",https://doi.org/10.1609/aaai.v33i01.33017450 ,,AIO:ModelArchitecture
+Language Interface LLM,AIO:LanguageInterfaceLLM,Model,Language Interface LLM,Interactive learning,"A language interface LLM supports interactive semantic parsing, enabling users to provide feedback/corrections which are used to dynamically refine and update the language model.",TBD,,AIO:InterfaceandIntegration
+Multimodal Fusion LLM,AIO:MultimodalFusionLLM,Model,Multimodal Fusion LLM,cross-modal grounding,"A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.",TBD,,AIO:ApplicationFocus
+Lifelong Learning LLM,AIO:LifelongLearningLLM,Model,Lifelong Learning LLM|Continual Learning LLM,Catastrophic forgetting|Plasticity-Stability balance,"A lifelong learning LLM can continually acquire new knowledge over time without forgetting previously learned information, maintaining a balance between plasticity and stability.",TBD,,AIO:LearningParadigms
+Curriculum Learning LLM,AIO:CurriculumLearningLLM,Model,Curriculum Learning LLM,Learning progression,"A curriculum learning LLM is trained by presenting learning examples in a meaningful order from simple to complex, mimicking the learning trajectory followed by humans.",TBD,,AIO:LearningParadigms
+Self-Supervised LLM,AIO:Self-SupervisedLLM,Model,Self-Supervised LLM,Pretext tasks,"A self-supervised LLM learns rich representations by solving pretext tasks that involve predicting parts of the input from other observed parts of the data, without relying on human-annotated labels.",TBD,,AIO:LearningParadigms
+Contrastive Learning LLM,AIO:ContrastiveLearningLLM,Model,Contrastive Learning LLM,Representation learning,"A contrastive learning LLM is trained to pull semantically similar samples closer together and push dissimilar samples apart in the representation space, learning high-quality features useful for downstream tasks.",TBD,,AIO:LearningParadigms
+Generative Language Interface,AIO:GenerativeLanguageInterface,Model,Generative Language Interface,Interactive generation,"A generative language interface enables users to engage in an interactive dialogue with an LLM, providing feedback to guide and refine the generated outputs iteratively.",TBD,,AIO:ApplicationFocus
+Explainable LLM,AIO:ExplainableLLM,Model,Explainable Language Model|XAI LLM,interpretability|model understanding,"An explainable LLM is designed to provide insights into its decision-making process, making it easier for users to understand and trust the model's outputs. It incorporates mechanisms for interpreting and explaining its predictions in human-understandable terms.",TBD,,AIO:InterpretabilityandEthics
+Adversarial-Resistant LLM,AIO:Adversarial-ResistantLLM,Model,Robust LLM,adversarial attacks|robustness,"An adversarial-resistant LLM is engineered to withstand or mitigate the effects of adversarial attacks, ensuring reliable performance even in the presence of deliberately misleading input designed to confuse the model.",TBD,,AIO:EnhancementStrategies
+Cross-Domain LLM,AIO:Cross-DomainLLM,Model,Domain-General LLM,domain adaptation|cross-domain transfer,"A cross-domain LLM is capable of performing well across a wide range of domains without significant loss in performance, facilitated by advanced domain adaptation techniques.",TBD,,AIO:ApplicationFocus
+Zero-Shot Learning LLM,AIO:Zero-ShotLearningLLM,Model,Zero-Shot LLM,zero-shot learning,"A zero-shot learning LLM is able to perform tasks or understand concepts it has not explicitly been trained on, demonstrating a high degree of generalization and understanding.",TBD,,AIO:LearningParadigms
+Transfer Learning LLM,AIO:TransferLearningLLM,Model,Transfer LLM,transfer learning,"A transfer learning LLM leverages knowledge acquired during training on one task to improve performance on different but related tasks, facilitating more efficient learning and adaptation.",TBD,,AIO:LearningParadigms
+Low-Resource LLM,AIO:Low-ResourceLLM,Model,Low-Resource Language Model,resource-efficient|low-resource languages,"A low-resource LLM is optimized for performance in scenarios with limited data, computational resources, or for languages with sparse datasets.",TBD,,AIO:ApplicationFocus
+Interpretable LLM,AIO:InterpretableLLM,Model,Interpretable Language Model,model transparency|interpretability,"An interpretable LLM prioritizes transparency and ease of understanding in its operations, making its decision-making processes clear and rational to human users.",TBD,,AIO:InterpretabilityandEthics
+Evolutionary LLM,AIO:EvolutionaryLLM,Model,Evolutionary Language Model,evolutionary algorithms|genetic programming,"An evolutionary LLM applies principles of evolutionary computation to optimize its structure and parameters, evolving over time to improve performance.",TBD,,AIO:LearningParadigms
+Data-to-Text LLM,AIO:Data-to-TextLLM,Model,Data-to-Text LLM|Meaning representation,,"A data-to-text LLM generates natural language descriptions from structured data sources like tables, graphs, knowledge bases, etc. Requiring grounding meaning representations.",TBD,,AIO:ApplicationFocus
+Autoregressive Language Model,AIO:AutoregressiveLanguageModel,Model,Autoregressive Language Model,generative language model|sequence-to-sequence model,"An autoregressive language model is a type of language model that generates text sequentially, predicting one token at a time based on the previously generated tokens. It excels at natural language generation tasks by modeling the probability distribution over sequences of tokens.",TBD,,AIO:ModelArchitecture
+Masked Language Model,AIO:MaskedLanguageModel,Model,Masked Language Model,bidirectional encoder|denoising autoencoder,"A masked language model is a type of language model that is trained to predict randomly masked tokens in a sequence, based on the remaining unmasked tokens. This allows it to build deep bidirectional representations that can be effectively transferred to various NLP tasks via fine-tuning.",TBD,,AIO:ModelArchitecture
+Multimodal Transformer,AIO:MultimodalTransformer,Model,Multimodal Transformer,vision-language model|unified encoder,"A multimodal transformer is a transformer architecture that can process and relate information from different modalities, such as text, images, and audio. It uses a shared embedding space and attention mechanism to learn joint representations across modalities.",TBD,,AIO:ModelArchitecture
+Generative Adversarial Network-Augmented LLM,AIO:GenerativeAdversarialNetwork-AugmentedLLM,Model,GAN-LLM|Generative Adversarial Network-Augmented LLM,text generation|adversarial training,"A GAN-augmented LLM incorporates a generative adversarial network (GAN) into its training process, using a discriminator network to provide a signal for generating more realistic and coherent text. This adversarial training can improve the quality and diversity of generated text.",TBD,,AIO:ModelArchitecture
+Retrieval-Augmented LLM,AIO:Retrieval-AugmentedLLM,Model,Retrieval-Augmented LLM,knowledge grounding|open-book question answering,"A retrieval-augmented LLM combines a pre-trained language model with a retrieval system that can access external knowledge sources. This allows the model to condition its generation on relevant retrieved knowledge, improving factual accuracy and knowledge grounding.",TBD,,AIO:ModelArchitecture
+Reinforcement Learning LLM,AIO:ReinforcementLearningLLM,Model,RL-LLM|Reinforcement Learning LLM,reward modeling|decision transformers,"An RL-LLM is a language model that is fine-tuned using reinforcement learning, where the model receives rewards for generating text that satisfies certain desired properties or objectives. This can improve the quality, safety, or alignment of generated text.",TBD,,AIO:LearningParadigms
+Continual Learning LLM,AIO:ContinualLearningLLM,Model,CL-LLM|Continual Learning LLM,lifelong learning|catastrophic forgetting,"A continual learning LLM is designed to continually acquire new knowledge and skills over time, without forgetting previously learned information. This allows the model to adapt and expand its capabilities as new data becomes available.",TBD,,AIO:LearningParadigms
+Compositional Generalization LLM,AIO:CompositionalGeneralizationLLM,Model,Compositional Generalization LLM,systematic generalization|out-of-distribution generalization,"A compositional generalization LLM is trained to understand and recombine the underlying compositional structures in language, enabling better generalization to novel combinations and out-of-distribution examples.",TBD,,AIO:LearningParadigms
+Prompt-based Fine-Tuning LLM,AIO:Prompt-basedFine-TuningLLM,Model,Prompt-tuned LLM|Prompt-based Fine-Tuning LLM,in-context learning|few-shot learning,"A prompt-tuned LLM is fine-tuned on a small number of examples or prompts, rather than full task datasets. This allows for rapid adaptation to new tasks with limited data, leveraging the model's few-shot learning capabilities.",TBD,,AIO:LearningParadigms
+Sparse LLM,AIO:SparseLLM,Model,Sparse LLM,model compression|parameter efficiency,"A sparse LLM uses techniques like pruning or quantization to reduce the number of non-zero parameters in the model, making it more parameter-efficient and easier to deploy on resource-constrained devices.",TBD,,AIO:ModelArchitecture
+Federated LLM,AIO:FederatedLLM,Model,Federated LLM,privacy-preserving|decentralized training,"A federated LLM is trained in a decentralized manner across multiple devices or silos, without directly sharing private data. This enables collaborative training while preserving data privacy and security.",TBD,,AIO:EnhancementStrategies
+Energy-Based LLM,AIO:Energy-BasedLLM,Model,Energy-Based LLM,energy scoring|explicit density modeling,"An energy-based LLM models the explicit probability density over token sequences using an energy function, rather than an autoregressive factorization. This can improve modeling of long-range dependencies and global coherence.",TBD,,AIO:ModelArchitecture
+Controllable LLM,AIO:ControllableLLM,Model,Controllable LLM,conditional generation|guided generation,"A controllable LLM allows for explicit control over certain attributes of the generated text, such as style, tone, topic, or other desired characteristics, through conditioning or specialized training objectives.",TBD,,AIO:EnhancementStrategies
+Knowledge-Grounded LLM,AIO:Knowledge-GroundedLLM,Model,Knowledge-Grounded LLM,factual grounding|knowledge integration,"A knowledge-grounded LLM incorporates external knowledge sources or knowledge bases into the model architecture, enabling it to generate more factually accurate and knowledge-aware text.",TBD,,AIO:EnhancementStrategies
+Domain-Adapted LLM,AIO:Domain-AdaptedLLM,Model,Domain-Adapted LLM,transfer learning|domain robustness,"A domain-adapted LLM is first pre-trained on a broad corpus, then fine-tuned on domain-specific data to specialize its capabilities for particular domains or applications, like scientific literature or code generation.",TBD,,AIO:EnhancementStrategies
+Dialogue LLM,AIO:DialogueLLM,Model,Dialogue LLM,conversational AI|multi-turn dialogue,"A dialogue LLM is optimized for engaging in multi-turn conversations, understanding context and generating relevant, coherent responses continuously over many dialogue turns.",TBD,,AIO:ApplicationFocus
+Instruction-Tuned LLM,AIO:Instruction-TunedLLM,Model,Instruction-Tuned LLM,natural language instructions|constitutional AI,"An instruction-tuned LLM is fine-tuned to follow natural language instructions accurately and safely, learning to map from instructions to desired model behavior in a more controlled and principled way.",TBD,,AIO:ApplicationFocus
+Modular LLM,AIO:ModularLLM,Model,Modular LLM,component skills|skill composition,"A modular LLM consists of multiple specialized components or skills that can be dynamically composed and recombined to solve complex tasks, mimicking the modular structure of human cognition.",TBD,,AIO:ModelArchitecture
+Recursive LLM,AIO:RecursiveLLM,Model,Recursive LLM|Self-Attending LLM,self-attention|iterative refinement,"A recursive or self-attending LLM incorporates recursive self-attention mechanisms, allowing it to iteratively refine its own outputs and capture long-range dependencies more effectively.",TBD,,AIO:ModelArchitecture
+Mixture-of-Experts LLM,AIO:Mixture-of-ExpertsLLM,Model,MoE LLM|Mixture-of-Experts LLM,conditional computation|model parallelism,"A Mixture-of-Experts LLM dynamically selects and combines outputs from multiple expert submodels, allowing for efficient scaling by conditionally activating only a subset of model components for each input.",TBD,,AIO:ModelArchitecture
+Meta-Learning LLM,AIO:Meta-LearningLLM,Model,Meta-Learning LLM,few-shot adaptation|learning to learn,"A meta-learning LLM is trained in a way that allows it to quickly adapt to new tasks or datasets through only a few examples or fine-tuning steps, leveraging meta-learned priors about how to efficiently learn.",TBD,,AIO:LearningParadigms
+Neuro-Symbolic LLM,AIO:Neuro-SymbolicLLM,Model,Neuro-Symbolic LLM,knowledge reasoning|symbolic grounding,"A neuro-symbolic LLM combines neural language modeling with symbolic reasoning components, leveraging structured knowledge representations and logical inferences to improve reasoning capabilities.",TBD,,AIO:InterpretabilityandEthics
+Causal LLM,AIO:CausalLLM,Model,Causal LLM,unidirectional|autoregressive,"A causal LLM only attends to previous tokens in the sequence when generating text, modeling the probability distribution autoregressively from left-to-right or causally.",TBD,,AIO:InterpretabilityandEthics
+Ordinal LLM,AIO:OrdinalLLM,Model,Ordinal LLM,ranking|preference modeling,"An ordinal LLM is trained to model ordinal relationships and rank outputs, rather than model probability distributions over text sequences directly.",TBD,,AIO:InterpretabilityandEthics
+Differentiable LLM,AIO:DifferentiableLLM,Model,Differentiable LLM,end-to-end training|fully backpropagable,"A differentiable LLM has an architecture amenable to full end-to-end training via backpropagation, without relying on teacher forcing or unlikelihood training objectives.",TBD,,AIO:ModelArchitecture
+Unsupervised LLM,AIO:UnsupervisedLLM,Model,Unsupervised LLM,self-supervised,"An unsupervised LLM is trained solely on unlabeled data using self-supervised objectives like masked language modeling, without any supervised fine-tuning.",TBD,,AIO:LearningParadigms
+Semi-Supervised LLM,AIO:Semi-SupervisedLLM,Model,Semi-Supervised LLM,self-training,A semi-supervised LLM combines self-supervised pretraining on unlabeled data with supervised fine-tuning on labeled task data.,TBD,,AIO:LearningParadigms
+Multilingual LLM,AIO:MultilingualLLM,Model,Multilingual LLM,cross-lingual transfer,"A multilingual LLM is trained on text from multiple languages, learning shared representations that enable zero-shot or few-shot transfer to new languages.",TBD,,AIO:LearningParadigms
+Personalized LLM,AIO:PersonalizedLLM,Model,Personalized LLM,user adaptation LLM,"A personalized LLM adapts its language modeling and generation to the preferences, style and persona of individual users or audiences.",TBD,,AIO:ApplicationFocus
+Embodied LLM,AIO:EmbodiedLLM,Model,Embodied LLM,multimodal grounding,"An embodied LLM integrates language with other modalities like vision, audio, robotics to enable grounded language understanding in real-world environments.",TBD,,AIO:ApplicationFocus
+Ethical LLM,AIO:EthicalLLM,Model,Ethical LLM,value alignment|constituitional AI,"An ethical LLM is trained to uphold certain ethical principles, values or rules in its language generation to increase safety and trustworthiness.",TBD,,AIO:ApplicationFocus
+Multi-Task LLM,AIO:Multi-TaskLLM,Model,Multi-Task LLM,transfer learning,"A multi-task LLM is trained jointly on multiple language tasks simultaneously, learning shared representations that transfer across tasks.",TBD,,AIO:InterpretabilityandEthics
+Rational LLM,AIO:RationalLLM,Model,Rational LLM,reasoning|logical inferences,"A rational LLM incorporates explicit reasoning capabilities, leveraging logical rules, axioms or external knowledge to make deductive inferences during language tasks.",TBD,,AIO:ApplicationFocus
+Multimodal Fusion LLM,AIO:MultimodalFusionLLM,Model,Multimodal Fusion LLM,cross-modal grounding,"A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.",TBD,,AIO:EnhancementStrategies
+Memory-Augmented LLM,AIO:Memory-AugmentedLLM,Model,Memory-Augmented LLM,external memory,"A memory-augmented LLM incorporates external writeable and readable memory components, allowing it to store and retrieve information over long contexts.",TBD,,AIO:ApplicationFocus
+Hierarchical LLM,AIO:HierarchicalLLM,Model,Hierarchical LLM,multi-scale representations,"A hierarchical LLM models language at multiple levels of granularity, learning hierarchical representations that can capture both low-level patterns and high-level abstractions.",TBD,,AIO:ApplicationFocus
+Generative Commonsense LLM,AIO:GenerativeCommonsenseLLM,Model,Generative Commonsense LLM,physical reasoning|causal modeling,"A generative commonsense LLM is trained to understand and model basic physics, causality and common sense about how the real world works.",TBD,,AIO:ApplicationFocus
+Implicit Language Model,AIO:ImplicitLanguageModel,Model,Implicit Language Model,Energy-based models|Token-level scoring,An implicit language model uses an energy function to score full sequences instead of factorizing probabilities autoregressively. This can better capture global properties and long-range dependencies.,TBD,,AIO:InterpretabilityandEthics
+Factorized LLM,AIO:FactorizedLLM,Model,Factorized LLM,Conditional masking|Product of experts,A factorized LLM decomposes the full language modeling task into multiple sub-components or experts that each focus on a subset of the information. This enables more efficient scaling.,TBD,,AIO:ModelArchitecture
+Graph LLM,AIO:GraphLLM,Model,Graph LLM,Structured representations,"A graph LLM operates over structured inputs/outputs represented as graphs, enabling reasoning over explicit relational knowledge representations during language tasks.",https://doi.org/10.48550/arXiv.2311.12399,,AIO:ModelArchitecture
+Model Architecture,AIO:ModelArchitecture,Model,,,An abstract parent class grouping LLMs based on model architecture.,TBD,,AIO:ModelArchitecture
+Learning Paradigms,AIO:LearningParadigms,Model,,,An abstract parent class grouping LLMs based on model learning paradigms.,TBD,,AIO:ModelArchitecture
+Application Focus,AIO:ApplicationFocus,Model,,,An abstract parent class grouping LLMs based on model application focus.,TBD,,AIO:LargeLanguageModel
+Enhancement Strategies,AIO:EnhancementStrategies,Model,,,An abstract parent class grouping LLMs based on model enhancement strategies.,TBD,,AIO:LargeLanguageModel
+Interpretability and Ethics,AIO:InterpretabilityandEthics,Model,,,An abstract parent class grouping LLMs based on model interpretability and ethics.,TBD,,AIO:LargeLanguageModel
+Interface and Integration,AIO:InterfaceandIntegration,Model,,,An abstract parent class grouping LLMs based on model interfaces and integration.,TBD,,AIO:LargeLanguageModel
\ No newline at end of file
diff --git a/src/ontology/components/aio-component.owl b/src/ontology/components/aio-component.owl
index 2fabcd2..b01bdb9 100644
--- a/src/ontology/components/aio-component.owl
+++ b/src/ontology/components/aio-component.owl
@@ -303,16 +303,6 @@
-
-
-
-
- An abstract parent class grouping LLMs based on advanced techniques.
- Advanced Techniques
-
-
-
-
@@ -323,6 +313,12 @@
robustness
Adversarial-Resistant LLM
+
+
+
+ An adversarial-resistant LLM is engineered to withstand or mitigate the effects of adversarial attacks, ensuring reliable performance even in the presence of deliberately misleading input designed to confuse the model.
+ TBD
+
@@ -394,9 +390,15 @@
- An abstract parent class grouping LLMs based on application focus.
+ An abstract parent class grouping LLMs based on model application focus.
Application Focus
+
+
+
+ An abstract parent class grouping LLMs based on model application focus.
+ TBD
+
@@ -488,13 +490,19 @@
-
+
An autoregressive language model is a type of language model that generates text sequentially, predicting one token at a time based on the previously generated tokens. It excels at natural language generation tasks by modeling the probability distribution over sequences of tokens.
Autoregressive Language Model
generative language model
sequence-to-sequence model
Autoregressive Language Model
+
+
+
+ An autoregressive language model is a type of language model that generates text sequentially, predicting one token at a time based on the previously generated tokens. It excels at natural language generation tasks by modeling the probability distribution over sequences of tokens.
+ TBD
+
@@ -887,13 +895,19 @@
-
+
A causal LLM only attends to previous tokens in the sequence when generating text, modeling the probability distribution autoregressively from left-to-right or causally.
Causal LLM
autoregressive
unidirectional
Causal LLM
+
+
+
+ A causal LLM only attends to previous tokens in the sequence when generating text, modeling the probability distribution autoregressively from left-to-right or causally.
+ TBD
+
@@ -940,6 +954,12 @@
Text normalization
Cleaning And Normalization
+
+
+
+ Removing irrelevant data, correcting typos, and standardizing text to reduce noise and ensure consistency in the data.
+ TBD
+
@@ -979,13 +999,19 @@
-
+
A compositional generalization LLM is trained to understand and recombine the underlying compositional structures in language, enabling better generalization to novel combinations and out-of-distribution examples.
Compositional Generalization LLM
out-of-distribution generalization
systematic generalization
Compositional Generalization LLM
+
+
+
+ A compositional generalization LLM is trained to understand and recombine the underlying compositional structures in language, enabling better generalization to novel combinations and out-of-distribution examples.
+ TBD
+
@@ -1108,7 +1134,7 @@
-
+
A continual learning LLM is designed to continually acquire new knowledge and skills over time, without forgetting previously learned information. This allows the model to adapt and expand its capabilities as new data becomes available.
CL-LLM
Continual Learning LLM
@@ -1116,6 +1142,12 @@
lifelong learning
Continual Learning LLM
+
+
+
+ A continual learning LLM is designed to continually acquire new knowledge and skills over time, without forgetting previously learned information. This allows the model to adapt and expand its capabilities as new data becomes available.
+ TBD
+
@@ -1138,12 +1170,18 @@
-
+
A contrastive learning LLM is trained to pull semantically similar samples closer together and push dissimilar samples apart in the representation space, learning high-quality features useful for downstream tasks.
Contrastive Learning LLM
Representation learning
Contrastive Learning LLM
+
+
+
+ A contrastive learning LLM is trained to pull semantically similar samples closer together and push dissimilar samples apart in the representation space, learning high-quality features useful for downstream tasks.
+ TBD
+
@@ -1157,6 +1195,12 @@
guided generation
Controllable LLM
+
+
+
+ A controllable LLM allows for explicit control over certain attributes of the generated text, such as style, tone, topic, or other desired characteristics, through conditioning or specialized training objectives.
+ TBD
+
@@ -1401,13 +1445,19 @@
-
+
A cross-domain LLM is capable of performing well across a wide range of domains without significant loss in performance, facilitated by advanced domain adaptation techniques.
Domain-General LLM
cross-domain transfer
domain adaptation
Cross-Domain LLM
+
+
+
+ A cross-domain LLM is capable of performing well across a wide range of domains without significant loss in performance, facilitated by advanced domain adaptation techniques.
+ TBD
+
@@ -1422,30 +1472,48 @@
Sequential learning
Curriculum Learning
+
+
+
+ Training the model on simpler tasks or easier data first, then gradually introducing more complex tasks to improve learning efficiency and performance.
+ TBD
+
-
+
A curriculum learning LLM is trained by presenting learning examples in a meaningful order from simple to complex, mimicking the learning trajectory followed by humans.
Curriculum Learning LLM
Learning progression
Curriculum Learning LLM
+
+
+
+ A curriculum learning LLM is trained by presenting learning examples in a meaningful order from simple to complex, mimicking the learning trajectory followed by humans.
+ TBD
+
-
+
+ A data-to-text LLM generates natural language descriptions from structured data sources like tables, graphs, knowledge bases, etc. Requiring grounding meaning representations.
Data-to-Text LLM
Meaning representation
- A data-to-text LLM generates natural language descriptions from structured data sources like tables, graphs, knowledge bases, etc. Requiring grounding meaning representations.
Data-to-Text LLM
+
+
+
+ A data-to-text LLM generates natural language descriptions from structured data sources like tables, graphs, knowledge bases, etc. Requiring grounding meaning representations.
+ TBD
+
@@ -1460,6 +1528,12 @@
Synonym replacement
Data Augmentation
+
+
+
+ Expanding the training dataset artificially by modifying existing data points to improve the model's robustness and generalization ability.
+ TBD
+
@@ -1483,11 +1557,16 @@
- Methods that enhance the training data or its representation, including augmentation and feature extraction.
- Data Improvement
- Data Refinement
- Data Enhancement
+
+ The processes and techniques used to improve data quality and value for better decision-making, analysis, and AI model training.
+ DataEnhancement
+
+
+
+ The processes and techniques used to improve data quality and value for better decision-making, analysis, and AI model training.
+ TBD
+
@@ -1526,12 +1605,19 @@
+
Techniques focused on preparing raw data for training, including cleaning, normalization, and tokenization.
Data Assembly
Data Curation
Data Processing
Data Preparation
+
+
+
+ Techniques focused on preparing raw data for training, including cleaning, normalization, and tokenization.
+ TBD
+
@@ -1630,6 +1716,12 @@
Input, Kernel, Convolutional/Pool, Probabilistic Hidden, Convolutional/Pool, Kernel, Output
Deep Convolutional Inverse Graphics Network
+
+
+
+ A Deep Convolution Inverse Graphics Network (DC-IGN) is a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. (https://arxiv.org/abs/1503.03167)
+ TBD
+
@@ -1656,13 +1748,7 @@
-
-
-
-
-
-
-
+
The feedforward neural network was the first and simplest type of artificial neural network devised. In this network, the information moves in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.
DFF
@@ -1671,10 +1757,10 @@
MLP
Multilayer Perceptoron
Input, Hidden, Output
- Deep FeedFoward
+ Deep FeedForward
-
+
The feedforward neural network was the first and simplest type of artificial neural network devised. In this network, the information moves in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.
https://en.wikipedia.org/wiki/Feedforward_neural_network
@@ -1690,6 +1776,12 @@
DNN
Deep Neural Network
+
+
+
+ A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers.[13][2] There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions. (https://en.wikipedia.org/wiki/Deep_Learning#:~:text=A%20deep%20neural%20network%20(DNN,weights%2C%20biases%2C%20and%20functions.)
+ TBD
+
@@ -1715,9 +1807,16 @@
Denoising Auto Encoders (DAEs) take a partially corrupted input and are trained to recover the original undistorted input. In practice, the objective of denoising autoencoders is that of cleaning the corrupted input, or denoising. (https://en.wikipedia.org/wiki/Autoencoder)
DAE
+ Denoising Autoencoder
Noisy Input, Hidden, Matched Output-Input
Denoising Auto Encoder
+
+
+
+ Denoising Auto Encoders (DAEs) take a partially corrupted input and are trained to recover the original undistorted input. In practice, the objective of denoising autoencoders is that of cleaning the corrupted input, or denoising. (https://en.wikipedia.org/wiki/Autoencoder)
+ https://doi.org/10.1145/1390156.1390294
+
@@ -1827,19 +1926,31 @@
multi-turn dialogue
Dialogue LLM
+
+
+
+ A dialogue LLM is optimized for engaging in multi-turn conversations, understanding context and generating relevant, coherent responses continuously over many dialogue turns.
+ TBD
+
-
+
A differentiable LLM has an architecture amenable to full end-to-end training via backpropagation, without relying on teacher forcing or unlikelihood training objectives.
Differentiable LLM
end-to-end training
fully backpropagable
Differentiable LLM
+
+
+
+ A differentiable LLM has an architecture amenable to full end-to-end training via backpropagation, without relying on teacher forcing or unlikelihood training objectives.
+ TBD
+
@@ -1887,19 +1998,31 @@
Teacher-student model
Distillation
+
+
+
+ Knowledge distillation involves training a smaller model to replicate the behavior of a larger model, aiming to compress the knowledge into a more compact form without significant loss of performance.
+ https://doi.org/10.48550/arXiv.2105.13093
+
-
+
A domain-adapted LLM is first pre-trained on a broad corpus, then fine-tuned on domain-specific data to specialize its capabilities for particular domains or applications, like scientific literature or code generation.
Domain-Adapted LLM
domain robustness
transfer learning
Domain-Adapted LLM
+
+
+
+ A domain-adapted LLM is first pre-trained on a broad corpus, then fine-tuned on domain-specific data to specialize its capabilities for particular domains or applications, like scientific literature or code generation.
+ TBD
+
@@ -2046,6 +2169,12 @@
multimodal grounding
Embodied LLM
+
+
+
+ An embodied LLM integrates language with other modalities like vision, audio, robotics to enable grounded language understanding in real-world environments.
+ TBD
+
@@ -2102,13 +2231,19 @@
-
+
An energy-based LLM models the explicit probability density over token sequences using an energy function, rather than an autoregressive factorization. This can improve modeling of long-range dependencies and global coherence.
Energy-Based LLM
energy scoring
explicit density modeling
Energy-Based LLM
+
+
+
+ An energy-based LLM models the explicit probability density over token sequences using an energy function, rather than an autoregressive factorization. This can improve modeling of long-range dependencies and global coherence.
+ TBD
+
@@ -2116,9 +2251,15 @@
- An abstract parent class grouping LLMs based on enhancement strategies.
+ An abstract parent class grouping LLMs based on model enhancement strategies.
Enhancement Strategies
+
+
+
+ An abstract parent class grouping LLMs based on model enhancement strategies.
+ TBD
+
@@ -2165,6 +2306,12 @@
value alignment
Ethical LLM
+
+
+
+ An ethical LLM is trained to uphold certain ethical principles, values or rules in its language generation to increase safety and trustworthiness.
+ TBD
+
@@ -2187,13 +2334,19 @@
-
+
An evolutionary LLM applies principles of evolutionary computation to optimize its structure and parameters, evolving over time to improve performance.
Evolutionary Language Model
evolutionary algorithms
genetic programming
Evolutionary LLM
+
+
+
+ An evolutionary LLM applies principles of evolutionary computation to optimize its structure and parameters, evolving over time to improve performance.
+ TBD
+
@@ -2216,7 +2369,7 @@
-
+
An explainable LLM is designed to provide insights into its decision-making process, making it easier for users to understand and trust the model's outputs. It incorporates mechanisms for interpreting and explaining its predictions in human-understandable terms.
Explainable Language Model
XAI LLM
@@ -2224,6 +2377,12 @@
model understanding
Explainable LLM
+
+
+
+ An explainable LLM is designed to provide insights into its decision-making process, making it easier for users to understand and trust the model's outputs. It incorporates mechanisms for interpreting and explaining its predictions in human-understandable terms.
+ TBD
+
@@ -2264,13 +2423,19 @@
-
+
A factorized LLM decomposes the full language modeling task into multiple sub-components or experts that each focus on a subset of the information. This enables more efficient scaling.
Factorized LLM
Conditional masking
Product of experts
Factorized LLM
+
+
+
+ A factorized LLM decomposes the full language modeling task into multiple sub-components or experts that each focus on a subset of the information. This enables more efficient scaling.
+ TBD
+
@@ -2285,19 +2450,31 @@
Syntactic information
Feature Extraction
+
+
+
+ Extracting specific features or patterns from the text before training to guide the model's learning process, including syntactic information or semantic embeddings.
+ TBD
+
-
+
A federated LLM is trained in a decentralized manner across multiple devices or silos, without directly sharing private data. This enables collaborative training while preserving data privacy and security.
Federated LLM
decentralized training
privacy-preserving
Federated LLM
+
+
+
+ A federated LLM is trained in a decentralized manner across multiple devices or silos, without directly sharing private data. This enables collaborative training while preserving data privacy and security.
+ TBD
+
@@ -2342,6 +2519,12 @@
Input, Hidden, Output, Hidden
Feedback Network
+
+
+
+ A feedback based approach in which the representation is formed in an iterative manner based on a feedback received from previous iteration's output. (https://arxiv.org/abs/1612.09508)
+ TBD
+
@@ -2416,7 +2599,16 @@
-
+
+ Function parent class
+ Function
+
+
+
+
+ Function parent class
+ TBD
+
@@ -2591,7 +2783,7 @@
-
+
A GAN-augmented LLM incorporates a generative adversarial network (GAN) into its training process, using a discriminator network to provide a signal for generating more realistic and coherent text. This adversarial training can improve the quality and diversity of generated text.
GAN-LLM
Generative Adversarial Network-Augmented LLM
@@ -2599,6 +2791,12 @@
text generation
Generative Adversarial Network-Augmented LLM
+
+
+
+ A GAN-augmented LLM incorporates a generative adversarial network (GAN) into its training process, using a discriminator network to provide a signal for generating more realistic and coherent text. This adversarial training can improve the quality and diversity of generated text.
+ TBD
+
@@ -2612,32 +2810,30 @@
physical reasoning
Generative Commonsense LLM
-
-
-
-
-
-
-
- A generative language circuit is a type of LLM that learns to generate programs or code by training on input-output pairs, effectively learning the underlying "program" that maps inputs to outputs.
- GLC
- Generative Language Circuit
- analogy circuit
- program synthesis
- Generative Language Circuit
-
+
+
+
+ A generative commonsense LLM is trained to understand and model basic physics, causality and common sense about how the real world works.
+ TBD
+
-
+
A generative language interface enables users to engage in an interactive dialogue with an LLM, providing feedback to guide and refine the generated outputs iteratively.
Generative Language Interface
Interactive generation
Generative Language Interface
+
+
+
+ A generative language interface enables users to engage in an interactive dialogue with an LLM, providing feedback to guide and refine the generated outputs iteratively.
+ TBD
+
@@ -2788,12 +2984,18 @@
-
+
A graph LLM operates over structured inputs/outputs represented as graphs, enabling reasoning over explicit relational knowledge representations during language tasks.
Graph LLM
Structured representations
Graph LLM
+
+
+
+ A graph LLM operates over structured inputs/outputs represented as graphs, enabling reasoning over explicit relational knowledge representations during language tasks.
+ https://doi.org/10.48550/arXiv.2311.12399
+
@@ -2942,6 +3144,12 @@
multi-scale representations
Hierarchical LLM
+
+
+
+ A hierarchical LLM models language at multiple levels of granularity, learning hierarchical representations that can capture both low-level patterns and high-level abstractions.
+ TBD
+
@@ -3081,13 +3289,19 @@
-
+
An implicit language model uses an energy function to score full sequences instead of factorizing probabilities autoregressively. This can better capture global properties and long-range dependencies.
Implicit Language Model
Energy-based models
Token-level scoring
Implicit Language Model
+
+
+
+ An implicit language model uses an energy function to score full sequences instead of factorizing probabilities autoregressively. This can better capture global properties and long-range dependencies.
+ TBD
+
@@ -3268,6 +3482,12 @@
natural language instructions
Instruction-Tuned LLM
+
+
+
+ An instruction-tuned LLM is fine-tuned to follow natural language instructions accurately and safely, learning to map from instructions to desired model behavior in a more controlled and principled way.
+ TBD
+
@@ -3287,16 +3507,54 @@
+
+
+
+
+ An abstract parent class grouping LLMs based on model interfaces and integration.
+ Interface and Integration
+
+
+
+
+ An abstract parent class grouping LLMs based on model interfaces and integration.
+ TBD
+
+
+
+
+
+
+
+
+ An abstract parent class grouping LLMs based on model interpretability and ethics.
+ Interpretability and Ethics
+
+
+
+
+ An abstract parent class grouping LLMs based on model interpretability and ethics.
+ TBD
+
+
+
+
-
+
An interpretable LLM prioritizes transparency and ease of understanding in its operations, making its decision-making processes clear and rational to human users.
Interpretable Language Model
interpretability
model transparency
Interpretable LLM
+
+
+
+ An interpretable LLM prioritizes transparency and ease of understanding in its operations, making its decision-making processes clear and rational to human users.
+ TBD
+
@@ -3374,13 +3632,39 @@
-
+
A knowledge-grounded LLM incorporates external knowledge sources or knowledge bases into the model architecture, enabling it to generate more factually accurate and knowledge-aware text.
Knowledge-Grounded LLM
factual grounding
knowledge integration
Knowledge-Grounded LLM
+
+
+
+ A knowledge-grounded LLM incorporates external knowledge sources or knowledge bases into the model architecture, enabling it to generate more factually accurate and knowledge-aware text.
+ TBD
+
+
+
+
+
+
+
+
+ Starting the training from a model already trained on a related task to reduce training time and improve performance on tasks with limited data.
+ Inductive Transfer
+ Skill Acquisition
+ Adaptation
+ Pretrained models
+ Knowledge Transfer
+
+
+
+
+ Starting the training from a model already trained on a related task to reduce training time and improve performance on tasks with limited data.
+ https://doi.org/10.1016/j.knosys.2015.01.010
+
@@ -3406,12 +3690,6 @@
-
-
-
-
-
-
@@ -3499,12 +3777,18 @@
-
+
A language interface LLM supports interactive semantic parsing, enabling users to provide feedback/corrections which are used to dynamically refine and update the language model.
Language Interface LLM
Interactive learning
Language Interface LLM
+
+
+
+ A language interface LLM supports interactive semantic parsing, enabling users to provide feedback/corrections which are used to dynamically refine and update the language model.
+ TBD
+
@@ -3546,6 +3830,12 @@
Network layer parent class
Layer
+
+
+
+ Network layer parent class
+ TBD
+
@@ -3722,6 +4012,22 @@
+
+
+
+
+ An abstract parent class grouping LLMs based on model learning paradigms.
+ Learning Paradigms
+
+
+
+
+ An abstract parent class grouping LLMs based on model learning paradigms.
+ TBD
+
+
+
+
@@ -3741,7 +4047,7 @@
-
+
A lifelong learning LLM can continually acquire new knowledge over time without forgetting previously learned information, maintaining a balance between plasticity and stability.
Continual Learning LLM
Lifelong Learning LLM
@@ -3749,6 +4055,12 @@
Plasticity-Stability balance
Lifelong Learning LLM
+
+
+
+ A lifelong learning LLM can continually acquire new knowledge over time without forgetting previously learned information, maintaining a balance between plasticity and stability.
+ TBD
+
@@ -3936,20 +4248,25 @@
-
+
A low-resource LLM is optimized for performance in scenarios with limited data, computational resources, or for languages with sparse datasets.
Low-Resource Language Model
low-resource languages
resource-efficient
Low-Resource LLM
+
+
+
+ A low-resource LLM is optimized for performance in scenarios with limited data, computational resources, or for languages with sparse datasets.
+ TBD
+
-
A field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks.
Machine Learning
@@ -4001,13 +4318,19 @@
-
+
A masked language model is a type of language model that is trained to predict randomly masked tokens in a sequence, based on the remaining unmasked tokens. This allows it to build deep bidirectional representations that can be effectively transferred to various NLP tasks via fine-tuning.
Masked Language Model
bidirectional encoder
denoising autoencoder
Masked Language Model
+
+
+
+ A masked language model is a type of language model that is trained to predict randomly masked tokens in a sequence, based on the remaining unmasked tokens. This allows it to build deep bidirectional representations that can be effectively transferred to various NLP tasks via fine-tuning.
+ TBD
+
@@ -4176,12 +4499,18 @@
-
+
A memory-augmented LLM incorporates external writeable and readable memory components, allowing it to store and retrieve information over long contexts.
Memory-Augmented LLM
external memory
Memory-Augmented LLM
+
+
+
+ A memory-augmented LLM incorporates external writeable and readable memory components, allowing it to store and retrieve information over long contexts.
+ TBD
+
@@ -4220,22 +4549,19 @@
-
+
A meta-learning LLM is trained in a way that allows it to quickly adapt to new tasks or datasets through only a few examples or fine-tuning steps, leveraging meta-learned priors about how to efficiently learn.
Meta-Learning LLM
few-shot adaptation
learning to learn
Meta-Learning LLM
-
-
-
-
-
-
- Method parent class.
- Method
-
+
+
+
+ A meta-learning LLM is trained in a way that allows it to quickly adapt to new tasks or datasets through only a few examples or fine-tuning steps, leveraging meta-learned priors about how to efficiently learn.
+ TBD
+
@@ -4283,6 +4609,12 @@
model parallelism
Mixture-of-Experts LLM
+
+
+
+ A Mixture-of-Experts LLM dynamically selects and combines outputs from multiple expert submodels, allowing for efficient scaling by conditionally activating only a subset of model components for each input.
+ TBD
+
@@ -4305,21 +4637,34 @@
-
+
An abstract parent class grouping LLMs based on model architecture.
Model Architecture
+
+
+
+ An abstract parent class grouping LLMs based on model architecture.
+ TBD
+
+
Techniques aimed at making models more efficient, such as knowledge distillation.
Computational Efficiency
Model Optimization
Model Efficiency
+
+
+
+ Techniques aimed at making models more efficient, such as knowledge distillation.
+ https://doi.org/10.1145/3578938
+
@@ -4349,18 +4694,30 @@
skill composition
Modular LLM
+
+
+
+ A modular LLM consists of multiple specialized components or skills that can be dynamically composed and recombined to solve complex tasks, mimicking the modular structure of human cognition.
+ TBD
+
-
+
A multi-task LLM is trained jointly on multiple language tasks simultaneously, learning shared representations that transfer across tasks.
Multi-Task LLM
transfer learning
Multi-Task LLM
+
+
+
+ A multi-task LLM is trained jointly on multiple language tasks simultaneously, learning shared representations that transfer across tasks.
+ TBD
+
@@ -4417,12 +4774,18 @@
-
+
A multilingual LLM is trained on text from multiple languages, learning shared representations that enable zero-shot or few-shot transfer to new languages.
Multilingual LLM
cross-lingual transfer
Multilingual LLM
+
+
+
+ A multilingual LLM is trained on text from multiple languages, learning shared representations that enable zero-shot or few-shot transfer to new languages.
+ TBD
+
@@ -4445,12 +4808,19 @@
-
+
+
A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.
Multimodal Fusion LLM
cross-modal grounding
Multimodal Fusion LLM
+
+
+
+ A multimodal fusion LLM learns joint representations across different modalities like text, vision and audio in an end-to-end fashion for better cross-modal understanding and generation.
+ TBD
+
@@ -4461,19 +4831,31 @@
Methods which can represent the joint representations of different modalities.
Multimodal Learning
+
+
+
+ Methods which can represent the joint representations of different modalities.
+ TBD
+
-
+
A multimodal transformer is a transformer architecture that can process and relate information from different modalities, such as text, images, and audio. It uses a shared embedding space and attention mechanism to learn joint representations across modalities.
Multimodal Transformer
unified encoder
vision-language model
Multimodal Transformer
+
+
+
+ A multimodal transformer is a transformer architecture that can process and relate information from different modalities, such as text, images, and audio. It uses a shared embedding space and attention mechanism to learn joint representations across modalities.
+ TBD
+
@@ -4516,6 +4898,12 @@
Network parent class
Network
+
+
+
+ Network parent class
+ TBD
+
@@ -4541,13 +4929,19 @@
-
+
A neuro-symbolic LLM combines neural language modeling with symbolic reasoning components, leveraging structured knowledge representations and logical inferences to improve reasoning capabilities.
Neuro-Symbolic LLM
knowledge reasoning
symbolic grounding
Neuro-Symbolic LLM
+
+
+
+ A neuro-symbolic LLM combines neural language modeling with symbolic reasoning components, leveraging structured knowledge representations and logical inferences to improve reasoning capabilities.
+ TBD
+
@@ -4619,13 +5013,19 @@
-
+
An ordinal LLM is trained to model ordinal relationships and rank outputs, rather than model probability distributions over text sequences directly.
Ordinal LLM
preference modeling
ranking
Ordinal LLM
+
+
+
+ An ordinal LLM is trained to model ordinal relationships and rank outputs, rather than model probability distributions over text sequences directly.
+ TBD
+
@@ -4671,6 +5071,12 @@
Input, Output
Perceptron
+
+
+
+ The perceptron is an algorithm for supervised Learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector. (https://en.wikipedia.org/wiki/Perceptron)
+ TBD
+
@@ -4693,12 +5099,18 @@
-
+
A personalized LLM adapts its language modeling and generation to the preferences, style and persona of individual users or audiences.
Personalized LLM
- user adaptation
+ user adaptation LLM
Personalized LLM
+
+
+
+ A personalized LLM adapts its language modeling and generation to the preferences, style and persona of individual users or audiences.
+ TBD
+
@@ -4750,6 +5162,21 @@
+
+
+
+ A range of techniques and processes applied to data before it is used in machine learning models or AI algorithms
+ Preprocessing
+
+
+
+
+ A range of techniques and processes applied to data before it is used in machine learning models or AI algorithms
+ https://doi.org/10.1109/ICDE.2019.00245
+
+
+
+
@@ -4854,7 +5281,7 @@
-
+
A prompt-tuned LLM is fine-tuned on a small number of examples or prompts, rather than full task datasets. This allows for rapid adaptation to new tasks with limited data, leveraging the model's few-shot learning capabilities.
Prompt-based Fine-Tuning LLM
Prompt-tuned LLM
@@ -4862,6 +5289,12 @@
in-context learning
Prompt-based Fine-Tuning LLM
+
+
+
+ A prompt-tuned LLM is fine-tuned on a small number of examples or prompts, rather than full task datasets. This allows for rapid adaptation to new tasks with limited data, leveraging the model's few-shot learning capabilities.
+ TBD
+
@@ -4901,7 +5334,7 @@
-
+
Like recurrent neural networks (RNNs), transformers are designed to handle sequential input data, such as natural language, for tasks such as translation and text summarization. However, unlike RNNs, transformers do not necessarily process the data in order. Rather, the attention mechanism provides context for any position in the input sequence.
RBFN
RBN
@@ -5132,13 +5565,19 @@
-
+
A rational LLM incorporates explicit reasoning capabilities, leveraging logical rules, axioms or external knowledge to make deductive inferences during language tasks.
Rational LLM
logical inferences
reasoning
Rational LLM
+
+
+
+ A rational LLM incorporates explicit reasoning capabilities, leveraging logical rules, axioms or external knowledge to make deductive inferences during language tasks.
+ TBD
+
@@ -5223,18 +5662,30 @@
self-attention
Recursive LLM
+
+
+
+ A recursive or self-attending LLM incorporates recursive self-attention mechanisms, allowing it to iteratively refine its own outputs and capture long-range dependencies more effectively.
+ TBD
+
-
+
A recursive language model uses recursive neural network architectures like TreeLSTMs to learn syntactic composition functions, improving systematic generalization abilities.
- Recursive Language Model
+ RLM
Compositional generalization
Recursive Language Model
+
+
+
+ A recursive language model uses recursive neural network architectures like TreeLSTMs to learn syntactic composition functions, improving systematic generalization abilities.
+ https://doi.org/10.1609/aaai.v33i01.33017450
+
@@ -5310,7 +5761,7 @@
-
+
An RL-LLM is a language model that is fine-tuned using reinforcement learning, where the model receives rewards for generating text that satisfies certain desired properties or objectives. This can improve the quality, safety, or alignment of generated text.
RL-LLM
Reinforcement Learning LLM
@@ -5318,6 +5769,12 @@
reward modeling
Reinforcement Learning LLM
+
+
+
+ An RL-LLM is a language model that is fine-tuned using reinforcement learning, where the model receives rewards for generating text that satisfies certain desired properties or objectives. This can improve the quality, safety, or alignment of generated text.
+ TBD
+
@@ -5477,13 +5934,19 @@
-
+
A retrieval-augmented LLM combines a pre-trained language model with a retrieval system that can access external knowledge sources. This allows the model to condition its generation on relevant retrieved knowledge, improving factual accuracy and knowledge grounding.
Retrieval-Augmented LLM
knowledge grounding
open-book question answering
Retrieval-Augmented LLM
+
+
+
+ A retrieval-augmented LLM combines a pre-trained language model with a retrieval system that can access external knowledge sources. This allows the model to condition its generation on relevant retrieved knowledge, improving factual accuracy and knowledge grounding.
+ TBD
+
@@ -5559,12 +6022,18 @@
-
+
A self-supervised LLM learns rich representations by solving pretext tasks that involve predicting parts of the input from other observed parts of the data, without relying on human-annotated labels.
Self-Supervised LLM
Pretext tasks
Self-Supervised LLM
+
+
+
+ A self-supervised LLM learns rich representations by solving pretext tasks that involve predicting parts of the input from other observed parts of the data, without relying on human-annotated labels.
+ TBD
+
@@ -5587,12 +6056,18 @@
-
+
A semi-supervised LLM combines self-supervised pretraining on unlabeled data with supervised fine-tuning on labeled task data.
Semi-Supervised LLM
self-training
Semi-Supervised LLM
+
+
+
+ A semi-supervised LLM combines self-supervised pretraining on unlabeled data with supervised fine-tuning on labeled task data.
+ TBD
+
@@ -5759,28 +6234,42 @@
-
+
-
+
Sparse autoencoders may include more (rather than fewer) hidden units than inputs, but only a small number of the hidden units are allowed to be active at the same time (thus, sparse). This constraint forces the model to respond to the unique statistical features of the training data. (https://en.wikipedia.org/wiki/Autoencoder)
SAE
+ Sparse AE
+ Sparse Autoencoder
Input, Hidden, Matched Output-Input
- Sparse AE
+ Sparse Auto Encoder
+
+
+
+ Sparse autoencoders may include more (rather than fewer) hidden units than inputs, but only a small number of the hidden units are allowed to be active at the same time (thus, sparse). This constraint forces the model to respond to the unique statistical features of the training data. (https://en.wikipedia.org/wiki/Autoencoder)
+ TBD
+
-
+
A sparse LLM uses techniques like pruning or quantization to reduce the number of non-zero parameters in the model, making it more parameter-efficient and easier to deploy on resource-constrained devices.
Sparse LLM
model compression
parameter efficiency
Sparse LLM
+
+
+
+ A sparse LLM uses techniques like pruning or quantization to reduce the number of non-zero parameters in the model, making it more parameter-efficient and easier to deploy on resource-constrained devices.
+ TBD
+
@@ -5942,6 +6431,12 @@
SentencePiece
Subword Segmentation
+
+
+
+ Utilizing techniques like Byte Pair Encoding (BPE) or SentencePiece to break down words into smaller units, allowing the model to handle a wide range of vocabulary with a fixed-size list.
+ TBD
+
@@ -6276,27 +6771,30 @@
Vocabulary size reduction
Tokenization And Vocabulary Reduction
-
-
-
-
-
-
-
- An abstract parent class grouping LLMs based on training paradigm.
- Training Paradigm
-
+
+
+
+ Breaking down text data into manageable pieces called tokens and reducing the model's vocabulary to streamline processing.
+ TBD
+
+
Specific strategies or methodologies employed during model training.
Instructional Methods
Learning Techniques
Training Strategies
+
+
+
+ Specific strategies or methodologies employed during model training.
+ TBD
+
@@ -6304,13 +6802,7 @@
-
Methods which can reuse or transfer information from previously learned tasks for the Learning of new tasks.
- Starting the training from a model already trained on a related task to reduce training time and improve performance on tasks with limited data.
- Knowledge Transfer
- Skill Acquisition
- Adaptation
- Pretrained models
Transfer Learning
@@ -6325,12 +6817,18 @@
-
+
A transfer learning LLM leverages knowledge acquired during training on one task to improve performance on different but related tasks, facilitating more efficient learning and adaptation.
Transfer LLM
transfer learning
Transfer Learning LLM
+
+
+
+ A transfer learning LLM leverages knowledge acquired during training on one task to improve performance on different but related tasks, facilitating more efficient learning and adaptation.
+ TBD
+
@@ -6423,12 +6921,18 @@
-
+
An unsupervised LLM is trained solely on unlabeled data using self-supervised objectives like masked language modeling, without any supervised fine-tuning.
Unsupervised LLM
self-supervised
Unsupervised LLM
+
+
+
+ An unsupervised LLM is trained solely on unlabeled data using self-supervised objectives like masked language modeling, without any supervised fine-tuning.
+ TBD
+
@@ -6555,6 +7059,12 @@
Input, Probabilistic Hidden, Matched Output-Input
Variational Auto Encoder
+
+
+
+ Variational autoencoders are meant to compress the input information into a constrained multivariate latent distribution (encoding) to reconstruct it as accurately as possible (decoding). (https://en.wikipedia.org/wiki/Variational_autoencoder)
+ TBD
+
@@ -6577,12 +7087,18 @@
-
+
A zero-shot learning LLM is able to perform tasks or understand concepts it has not explicitly been trained on, demonstrating a high degree of generalization and understanding.
Zero-Shot LLM
zero-shot learning
Zero-Shot Learning LLM
+
+
+
+ A zero-shot learning LLM is able to perform tasks or understand concepts it has not explicitly been trained on, demonstrating a high degree of generalization and understanding.
+ TBD
+
diff --git a/src/ontology/imports/swo_import.owl b/src/ontology/imports/swo_import.owl
index 825f97a..6e0232a 100644
--- a/src/ontology/imports/swo_import.owl
+++ b/src/ontology/imports/swo_import.owl
@@ -7,9 +7,9 @@ Prefix(rdfs:=)
Ontology(
-
+
Annotation( )
-Annotation(owl:versionInfo "2024-03-28")
+Annotation(owl:versionInfo "2024-03-29")
Declaration(Class())
Declaration(Class())
@@ -1259,7 +1259,7 @@ AnnotationAssertion(rdfs:label "has
SubObjectPropertyOf( )
ObjectPropertyRange( )
-# Object Property: (has role)
+# Object Property: (has_role)
AnnotationAssertion( "this person has role this investigator role (more colloquially: this person has this role of investigator)"@en)
AnnotationAssertion( "a relation between an independent continuant (the bearer) and a role, in which the role specifically depends on the bearer for its existence"@en)