- Entities are embedded in a continuous low dimensional vector space and each relations are embedded in different vector space.
- Judge whether triplet (entities, relationships, entities) can be considered as a fact by similarity based on distance.
- First, the entity vector is projected onto the vector space related to both the entitiy and relation, and then use L2 normal form as the similarity calculation method, and the formula is as follows: (
$\textbf{I}$ is the identity matrix)
$$ f(\textbf{h},\textbf{r},\textbf{t}) = | (\textbf{r}{p}\textbf{h}{p}^{\top }+\textbf{I})\textbf{h}+\textbf{r}-(\textbf{r}{p}\textbf{t}{p}^{\top }+\textbf{I})\textbf{t}|_{2}^{2} $$
- The negative samples are constructed by destroying the fact triples to train the model.
- The loss calculation method is as follows:
-
Finally, use BP to update the model.
-
What`s more, the dimensions of the entity embedding vector and the relationship embedding vector can be different.
-
Clone the Openhgnn-DGL
# For link prediction task python main.py -m TransD -t link_prediction -d FB15k -g 0 --use_best_config
If you do not have gpu, set -gpu -1.
-
-
Number of entities and relations
entities relations 14,951 1,345 -
Size of dataset
set type size train set 483,142 validation set 50,000 test set 59,071
-
-
-
Number of entities and relations
entities relations 40,493 18 -
Size of dataset
set type size train set 141,442 validation set 5,000 test set 5,000
-
-
Evaluation metric: mrr
Dataset | Mean Rank | Hits@10 |
---|---|---|
FB15k(raw) | 187 | 50.0 |
FB15k(filt.) | 67 | 67.3 |
WN18(raw) | 464 | 56.1 |
WN18(filt.) | 212 | 60.3 |
TrainerFlow: TransX flow
You can modify the parameters[TransE] in openhgnn/config.ini
Xiaoke Yang
Submit an issue or email to x.k.yang@qq.com.