-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node centrality Lifting (Graph to Hypergraph) #46
base: main
Are you sure you want to change the base?
Conversation
…yperedges in incidence matrix results in too many added nodes and hence an out of index error. This error does not happen in the KNN given example, because the number of hyperedges equals the number of nodes and hence doubles the number of nodes anyways. It occurs however, if the number of hyperedges is less then the number of nodes
…network backbone(s) as hyperedges
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Hello @mbanf! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at guillermo_bernardez@ucsb.edu so I can share access to the voting form. In your email, please include:
Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified. |
Motivation
This is a novel lifting that creates hyperedges based on central, i.e. highly influential, nodes in the network. Mapping a connection between individual nodes to specific nodes in the network architecture that have a specific and potentially competing influence on them is a very convenient scenario to be modelled via hyperedges. Using shortest path distance to identify the most influential nodes on any given node even allows for placing weights on the hyperedge connection to individual, connected nodes (i.e. the inverse shortest path distance to the corresponding most influential node that the hyperedge represents) in order to model influence decay across the network.
To define and identify influential nodes in the network, we refer to the variant of the Eigenvector Centrality that introduces additional random teleportations besides the stationary distribution of the Markov Chain, i.e. PageRank.
Background
Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores.
The algorithm computes the eigenvector associated with the largest absolute eigenvalue. To compute that eigenvalue, the algorithm applies the power iteration approach. Within each iteration, the centrality score for each node is derived from the scores of its incoming neighbors. In the power iteration method, the eigenvector is L2-normalized after each iteration, leading to normalized results by default.
The PageRank [1] variant of Eigenvector Centrality utilises, at any step of the power iteration, an additional teleportation probability, called dampening factor$\alpha$ , which decides whether to continue following the transition matrix or teleport to random positions in the process. These random teleportations have shown to be an effective way to ensure that the transition matrix and corresponding Markov chain exhibit ergodicity which makes them easier to analyze and to guarantee convergence.
Method
Our approach is applicable to both directed and undirected as well as weighted and unweighted networks. It works as follows:$n$ most influential nodes in the graph as hyperedges based on a given quantile.$m >= 1$ most influential nodes (with $m <= n$ ), i.e. their respective hyperedges, based on their shortest path distance $d$ to each influential node.$1/d$ ) to the hyperedge's corresponding most influential node.
(1) calculate the node centrality of all nodes in the graph.
(2) select the top
(3) Assign all nodes in the network to
(4, optional) model individual connection weights per node to a hyperedge via the inverse shortest path distance (i.e.
Remarks on additional influential node feature lifting proposed
Note that the algorithm can support the utilization of the ProjectionSum feature lifting to model the inverse relationship between all nodes towards their shared most influential node.$1/d$ ) to the hyperedge's corresponding most influential node, this direct assignment of the influential node's features to the hyperedges lends itself ideally to model the decaying influence of the influential node to all nodes assigned to the hyperedge by (i.e. $1/d$ ) per node across the network.
In order to model the direct influence of the influential node on all individual nodes via the hyperedge, we have, however, further implemented an (optional) straight-forward feature lifting via assignment of the hyperedge's corresponding node's features, thereby bypassing the ProjectionSum feature lifting. In combination with the individual connection weights per node to a hyperedge via the inverse shortest path distance (i.e.
Submission by Team PerelynAI
Max Schattauer (@max-perelyn), Liliya Imasheva (@liliya-imasheva) , Dominik Filipiak (@DominikFilipiak), Michael Banf (@mbanf)