Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PageRank Algorithm. #1644

Merged
merged 1 commit into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions Graph Algorithms/PageRank Algorithm/Program.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#define MAX_NODES 100
#define DAMPING_FACTOR 0.85
#define CONVERGENCE_THRESHOLD 0.0001

// Structure to store graph edges
typedef struct {
int src;
int dest;
} Edge;

// Initialize adjacency matrix and out-degree array
void initializeGraph(int adjMatrix[MAX_NODES][MAX_NODES], int outDegree[MAX_NODES], int numNodes) {
for (int i = 0; i < numNodes; i++) {
for (int j = 0; j < numNodes; j++) {
adjMatrix[i][j] = 0;
}
outDegree[i] = 0;
}
}

// Add an edge to the graph
void addEdge(int adjMatrix[MAX_NODES][MAX_NODES], int outDegree[MAX_NODES], Edge edge) {
adjMatrix[edge.src][edge.dest] = 1;
outDegree[edge.src]++;
}

// PageRank calculation
void calculatePageRank(int adjMatrix[MAX_NODES][MAX_NODES], int outDegree[MAX_NODES], int numNodes, double pageRank[MAX_NODES]) {
double tempRank[MAX_NODES];
int converged;

// Initialize PageRank values
for (int i = 0; i < numNodes; i++) {
pageRank[i] = 1.0 / numNodes;
}

do {
converged = 1;

// Copy current page ranks to temporary array
for (int i = 0; i < numNodes; i++) {
tempRank[i] = pageRank[i];
}

// Update page ranks for each node
for (int i = 0; i < numNodes; i++) {
double incomingRankSum = 0.0;

for (int j = 0; j < numNodes; j++) {
if (adjMatrix[j][i] == 1) {
incomingRankSum += tempRank[j] / outDegree[j];
}
}

pageRank[i] = (1 - DAMPING_FACTOR) / numNodes + DAMPING_FACTOR * incomingRankSum;

if (fabs(pageRank[i] - tempRank[i]) > CONVERGENCE_THRESHOLD) {
converged = 0;
}
}
} while (!converged);
}

int main() {
int numNodes = 4; // Example number of nodes
int numEdges = 4; // Example number of edges
Edge edges[] = {
{0, 1},
{0, 2},
{1, 2},
{2, 0}
};

int adjMatrix[MAX_NODES][MAX_NODES];
int outDegree[MAX_NODES];
double pageRank[MAX_NODES];

// Initialize graph and add edges
initializeGraph(adjMatrix, outDegree, numNodes);
for (int i = 0; i < numEdges; i++) {
addEdge(adjMatrix, outDegree, edges[i]);
}

// Calculate PageRank
calculatePageRank(adjMatrix, outDegree, numNodes, pageRank);

// Print PageRank values
printf("PageRank values:\n");
for (int i = 0; i < numNodes; i++) {
printf("Node %d: %.6f\n", i, pageRank[i]);
}

return 0;
}
75 changes: 75 additions & 0 deletions Graph Algorithms/PageRank Algorithm/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# PageRank Algorithm


# Description

This project presents an implementation of the PageRank Algorithm, originally developed by Larry Page and Sergey Brin, in the C programming language. The PageRank algorithm ranks web pages based on their importance, determined by the link structure of the web. The premise is that more important pages are likely to receive more links from other pages, making them appear higher in search results or social influence analyses.

# Key Features

>Damping Factor: A probability factor that accounts for the likelihood of a user continuing to click on links rather than jumping to a new page. A commonly used value is 0.85, balancing the influence of direct links with random jumps.

>Convergence: The algorithm iteratively adjusts page ranks until they stabilize, determined by a threshold value that measures minimal changes between iterations.

>Scalability: Suitable for large graphs, such as web page networks and social graphs.

>Versatile Applications: Beyond web page ranking, PageRank is used in social network analysis to gauge user influence based on their connections.


# Problem Definition

Given a directed graph where nodes represent web pages and edges represent hyperlinks, the goal is to assign each node a PageRank score based on the scores of pages linking to it and the number of outgoing links from those pages.

>The PageRank score 𝑃𝑅(𝑃) of a page P is calculated as follows:

𝑃𝑅(𝑃)=1−𝑑/𝑁+ 𝑑d∑
Q∈M(P) 𝑃𝑅(𝑄)/𝐿(𝑄)

Where:

>PR(P): PageRank score of page
>d: Damping factor (typically 0.85)
>N: Total number of pages
>M(P): Set of pages linking to
>L(Q): Number of outbound links from page

The algorithm iteratively updates each page’s rank until the change between iterations falls below a set threshold, indicating convergence.

# Algorithm Review

>Iterative Update Process

The PageRank algorithm is implemented as an iterative process, following these steps:

1.<Initialization: Assign an initial PageRank score to each page, usually 1/N where N is the total number of pages.

2.<Iterative Update: For each page P, update its score based on the scores of pages linking to it, adjusted by the damping factor.

3.<Convergence Check: Repeat the update process until the change in PageRank scores between iterations is less than a pre-set convergence threshold.

4.<Normalization: Normalize scores if required to ensure consistency in ranking scale.

# Time Complexity

The time complexity of the PageRank algorithm for each iteration is approximately:
>O(E), where:

1.<E is the number of edges in the graph.

2.<The number of iterations required for convergence depends on the graph's structure and the convergence threshold.

# Applications

The PageRank algorithm has broad applications across multiple domains, including:

1.<Web Search Engines: Ranks web pages to display more relevant results based on importance.

2.<Social Network Analysis: Determines influence in networks, such as identifying central users in social media.

3.<Recommendation Systems: Used to rank items or content based on link structure, enhancing user experience.

4.<Biological Networks: Assists in analyzing protein-protein interaction networks to identify key proteins.

# Conclusion

The PageRank algorithm provides a robust approach for ranking nodes in a directed graph, effectively gauging the importance of each node based on link structures. With applications spanning search engines, social networks, and beyond, this implementation serves as a practical resource for those exploring graph theory and influence-based algorithms. The adjustable damping factor and iterative convergence make it versatile and efficient, especially in handling large-scale graphs.
Loading