A prototype search engine with a web crawler and full text search through UI and API integrations written in Go.
Live app: https://gosearch.up.railway.app/
- Web crawler
- Custom full text search engine
- Cronjobs
- Admin UI to configure the web crawler
- JSON API with JWT auth and rate limiting
- Go framework: Fiber
- Database: PostgreSQL
- ORM: Gorm
- Auth: JWT tokens using Cookies
- UI framework: Templ, HTMX, DaisyUI, TailwindCSS
- Load testing: Grafana k6
- Continuous integration (CI): GitHub Actions
- Deployment: Docker, Air, Railway
To try out the API, I recommend downloading and using Postman.
- Base URL (prod):
https://gosearch.up.railway.app
- Base URL (localhost):
http://localhost:3000
Note: To get the full URL, simply concatenate the Base URL to the API endpoint targeted (e.g. http://localhost:3000/api/v1/search
)
Allows users to search for indexed web pages which contain the query terms.
JSON request body:
{
"query": "sebastian nunez"
}
JSON response:
{
"results": [
{
"id": "b1a34930-b2a3-4f1f-9454-10e926754d55",
"url": "https://www.sebastian-nunez.com/",
"indexed": true,
"success": true,
"crawlDuration": 89205,
"statusCode": 200,
"lastTested": "2024-12-30T01:30:40.975935-05:00",
"title": "Sebastian Nunez",
"description": "Sebastian Nunez - Software Engineer Intern at Google | ex. UKG, JPMorgan Chase & Co. | Google Tech Exchange Scholar | Apple Pathways Alliance | Computer Science student at Florida International University",
"headings": "Hey, I'm Sebastian!",
"createdAt": "2024-12-30T01:31:34.671422-05:00",
"updatedAt": "2024-12-30T01:31:34.671422-05:00",
"deletedAt": null
}
// More results...
],
"total": 1
}
Access the production deployment of the search engine: https://gosearch.up.railway.app/
While running the app locally, the main caveats are around seeding the initial data into the database. For now, these are the required actions (SQL queries are provided for convenience):
- Insert admin credentials to login
- Insert the initial crawler settings
- Insert the first URL(s) for the crawler to start exploring
Here are the installation steps:
- Clone the repo:
git clone https://github.com/sebastian-nunez/golang-search-engine
- Install
Go
1.23 or greater - Install
Docker Desktop
- Install
Air
:go install github.com/air-verse/air@latest
- Run docker compose to start the database:
docker compose up
- Insert your admin credentials into the database (see the
Login credentials
section below). - Create the initial crawler settings (see the
Initial crawler settings
section below). - You must manually seed the initial URL(s) for the crawler to begin exploring into the database (see the
Seeding URL(s) for the crawler
section below). - Run the app:
air
- You can open the crawler settings dashboard: http://localhost:3000/dashboard
- Check out the
API reference
section
For testing purposes, admin credentials can be inserted into the database. You can use the SQL query below:
INSERT INTO users (id, email, password, is_admin, created_at, updated_at)
VALUES (uuid_generate_v4(), '<your_email>', '<your_password_hash>', true, NOW(), NOW());
Note: password
is your hashed password. So, use bycrypt to hash your plain-text password (cost factor = 10).
The crawler must be configured before starting up the program for the first time. This is the SQL query:
INSERT INTO crawler_settings (id, urls_per_hour, search_on, add_new_urls, updated_at)
VALUES (1, 25, true, true, NOW());
With a fresh database, the crawler will not have any websites to visit. So, we must give it a starting point(s).
For now, the only way is through an insert query into PostgreSQL:
INSERT INTO crawled_pages (id, url)
VALUES (uuid_generate_v4(), '<your_url>');
Quick tip: sites like https://news.yahoo.com/
are good initial seeds since they have A LOT of external links to other pages.