This project borns to help a friend of mine: she was struggling to find a job. So the idea to experiment a little bit with Puppeteer, to have fun, learn something new, and... sure, maybe scrape a bit through portals offering jobs to help her.
It is composed by two parts:
- a backend REST API written in Nest (this repo)
- a frontend client application, natjob-webclient, written with Vue 3.0 and tailwindcss
App demo: http://natjob.matteopiazza.wtf/#/
$ npm install
#or
$ yarn install
Each part (backend and frontend) should behave, and needs to be configured and hosted, as an independent application.
You need to enable CORS by setting the environment variable WEBCLIENT_ORIGIN
:
// file: /src/main.ts
app.enableCors({
origin: process.env.WEBCLIENT_ORIGIN
});
You can use a .env file, as in the repo's example, or configure your environment variable as you prefer.
# development
$ npm run start
# watch mode
$ npm run start:dev
# production mode
$ npm run start:prod
Navigate to {{your domain}}/jobs
to test the API and check the returned json.
The main controller jobs
is located under /src/jobs
.
Each scraped website has its own nuances (a different HTML structure, or AJAX strategy to look for , and list, jobs ads.
So the /src/jobs/services
folder help in organizing the code for the respective web pages.
Each job is defined through a Typescript interface:
export interface Job {
title: string, // job title, as exposed by the original source
location: string // location of the offered job,
publicationDate: string, // online since
url: string, // original job link from website
originalSource: string, // job original source
originalSourceJobsUrl: string, // original source jobs list page
description: string // job extended description
}
And then collected and grouped in the results
property of the JobSource
interface, which is used to normalize different outputs from different pages, and send a common json to the client app:
import { Job } from './Job.interface';
export interface JobsSource {
name: string, // website name
url: string, // base website url
results: Job[],
error?: string
}
# unit tests
$ npm run test
# e2e tests
$ npm run test:e2e
# test coverage
$ npm run test:cov