Skip to content

Commit

Permalink
Merge pull request #2 from ExcitingTheory/add-docs-and-licnese
Browse files Browse the repository at this point in the history
Improved documentation
  • Loading branch information
csfx authored Jul 9, 2023
2 parents 4241f7e + ddc7378 commit 1892509
Show file tree
Hide file tree
Showing 8 changed files with 1,712 additions and 67 deletions.
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 Colin Barrett-Fox, Exciting Theory

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
85 changes: 48 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,65 @@
# Next.js Amplify Spiders v1
# Amplify Spiders v1

## How to use
<image style="position:relative; left:25%" src="./images/purple-spider.gif" alt="Amplify Spiders v1" width="50%">

Install it and run:

```sh
npm install
npm run dev
```
Amplify Spiders v1 is an AWS Amplify project that hosts a Next.js site with real-time data and custom Lambda handlers that include Lambda containers and tensorflow.js. This project provides crawlers for several different search engines for competitive analysis only.

## NOTES on setup
## Getting Started

```bash
amplify init
To get started with Amplify Spiders v1, follow these steps:

amplify add auth
1. Clone the repository to your local machine.
2. Install the necessary dependencies by running `npm install`.
3. Set up your AWS Amplify environment by following the instructions in the [amplify/README.md](amplify/README.md) file.
4. Run the project locally by running `npm run dev`.
5. Deploy the project to the cloud by running `amplify push`.

user groups free, paid, and admin.
Facebook, amazon and google login.
## Features and Functionality

amplify push
```
Amplify Spiders v1 includes the following features and functionality:

--force? didn't end up doing that
https://github.com/aws-amplify/amplify-adminui/issues/472
- [x] Next.js site
- [x] Real-time data
- [x] Custom Lambda handlers
- [x] Lambda containers
- [x] Tensorflow.js for Universal Sentence Encoder to compare search results to search query.
- [x] Crawler for Google custom search engine
- [x] Crawler for Citysearch
- [x] Crawler for Yelp
- [x] Crawler for Yellow Pages
- [x] Crawler for FourSquare
- [x] Can create a new user
- [x] Can login as a user
- [x] Can create a domain to monitor
- [x] Can view all domains
- [x] Can view domain details
- [x] Can view historical rankings for a domain and a search engine as a line chart
- [x] Detects if the domain is in the first page search results for a search engine

What ended up working was creating auth without federation.
## Roadmap

Amplify Spiders v1 is currently in development. The following features and functionality are planned for future releases:

```bash
amplify publish
```

Python lambda keys are in ssm:
Use the AWS SSM GetParameter API to retrieve secrets in your Lambda function.
More information can be found here: https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_GetParameter.html
- [ ] Finish the main site menubar (can login, but not logout yet) IN PROGRESS
- [ ] Remove lambda contianers by treeshaking this library to reduce bundle size.
- [ ] Crawler for Facebook Business: Need to get the app approved by Facebook for the demo site
- [ ] Crawler for Bing?
- [ ] Crawler for Yahoo?
- [ ] CI/CD for Container Lambda handlers?
- [ ] Find good sources of regional statistical and demohgraphic data for cross referencing with search results?

graphql access in lambdas
## Contributing

```bash
API_SPIDERS1_GRAPHQLAPIENDPOINTOUTPUT
API_SPIDERS1_GRAPHQLAPIIDOUTPUT
ENV
REGION
```
If you'd like to contribute to Amplify Spiders v1, please follow these steps:

Example:
1. Fork the repository.
2. Create a new branch for your changes.
3. Make your changes and commit them.
4. Push your changes to your fork.
5. Create a pull request.

```graphql
mutation MyMutation {
crawlEngines(search:"bend brewing company", postalCode:"97702")
}
```
## License

Amplify Spiders v1 is licensed under the MIT License. See `LICENSE` for more information.
47 changes: 47 additions & 0 deletions amplify/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Amplify Specific Setup

See the official [Amplify Documentation](https://docs.amplify.aws/cli/start/install) for more information.

## Install Amplify CLI

To install the Amplify CLI, run the following command:

```bash
npm install -g @aws-amplify/cli
```

## Configure Amplify CLI

To configure the Amplify CLI, run the following command:

```bash

amplify configure
```

## Clone as a Sample Project

If you'd like to clone this as a sample project, you can do so by running the following command:

```bash

amplify init --app <github-url>
```


## Pull backend environment

If you have an existing implementation of Amplify Spiders v1, you can pull the backend environment to your local machine by running the following command:

```bash
amplify pull --appId APP-ID --envName ENV-NAME
```

## Push backend environment

If you have made changes to the backend environment, you can push those changes to the cloud by running the following command:

```bash
amplify push
```

8 changes: 8 additions & 0 deletions amplify/backend/custom/spiderLanguage/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,13 @@ const query = /* GraphQL */ `

/**
* @type {import('@types/aws-lambda').APIGatewayProxyHandler}
* @param {import('@types/aws-lambda').APIGatewayProxyEvent} event
* @param {import('@types/aws-lambda').Context} context
* @returns {Promise<import('@types/aws-lambda').APIGatewayProxyResult>}
* @see https://docs.aws.amazon.com/lambda/latest/dg/nodejs-handler.html
*
* @description
* This is the entry point for the Lambda function invoked by the API Gateway.
*/
export const handler = async (event) => {
console.log(`EVENT: ${JSON.stringify(event)}`);
Expand Down Expand Up @@ -268,6 +275,7 @@ export const handler = async (event) => {
let response;

try {
// Save the data to the database
response = await fetch(request);
body = await response.json();
if (body.errors) statusCode = 400;
Expand Down
112 changes: 86 additions & 26 deletions amplify/backend/custom/spiderLanguage/src/lib.js
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,14 @@ export const stringSimilarity = function (str1, str2, gramSize = 2) {
return hits / total;
}


/**
*
* @param {*} a
* @param {*} b
* @returns The dot product of two vectors.
*
* https://stackoverflow.com/questions/43013238/how-to-calculate-dot-product-of-two-vectors-in-javascript
*/
const dot = function (a, b) {
var hasOwnProperty = Object.prototype.hasOwnProperty;
var sum = 0;
Expand All @@ -56,6 +63,13 @@ const dot = function (a, b) {
return sum
}

/**
*
* @param {*} a
* @param {*} b
* @returns The cosine similarity of two vectors.
*
*/
const similarity = function (a, b) {
var magnitudeA = Math.sqrt(dot(a, a));
var magnitudeB = Math.sqrt(dot(b, b));
Expand All @@ -64,7 +78,11 @@ const similarity = function (a, b) {
else return false
}


/**
*
* @param {*} matrix
* @returns A matrix of cosine similarities between the rows of the input matrix.
*/
const cosineSimilarityMatrix = function (matrix) {
let cosine_similarity_matrix = [];
for (let i = 0; i < matrix.length; i++) {
Expand All @@ -81,13 +99,24 @@ const cosineSimilarityMatrix = function (matrix) {
return cosine_similarity_matrix;
}

/**
*
* @param {*} sentences
* @returns A matrix of syntactic similarities between the sentences.
*/
export const syntacticSimilarity = async function (sentences) {
const model = await use.load()
const embeddings = await model.embed(sentences)
let arr = cosineSimilarityMatrix(embeddings.arraySync())
return arr
}

/**
*
* @param {*} url
* @param {*} params
* @returns The response from the get request.
*/
export const getRequest = async function (url, params = {}) {
let data = axios.get(url, params).catch((error) => {
if (error.response) {
Expand All @@ -111,6 +140,18 @@ export const getRequest = async function (url, params = {}) {
return data
}


/**
*
* @param {*} data
* @param {*} search
* @param {*} postalCode
* @param {*} domainName
* @param {*} dateStamp
* @returns The most likely location from the data.
*
* This function takes the data from the citysearch api and returns the most likely location.
*/
export const parseCitysearch = async function (data, search, postalCode, domainName, dateStamp) {
const toSearch = data?.results?.locations || [];
let mostLikely = -1
Expand Down Expand Up @@ -138,13 +179,6 @@ export const parseCitysearch = async function (data, search, postalCode, domainN

syntax.push(name)

// const address = {
// street: element?.address?.street,
// city: element?.address?.city,
// state: element?.address?.state,
// postalCode: element?.address?.postal_code
// }

const address = element?.address?.street

let scoreInfo = {
Expand All @@ -168,6 +202,7 @@ export const parseCitysearch = async function (data, search, postalCode, domainN
return result.value
}
})

const bumpChart = results.map(function (result) {
if (result?.name) {
const y = result?.key >= 0 ? result.key + 1 : null
Expand All @@ -182,15 +217,7 @@ export const parseCitysearch = async function (data, search, postalCode, domainN
}
})

// console.log(JSON.stringify({
// results,
// highScore,
// foundWebsite,
// mostLikely
// }))

const websiteCheckIndex = exactNameMatch >= 0 ? exactNameMatch : mostLikely
// console.log('toSearch[websiteCheckIndex]["website"]', toSearch[websiteCheckIndex]["website"])
if (toSearch[websiteCheckIndex] && toSearch[websiteCheckIndex]["website"]) {
let website
try {
Expand Down Expand Up @@ -250,6 +277,17 @@ export const parseCitysearch = async function (data, search, postalCode, domainN
}
}

/**
*
* @param {*} data
* @param {*} search
* @param {*} postalCode
* @param {*} domainName
* @param {*} dateStamp
* @returns The most likely location from the data.
*
* This function takes the data from the google api and returns the most likely location.
*/
export const parseGoogle = async function (data, search, postalCode, domainName, dateStamp) {
const toSearch = data?.items || [];
// console.log(toSearch)
Expand Down Expand Up @@ -361,15 +399,14 @@ export const parseGoogle = async function (data, search, postalCode, domainName,

/**
*
* @param {Object} data
* @param {String} search
* @param {Number} postalCode
* @param {String} domainName
* @returns {
results, // array of first page results
highScore, // Highest score
mostLikely // Key from highest score
}
* @param {*} data
* @param {*} search
* @param {*} postalCode
* @param {*} domainName
* @param {*} dateStamp
* @returns The most likely location from the data.
*
* This function takes the data from the foursquare api and returns the most likely location.
*/
export const parseFoursquare = async function (data, search, postalCode, domainName, dateStamp) {
const toSearch = data?.results || [];
Expand Down Expand Up @@ -468,6 +505,17 @@ export const parseFoursquare = async function (data, search, postalCode, domainN
}
}

/**
*
* @param {*} data
* @param {*} search
* @param {*} postalCode
* @param {*} domainName
* @param {*} dateStamp
* @returns The most likely location from the data.
*
* This function takes the data from the yelp api and returns the most likely location.
*/
export const parseYelp = async function (data, search, postalCode, domainName, dateStamp) {
const toSearch = data?.businesses || [];
let mostLikely = -1
Expand Down Expand Up @@ -599,6 +647,18 @@ export const parseYelp = async function (data, search, postalCode, domainName, d
}
}


/**
*
* @param {*} data
* @param {*} search
* @param {*} postalCode
* @param {*} domainName
* @param {*} dateStamp
* @returns The most likely location from the data.
*
* This function takes the data from the yellowpages api and returns the most likely location.
*/
export const parseYellowpages = async function (data, search, postalCode, domainName, dateStamp) {
const toSearch = data?.searchResult?.searchListings?.searchListing || [];
let mostLikely = -1
Expand Down
Loading

0 comments on commit 1892509

Please sign in to comment.