Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
df869db
no secrets
dendenso Apr 30, 2025
b25bd60
removed cached files and ignore compilation removed testing method
dendenso Apr 30, 2025
59249db
removing testing method in main
dendenso Apr 30, 2025
50414e4
reset pyc and pycache files
dendenso Apr 30, 2025
37ee253
place files back in
dendenso Apr 30, 2025
534f0d6
direct commit
dendenso Apr 30, 2025
83a4e06
Merge branch 'primary' into EbaySearch
James-Cheaper Apr 30, 2025
c6a4df5
Merge pull request #6 from James-Cheaper/EbaySearch
James-Cheaper Apr 30, 2025
54265bb
Merge branch 'primary' of https://github.com/James-Cheaper/cheaper in…
dendenso May 1, 2025
fbe2803
Orms created UserAccount Product models with prod list view
johnnvij May 5, 2025
12df7ce
Update .gitignore to ignore environment, cache, and local files
AbhiDoshi2000 May 6, 2025
122db4a
issue 5 documentation
gmanhas12 May 7, 2025
3c60b9c
Merge pull request #15 from James-Cheaper/update-gitignore
James-Cheaper May 9, 2025
6dc657b
Merge pull request #17 from James-Cheaper/GagansBranch
James-Cheaper May 9, 2025
9fe2f75
Merge pull request #14 from James-Cheaper/orm_setup
James-Cheaper May 11, 2025
50768d6
used docker support with Gunicorn and environment setup
johnnvij May 24, 2025
b597966
testing
johnnvij May 28, 2025
a8b61fc
testin1
johnnvij May 28, 2025
eb2a3fd
Retrieve a item from Ebay api #7 (#19)
gmanhas12 May 30, 2025
9e04373
remove .egg file
johnnvij Jun 2, 2025
03298ce
#30 final
johnnvij Jun 6, 2025
3a0c011
Merge branch 'primary' into dgc-setup
johnnvij Jun 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
__pycache__/
*.pyc
*.pyo
*.pyd
.env
venv/
.envrc
.git
Empty file removed .env
Empty file.
58 changes: 58 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# === Python build artifacts ===
*.pyc
*.pyo
*.pyd
__pycache__/
**/__pycache__/
*.egg-info/
dist/
build/
*.log

# === SQLite & output files ===
*.sqlite3
*.db
output.json

# === Environment variables ===
.env
.env.*
*.env

# === Virtual environments ===
venv/
.venv/
.env/

# === VSCode project settings ===
.vscode/

# === macOS system files ===
.DS_Store

# === Pytest and test cache ===
htmlcov/
.coverage
.cache/
pytest_cache/
.tox/

# === Jupyter Notebook ===
.ipynb_checkpoints/

# === Django migration artifacts (optional to ignore) ===
# Uncomment the lines below if you want to regenerate migrations often
# **/migrations/*.py
# **/migrations/*.pyc
# !**/migrations/__init__.py

# === FastAPI-specific artifacts ===
fastapi_email/email_db.sqlite3

# === IDE-specific ===
.idea/
*.sublime-project
*.sublime-workspace

# === GitHub Codespaces or devcontainers ===
.devcontainer/
14 changes: 14 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"python.analysis.extraPaths": [
"./webscraper/ABC"
],
"python.testing.unittestArgs": [
"-v",
"-s",
"./webscraper",
"-p",
"*test*.py"
],
"python.testing.pytestEnabled": false,
"python.testing.unittestEnabled": true
}
18 changes: 18 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM continuumio/miniconda3

WORKDIR /app

COPY environment.yml .

RUN conda install -n base -c conda-forge mamba && \
mamba env update -n base -f environment.yml && \
conda clean --all --yes

COPY . .

ENV PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["gunicorn", "--bind", "0.0.0.0:8000", "wsgi_entry:application"]

14 changes: 13 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Initial Landing page![Initial Landing page](https://github.com/user-attachments/

-To run the scraper, execute the main.py script by running the command

python src/main.py
python main.py

-Make sure you are in the webscraper directory when you run the command

Expand All @@ -25,3 +25,15 @@ if __name__ == "__main__":
##what file needs to be located and what variables would need to be changed if you wanted to scrape another website?

-If you wanted to scrape another website, you need to locate the file main.py and change the variables “scraper” and “pages” to whatever website you wanted and the new URl paths. As well ensure the website allows scraping.



Documentation on connecting the database to vscode with the postgres extension

1. Install the PostgreSQL Extension in VSCode
2. Make sure PostgreSQL is Running Locally
3. click the extension on the left sidebar
4. click the plus button and create a new connection
5. fill in the needed information, server = localhost, database = cheaper_local, User = postgres, port = 5432 (default), password = the password you made when installing PostgreSQL
7. You should be connected now and see a message and see the conencted database in the extension now.

1 change: 1 addition & 0 deletions accounts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# test
Binary file removed accounts/__pycache__/__init__.cpython-312.pyc
Binary file not shown.
Binary file removed accounts/__pycache__/admin.cpython-312.pyc
Binary file not shown.
Binary file removed accounts/__pycache__/apps.cpython-312.pyc
Binary file not shown.
Binary file removed accounts/__pycache__/models.cpython-312.pyc
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Generated by Django 5.2 on 2025-05-05 19:14

import django.db.models.deletion
from django.db import migrations, models


class Migration(migrations.Migration):

dependencies = [
("accounts", "0001_initial"),
]

operations = [
migrations.RemoveField(
model_name="product",
name="name",
),
migrations.RemoveField(
model_name="product",
name="source_url",
),
migrations.RemoveField(
model_name="useraccount",
name="password",
),
migrations.AddField(
model_name="product",
name="product_name",
field=models.CharField(default="Unnamed Product", max_length=255),
),
migrations.AddField(
model_name="product",
name="url",
field=models.TextField(default="https://example.com"),
),
migrations.AddField(
model_name="product",
name="user",
field=models.ForeignKey(
null=True,
on_delete=django.db.models.deletion.CASCADE,
to="accounts.useraccount",
),
),
migrations.AddField(
model_name="useraccount",
name="password_hash",
field=models.CharField(default="defaultpass123", max_length=100),
),
migrations.AlterField(
model_name="product",
name="price",
field=models.DecimalField(decimal_places=2, default=0.0, max_digits=10),
),
]
Binary file not shown.
Binary file not shown.
13 changes: 8 additions & 5 deletions accounts/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,21 @@ def validate_email(value):

class UserAccount(models.Model):
email = models.EmailField(max_length=50, unique=True)
password = models.CharField(max_length=100)
password_hash = models.CharField(max_length=100)
# password_hash = models.CharField(max_length=100, default='defaultpass123') # added default

def clean(self):
validate_email(self.email)

def __str__(self):
return self.email


class Product(models.Model):
name = models.CharField(max_length=200)
price = models.CharField(max_length=10)
source_url = models.URLField(max_length=150)
product_name = models.CharField(max_length=255, default='Unnamed Product')
price = models.DecimalField(max_digits=10, decimal_places=2, default=0.00)
url = models.TextField(default='https://example.com')
user = models.ForeignKey(UserAccount, on_delete=models.CASCADE, null=True)

def __str__(self):
return self.name
return self.product_name
7 changes: 7 additions & 0 deletions accounts/views.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
from django.shortcuts import render
from django.http import JsonResponse
from .models import Product

def product_list(request):
products = Product.objects.all()
data = [{"name": p.product_name, "price": float(p.price), "url": p.url} for p in products]
return JsonResponse(data, safe=False)

# Create your views here.
Binary file removed cheaper/__pycache__/__init__.cpython-312.pyc
Binary file not shown.
Binary file removed cheaper/__pycache__/settings.cpython-312.pyc
Binary file not shown.
Binary file removed cheaper/__pycache__/urls.cpython-312.pyc
Binary file not shown.
2 changes: 2 additions & 0 deletions cheaper/urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@
"""
from django.contrib import admin
from django.urls import path
from accounts.views import product_list

urlpatterns = [
path('admin/', admin.site.urls),
path('', product_list, name='product_list'), # This sets the homepage
]
Binary file modified db.sqlite3
Binary file not shown.
77 changes: 77 additions & 0 deletions dockerREADME.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Docker Deployment Guide

### 1. Prerequisites

- Install [Docker Desktop](https://www.docker.com/products/docker-desktop)
- Make sure Docker Engine is running

### 2. Project Structure (Relevant Parts)

```
cheaper/
├── cheaper/
│ └── wsgi.py
├── environment.yml
├── Dockerfile
├── .dockerignore
├── main.py
├── setup.py
└── ...
```

---

### 3. Dockerfile

We're using Miniconda and `environment.yml` (not `requirements.txt`) for dependency management.

```dockerfile
FROM continuumio/miniconda3:latest

WORKDIR /app

COPY environment.yml .

RUN conda install -n base -c conda-forge mamba && \
mamba env update -n base -f environment.yml && \
conda clean --all --yes

COPY . .

# ⏱️ Gunicorn timeout is increased to handle long scraping time
CMD ["gunicorn", "--timeout", "120", "cheaper.wsgi:application", "-b", "0.0.0.0:8000"]
```

---

### 4. .dockerignore

```dockerignore
__pycache__/
*.pyc
*.pyo
*.pyd
env/
venv/
.git
```

---

### 5. Build and Run

```bash
# Build the Docker image
docker build -t cheaper-app .

# Run the container on port 8000
docker run --rm -p 8000:8000 cheaper-app
```

Open [http://localhost:8000](http://localhost:8000) — you should see:

```
Scraping complete.
```

---
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ dependencies:
- pip:
- beautifulsoup4
- lxml
- gunicorn
29 changes: 29 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from setuptools import setup, find_packages

setup(
name='cheaper',
version='0.1',
packages=find_packages(exclude=["tests", "*.tests", "*.tests.*", "tests.*"]),
include_package_data=True,
install_requires=[
"beautifulsoup4",
"lxml",
"flask",
"pandas",
"numpy",
"requests",
"gunicorn",
],
entry_points={
'console_scripts': [
'cheaper=webscraper.main:main',
],
},

description='cheaper for now',
classifiers=[
'Programming Language :: Python :: 3',
'Operating System :: OS Independent',
],
python_requires='>=3.10',
)
15 changes: 15 additions & 0 deletions webscraper/ABC/Ebay_API.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from abc import ABC,abstractmethod

class EbayApi(ABC):

@abstractmethod
def retrieve_access_token() -> str:
""" retrieves the user access token for sandbox environment it's a long line
of text, numbers, symbols
"""
pass

@abstractmethod
def retrieve_ebay_response(httprequest:str,query:str) -> dict:
""" retrieves a json of large data with category ids, names, parentcategorynodes """
pass
Binary file not shown.
Binary file not shown.
Loading