Skip to content

Search API

fab edited this page Aug 29, 2023 · 1 revision

Creating a robust API for checking domains against a blacklist requires a few essential considerations:

  1. Speed: As an affiliate network, you'd want to quickly check if a given domain is on the blacklist. It would be best if you had efficient storage and lookup mechanisms.
  2. Scalability: Your service should handle large numbers of requests, especially if it's being used by big affiliate networks.
  3. Security: Protect your API against misuse. Consider rate-limiting, access controls, and other measures to ensure only authorized users can make requests.

Let's start with a simple Flask-based API which relies on in-memory storage for speed:

1. Setting Up

Ensure you have the required packages:

pip install Flask Flask-Limiter

2. Implementing the API

from flask import Flask, request, jsonify
from flask_limiter import Limiter
import requests

app = Flask(__name__)
limiter = Limiter(app, key_func=get_remote_address, default_limits=["200 per day", "50 per hour"])

BLACKLIST_URL = "https://get.domainsblacklists.com/blacklist.txt"
cached_blacklist = set()

def update_blacklist():
    global cached_blacklist
    response = requests.get(BLACKLIST_URL)
    if response.status_code == 200:
        cached_blacklist = set(response.text.splitlines())

@app.route('/check-domain', methods=['POST'])
@limiter.limit("10 per minute")
def check_domain():
    domain = request.json.get('domain')
    if not domain:
        return jsonify(error='Domain not provided'), 400

    if domain in cached_blacklist:
        return jsonify(status="blacklisted")
    else:
        return jsonify(status="safe")

if __name__ == '__main__':
    update_blacklist()  # Update the blacklist when the app starts
    app.run(debug=True)

3. Considerations:

  • In-Memory Storage: The script above caches the blacklist in memory. While this approach is very fast, it may not be scalable for massive blacklists. Consider using a more scalable data store (e.g., Redis) for larger use cases.

  • Rate Limiting: We're using Flask-Limiter to limit requests to 10 requests per minute for each client IP to prevent abuse. Adjust these limits based on your specific requirements and capacity.

  • Auto-Updating the Blacklist: The above script loads the blacklist once when the app starts. You might want to update the blacklist periodically. One way to do this is by using a background task scheduler like Celery.

  • Authentication: The provided API is open, meaning anyone can access it. In a production setting, you'll want to implement some form of authentication, such as API key-based authentication, to ensure only authorized users can access your service.

Deploying this API on a robust cloud server, behind a load balancer, would ensure high availability and scalability. If you expect a very high request rate, consider also deploying this in a containerized environment like Kubernetes.