Project-forex-pipeline

Description

This is a simple pipeline that takes in forex data from a csv file, and sends it to a Kafka server. The Kafka server then sends it to a S3 bucket, where it is stored. A crawler is used to crawl the data in the S3 bucket, and a table is created in the Glue Data Catalog. Athena is used to query the data in the S3 bucket.

Would have used an API to get the data, but I didn't want to pay for it. As such I used a csv file instead.

Note:

Each time you update start and stop the EC2 instance, you will need to change the IP address in the code
You will have to go to sudo nano config/server.properties and change the ADVERTISED_LISTENERS to the EC2 instance's IP address

Setup for Kafka

Install Kafka on the EC2 instance (make sure you change the security settings)
Open a new terminal, to run the Zookeeper server

cd kafka_2.12-3.5.1
bin/zookeeper-server-start.sh config/zookeeper.properties

Open a new terminal to run the Kafka server

4. Allocate memory to the Kafka server

export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"

5. Start the Kafka server

cd kafka_2.12-3.5.1 bin/kafka-server-start.sh config/server.properties

6. Create a topic in another terminal

bin/kafka-topics.sh --create --topic test1 --bootstrap-server <EC2_ip_address>:9092 --replication-factor 1 --partitions 1

7. Start a producer

bin/kafka-console-producer.sh --topic test1 --bootstrap-server <EC2_ip_address>:9092

8. New terminal, Start a consumer

bin/kafka-console-consumer.sh --topic test1 --bootstrap-server <EC2_ip_address>:9092

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
GBPUSD.csv		GBPUSD.csv
README.md		README.md
architecture.png		architecture.png
consumer-kafka.py		consumer-kafka.py
producer-kafka.py		producer-kafka.py
test-consumer.ipynb		test-consumer.ipynb
test-producer.ipynb		test-producer.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project-forex-pipeline

Description

Setup for Kafka

About

Releases

Packages

Languages

yijiyap/Project-forex-pipeline

Folders and files

Latest commit

History

Repository files navigation

Project-forex-pipeline

Description

Setup for Kafka

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages