headline_detector

Indonesian Headline Detection Python API

This is a Python library that provides APIs for detecting headlines in textual data, especially on social media platforms such as Twitter. The library utilizes a model that has been developed and trained on a dataset of Twitter posts containing both headline and non-headline texts, with the assistance of journalism professionals to ensure the data quality.

$ pip install headline-detector

Available scenario and the performance

Model	Scenario 1	Scenario 2	Scenario 3	Scenario 4	Scenario 5	Scenario 6
Fasttext	0.8766	0.8714	0.8793	0.8714	0.8714	0.8661
CNN	0.9081	0.9081	0.8950	0.8898	0.8950	0.8898
IndoBERTweet	0.9895	0.9921	0.9738	0.9580	0.9843	0.9685

All meassured in accuracy

Model Throughput

Model	Throughput (± Text/seconds)
IndoBERTweet	±1.3
CNN	±281.60
Fasttext	±2048.41

Tested on Intel i7-6700k and 32GB of RAM.

Usage

Output either 0 (non-headline) and 1 (headline)

from headline_detector import FasttextDetector, IndoBERTweetDetector, CNNDetector

detector = FasttextDetector.load_from_scenario(1)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

detector = CNNDetector.load_from_scenario(3)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

detector = IndoBERTweetDetector.load_from_scenario(5)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

# 0 is non-headline
# 1 is headline

Paper

Coming soon.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
src/headline_detector		src/headline_detector
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

headline_detector

Available scenario and the performance

Model Throughput

Usage

Paper

About

Releases

Packages

Languages

License

kaenova/headline_detector

Folders and files

Latest commit

History

Repository files navigation

headline_detector

Available scenario and the performance

Model Throughput

Usage

Paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages