Skip to content

mayarachew/CompsciencePapers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Compscience Papers

Description: Project to compare different types of classifications and visualizations techniques.

Concepts:

  • Text preprocessing (remove stopwords, Stemming, Lemmatizing)
  • Dimensionality reduction (TF-IDF)
  • Split dataset
  • Classification (Support Vector Machine, Artificial Neural Network: Feed-forward Backpropagation Multilayer Perceptron, Naive Bayes, Random Forest, Nearest Neighbors Classifier, Decision Tree)
  • Visualization (Scatterplot)

Dataset: Scientific papers

Description

Collection of 682 scientific papers, which are categorized as:

  1. Case-based Reasoning (CBR)
  2. Inductive Logic Programming (ILP)
  3. Information Retrieval (IR)
  4. Sonification (SON)
  5. Interactive Visualization (INT)

Each scientific paper is represented by a simple text describing its title, authors, abstract and references.

Source

http://vicg.icmc.usp.br/vicg/software

Files

cbr-ilp-ir-son-int.zip: compacted files of all scientific papers;

compscience_papers.csv: csv file structured as [label,text of the scientific paper];

cbr-ilp-ir-son-int_cosine_gpt.data: derived TF-IDF representation of this text collection;

References

[1] Paulovich, F. V., Nonato, L. G., Minghim, R., & Levkowitz, H., Least square projection: A fast high-precision multidimensional projection technique and its application to document mapping. IEEE Transactions on Visualization and Computer Graphics, v. 14, n. 3, p. 564-575, 2008.

[2] Levkowitz, H., Minghim, R., Nonato, L. G., & Paulovich, F. V., Visual mapping of text collections through a fast high precision projection technique. In: Tenth International Conference on Information Visualisation (IV'06). IEEE, 2006. p. 282-290.

[3] Eler, D. M., Paulovich, F. V., de Oliveira, M. C. F., & Minghim, R. Coordinated and multiple views for visualizing text collections. In: 2008 12th International Conference Information Visualisation. IEEE, 2008. p. 246-251.

[4] Paulovich, Fernando V.; Minghim, Rosane, Multidimensional Data Mapping–Integrating Mining and Visualization.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published