Skip to content

Commit

Permalink
Refactor, update README.Rmd/.md
Browse files Browse the repository at this point in the history
  • Loading branch information
LazerLambda committed Apr 19, 2024
1 parent 755a225 commit ebee4d2
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 27 deletions.
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ Config/testthat/edition: 3
RoxygenNote: 7.2.3
VignetteBuilder: knitr
Encoding: UTF-8
Language: en-US
8 changes: 4 additions & 4 deletions R/bleu.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ library(checkmate)



#' Validate Arguemnts
#' Validate Arguments
#'
#' @param weights Weight vector for `bleu_corpus_ids` and `bleu_sentence_ids` functions
#' @param smoothing Smoothing method for `bleu_corpus_ids` and `bleu_sentence_ids` functions
Expand Down Expand Up @@ -40,9 +40,9 @@ validate_references <- function(references, target) {
#'
#' `bleu_sentence_ids` computes the BLEU score for a single candidate sentence and a list of reference sentences.
#' The sentences must be tokenized before so they are represented as integer vectors.
#' Akin to sacreBLEU, the function allows the application of different smoothing methods.
#' Akin to sacrebleu (Python), the function allows the application of different smoothing methods.
#' Epsilon- and add-k-smoothing are available. Epsilon-smoothing is equivalent to 'floor'
#' smoothing in the sacreBLEU implementation.
#' smoothing in the sacrebleu implementation.
#' The different smoothing techniques are described in Chen et al., 2014
#' (https://aclanthology.org/W14-3346/).
#'
Expand Down Expand Up @@ -128,7 +128,7 @@ bleu_corpus_ids <- function(references, candidates, n = 4, weights = NULL, smoot
# Compute BLEU for a Corpus with Tokenization
#
#' This function applies tokenization based on the 'tok' library and computes the BLEU score.
#' An already initializied tokenizer can be provided using the `tokenizer`argument or
#' An already initialized tokenizer can be provided using the `tokenizer`argument or
#' a valid huggingface identifier (string) can be passed. If the identifier is used only,
#' the tokenizer is newly initialized on every call.
#' @param references A list of a list of reference sentences (`list(list(c(1,2,...)), list(c(3,5,...)))`).
Expand Down
14 changes: 6 additions & 8 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ knitr::opts_chunk$set(
)
```

# sac :registered: eBLEU
# sacReBLEU

<!-- badges: start -->
<!-- badges: end -->

The goal of sacReBLEU is to provide a simple interface to the BLEU score, a metric for evaluating the quality of machine-translated text.
This package is inspired by the NLTK and sacrebleu implementations of the BLEU score, and is based on a high-performance C++ implementation for the R programming language.
The goal of sacReBLEU is to provide a simple interface to the BLEU score, a metric for evaluating the quality of generated text.
This package is inspired by the NLTK and sacrebleu implementations of the BLEU score, and is implemented in C++ for the R programming language.

## Installation

Expand All @@ -32,12 +32,10 @@ devtools::install_github("LazerLambda/sacReBLEU")

## Example

This is a basic example which shows you how to solve a common problem:

```{r example}
library(sacReBLEU)
ref_corpus <- list(c(1,2,3,4))
cand_corpus <- c(1,2,3,5)
bleu_standard <- bleu_sentence_ids(ref_corpus, cand_corpus)
cand_corpus <- list("This is good", "This is not good")
ref_corpus <- list(list("Perfect outcome!", "Excellent!"), list("Not sufficient.", "Horrible."))
bleu_corpus <- bleu_corpus(ref_corpus, cand_corpus)
```

17 changes: 7 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->

# sac :registered: eBLEU
# sacReBLEU

<!-- badges: start -->
<!-- badges: end -->

The goal of sacReBLEU is to provide a simple interface to the BLEU
score, a metric for evaluating the quality of machine-translated text.
This package is inspired by the NLTK and sacrebleu implementations of
the BLEU score, and is based on a high-performance C++ implementation
for the R programming language.
score, a metric for evaluating the quality of generated text. This
package is inspired by the NLTK and sacrebleu implementations of the
BLEU score, and is implemented in C++ for the R programming language.

## Installation

Expand All @@ -24,11 +23,9 @@ devtools::install_github("LazerLambda/sacReBLEU")

## Example

This is a basic example which shows you how to solve a common problem:

``` r
library(sacReBLEU)
ref_corpus <- list(c(1,2,3,4))
cand_corpus <- c(1,2,3,5)
bleu_standard <- bleu_sentence_ids(ref_corpus, cand_corpus)
cand_corpus <- list("This is good", "This is not good")
ref_corpus <- list(list("Perfect outcome!", "Excellent!"), list("Not sufficient.", "Horrible."))
bleu_corpus <- bleu_corpus(ref_corpus, cand_corpus)
```
4 changes: 2 additions & 2 deletions man/bleu_corpus.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/validate_arguments.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion src/Fraction.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ class Fraction {
long double get_value();
};

#endif
#endif

0 comments on commit ebee4d2

Please sign in to comment.