Skip to content

Commit

Permalink
Documentation updates
Browse files Browse the repository at this point in the history
  • Loading branch information
zgornel committed Sep 3, 2019
1 parent 8ee05ed commit 115a7f0
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 13 deletions.
2 changes: 1 addition & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"

[compat]
Documenter = "~0.20"
Documenter = "~0.23"
2 changes: 1 addition & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ push!(LOAD_PATH,"../src/")
# Make documentation
makedocs(
modules = [StringAnalysis],
format = :html,
format = Documenter.HTML(),
sitename = " ",
authors = "Corneliu Cofaru, 0x0α Research",
clean = true,
Expand Down
16 changes: 10 additions & 6 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,16 +94,20 @@ crps.lexicon
update_inverse_index!(crps)
crps.inverse_index
```
The ngram complexity can be specified as well:
It is possible to explicitly create the lexicon and inverse index:
```@repl index
update_inverse_index!(crps, 2)
crps.inverse_index
update_inverse_index!(crps) # default ngram complexity is 1
create_lexicon(Corpus([sd]))
create_inverse_index(Corpus([sd]))
```
Ngram complexity can be specified as a second parameter
```@repl index
create_lexicon(Corpus([sd]), 2)
```

!!! note

From version `v0.3.9`, the lexicon and inverse index can be created with the `create_lexicon` and
`create_inverse_index` functions respectively. Both functions support specifying the ngram complexity.
The `create_lexicon` and `create_inverse_index` functions are available from `v0.3.9`.
Both functions support specifying the ngram complexity.

## Preprocessing
The text preprocessing mainly consists of the `prepare` and `prepare!` functions and preprocessing flags which start mostly with `strip_` except for `stem_words`. The preprocessing function `prepare` works on `AbstractDocument`, `Corpus` and `AbstractString` types, returning new objects; `prepare!` works only on `AbstractDocument`s and `Corpus` as strings are immutable.
Expand Down
10 changes: 5 additions & 5 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ StringAnalysis is a package for working with strings and text. It is a hard-fork

## What is new?
This package brings several changes over `TextAnalysis.jl`:
- Added the [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) statistic
- Added dimensionality reduction with [sparse random projections](https://en.wikipedia.org/wiki/Random_projection)
- Added co-occurence matrix
- Improved latent semantic analysis
- Many objects are hashable and can be compared
- Added dimensionality reduction with [sparse random projections (RP)](https://en.wikipedia.org/wiki/Random_projection)
- Improved latent semantic analysis (LSA)
- Re-factored text preprocessing API
- DTM and similar have documents as columns
- DTM and similar have documents as columns (faster data representation model)
- Parametrized many of the objects (`DocumentTermMatrix`, `AbstractDocument`s)
- n-gram complexity support for DTMs, DTVs, DTV iterators, LSA, random projections, lexicon and inverse index
- Element type specification for `each_dtv`, `each_hash_dtv`
- Extended `DocumentMetadata` fields
- Simpler API i.e. less exported methods
Expand Down

0 comments on commit 115a7f0

Please sign in to comment.