- read.transactions gained parameter header to read files with column headers.
- Fixed PROTECT placement in C code discovered by rchk.
- S4 objects use now show instead of print.
- discretizeDF now understands the method "none" which skips discretization.
- discretizeDF now reports which column produces the problem.
- transactions: numeric columns are now discretized during coersion using discretizeDF (with a warning).
- The spurious warning for reaching maxlen in apriori is now removed (reported by Ryan J. Cole).
- Fixed matrix check in function dissimilarity.
- discretize now handles NAs in equal frequency (reported by yarik1988).
- interestMeasure: fixed error when an itemset/rules object of length 0 is provided.
- rules and itemsets gained a method for nitems.
- discretize: the default method is now "frequency" and categories was renamed breaks to be consistent with cut in R-base.
- Added interest measure "importance".
- Added method items for transactions.
- Added discretizeDF to apply discretization to all numeric columns in a data.frame.
- Fixed typo in inspect for tidLists (reported by Carlos Chavarria).
- Fixed bug in %in% for itemMatrix (reported by Henrique Lemos)
- Added (absolut support) "count" as an interest measure.
- itemLabels can now be assigned for rules and itemsets.
- Fixed bug in subset with signature itemMatrix, itemMatrix (reported by rwdvc).
- Fixed pointer punning warning.
- Improved speed for read.transactions with format = "single" significantly.
- Appearance for apriori now guesses the default parameter automatically and does some more checking, making the specification of templates easier.
- Fixed null pointer in error message code.
- head does now not result in an error for empty rule sets (bug reported by cornejom).
- apriori and eclat return now count (absolute support count) in the quality data.frame.
- Added %oin% to find transactions/itemsets that ONLY contain certain items.
- Improved PROTECT placement in C source code.
- itemMeasures for single rules/itemssets now returns a proper data.frame (reported by lordbitin).
- itemMeasures: Added missing parentheses in kappa calculation and fixed equation for least contradiction (reported by Feng Chen).
- apriori: maxtime = 0 disables the time limit.
- is.subset/is.superset uses now fast and memory efficient C code for sparse computation (contributed by Ian Johnson). sparse = TRUE is now the default. Note that the result is now a sparse matrix.
- Added interest measure maxConf.
- is.significant now supports in addition to Fisher's exact test, the chi-squared test.
- interest measures Fisher's exact test and chi-squared (using significance = TRUE) can now produce p-values for substitutes (with complements = FALSE).
- Added function DATAFRAME for more control over coercion to data.frame (e.g., use separate columns for LHS and RHS of rules).
- Error message for sorting with an unknown interest measure.
- abbreviate works now for rules correctly.
- Added registration code for native routines. This requires R 3.3.2.
- apriori uses now a time limit set in the parameter list with maxtime. The default is 5 seconds. Running out of time or maxlen results in a warning. The warning for low absolute support was removed.
- is.redundant now also marks rules with the same confidence as redundant.
- plot for associations and transactions produces now a better error/warning message.
- improved argument check for %pin%. Warns now for multiple patterns (was an error) and give an error for empty pattern.
- inspect prints now consistently the index of rules/itemsets using brackets and starting from 1.
- is.redundant returned !is.redundant (reported by brisbia)
- Duplicate items when coercing from list to transactions are now removed with a warning.
- added tail method for associations.
- added/fixed encoding for read.transactions
- Mutual information is now calculated correctly (reported by ddessommes).
- The transaction class lost slot transactionInfo (we use the itemsetInfo slot now). Note that you may have to rebuild some transaction sets if you are using transactionInfo.
- interestMeasure: performance improvement for "improvement" measure.
- sort: speed up sort by always sorting NAs last.
- head: added method head for associations for getting the best rules according to an interest measure faster than sorting all the associations first.
- abbreviate is now a S4 generic with S4 methods.
- combining item matrices with 0 rows (reported by C. Buchta).
- itemLabel recoding in is.subset (reported by sjain777).
- NAMESPACE export for %in%
- is.redundant: fixed and performance improvement.
- Groceries: fixed typo in dataset.
- we now require R 3.2.0 so cbind in Matrix works.
- is.maximal is now also available for rules.
- added is.significant for rules (uses Fishers exact test with correction).
- added is.redundant for rules.
- added support for multi-level analysis (aggregate).
- APparameter: confidence shows now NA for frequent itemsets.
- removed deprecated WRITE and SORT functions.
- subset extraction: added checks, handles now NAs and recycles for logical.
- read.transactions gained arguments skip and quote and some defaults for read and write (uses now quotes and no rownames by default) have changed.
- itemMatrix: coersion from matrix checks now for 0-1 matrix with a warning.
- APRIORI and ECLAT report now absolute minimum support.
- APRIORI: out-of-memory while rule building does now result in an error and not a memory fault.
- aggregate uses now 'by' instead of 'itemLabels' to conform to aggregate in base.
- ruleInduction: bug fix for missing confidence values and better checking (by C. Buchta).
- Added many new interest measures.
- interestMeasure: the formal argument method is now called measure (method is now deprecated).
- Added Mushroom dataset.
- Moved abbreviate from arulesViz to arules.
- fixed undefined behavior for left shift in reclat.c (reported by B. Ripley)
- added support for weighted association rule mining (by C. Buchta):
- transactions can store weights a column called "weight" in transactionInfo.
- support, itemFrequency and itemFrequencyPlot gained a parameter called weighted.
- weclat extends eclat with transaction weights.
- hits can be used to calculate weights from transact ions.
- We are transitioning to internally use consistently data.frames with the correct number of rows for quality, itemInfo, transactionInfo and itemsetInfo. These data.frames possibly have 0 columns.
- arules uses now testthat (tests are in tests/testthat).
- sort can now sort by several columns (used to break ties) in quality. It also gained an order parameter to return a permutation vector (order) instead.
- inspect gained parameters setStart, setEnd, itemSep, ruleSep and linebreak to control output better.
- read.transactions now ignores empty items (e.g., caused by trailing commas and leading or trailing white spaces).
- labels now returns not a list but consistent labels for objects
(transactions, itemMatrix, rules, itemsets, and tidLists). - tidLists has now an inspect method, gained coercion from "list", and has now a replacement method for dimnames().
- Coercion from itemMatrix to matrix results now in a logical matrix.
- fixed as(transactions, "data.frame"). The column names do now have no prefix (except if transactionInfo contains an item called "items").
- transactions has now its own dimnames function which correctly returns transactionID from transactionInfo as rownames.
- replacement method for dimnames() checks now dimensions.
- item labels are now internally handled as character using stringAsFactor = FALSE in data.frames and not AsIs with I(character).
- rules can now have no item in the RHS.
- fixed missing row labels for is.subset().
- More work on namespace.
- Fixed tests.
- itemUnion: fixed bug for large amounts of dense rules.
- crossTable gained arguments measure and sort.
- Fixed namespace imports for non-base default packages.
- dissimilarity method "pearson" is now set to 1 (max) for neg. correlation. Also added phi correlation coefficient.
- discretize method "cluster" accepts now ... passed on to k-means (e.g., for nstart)
- merge for itemMatrix checks now for conformity
- as(..., "transactions"): binary attributes are now translated into items only if TRUE.
- Import drop0 from Matrix
- C code: fixed problem in error message generation in apriori and eclat (this fixes the trio library problem under Windows)
- C code: rapriori uses now STRING_ELT to be compatible with TERR (TIBCO)
- C code: removed some unused variables.
- Fixed dependency on XML and pmml
- the interest measure chi-squared does now also report p-values (with significance=TRUE)
- interestMeasure calculation checks now better for missing transactions
- interestMeasure consistently returns now NA if not defined for a certain rule
- discretize gained the parameter ordered.
- itemwise set operations itemUnion, itemSetdiff and itemIntersect added.
- validObject checks now rules more thoroughly
- aggregate removes duplicate items from the lhs
- is.superset/is.subset now makes sure that the two arguments conform using recode (number and order of items)
- is.superset/is.subset returns now a matrix with appropriate dimnames
- bug fix: fixed dimname bug in as(..., "dgCMatrix") for tidLists
- image: labels are now passed on correctly.
- tidLists has now c().
- bug fix: reuse in now passed on correctly in interestMeasures (bug reported by Ying Leung)
- direct coercions from and to dgCMatrix is no longer supported use ngCMatrix instead
- coercion from ngCMatrix to itemMatrix and transactions is now possible
- C code: fixed misaligned address on 64-bit systems
- service release
- discretize handles now NAs correctly
- bug fix in is.subset
- transactions: coercion form data.frame now handles logical automatically.
- discretize replaces categorize and offers several additional methods
- Added read and write for PMML.
- 'WRITE' is now deprecated, use 'write' instead
- C code: Added a copy of the C subscript code from R for better performance and compatibility with arulesSequences
- Fixed vignette.
- Internal Changes for dimnames and subsetting
- Added PACKAGE argument to C calls.
- C code: Added C routine symbols to NAMESPACE for arulesSequence
- fixed memory problem in eclat with tidLists=TRUE
- added supportedTransactions()
- is.subset/is.superset can not return a sparse matrix
- added support to categorize continuous variables.
- minor fixes (removed factor in dimnames for itemMatrix, warning in WRITE)
- read.transactions now accepts column names to specify user and item columns (by F. Leisch)
- Initial stable release version
- Alpha and beta versions