R Implementation of Wordpiece Tokenization [R package wordpiece version 2.1.3]

wordpiece: R Implementation of Wordpiece Tokenization

Apply 'Wordpiece' (<doi:10.48550/arXiv.1609.08144>) tokenization to input text, given an appropriate vocabulary. The 'BERT' (<doi:10.48550/arXiv.1810.04805>) tokenization conventions are used by default.

Version:	2.1.3
Depends:	R (≥ 3.3.0)
Imports:	dlr (≥ 1.0.0), fastmatch (≥ 1.1), memoise (≥ 2.0.0), piecemaker (≥ 1.0.0), rlang, stringi (≥ 1.0), wordpiece.data (≥ 1.0.2)
Suggests:	covr, knitr, rmarkdown, testthat (≥ 3.0.0)
Published:	2022-03-03
DOI:	10.32614/CRAN.package.wordpiece
Author:	Jonathan Bratt [aut, cre], Jon Harmon [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
Maintainer:	Jonathan Bratt <jonathan.bratt at macmillan.com>
BugReports:	https://github.com/macmillancontentscience/wordpiece/issues
License:	Apache License (≥ 2)
URL:	https://github.com/macmillancontentscience/wordpiece
NeedsCompilation:	no
Materials:	README, NEWS
CRAN checks:	wordpiece results

Reference manual:	wordpiece.html , wordpiece.pdf
Vignettes:	Using wordpiece (source, R code)

Package source:	wordpiece_2.1.3.tar.gz
Windows binaries:	r-devel: wordpiece_2.1.3.zip, r-release: wordpiece_2.1.3.zip, r-oldrel: wordpiece_2.1.3.zip
macOS binaries:	r-release (arm64): wordpiece_2.1.3.tgz, r-oldrel (arm64): wordpiece_2.1.3.tgz, r-release (x86_64): wordpiece_2.1.3.tgz, r-oldrel (x86_64): wordpiece_2.1.3.tgz
Old sources:	wordpiece archive

Reverse suggests:

Please use the canonical form https://CRAN.R-project.org/package=wordpiece to link to this page.