Generates sparse BM25 embeddings for keyword search
Public fields
vocab
Vocabulary
language
Language setting ("en" or "ml")
Methods
Method new()
Create a new SparseEmbedder
Arguments
language
Language behavior ("en" = ASCII-focused, "ml" = Unicode-aware)
Method fit()
Fit the embedder on a corpus
Usage
SparseEmbedder$fit(texts)
Arguments
texts
Character vector of texts
Embed texts to sparse vectors
Usage
SparseEmbedder$embed(texts)
Arguments
texts
Character vector of texts
Returns
Sparse matrix of BM25 scores
Method query_terms()
Get term scores for a query
Usage
SparseEmbedder$query_terms(query)
Returns
Named vector of term scores
Method clone()
The objects of this class are cloneable with this method.
Usage
SparseEmbedder$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.