Negative Sampling
This library provides support for negative sampling.
Interface
To use this library, prefix the types and functions below with Negsample.
.
Types
type t
- The representation of the statistics computed for a document.
Methods
val dict : int array array * int -> t
val sampleContext : t * int -> (int * (int list))
val negSample : t * int * int -> (int list)
val vocabSize : t -> int
Method Overview
t dict((int array array) data, int vocabSize).
- Computes statistics on vocabulary frequency. Words as represented as
integers, from
0
tovocabSize - 1
. The input data is treated as an array of paragraphs, where a paragraph is a array of words.
- Computes statistics on vocabulary frequency. Words as represented as
integers, from
sampleContext (ns, numContext)
- Samples the context of a word. Randomly chooses a word (weighted by its frequency), and returns its context (the words appear immediately before and after it).
negSample (ns, pid, numNeg)
- Samples the words not in a specific paragraph, where the paragraph is given
by the ID
pid
.
- Samples the words not in a specific paragraph, where the paragraph is given
by the ID
vocabSize ns
- Returns the vocabulary size.