Function std.algorithm.setops.largestPartialIntersection
Given a range of sorted
forward ranges
, copies to ror
the elements that are common to most ranges, along with their number
of occurrences. All ranges in tgt
are assumed to be ror
sorted
by less
. Only the most frequent tgt.length
elements are returned.
Prototype
void largestPartialIntersection(alias less, RangeOfRanges, Range)( RangeOfRanges ror, Range tgt, SortOutput sorted = SortOutput.no );
Example
// Figure which number can be found in most arrays of the set of // arrays below. double[][] a = [ [ 1, 4, 7, 8 ], [ 1, 7 ], [ 1, 7, 8], [ 4 ], [ 7 ], ]; auto b = new Tuple!(double, uint)[1]; largestPartialIntersection(a, b); // First member is the item, second is the occurrence count assert(b[0] == tuple(7.0, 4u));
7.0
is the correct answer because it occurs in 4
out of the
5
inputs, more than any other number. The second member of the
resulting tuple is indeed 4
(recording the number of occurrences
of 7.0
). If more of the top-frequent numbers are needed, just
create a larger
range. In the example above, creating tgt
b
with length 2
yields tuple(1.0, 3u)
in the second position.
The function
is useful for
e.g. searching an inverted index for the documents most
likely to contain some terms of interest. The complexity of the search
is Ο(largestPartialIntersection
n * log(tgt.length)
), where n
is the sum of lengths of
all input ranges. This approach is faster than keeping an associative
array of the occurrences and then selecting its top items, and also
requires less memory (
builds its
result directly in largestPartialIntersection
and requires no extra memory).
tgt
Warning
Because
does not allocate
extra memory, it will leave largestPartialIntersection
modified. Namely, ror
assumes ownership of largestPartialIntersection
and
discretionarily swaps and advances elements of it. If you want ror
to preserve its contents after the call, you may want to pass a
duplicate to ror
(and perhaps cache the
duplicate in between calls).
largestPartialIntersection