| Akron | 5f9091c | 2017-03-24 20:37:35 +0100 | [diff] [blame] | 1 | package Krawfish::Corpus::Distribution; |
| 2 | use strict; |
| 3 | use warnings; |
| 4 | |
| 5 | # TODO: |
| 6 | # distr([1:3], 'author:Goethe', 'author:Schiller') |
| 7 | # |
| 8 | # Go through both queries and buffer them. |
| 9 | # Once the first buffer has a position >= 1 and the |
| 10 | # second query has a position >= 3, release both |
| 11 | # buffers in document order (aka do an or-query) |
| 12 | # in the requested ratio. |
| 13 | # |
| 14 | # In the worst case, this means that one of the |
| 15 | # queries will be completely buffered, while the other |
| 16 | # has only a few entries, making most of the buffered elements |
| 17 | # rendered useless. |
| 18 | # However - the strategy described above means, |
| 19 | # that there may be a lot elements missing, so it may be usefull to |
| 20 | # buffer the query with the lowest freq first and then go through |
| 21 | # the other one with mild skips. |
| 22 | # |
| 23 | # However - in case skips are not available, |
| 24 | # this may be slow ... |
| 25 | |
| 26 | 1; |