http://blog.csdn.net/jiaomeng/article/details/1495500中这么介绍布隆过滤器
Bloom Filter是一种空间效率很高的随机数据结构,它利用位数组很简洁地表示一个集合,并能判断一个元素是否属于这个集合。Bloom Filter的这种高效是有一定代价的:在判断一个元素是否属于某个集合时,有可能会把不属于这个集合的元素误认为属于这个集合(false positive)。因此,Bloom Filter不适合那些“零错误”的应用场合。而在能容忍低错误率的应用场合下,Bloom Filter通过极少的错误换取了存储空间的极大节省。
http://www.360doc.com/content/14/0914/13/7703112_409376352.shtml中讨论了使用布隆过滤器实现java缓存的架构。开源爬虫webmagic,也实现了一个基于它的url队列的构件。u
guava中这样介绍它,A Bloom filter offers an approximate containment test with one-sided error: if it claims that an element is contained in it, this might be in error, but if it claims that an element is not contained in it, then this is definitely true.下面的这个例子使用的是guava-18.0.jar。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 package com; import java.nio.charset.Charset; import com.google.common.hash.BloomFilter; import com.google.common.hash.Funnels; public class Bloom { public static void main(String[] args) { BloomFilter<String> b = BloomFilter.create(Funnels.stringFunnel(Charset.forName( "utf-8" )), 1000 , 0.000001 ); b.put( "121" ); b.put( "122" ); b.put( "123" ); System.out.println(b.mightContain( "12321" )); BloomFilter<String> b1 = BloomFilter.create(Funnels.stringFunnel(Charset.forName( "utf-8" )), 1000 , 0.000001 ); b1.put( "aba" ); b1.put( "abb" ); b1.put( "abc" ); b1.putAll(b); System.out.println(b1.mightContain( "123" )); } }