On the k-Independence Required by Linear Probing and Minwise Independence
Abstract
We show that linear probing requires 5-independent hash functions for expected constant-time performance, matching an upper bound of [Pagh et al. STOC’07]. For (1 + ε)-approximate minwise independence, we show that \(\Omega(\lg \frac{1}{\varepsilon})\)-independent hash functions are required, matching an upper bound of [Indyk, SODA’99]. We also show that the multiply-shift scheme of Dietzfelbinger, most commonly used in practice, fails badly in both applications.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alon, N., Nussboim, A.: k-wise independent random graphs. In: Proc. 49th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 813–822 (2008)
Black, J.R., Martel, C.U., Qi, H.: Graph and hashing algorithms for modern architectures: Design and performance. In: Proc. 2nd International Workshop on Algorithm Engineering (WAE), pp. 37–48 (1998)
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. Journal of Computer and System Sciences 60(3), 630–659 (2000); See also STOC 1998
Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Computer Networks 29, 1157–1166 (1997)
Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. Journal of Computer and System Sciences 55(3), 441–453 (1997); See also STOC 1994
Dietzfelbinger, M.: Universal hashing and k-wise independent random variables via integer arithmetic without primes. In: Proc. 13th Symposium on Theoretical Aspects of Computer Science (STACS), pp. 569–580 (1996)
Heileman, G.L., Luo, W.: How caching affects hashing. In: Proc. 7th Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 141–154 (2005)
Indyk, P.: A small approximately min-wise independent family of hash functions. Journal of Algorithms 38(1), 84–90 (2001); See also SODA 1999
Knuth, D.E.: Notes on open addressing (1963) (Unpublished memorandum), http://citeseer.ist.psu.edu/knuth63notes.html
Pagh, A., Pagh, R., Ružić, M.: Linear probing with constant independence. SIAM Journal on Computing 39(3), 1107–1120 (2009); See also STOC 2007
Pagh, R., Rodler, F.F.: Cuckoo hashing. Journal of Algorithms 51(2), 122–144 (2004); See also ESA 2001
Pǎtraşcu, M., Thorup, M.: The power of simple tabulation-based hashing (2010) (manuscript )
Schmidt, J.P., Siegel, A.: The analysis of closed hashing under limited randomness. In: Proc. 22nd ACM Symposium on Theory of Computing (STOC), pp. 224–234 (1990)
Schmidt, J.P., Siegel, A., Srinivasan, A.: Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics 8(2), 223–250 (1995); See also SODA 1993
Siegel, A., Schmidt, J.P.: Closed hashing is computable and optimally randomizable with universal hash functions. Technical Report TR1995-687, Currant Institute (1995)
Thorup, M.: Even strongly universal hashing is pretty fast. In: Proc. 11th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 496–497 (2000)
Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In: Proc. 15th ACM/SIAM Symposium on Discrete Algorithms (SODA), pp. 615–624 (2004)
Thorup, M., Zhang, Y.: Tabulation based 5-universal hashing and linear probing. In: Proc. 12th Workshop on Algorithm Engineering and Experiments, ALENEX (2009)
Wegman, M.N., Carter, L.: New classes and applications of hash functions. Journal of Computer and System Sciences 22(3), 265–279 (1981); see also FOCS 1979
Author information
Authors and Affiliations
AT&T Labs,
Mihai Pǎtraşcu & Mikkel Thorup
Authors
- Mihai Pǎtraşcu
You can also search for this author in PubMed Google Scholar
- Mikkel Thorup
You can also search for this author in PubMed Google Scholar
Editor information
Editors and Affiliations
Oxford University Computing Laboratory, Wolfson Building, Parks Road, OX1 3QD, Oxford, UK
Samson Abramsky
Université de Bordeaux (LaBRI) & INRIA, 351, cours de la Libération, 33405, Talence Cedex, France
Cyril Gavoille
INRIA, Centre de Recherche Bordeaux – Sud-Ouest, 351 cours de la Libération, 33405, Talence Cedex, France
Claude Kirchner
Heinz Nixdorf Institute, University of Paderborn, Fürstenallee 11, 33102, Paderborn, Germany
Friedhelm Meyer auf der Heide
University of Patras and RACTI, 26500, Patras, Greece
Paul G. Spirakis
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pǎtraşcu, M., Thorup, M. (2010). On the k-Independence Required by Linear Probing and Minwise Independence. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds) Automata, Languages and Programming. ICALP 2010. Lecture Notes in Computer Science, vol 6198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14165-2_60
Download citation
DOI: https://doi.org/10.1007/978-3-642-14165-2_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14164-5
Online ISBN: 978-3-642-14165-2
eBook Packages: Computer ScienceComputer Science (R0)