Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Searching Algorithms

Old News

See Also Recommended Books Recommended Links Lecture Notes and E-books Donald Knuth TAOCP Volume3 Animations
Linear Search Binary Search Hashing Trees AVL trees Fibonacci search Humor Etc

Searching algorithms are closely related to the concept of dictionaries. Dictionaries are data structures that support search, insert, and delete operations. One of the most effective representations is a hash table. Typically, a simple function is applied to the key to determine its place in the dictionary. Another efficient search algorithms on sorted tables is binary search.

If the dictionary is not sorted then heuristic methods of dynamic reorganization of the dictionary are of great value. One of the simplest are cache-based methods: several recently used keys are stored a special data structure that permits fast search (for example is always sorted). For example keeping the last N recently found values at the top of the table (or list) dramatically improved performance. Other cache-based approach are also possible. In the simplest form the cache can be merged with the dictionary:

In the paper On self-organizing sequential search heuristics, Communications of the ACM, v.19 n.2, p.63-67, Feb. 1976 , R.L. Rivest presents a set of methods for dynamically reordering a sequential list containing N records in order to increase search efficiency. The method Ai (for i between 1 and N) performs the following operation each time that a record R has been successfully retrieved: Move R forward i positions in the list, or to the front of the list if it was in a position less than i. The method A1 is called the transposition method, and the method AN-1 is called the move-to-front method.  See also J. H. Hester , D. S. Hirschberg, Self-organizing linear search, ACM Computing Surveys (CSUR), v.17 n.3, p.295-311, Sept. 1985


Notes:
  • This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • The site contain some broken links as it develops like a living tree... Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.
Google Search
Open directory

Research Index


Old News ;-)

[May 6, 2008] C Minimal Perfect Hashing Library 0.8 by Davi de Castro Reis

About: C Minimal Perfect Hashing Library is a portable LGPL library to create and to work with minimal perfect hashing functions. The library encapsulates the newest and more efficient algorithms available in the literature in an easy-to-use, production-quality, fast API. The library is designed to work with big entries that cannot fit in the main memory. It has been used successfully for constructing minimal perfect hashing functions for sets with billions of keys.

Changes: This version adds the internal memory bdz algorithm and utility functions to (de)serialize minimal perfect hash functions from mmap'ed memory regions. The new bdz algorithm for minimal perfect hashes requires 2.6 bits per key and is the fastest one currently available in the literature.

Joseph Culberson's Binary Search Tree Research Binary Search Trees

This research is all concerned with the development of an analysis and simulations of the effect of mixed deletions and insertions in binary search trees. Previously it was believed that the average depth of a node in a tree subjected to updates was decreased towards an optimal O(log n) when using the usual algorithms (See e.g. Knuth Vol. 3 [8] ) This was supported to some extent by a complete analysis for trees of only three nodes by Jonassen and Knuth [7] in 1978. Eppinger [6] performed some simulations showing that this was likely false, and conjectured, based on the experimental evidence, that the depth grew as O(log^3 n), but offered no analysis or explanation as to why this should occur. Essentially, this problem concerning one of the most widely used data structures had remained open for 20 years.

Both my MSc [1] and Ph.D. [2] focused on this problem. This research indicates that in fact the depth grows as O(n^{1/2}). A proof under certain simplifying assumptions was given in the theoretical analysis in Algorithmica [3], while a set of simulations was presented together with a less formal analysis of the more general case, in the Computer Journal [4] , intended for a wider audience. A complete analysis for the most general case is still open.

More recently P. Evans [5] has demonstrated that asymmetry may be even more dangerous in smaller doses! Algorithms with only occasional asymmetric moves tend to develop trees with larger skews, although the effects take much longer.

    Publications

  1. Joseph Culberson. Updating Binary Trees. MSc Thesis, University of Waterloo Department of Computer Science, 1984. (Available as Waterloo Research Report CS-84-08.) Abstract
  2. Joseph Culberson. The Effect of Asymmetric Deletions on Binary Search Trees. Ph. D. Thesis University of Waterloo Department of Computer Science, May 1986. (Available as Waterloo Research Report CS-86-15.) Abstract
  3. Joseph Culberson J. Ian Munro. Analysis of the standard deletion algorithms in exact fit domain binary search trees. Algorithmica, vol 6, 295-311, 1990. Abstract
  4. Joseph Culberson J. Ian Munro. Explaining the behavior of binary search trees under prolonged updates: A model and simulations. The Computer Journal, vol 32(1), 68-75, February 1989. Abstract
  5. Joseph C. Culberson and Patricia A. Evans Asymmetry in Binary Search Tree Update Algorithms Technical Report TR94-09

    Other References

  6. Jeffery L. Eppinger. An empirical study of insertion and deletion in binary trees. Communications of the ACM vol. 26, September 1983.
  7. Arne T. Jonassen and Donald E. Knuth A trivial algorithm whose analysis isn't. Journal of Computer and System Sciences, 16:301-322, 1978.
  8. D. E. Knuth Sorting and Searching Volume III, The Art of Computer Programming Addison-Wesley Publishing Company, Inc., Reading, Massachusetts, 1973.

Joseph Culberson

Simulations of dynamic sequential search algorithms

In [3], R.L. Rivest presents a set of methods for dynamically reordering a sequential list containing N records in order to increase search efficiency. The method Ai (for i between 1 and N) performs the following operation each time that a record R has been successfully retrieved: Move R forward i positions in the list, or to the front of the list if it was in a position less than i. The method A1 is called the transposition method, and the method AN-1 is called the move-to-front method.

Recommended Links


In case of broken links please try to use Google search. If you find the page please notify us about new location
Google     

Hashing

Algorithms/data animations

Linear Search

NIST: Linear search

Linear search - encyclopedia article about Linear search.

Review of Linear Search

In the previous labs we have already dealt with linear search, when we talked about lists. Linear Search in Scheme is probably the simplest example of a storage/retrieval datastructure due to the number of primitive operations on lists that we can use. For instance, creation of the datastructure just requires defining a null list.

There are two types of linear search we will examine. The first is search in an unordered list; we have seen this already in the lab about cockatoos and gorillas. All of the storage retrieval operations are almost trivial: insertion is just a single cons of the element on the front of the list. Search involves recurring down the list, each time checking whether the front of the list is the element we need. Deletion is similar to search, but we cons the elements together as we travel down the list until we find the element, at which point we return the cdr of the list. We also did analysis on these algorithms and deduced that the runtime is O(n).

C++ Examples - Linear Search in a Range

On the Competitiveness of Linear Search

C++ Notes: Algorithms: Linear Search

CSC 108H - Lecture Notes
... Linear Search (Sequential). Start with the first name, and continue looking until
x is found. Then print corresponding phone number. ... Linear Search (on a Vector). ...
 

Linear Search: 1.1 Description/Definition
The simplest of these searches is the Linear Search. A Linear Search is simply searching
through a line of objects in order to find a particular item. ...
 

Linear Search
Linear Search. No JDK 1.3 support, applet not shown. 
www.cs.usask.ca/.../csconcepts/2002_7/static/ tutorial/introduction/linearsearchapp/linearsearch.html - 2k - Cached - Similar pages

 

Move to the front (self-organizing) modification

Linear Search  the Transpose Method

Here the heuristic is slightly different from the Move-To-Front method : once found, the transition is swapped with the immediately preceding one, performing an incremental bubble sort at each access.

Of course, these two techniques provide a sensitive speed-up only if the probabilities for each transition to be accessed are not uniform.

These are the seven representations that caught our attention at first but that set of containers will be broadened when a special need is to be satisfied.
What is obvious here is that each of these structures has advantages and drawbacks, whether on space complexity or on time complexity. All these constraints have to be taken in account in order to construct code adapted to particuliar situations.

Linear Search : the Move-To-Front Method

Self-organizing linear search
... Self-organizing linear search. Full text, pdf formatPdf (1.73 MB). ... ABSTRACT Algorithms
that modify the order of linear search lists are surveyed. ...
portal.acm.org/citation.cfm?id=5507 - Similar pages

Linear search in lists

Binary Search

NIST: binary search

Definition: Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.

Generalization (I am a kind of ...)
dichotomic search.

Aggregate parent (I am a part of or used in ...)
binary insertion sort, ideal merge, suffix array.

Aggregate child (... is a part of or used in me.)
divide and conquer.

See also linear search, interpolation search, Fibonaccian search, jump search.

Note: Run time is O(ln n). The search implementation in C uses 0-based indexing.

Author: PEB

Implementation

search (C) n is the highest index of the 0-based array; possible key indexes, k, are low < k <= high in the loop. (Scheme), Worst-case behavior annotated for real time (WOOP/ADA), including bibliography.

en.wikipedia.org/wiki/Binary_search

Topic #9 Binary search trees

Search Trees

Binary Search Trees

Binary Search Trees http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/niemann/s_bin.htm

Binary Search Tree http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/binarySearchTree.htm

Trees

AVL trees

(see also Libraries -- there are a several libraries that implementat AVL trees)

Fibonacci search

(see The Fibonacci Numbers and Golden section in Nature for information about Fibinacci, who introduced the decimal number system into Europe and first suggested a problem that lead to Fibonacci numbers)


Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified:  November 08, 2008