How does a Search Engine work?

10:33 AM, Posted by Mini, No Comment

When a use accesses a search engine, they are presented with a graphical interface form, on which they specify what they are looking for. Whey they tell the search engine to start the search (by pressing “Enter” or on clicking on a specific button), the search engine invokes a program that queries its database (a collection of all the web pages it has access to).

The results are returned to the user as a number of possible URL’s. Often, these will be ranked in priority or success rate, which higher values meaning more likely to contain the information you request (what it really means is that it contains more occurrences of the keywords you were searching for compared to other documents).

How does a search engine know where the information is?

There are a number of ways a search engine can know about where information is to be found. Firstly, a search engine can list information by keywords or page titles. These keywords or titles (subject categories) can either be submitted by user’s that provide information on the internet, or can be extracted by accessing web pages and extracting the page title and keywords from the header of the web page. This keyword extraction relies on the appropriate HTML code in the header of the web page (it is called a meta-tag). The advantage is that it quicker to index information and less traffic is involved (only headers are requested from websites, the entire web document is not read).

The second method a search engine can use relies upon reading every page it knows about (usually pages are submitted for inclusion by web authors). This technique involves the use of programs called spiders or web robots that request every page then extracts all words from the content of the page and stores these words in a large database.

Not all search engines are the same. Some use keyword extraction via meta-tags while other use keywords via page content indexing. Obviously content indexing is a much better method because you are more likely to find specific information. However, this method has a number of problems. One is the sheer size of the resultant database and number of pages involved (which means a lot of traffic, and it might take two weeks to fully search all those pages). As the size of the www continues to grow this becomes more and more difficult.

Keeping the database up-to-date is a serious problem. It is common find that pages returned by a search engine have in fact since been moved or deleted.

Object Relational Database Management System and Object Oriented Database Management System

8:13 AM, Posted by Mini, No Comment

Object Relational Database Management System (ORDBMS)
  • ORDBMS provides support for complex data types. Powerful query languages support through SQL. ORDBMS also provides good protection of data against programming errors.
  • One of the major assets here is SQL. Although, SQL is not as powerful as a programming language, but it is none the less essentially a fourth generation language, thus, it provides excellent protection of data from the programming errors.
  • The relational model has a very rich foundation for query optimization, which helps in reducing the time taken to execute a query.
  • These databases make the querying as simple as in relational even, for complex data types and multimedia data.
  • Although the strength of these DBMS is SQL, it is also one of the major weaknesses from the performances point of view in memory applications.
Object Oriented Database Management System (OODBMS)
  • OODBMS provides good support for complex data types. It high integration of database with the programming language. It has very good performance but not as powerful as relational.
  • It is based on object oriented programming languages, thus, are very strong in programming, however, any error of a data type made by a programmer may effect many users.
  • These databases are still evolving in this direction. They have reasonable systems in places.
  • The querying is possible but somewhat difficult to get.
  • Some applications that are primarily run in the RAM and require a large number of database accesses with high performance may find such DBMS more suitable. This is because of rich programming interface provided by such DBMS. However, such applications may not support very strong query capabilities. A typical example of one such application is databases required for CAD.