Information Performance Specifications Other engines Benefits Get IT Search

IT Search is no longer supported. You can continue using this code for free if you wish so. Thank you. Get IT Search Engine code from this website


General information:

IT Search is a powerful, customizable, effective site indexing/searching engine, designed for both typical and big web sites, with number of files from 3-15 to more than 100.000 and site size more than 1GB. No Database is needed for storing index or documents. Now IT Search supports also PDF files along with TEXT, HTML, and PHP.

Unique architecture and use of advanced indexing hashing technique allows IT Search to look through such big amount of documents in a moment, thus allowing You to implement such feature as searching through the whole web site, not only "news" or recent articles as most sites do.

The IT Search Professional package adds the ability to sort search results either by relevance or date modified.

You get not only "simple text search" capability, but logical search as well. It means with IT Search you may search documents, that INCLUDE, DO NOT INCLUDE and POSSIBLY INCLUDE given words.
It means, you can find pages, that INCLUDE words "memory" and "fast", DOES NOT include word "price" and POSSIBLY include word "customer" (documents, that contain this word will be displayed first).

IT Search is fast enough to find single word in less than 0.5 seconds, several words in less than 2 seconds and word combination, that involves more than 30% of web site pages in less than 8 seconds.

IT Search engine serves search queries for keywords and displays results pages in a standard format including words count summary, page title, short description (you can customize, how many words it contain), file size and links to the document found, one standard, and another, that opens in a new window.

IT Search output is highly customizable, you can use various skins for output pages, enable/disable words count summary and search time summary, change output text color and number of documents shown.

It provides also such feature as words output summary, thus allowing more precise future searches. For example, You can notice, how many pages contain word "memory" and do not search for this word if every page on the web site contains it.
Each search page has 'next' and/or 'previous' links, so if your search query result is longer than one page, you can look through ALL matches, not only 'first 10'.

LINK: IT Search Screenshots

 

Performance

IT Search performance is far beyond the performance of many existing search engines. It was tested on small and big web sites, containing thousands of pages.
IT Search was highly optimized for little memory usage, fast search and indexing, adequate representation of search output. It has also intuitive input, so people get pages exactly matching criteria they typed in.

IT Search indexes and searches entire files, not first NN Kbytes of each file data.

Size of Index files is very little, so search takes less disk time and space, allowing you to store more pages instead of index files and reduce hosting cost. Indexes are organized in such manner that every search involves only 2 files, each less that 1 MB. Such distributed architecture results in great performance and little disk usage.

Usage test results:

Performed a kind of test to show how fast IT Search is.
IT Search was tested on the web site, containing 3 Gb (3.177 GB) of data, 200K files (191 822 files) and 8K folders (8 486 folders).
Hardware: CPU: Intel Pentium III Xeon 400 MHz, 128 Mb memory, 20GB UATA33 hard drives, NT Operating System.

Average indexing time: 8.5 hours. (Can be less if you use more memory for file cache and set index update rarely.)

Index files size: 107 Mb.

Memory usage:
indexing: 75 MB (customized)
searching: 4 MB

Search time:*
1 uncommon word search : less than 0.5 second
1 common word search : less than 0.5 second
2 uncommon words search : 1 second
2 common words search : 2 seconds
complex uncommon words search (5 words): 1 second
complex common words search (5 words): 8 seconds

* uncommon word: less than 5 % of files contain it.
* common word is a word that exists in more than 30% of all files.

 

Extended specifications

IT Search uses hashing for striping index file data into little files, which can be accessed quickly. No Database is needed for storing index.
IT Search is a combination of index script, that indexes your si
te files and search engine serving search queries.
Both scripts are written in Perl language, thus allowing you full customization.
(You may want to install Perl interpreter in order to run IT Search.)


You can customize

indexer:

number of index files (affects file system load)

index interval updates (affects indexing speed, file system and memory load)

path(s) to your documents directory(ies)

words processing rules (advanced)

search engine:

number of results per page

brief summary size

brief summary filters (advanced)

output text colors

output skins (skins are just plain html files, which are inserted before and after each search page, very useful)

search form (separate plain html file)

 

Other search engines

There are lots of search engines today, each offering unique features and implementation.

Most common approaches to text search are:

Using simple text searchers, that just open one html file after another, search for a given string, and than presenting output in a standard manner.
This searchers are most slow and can be used only for searching little site / messageboard.

Searchers, that use one big index file. Before a search can be run, they make index file by opening each html document, extracting words and writing them to the end of index file along with document path and title.
Such searchers are much faster, but only when index file size is smaller than system free memory. If it is not - they slow down significantly, trying to read hundreds of megabytes from your hard drive each time search is running.

Searchers, that use database for storing, manipulating and searching index.
Fast enough and even capable of performing multi-phrase search. Main disadvantage is database size. Usually it is twice more than your site size. So if You have 3 GB of information, your index file size will be at least 6 GB.

Professional search engines, designed for heavy-duty web sites. Have lots of features, fast engine and licenced usually for the number of documents searched.
Typical is Ultraseek Server ($4,995 for 10,000 documents)

 

IT Search Benefits

Powerful, fast, customizable search engine allows you to implement search over your entire web site.

Users can search through thousands of documents in a single click.

Not only "simple text search", but extended logical queries can be run, that helps people find documents they want.

The IT Search Professional package has the ability to sort search results either by relevance or date modified, showing high-relevant or new documents first.

Little disk space usage reduces hosting cost.

There is no need to pay for additional database installation and maintenance -- IT Search does not use external databases.

Due to unique architecture and use of advanced techniques it is fast enough even for big web sites, containing thousands of documents.

Customizable file system / memory load during indexing phase.

Multi-page search results.

Customizable output, including skins, text color, summarization, word count and more.


Get IT Search Engine from this website