Indexing and Searching Overview
dtSearch products instant searching across terabytes of text in a wide range of online and offline data types. Search time (including concurrent search time) is typically less then a second.
Building an Index. dtSearch products can instantly search terabytes of text because dtSearch builds a search index that stores each unique word and its location in the data.
Updating an Index. dtSearch can update your indexes by adding only new or updated items, removing deleted items, and compressing the index, without affecting searching.
Indexing Tip #1: Build an index. Unindexed searching is almost never more efficient. While indexing is much slower than searching, the time it takes to build an index and then search for multiple search terms (as is typical in forensics and e-discovery) is significantly less than the time it takes to run multiple unindexed search terms. And once the index is in place, if you think of more search terms, additional search time is pretty much instantaneous.
Indexing Tip #2: Watch for encrypted files. After building an index, dtSearch’s “off the shelf” products, for example, create a log of encrypted files dtSearch cannot read. Take a look at this log so you know what you need to separately decrypt and run again through dtSearch. (More)
Indexing Tip #3: Access emails directly as PST, OST, MSG etc. files, instead of going through Outlook/MAPI. If you are not searching your own personal email collection (and sometimes even if you are searching your own emails and have a large collection), it is much more efficient to bypass the Outlook/MAPI “middleman,” and directly access the data. (More) And don’t forget fuzzy searching to sift through potential typographical errors in emails and attachments!
Indexing Tip #4: Update your indexes by telling dtSearch to add any new or changed documents, remove deleted documents and compress the updated index. This type of update tends to be much less time consuming than completely re-indexing. Even better, dtSearch can update its indexes automatically with no effect on ongoing concurrent searching. (More)
Indexing Tip #5: Check out general tips on optimizing indexing before you start a large index job. Following is just one example of the type of thing you need to know.
While search options like fuzzy searching are adjustable at search time, if you build a case and accent-sensitive index, the only way to change that setting is to rebuild the entire index again. With case and accent sensitive indexing on, your index size will be much larger, as your index will store Frank, frank and FRANK as separate words, instead of the same word. Worse, with case and accent-sensitive indexing on, a search for Frank Harvey would miss both frank harvey and FRANK HARVEY. (More)