||dtSearch Corp. has over two decades of experience in enterprise and developer text retrieval and document filters. The Smart Choice for Text Retrieval® since 1991, the dtSearch software product line offers “industrial-strength” (PC Magazine) performance in searching, as well as the ability to parse a wide variety of data formats.
Then and Now
The company started research and development in text retrieval in 1988. Incorporated in Virginia in 1991, dtSearch Corp. began marketing the first dtSearch product in the first quarter of that year. The original dtSearch release ran as a desktop-“only” application.
Today, the dtSearch product line can instantly search terabytes of text across a desktop, network, or Internet or Intranet site. dtSearch products also serve as tools for publishing, with instant text searching, large document collections to Web sites or portable media.
The first few releases of the dtSearch product line did not include OEM or developer API access. In 1995, dtSearch Corp. made available its initial developer version for OEM integration. Launched in 1995, the very first Windows product that embedded the dtSearch Engine had an installed base of over one million users.
Today, the dtSearch Engine for Win & .NET and the dtSearch Engine for Linux make available dtSearch document filters and instant searching for a wide range of Internet, Intranet and other commercial applications. The SDKs include native 64-bit and 32-bit C++, Java and .NET APIs.
The two functional components at the core of dtSearch products are: dtSearch’s proprietary document filters and general data support; and dtSearch’s full-text and fielded data searching. These two functional components work together for integrated searching and data display with highlighted hits. Or they can work separately. For example, some developers require the dtSearch document filters “only,” without the need for search functionality.
In addition to the dtSearch Engine, other dtSearch products include: dtSearch Web with Spider for quickly publishing instantly searchable data to an Internet or Intranet site; dtSearch Network with Spider for instantly searching across a network; dtSearch Desktop with Spider for desktop search; and dtSearch Publish for publishing searchable data to portable media.
Who Uses dtSearch
Fortune 100 companies and others with some of the most demanding document search needs in the world rely on dtSearch products. Typical enterprise use of the dtSearch product line includes general “office” document retrieval, searching through email repositories plus attachments, and database search.
dtSearch products can also search web data. High-traffic, content-rich public Internet sites deploy dtSearch products to search online technical documentation. dtSearch products can also run on internal access servers, providing secure Intranet searching.
In the legal and investigative areas, a large number of e-discovery providers and forensic investigators rely on dtSearch. (For these users, the dtSearch site includes a summary FAQ on indexing and searching features of common interest to forensics users.) Additionally, dtSearch products assist with legal research across statutes, regulations, and case law.
The financial and accounting industries similarly use dtSearch products. In fact, 3 out of 4 of the “Big 4” accounting firms have dtSearch multi-user or developer licenses. The dtSearch product line operates in the recruiting space for resume or CV searching. Increasingly, dtSearch products search medical records too.
US Government customers include defense, space and law enforcement agencies. 6 out of 7 of the Fortune 500 largest Aerospace and Defense industry companies are direct dtSearch Engine customers. Other US Government agencies (from tax agencies to court systems), as well as state and local government agencies, are also dtSearch customers.
International governmental organizations likewise employ dtSearch products. (Through its Unicode support, the dtSearch product line supports hundreds of international languages.) The product line has a strong international presence in the private sector too. dtSearch has distributors worldwide, including coverage on six continents.
As for OEM integration, some of the largest IT companies have embedded the dtSearch Engine in commercial applications. The dtSearch site includes hundreds of publicly-available developer case studies covering a wide variety of market segments. The dtSearch site also has over a hundred press reviews from general information management publications like Computerworld, Network World, eWeek, KMWorld as well as reviews from more specialized programmer and other vertical market publications.
As noted above, the two functional components at the core of the dtSearch product line are: dtSearch’s proprietary document filters and general data support; and dtSearch’s full-text and fielded data searching. White papers
Document filters and supported data types. dtSearch supports a wide range of online and offline date, including display of full-text and metadata in all formats with highlighted hits. In many cases, as specifically noted below, support also covers integrated images along with hit-highlighted text.
- Web-ready static and dynamic content: support covers integrated image and text support in HTML, XML/XSL, PDF, ASP.NET, PHP, SharePoint, etc.
- Other databases: support covers XML, Access, XBASE, CSV, etc.; dtSearch Engine APIs support SQL-type data along with the full-text of BLOB data.
- MS Office formats: support covers integrated browser-ready image and text support in Word (RTF/DOC/DOCX), PowerPoint (PPT/PPTX), Excel (XLS/XLSX), Access (MDB/ACCDB) and OneNote (ONE).
- PDF, other “Office” documents, compression formats: support covers PDF with integrated image and text support, OpenOffice, RAR, ZIP, GZIP/TAR, etc.
- Emails and attachments: support covers integrated browser-ready image and text support — plus support for attachments — in Outlook/Exchange (PST/OST/MSG) and Thunderbird (MBOX/EML).
- Recursively embedded objects: support covers recursively embedded objects and images in supported email types and MS Office formats. For example, the dtSearch document filters would support an email attachment consisting of a ZIP container including both a PDF and an Access database, where the latter also includes an embedded PowerPoint with embedded images.
Developer APIs (including Windows and Linux, native 64-bit and 32-bit, C++, Java and .NET (through current versions) make available to developers dtSearch’s text parsing, extraction, conversion and hit-highlighting capabilities.
Full-text and fielded data searching. Following are some key text search features.
- Terabyte indexer. dtSearch enterprise and developer products can index over a terabyte of text in a single index, spanning multiple directories, emails and attachments, online data and other databases. The products can create and search any number of indexes. Indexed search time is typically less than a second, even across terabytes of data.
- Concurrent, multithreaded searching. The dtSearch Engine supports fast, concurrent, multithreaded searching. A “success story” from Intel® describes dtSearch’s “perfect score” for high-volume web-based concurrent searching:
Using the Intel®
Concurrency Checker, available through the Intel®
Software Partner Program, [dtSearch Corp.] then tested a dtSearch Engine sample application to simulate high-volume concurrent searching of a single shared index, similar to what might occur on a high-traffic web site. The dtSearch Engine multi-threaded indexed search demo achieved 100 percent parallel time in the Intel Concurrency Checker test, indicating full optimization for multi-core hardware under that test scenario ... The relationship between Intel and dtSearch stretches back a number of years ... [and] generates synergies that deliver excellent performance and other benefits to end-customers, including internal customers at Intel.
- Federated search and the dtSearch Spider. dtSearch products offer federated searching across any number of directories, emails (with nested attachments), and databases. The dtSearch Spider adds local and remote, static and dynamic online content to a search. The Spider can index sites to any level of depth, with support for public and private or secure online content, including log-ins and forms-based authentication. dtSearch products support integrated relevancy ranking with highlighted hits across both online and offline data repositories.
- Faceted search and other data classification options. The dtSearch Engine supports categorization based on document full-text contents, internal document metadata, database content, or data attributes associated with documents during document indexing. The dtSearch Engine has APIs for other advanced data classification options as well, such as faceted search and full-text and/or fielded data positive and negative variable term weighting.
- 25+ search options and international language support. The dtSearch product line offers over 25 hit-highlighted search options, including special forensics search features. For international languages, dtSearch products support Unicode, including support for right-to-left languages, and special Chinese/Japanese/Korean character options.
- SDKs covers native 64-bit and 32-bit APIs in C++, Java and .NET through current versions. The dtSearch Spider is also available with dtSearch Web, and as a .NET API (including a native 64-bit version). The SDKs make available the dtSearch document filters available separately as well for developers that need data parsing, conversion and extraction functionality only, without searching.
dtSearch Corp. typically releases a new version of the dtSearch product line every 3 - 4 months. Because of the very large developer installed base currently using the dtSearch Engine, the company strives wherever possible to maintain backwards compatibility in the developer API.
Typically, a new release will provide support for new formats and operating systems, and of course add new product line features. The dtSearch website includes detailed release notes spanning many years.
Because of the many file format changes that the dtSearch product line must keep up with, as well as coverage for new operating systems and the like, dtSearch encourages customers to stay current on dtSearch releases. Please sign up for automatic notifications of new version releases. And see upgrades for information on current downloads.
With each new release of the dtSearch product line, the company generally makes available a new beta version for download. The posted release notes contain information on the new beta features as well.