About dtSearch Corp.
dtSearch Corp. has over two decades of experience in enterprise and developer text retrieval and document filters. The Smart Choice for Text Retrieval®
since 1991, the dtSearch software product line offers “industrial-strength” (PC Magazine) performance in searching, as well as the ability to parse a wide variety of data formats.
Then and Now
The company started research and development in text retrieval in 1988. Incorporated in Virginia in 1991, dtSearch Corp. began marketing the first dtSearch product in the first quarter of that year. The original dtSearch release ran as a desktop-“only” application.
Today, the dtSearch product line can instantly search terabytes of text across a desktop, network, or Internet or Intranet site. dtSearch products also serve as tools for publishing, with instant text searching, large document collections to Web sites or portable media.
The first few releases of the dtSearch product line did not include OEM or developer API access. In 1995, dtSearch Corp. made available its initial developer version for OEM integration. Launched in 1995, the very first Windows product that embedded the dtSearch Engine had an installed base of over one million users.
Today, the dtSearch Engine comes in multiple platforms and makes available dtSearch document filters and instant searching for a wide range of Internet, Intranet and other commercial applications. The SDKs include native 64-bit and 32-bit C++, Java and .NET APIs.
The two functional components at the core of dtSearch products are: dtSearch’s proprietary document filters and general data support
; and dtSearch’s full-text and metadata searching
. These two functional components work together for integrated searching and data display with highlighted hits. Or they can work separately. For example, some developers require the dtSearch document filters “only,” without the need for search functionality.
In addition to the dtSearch Engine, other dtSearch products include: dtSearch Web with Spider for quickly publishing instantly searchable data to an Internet or Intranet site; dtSearch Network with Spider for instantly searching across a network; dtSearch Desktop with Spider for desktop search; and dtSearch Publish for publishing searchable data to portable media.
Who Uses dtSearch
Fortune 100 companies and others with some of the most demanding document search needs in the world rely on dtSearch products. Typical enterprise use of the dtSearch product line includes general “office” document retrieval, searching through email repositories plus attachments, and database search.
dtSearch products can also search web data. High-traffic, content-rich public Internet sites deploy dtSearch products to search online technical documentation. dtSearch products can also run on internal access servers, providing secure Intranet searching.
In the legal and investigative areas, a large number of e-discovery providers and forensic investigators rely on dtSearch. (For these users, the dtSearch site includes a summary FAQ
on indexing and searching features of common interest to forensics users.) Additionally, dtSearch products assist with legal research across statutes, regulations, and case law.
The financial and accounting industries similarly use dtSearch products. In fact, 3 out of 4 of the “Big 4” accounting firms have dtSearch multi-user or developer licenses. The dtSearch product line operates in the recruiting space for resume or CV searching. Increasingly, dtSearch products search medical records too.
US Government customers include defense, space and law enforcement agencies. 6 out of 7 of the Fortune 500 largest Aerospace and Defense industry companies are direct dtSearch Engine customers. Other US Government agencies (from tax agencies to court systems), as well as state and local government agencies, are also dtSearch customers.
International governmental organizations likewise employ dtSearch products. (Through its Unicode support, the dtSearch product line supports hundreds of international languages
.) The product line has a strong international presence in the private sector too. dtSearch has distributors worldwide, including coverage on six continents.
As for OEM integration, some of the largest IT companies have embedded the dtSearch Engine in commercial applications. The dtSearch site includes hundreds of publicly-available developer case studies
covering a wide variety of market segments. The dtSearch site also has over a hundred press reviews
from general information management publications like Computerworld, Network World, eWeek, KMWorld
as well as reviews from more specialized programmer and other vertical market publications.
dtSearch products can index over a
terabyte of text in a single index, spanning multiple directories, emails and attachments, online data and other databases. The products can create and search any number of indexes. Indexed search time is typically less than a second, even across terabytes of data.
Concurrent, Multithreaded Searching.
dtSearch developer products provide efficient multithreaded searching, with no limit on the number of concurrent search threads.
For online search, the products can run in a completely
stateless manner, making it very easy to scale. A success story
from Intel® describes dtSearch’s “perfect score” for highvolume web-based concurrent searching:
Federated Searching and the dtSearch Spider.
The dtSearch Engine multi-threaded indexed search demo achieved 100 percent parallel time in the Intel Concurrency Checker test, indicating full optimization for multi-core hardware ... The relationship between Intel and dtSearch stretches back a number of years ... [and] generates synergies that deliver excellent performance and other benefits to end-customers, including internal customers at Intel. (View a PDF of this Intel® Software Partner Program Success Story.
dtSearch products provide federated search across any number of directories, emails (with nested attachments), and databases. The dtSearch Spider adds local and remote online content to a search. The Spider can index sites to any level of depth, with support for public and secure online content, including log-ins and forms-based authentication. dtSearch products provide integrated relevancy ranking with highlighted hits
across both online and offline data.
Faceted Search and Other Data Classification Options.
The dtSearch Engine developer APIs support categorization based on document full-text contents, internal document metadata, database content, or data attributes associated with documents during document indexing. The dtSearch Engine also has APIs for other advanced data classification
options, such as faceted search and full-text and/or metadatabased
positive and negative variable term weighting.
25+ Search Options and International Languages.
dtSearch products have over 25 search types, including forensics search options. For international languages, the products support Unicode, including right-to-left languages and special Chinese/Japanese/Korean character options.
These include native 64-bit and 32-bit APIs for .NET
(through current versions), C++ and Java. For Azure, the dtSearch Engine can run online with searching via
Microsoft’s RemoteApp, giving search the “look and feel” of
a native application under Windows, Android, iOS or OS X.
Document Filters and Supported Data Types
dtSearch’s proprietary document filters support parsing,
indexing, searching and display with highlighted hits
and metadata across a broad range of data types.
Document Filter APIs.
- Web-ready content: supports integrated images and text in HTML, XML/XSL, PDF, ASP.NET, CMS, PHP, SharePoint, etc.
- Other databases: supports XML, Access, XBASE, CSV, etc.; dtSearch Engine APIs support No-SQL and SQL-type databases, along with the full-text of BLOB data.
- MS Office formats: supports integrated browser-ready image and text in Word (RTF/DOC/DOCX), PowerPoint (PPT/PPTX), Excel (XLS/XLSX), Access (MDB/ACCDB) and OneNote (ONE).
- Other “Office” formats, PDF, compression formats: supports other “Office” suite formats; compression formats like RAR, ZIP, GZIP and TAR; PDF, PDF Portfolio, and many encrypted PDFs.
- Emails and attachments: supports integrated browserready images, text and attachments in Outlook/Exchange (PST/OST/MSG) and Thunderbird (MBOX/EML).
- Recursively embedded objects: supports recursively embedded objects and images in supported email types and MS Office formats. For example, the dtSearch document filters would support an email attachment consisting of a ZIP container including both a PDF and an Access database, where the latter also includes an embedded PowerPoint with embedded images.
All developers APIs (C++, Java and .NET through current versions) make available to developers dtSearch’s text parsing, extraction, conversion and hit-highlighting
- An “object extraction” API lets developers navigate through the structure of each embedded object as a hierarchy, and optionally extract each object, such as an image in an MS Word file embedded in an MS Access database, compressed and attached to an email.
- General dtSearch Engine licenses include the document filters along with dtSearch indexing and searching functionality.
- The document filters are also available for separate license for developers requiring text parsing, extraction and conversion “only,” without search.
dtSearch Corp. typically releases a new version of the dtSearch product line every 3 - 4 months. Because of the very large developer installed base currently using the dtSearch Engine, the company strives wherever possible to maintain backwards compatibility in the developer API.
Typically, a new release will provide support for new formats and operating systems, and of course add new product line features. The dtSearch website includes detailed release notes
spanning many years.
Because of the many file format changes that the dtSearch product line must keep up with, as well as coverage for new operating systems and the like, dtSearch encourages customers to stay current on dtSearch releases. Please sign up
for automatic notifications of new version releases. And see upgrades
for information on current downloads.
With each new release of the dtSearch product line, the company generally makes available a new beta version for download. The posted release notes
contain information on the new beta features as well.