Features Map – Instantly Search Terabytes

Faceted Search and Data Classification

General Metadata Search Option

  • dtSearch products can index and search metadata in all supported file types and databases: MS Access, Word, PowerPoint, OneNote, other “Office” files, compression formats (RAR, ZIP, GZIP, TAR), XBASE, CSV, HTML, XML/XSL, PDF, PDF portfolio, ASP.NET, CMS, PHP, SharePoint, etc.
  • dtSearch products can also index and search metadata in emails such as Outlook/Exchange (PST/OST/MSG) and Thunderbird (MBOX/EML), including fields in both emails and in attachments (even recursively embedded attachments). More email options
  • Detection and support of metadata in all supported data types is automatic.
  • In addition to locating general search terms in metadata, dtSearch products can also limit searching to specific fields: (Author contains John Smith) and (Subject contains turbine generators).
  • For XML data, dtSearch supports hierarchical field structures, including both fields and attributes, enabling highly refined nested field queries.
  • Variable term weighting can also apply to fields: (description:5 contains (apple and pear)) or (author:2 contains smith).
  • Developers, see the Databases and Field Searching section of Selected Articles by Subject for additional options relating to relevancy ranking, sorting search results and metadata display in search results.

SQL, NoSQL, Disk Images, Data Streams and Other Non-File Data


The dtSearch product line can index SharePoint data in three ways:

Faceted Search

  • Faceted search is an option for classifying data at the user interface level. Using faceted search, the end-user can “drill down” through various topics to hone in on the correct topical subset prior to performing a search.
  • The dtSearch Engine offers APIs to implement faceted search.
  • CodeProject also has two articles on faceted search and the dtSearch Engine.

Additional API Options for Data Classification

Crossing the Full-Text Search / Fielded Data Divide from a Development Perspective
  • While faceted search typically operates at the user interface level, the dtSearch Engine also offers additional options for data classification for filtering data prior to search results delivery.
  • A wide range of API filters and objects can categorize documents and other data via full-text document contents, document metadata content, backend database content, or data attributes attached during document indexing.
  • For example, search filter objects work in conjunction with end-user query dialects to allow multiple users with different security classifications to search the same document collection, without having to maintain separate indexes corresponding to each classification level.
  • A single query can include an “exact phrase” full-text end-user search request; second-level Boolean search expressions (such as one or more field or metadata search criteria); and developer-added filtering expressions (such as a filtering expression to filter out documents that do not match a user’s security classification).
  • For more information on database indexing, adding metadata "on the fly" while indexing, API-driven sorting and relevancy-ranking options, and displaying a synopsis or other metadata in search results, see the Databases and Field Searching topic and the Displaying Search Results topics in Selected Articles by Subject.