Products — Spider

 

dtSearch Spider is a dtSearch Desktop, Network, Web and Engine feature that extends a “local” search to a remote Web site.
Operating through dtSearch Web, the Spider can expand the scope of the searchable database beyond a site's own data to content on a third-party Web site.
Operating through dtSearch Desktop and dtSearch Network, the Spider can expand the scope of the searchable database beyond a local or local area network database to Web site content.
The dtSearch Engine includes a new .NET Spider API, providing access to full dtSearch Spider functionality.
Spider Features
The Spider can index and search publicly available sites, secure content HTTPS sites, and password-accessible sites. The Spider also supports forms-based authentication.
A single search request can return fully-integrated search results, spanning local and remote content, including:
 
hit-highlighted display of Web-ready file types such as HTML, PDF and XML, including display of images, formatting and links.
conversion of other file types ("Office," Unicode, ZIP, etc.) to HTML for browser display with highlighted hits.
support for dynamically-generated content (ASP.NET, MS CMS, SharePoint, etc.) with highlighted hits.
The Spider can perform "vertical" searching of pages linked from a URL, as well as "horizontal" crawling of sites linked to a URL.
The Spider can limit indexed data by file size, file number, time on a Web site, etc.
For a Spider demo operating through dtSearch Web, click here. (The www.dtsearch.com spidered site is hosted on a completely different server than that of the dtSearch Web search demo.)
 
The dtSearch product line can instantly search terabytes of text across a desktop, network, Internet or Intranet site.
dtSearch products also serve as tools for publishing, with instant text searching, large document collections to Web sites or CD/DVDs.
over two dozen indexed, unindexed, fielded and full-text search options
highlights hits in HTML, XML and PDF, while displaying embedded links, formatting and images
converts other file types — word processor, database, spreadsheet, email and full-text of email attachments, ZIP, Unicode, etc. — to HTML for display with highlighted hits
built-in Spider adds a third-party or other Web site (public, secure content, password accessible, etc.) to your searchable database
Spider supports Web-based content (HTML, PDF, XML, etc.) as well as dynamically-generated content (ASP.NET, MS CMS, SharePoint, etc.)
General supported file types
SQL and similar data sources