With the embedded dtSearch Engine, Odyssey Digital Forensics Keyword Searching System’s smart search crosses language and file format ‘barriers.’
|
|
Basis
Technology developed its Odyssey Digital Forensics™ application
to help investigators locate information in
foreign language documents that the investigators
may miss using conventional forensics and search
tools. “Many analysts take it for
granted that their search tools can be used
to locate important keywords in all languages,” says
Basis. “Since popular forensics
tools do not include linguistic processing
modules, this is an incorrect and potentially
dangerous assumption. These tools may
only be finding a small percentage of documents
which contain the specific keywords.”
First,
Odyssey enables experts and non-experts alike
to capture data off hard disks, while also
documenting the integrity and source of the
data. From a captured disk image, Odyssey
Digital Forensics then analyzes the file system
to extract and recover files, and extract text
from them. Odyssey then uses the Rosette® Linguistics
Platform to preprocess multilingual text with
its text normalization functions.
After
text normalization, Odyssey invokes the dtSearch
Text Retrieval Engine functionality to build
a search index. The analyst can then
type in search terms through a custom graphical
interface to optimize retrieval of the international
data, including hit highlighting of international
language search terms. Odyssey supports
Middle Eastern languages (Arabic, Persian),
East Asian languages (Chinese, Korean, and
Japanese) as well as a variety of European
language content.
|
Analysts need not know all the languages of the data to perform searches that quickly bring significant files to the fore.
|
|
“With
the embedded dtSearch Engine, Odyssey Digital
Forensics Keyword Searching System’s
smart search crosses language and file format
‘barriers,’” explains Basis. “Analysts
need not know all the languages of the data to
perform searches that quickly bring significant
files to the fore.”
For
example, “a Chinese document can
be discovered whether it includes Simplified
script used in Mainland China or Traditional
script used in Taiwan. Arabic documents
can be found with prefixes like 'al-' ignored
on keywords, and European verbs can be matched
in different conjugation patterns.” The
screen shot below shows retrieval with hit highlighting
of a Chinese document sample using Odyssey.

|