Extext.exe is a tool for extracting text from large binary files, such as undeleted data recovered from a hard disk. It converts the input data to a series of small Unicode text or HTML files containing extracted text. Extext assumes that the input data consists of fragments of files rather than a single complete document and so looks for sequences of data that appear to be text, Unicode text, or UTF-8 text.

Input Files
Use Add Files to select one or
more binary files to process. Add folder
will add all files in a folder tree. You can also drag and drop files
onto the Extext dialog box. Each file can be up to 2 Gb in length. There
is no limit on the total size of the input files.
Output folder for extracted
text files
Extracted text files will be written to this folder. Each output file will
be named after the input file, with a number appended to the end.
Input chunk size (KB)
The input chunk size controls how many files will be created from each
input file. For example, if the input is a single 500 MB binary file,
and the input chunk size is 1024 KB (1 MB), then 500 output files will
be created, one for each megabyte of the input.
Type of output to create
Filtered text can be written either as Unicode Text files or as HTML files.
Both formats can hold Unicode data. HTML files include, in front of each
extracted sequence of text, an HTML comment identifying where in the input
file the data was found, and how it was stored in the original. Example:
<!-- @00072a5c Unicode--> New Zealand
This comment indicates that the Unicode text "New Zealand" was found at byte offset 72a5c in the original data. Because this information is stored in a comment, it is not visible when you open the HTML file in a browser, and it will not affect indexing or searching. To see the HTML comments, open the HTML file in a text editor like Notepad.
The "No filtering" option lets you use Extext as a simple file splitter. It will break each file into smaller chunks according to the input chunk size, without modifying the data in any way.
Languages to include
Selecting languages to include in the filtering helps Extext to separate
valid Unicode text from random binary data. For example, if you select
"Arabic", Extext will look for sequences of Unicode characters
in the 0x0600-0x06ff range. Extext will use the language selections to
help it to find Unicode text, but it may still report some text in other
languages that appears to be present in the input data.