FILEFLEX PROGRAMMER MANUAL


CHAPTER 16
Full-Text Search and Retrieval

While FileFlex character fields are fixed in width at the time of database design, and can store only up to 256 characters each, memo fields have no such restrictions. You can comfortably place descriptions, notes, reports, letters, and other long information into FileFlex memo fields.

Easy Full-Text Search

FileFlex provides a simple, yet powerful function that allows you to find text anywhere in a memo field. The DBFindMemo function searches the specified memo field in the current database for the string specified. This is something of a brute-force searching mechanism. FileFlex checks every record, scanning each field of each record for the matching search string:

DBFindMemo("NOTES","status on FileFlex") into dbResult

Building a Complex Indexed Full-Text Search

If your database is very large, you'll find that DBFindMemo will take some time to find your record. But you can combine a number of FileFlex features together to create a rather sophisticated fully-indexed full-text search.

Note: Unlike normal indexes, this mechanism doesn't dynamically update the full-text index unless the index is rebuilt. Therefore it's most appropriate for use with CD-ROM or unchanging information. There are two phases to performing a complex indexed full-text search:

Constructing a Full-Text Index

Before you can build your index, you need to think through the indexing architecture. The most important database is the file where your memo field is located. Let's call this the "memo database". When the user enters a search word, we want to move the record pointer in the memo database to the first matching record.

You'll need to create two additional files: a database file (let's describe it as the "word database") containing the words to be searched and an index into the word database file.

Define the word database with two fields: a fixed width field for each word in the memo file, and a fixed width field containing the physical record number of the record in the memo file containing the word.

Alternatively, the you can replace the record number with any other unique key that you can search for, but you'll need an extra index file if you use this technique.

Finally, you'll need a standard index file that uses the fixed word field as the indexed field and the word file as the data file. So, here's what you've got so far:

Now you have to write a utility routine that follows the following algorithm (remember that this is a routine that you as the developer will run, not one used by the end-user of your software):

For every record in the memo database...read the contents of the memo field into a variable

For every word in the variable...write the word and the current memo database record number to a new record in the word database

Make sure the index file is up-to-date with the data in the word database.

At this point, you've got everything you need to do an incredibly fast, indexed full-text search. Let's look at how it'll run in your application to retrieve a value.

Searching Using a Full-Text Index

You get the word to be searched via some form of user input. You then pass that word to a full-text seek routine that does the following:

You can, of course, write the routine to iterate through the list of matching words if you want to build a table of all the records that contain a given word.


Discuss this chapter on the FileFlex Boards.


Copyright © 1992-1998 by David Gewirtz, under license to Component Enterprises, Inc.
Contact: info@component-net.com

Casa de Bender