Indexing

Indexing is what GEODI crawls all data and runs the discovery. GEODI creates brief information about the data and uses this brief to answer searches, reports, and all others. GEODI will not need the original data unless you want to open it in a viewer or start indexing.

Continuous Discovery

Once you index data sources, the process will repeat automatically for new data(rows, files, emails, etc.). You do not need to intervene; just tell GEODI the recheck period. This is done in Project Wizard.

Index Storage

The index needs some storage. The size will be much smaller than the data, but it is unpredictable. You may assume 1/10 generally.
Options like sampling mode or similarity indexing., affect the index size.
A backup space for the index should also be reserved for uninterrupted service.

Indexing Time

The time very depends on the CPU, Memory, Disk and other resources of the Server that GEODI runs on.
Data throughput of the Network and Disk of data sources are also very important.
Options like OCR or FacePro greatly affect the performance.
The GEODI indexing engine multitasks and processes multiple documents simultaneously. The Indexing Speed parameter mentioned in the Settings tab affects this speed. If the server has no task other than GEODI, we suggest that you set it to maximum. GEODI will use the CPU as much as possible but leave at least a core for user interaction.

Options

The beginning should be “Index all Content”. After that, you may use all other options. If you set scheduled indexing and have periodic backups you will not need to use options other than maintenance needs.

Indexing

Continuous Discovery

Index Storage

Indexing Time

Options

Monitoring Indexing