Indexing is what GEODI crawls all data and runs the discovery. GEODI creates brief information about the data and uses this brief to answer searches, reports, and all others. GEODI will not need the original data unless you want to open it in a viewer or start indexing.
Table of Contents | ||
---|---|---|
|
Info |
---|
Continuous DiscoveryOnce you index data sources, the process will repeat automatically for new data(rows, files, emails, etc.). You do not need to intervene; just tell GEODI the recheck period. This is done in Project Wizard. |
Info |
---|
Index Storage
|
...
Info |
---|
OptionsThe beginning should be “Index all Content”. After that, you may use all other options. If you set scheduled indexing and have periodic backups you will not need to use options other than maintenance needs. |
Info |
---|
Monitoring Indexing |
Filecontent Filtering
Any corpus contains various file types. Some may not be necessary for the project scope, and some may be too large to disrupt the network or unwanted at all.
GEODI comes ready with default rules to avoid some file types and place some size limitations, collected from the best practices of many Discovery projects. Here, we documented the rules and how you can modify them.
Filters (%100) | Explanation |
---|---|
IgnoreRules | Ignore rules contain file extensions, directory names and some patterns. Default list contains *.DLL, *.SYS, “programm files” and similar. Any file matches ignore rules are not indexed and logged at all. Settings are in:
If you need to override defauls please do in in %appdata% |
KnownFiles | These files are the ones GEODI has a reader like PDF, DOCX etc. The full list is Supported Formats These files are processed as expected unless there is an IgnoreRule or ProtectRule. Ignorule will set the file type invisible. Protectrule may set so size limitation. |
UnknownFiles | By default unknownfile types are ignored. You may override this settings from Project Wizard advanced settings. If you use “only name and date” then all unknown extensions will be indexed. You may add any unwanted to ignore list but these actions requires to run discovery all over again. |
ProtectRules | These rules is to protects system and network again too large files. Protect rules apply to known and unknown files. The content are grouped as local and far. There is no limitation for local content which resides in local folders and network folders. Far means files from GDE, e-mail attachments and files from web pages. Far content is filtered as any file greater than 100Mb, and Compressed files greater than 500Mb are indexed as name only. You will know these files but not their content. Settings are in: <geodi>\Settings\Engine\ResourceBalancing If you need to override defauls please do in in %appdata% |