Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

Monitoring Indexing

GEODI informs you about the progress of indexing. Please be careful about progress bar, which is not lineer. That is GEODI can not know how much time will require to index for future documents, so the progress bar is only an estimate using the previous document indexing time.

Sampling

Sampling is possible for both, structured and unstructured data. Each data source asks you the sampling values. Sampling saves great time for discovery projects. We suggest you always use sampling for DB discovery. For unstructured data sampling is also a good starting point. Start with sampled mode and see what is in data, are there any unnecessary types or are there any permission problem.

Filecontent Filtering

Any corpus contains various file types. Some may not be necessary for the project scope, and some may be too large to disrupt the network or unwanted at all.

...

Expand
titleIndexing is slow

GEODI discovery engine is one of the fastest among other discovery engines. Slow indexing may depend on machine, settings or enviroment.

  1. Check indexing speed and be sure that speed is high

  2. Check engine errors, If a source is throwing too many errors this may slow down the indexing

  3. Another task may be using too much resources,

  4. Too many recognizers may slow down a indexing

  5. Slow disk may slow down indexing. Consider dividing index and putting some part to a fast disk, like SSD.

  6. Use sampling mode if you need a quciker result

Expand
titleToo much CPU is used

High CPU usage for an engine like GEODI should be expected. CPU usage of GEODI never goes to unresponsive machine state. GEODI always leaves one core to other tasks.

  1. CPU usage may be temporary, just wait to see if it drops

  2. If a consistent CPU usage try decreasing indexing speed

  3. OCR or FacePro needs CPU, if you are using these options, decrease indexing speed or wait.

...