Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

GEODI Classifier is a set of tools for manual and automatic classification. This page is about configuring central management of the tools.

Table of Contents

Classifier Module Configuration

Use “Activate Classification Tools” on the last page of Project Wizard to open the management dialog. The classes, rules, header, footer, and watermark grans are all set here. There are ready project templates; we suggest you start with one to have ready-to-use settings.

(info) As a best practice, create a separate project for classification other than search and/or discovery. The project for classification does not need to have data.

Image Removedimage-20241111-113909.pngImage Added

Classes

In this tab, we will define the class labels, their rules, and their appearance in plugins (Microsoft Office, others). Classes should be created with the most important at the top.

  • ID: You must give a unique value for the class.

  • Name: The text of the class that you want to appear in the interfaces.

  • Category: You can create a classification tree by giving different categories. It is useful if you have many classes. By default, it can be empty.

  • Description: With the description, you write down what you need to know about the class. This text will be a guideline for the users of the plugins. Descriptions will be displayed in the classification interface of the users.

  • Query: Documents that match the query automatically take the corresponding class. GEODI Query Rules are valid. There are predefined queries (predefined:ClassPII, predefined:ClassSecret,...) to help you use the same queries in classes, panels, and other places. This is a good way to simplify configuration and management:

  • Tags: Tags are the key and value pairs to write In MS Word, PDF, or ADS files. If you plan to switch to GEODI from another classification solution, GEODI adapts the existing schema. The transition will be seamless. Please check GEODI Classifier Label/Tag Definitions - geodi-en - Confluence (atlassian.net)

  • Header, Footer, and Watermark values are valid only for MS Office software. If the user chooses this class, the text will be embedded into the document.

    • You may use %User% to have a user name in the values.

    • You can use \r or \n to move to the next line for multiple lines of text. For example, “Personally\rIdentifialbenIdentifialbe\rInformationnInformation”.

  • Auto classification chooses the last class if the content does not match any other query.

Behaviour

Default Classification rules are set here. These rules are automatically updated for all clients in about 10+ minutes. You may override/change rules by user, group, IP, or classification tools using the “Customize” tab.

...

image-20241111-113934.pngImage Added

  1. Auto Classify Behaviour: Determines how auto-classification works.

    1. Use as a suggestion → The user may or may not use auto.

    2. Disabled → Automatic classification is off

    3. Do not select the class under auto → Users can not choose classes lower than the auto.

  2. Ask Classes on Save: MS Office add-ins open a dialog to choose classes when saving/close or printing. This option determines when the dialog opens.

    1. Show when necessary → If auto-classification is possible or the document already has a class, the dialog does not open.

    2. Always → The dialog opens after each change.

    3. Never - Manuel Only → User can open the dialog manually.

  3. Use OS Meta: Enables ADS for documents other than Microsoft Office and PDF. ADS labesl are created only by Shell/Desktop ClassifierMicrosoft Office, Libre Office, and PDF formats can be tagged with ADS. These tags are only created by the Shell/Desktop classification tool.

  4. Allow Class Lowering: Determines if a user can choose a lesser class for an already classified document. This is an important setting and you may override it by user, group, or IP.

  5. Classifying internal emails: When checked, same domain mails are to classifiedthe feature is used, if the sending domain address and the target domain address are the same, the email can be sent without asking for classification.

  6. Log Format: By default, all classification activities classified documents are logged. Here you choose log format and mediumThis log is stored on the server side. We can deactivate this process or change the logging format. The logs are in the same location as other GEODI logs.

  7. Do not use images for Header/Footer: By default, headers and footers are used as images in Excel. If this option is selected, headers and footers will be used as text in Excel.

  8. Do not use images for Watermark: By default, the watermark is used as an image in Excel. If the 'Do not use images for Watermark' setting is selected, the watermark will be disabled in Excel.

Customize

With customization, the default rules can be changed based on User, Group, IP, or the application.

...

Available Classes: You may want some classes to be offered only by certain groups or departments. In this case, you can specify which classes are available for each customization.

Pop-Up Texts Settings

Use the terminology by your preferences.

...

Class Not Defined Icon: The icon of the class indicating that the document to appear in unclassified documents is unclassified.

Setup

The software automatically generates the token and MSI parameters required for installation. Please refer to the client pages and OWA installation page for details.

...

Troubleshooting

...

Clients will be updated in about 10 minutes when the Classifier settings are changed in the GEODI interface.

...

If office add-ons or Desktop Classification is not active

  • Check that the client installation is complete

  • Must have access to the GEODI server

  • GEODI Token validity must be checked

...

Installation can be performed in an environment without internet / in cases where GEODI cannot be reached. The Classifier plugin is automatically activated when a connection to GEODI is established.

...

ADS is used for classification in other file types except for MS Office and PDF files. In a classified file, whether the classes are preserved or not as a result of the following operations are written.

  • The name of the file has been changed (Class Preserved).

  • The file extension changed. txt → log - mp4 → avi (Class Preserved)

  • The file was copied to another computer without GEODI Classifier and checked (Class Preserved)

  • File copied over RDP connection (Class Not Preserved)

  • Classified, uploaded to Wetransfer, and downloaded. (Class Not Preserved)

  • The file has been classified. Compressed as rar/zip and extracted. (Class Not Preserved)

...

In case there is no access to the GEODI server, manual classification can be used. You can use other classes except for automatic classification through the last meta in communication with GEODI.

  • When connected to the GEODI server, the operations performed are logged.

  • In Shell classification, the automatic option is unavailable if it is not connected to the server.

  • In the Office plugin, "automatic (offline)" is displayed if the server is not reachable.

  • Changes made (adding a class) or falling off the server are checked periodically (5 minutes), not instantly, so as not to slow down your operations. For this reason, the changes made are not instantly reflected on the Add-In and Shell interface.

Suppose many mails are sent for Mail Merge use when the %AutoClass% statement is added to the document to be classified. In that case, it is automatically classified without asking for the class when sending multiple mails.

...

For example, if a mail merge is made over Word when %AutoClass% is written in the document, the classification is made automatically, and the class is not asked for every mailing.

...

Ignoring e-mail signatures

E-mail signatures contain the sender's PII information. To ignore this, you must either change e-mail server settings or prepare a dictionary.

The E-mail server should add a signature after the classification/or just before sending. The method for exchange is https://learn.microsoft.com/en-us/exchange/security-and-compliance/mail-flow-rules/disclaimers-signatures-footers-or-headersthe link. Your e-mail server may provide different methods. This method solves only the problem only for the first e-mail. In mail chains, signatures are accumulated. A GEODI dictionary method works for all situations.

  1. You must have a senders list that contains at least the values used in signatures (name, phone, e-mail, etc.).

  2. GEODI can use Excel or a Table as a dictionary. You must generate the signature using Excel formula or SQL. Please check the attached sample Excel. Excel should be updated manually.

  3. Add this dictionary to the discoveries list for the classification project.

  4. That is all.

  5. (info) The method works only if the dictionary and mail signatures match exactly. So before release, we suggest running a few tests for sample senders.

  6. (info) This dictionary can also be used for e-mail discovery to avoid signatures recognized as PII for old emails.

  7. (info) Sender’s actual PII data, in e-mail body, contracts, medical records, etc., will still be recognized.

Sample Excel

View file
nameSampleMailSignatureIgnoreDictionary.xlsx