Rozdíly

Zde můžete vidět rozdíly mezi vybranou verzí a aktuální verzí dané stránky.

--- en:architecture [07.09.2018 13:00]
mach@cesnet.cz
+++ en:architecture [10.09.2018 17:20]
mach@cesnet.cz
@@ Řádek 1: / Řádek 1: @@
 ====== Architecture ======
-The //Mentat// system has been designed as a distributed modular system with an emphasis on its easy extendability and scalability. The core of the system reflects the architecture of MTA system [[http://www.postfix.org/|Postfix]]. It consists of many simple modules/daemons, each of which responsible for performing a particular task. This approach enables smooth parallelization and extendability. All modules use the same core service framework, thus making implementing new modules an easy task.
+The //Mentat// system has been designed as a distributed modular system with an emphasis on its easy extendability and scalability. The core of the system reflects the architecture of MTA system [[http://www.postfix.org/|Postfix]]. It consists of many simple modules/daemons, each of which responsible for performing a particular task. This approach enables smooth process-level parallelization and extendability. All modules use the same core framework built on top the [[https://alchemist.cesnet.cz/pyzenkit/doc/production/html/manual.html|PyZenKit]] framework, thus making implementing new modules an easy task.
-The original //Mentat’s// design presupposed features and tools enabling to collate and share security information. This function has, however, later been taken over by a twin project [[https://warden.cesnet.cz/en/index|Warden]] with slightly humbler ambitions and simpler but ultimately better design. At present, the [[https://warden.cesnet.cz/en/index|Warden]] system has profiled as a **single communication channel for sharing security information** and the //Mentat// system as **a tool for streamlined security information processing**. //Mentat’s// source codes still contain some remains of protocols and components for data sharing between remote nodes.
+Mentat itself does not have any network communication protocol for receiving events or messages from the outside (however nothing stops you from implementing your own module). Instead it relies on the services of [[https://warden.cesnet.cz/en/index|Warden]] system, which is the security information sharing platform.
 ===== Technical background =====
@@ Řádek 21: / Řádek 21: @@
 {{ ::mentat-architecture.png?nolink |Aktuální stav architektury systému Mentat}}
-The //Mentat// system consists of tools allowing processing events both in real time and retrospectively over a particular period of time. At present, the following modules for real time processing are available:
+The implementation language is strictly [[https://www.python.org/|Python3]] with no attempts whatsoever to be
-  * **mentat-inspector.py**\\ This module enables the processing of [[https://idea.cesnet.cz/|IDEA]] messages based on the result of given filtering expression. There is a number of actions that can be performed on the message in case the filtering expression evaluates as ''true''.
+compatible with aging Python2. The system uses the [[https://www.postgresql.org/|PostgreSQL]] database as persistent data storage. The system uses the [[https://idea.cesnet.cz/en/index|IDEA]] as data model, which is based on [[http://www.json.org/|JSON]] and which was specifically designed to describe and contain a wide range of different security events and with further extendability in mind.
-  * **mentat-enricher.py**\\ This module enables the enrichment of incoming [[https://idea.cesnet.cz/|IDEA]] messages through the following sequence of tasks: [[https://idea.cesnet.cz/|IDEA]] notification validation, resolving target abuse’s contact (for the reporting purposes), detection of event’s specific type (to enable notification formatting) and geolocation resolving. Implementation of further operations is planned: hostname/ip resolving, passive DNS, …
-  * **mentat-storage.py**\\ This module enables to store incoming [[https://idea.cesnet.cz/|IDEA]] messages in a database ([[https://www.mongodb.org/|MongoDB]]).
+The //Mentat// system consists of tools allowing processing events both in real time and retrospectively over a particular period of time. At present, the following most important modules for real time processing are available:
+  * **mentat-inspector.py**\\ This module enables the processing of [[https://idea.cesnet.cz/|IDEA]] messages based on the result of given filtering expression. There is a number of actions that can be performed on the message in case the filtering expression evaluates as ''true''. The most common and useful usecases are message classification, verification, filtering or conditional procesing branching. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-inspector.html|(more information)]]
+  * **mentat-enricher.py**\\ This module enables the enrichment of incoming [[https://idea.cesnet.cz/|IDEA]] messages with additional information, like resolving target abuse`s contact (for the reporting purposes), geolocation and ASN resolving. Implementation of further enrichment operations is planned and custom enrichment plugins are supported (hostname/ip resolving, passive DNS, …) [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-enricher.html|(more information)]]
+  * **mentat-storage.py**\\ This module enables to store incoming [[https://idea.cesnet.cz/|IDEA]] messages in a database ([[https://www.postgresql.org/|PostgreSQL]]). [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-storage.html|(more information)]]
-Most modules enabling retrospective event processing are based on regularly re-launched scripts (i.e. crons). At present, the following modules enabling retrospective event processing are available:
+Most modules enabling retrospective event processing are based on regularly re-launched scripts (i.e. **crons**). At present moment the following modules enabling retrospective event processing are available:
-  * **mentat-statistician.py**\\ This module enables statistical processing of events over a given self-defined period. At present, the feature is configured to five-minute intervals. For each of these intervals, it determines the frequency of events according to detector type, event type, IP address etc. These statistical reports are stored in a separate database and can later support an overview of system’s operation, provide underlying data for other statistical reports or for the creation of dictionaries for a web interface.
+  * **mentat-statistician.py**\\ This module enables statistical processing of events over a given self-defined period. At present, the feature is configured to five-minute intervals. For each of these intervals, it determines the frequency of events according to detector type, event type, IP address etc. These statistical reports are stored in a separate database and can later support an overview of system’s operation, provide underlying data for other statistical reports or for the creation of dictionaries for a web interface. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-statistician.html|(more information)]]
-  * **mentat-reporter-ng**\\ This module enables to distribute periodical event reports directly to end abuse contacts of responsible network administrators. More information about the reporter can be found at [[:cs:reporting|reporter’s website]].
+  * **mentat-reporter.py**\\ This module enables to distribute periodical event reports directly to end abuse contacts of responsible network administrators. More information about the reporter as a service provided by [[https://www.cesnet.cz/?lang=en|CESNET, a.l.e]]. can be found at official [[https://csirt.cesnet.cz/cs/services/mentat|Mentat service]] webpage. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-reporter.html|(more information)]]
-  * **mentat-briefer**\\ This module is similar to the above described reporter. It provides periodical summary reports on system’s statuses and reports sent.
+  * **mentat-informant.py**\\ This module is similar to the above described reporter. It provides periodical summary reports on system’s statuses and reports sent. It is most useful for system administrators or for target abuse contacts as status overview. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-informant.html|(more information)]]
-  * **mentat-backup.py**\\ A configurable module enabling periodical database backups. At present, a full backup of system collections (users, groups …) is created once a day while [[https://idea.cesnet.cz/|IDEA]] message collection is backed up incrementally.
-  * **mentat-cleanup.py**\\ A configurable module enabling periodical database cleanups.
+Little bit on the side is a big collection of utility and management scripts and
-  * **mentat-precache.py**\\ A configurable module enabling data caching, in particular of various dictionaries for web interface.
+tools that attempt to simplify repeated dull tasks for the system administrator. Some of the most useful ones are following:
-  * **hawat-negistry**\\ A feature enabling data synchronisation between **Negistry** and Mentat’s system database. It synchronises abuse groups and address blocks assigned to them.
+  * **mentat-controller.py**\\ A script enabling to control all configured deamons/modules on a given server.[[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-controller.html|(more information)]]
+  * **mentat-backup.py**\\ A configurable module enabling periodical database backups. At present, a full backup of system tables (users, groups …) is created once a day while [[https://idea.cesnet.cz/|IDEA]] event table is backed up incrementally. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-backup.html|(more information)]]
+  * **mentat-cleanup.py**\\ A configurable module enabling periodical database and filesystem cleanups. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/bin_mentat-cleanup.html|(more information)]]
+The last important component of the system is a web user interface:
+  * **Hawat**\\ Customizable and easily extentable web user interface based on [[http://flask.pocoo.org/docs/1.0/|Flask]] microframework. [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/hawat.html|(more information)]]
+===== Module architecture =====
+As mentioned above, all system modules, including continuously running deamons or periodically launched scripts, use a simple common framework called
+[[https://alchemist.cesnet.cz/pyzenkit/doc/production/html/manual.html|PyZenKit]], which ensures all common features:
+  * Application life-cycle management.
+  * Configuration loading, merging and validation.
+  * Daemonisation.
+  * Logging setup.
+  * Database abstraction layer.
+  * Abstract layer for working with IDEA messages.
+  * Statistical data processing.
+  * WHOIS queries, DNS resolving.
+==== Daemon module architecture ====
+All continuously running deamons operate as *pipes*, i.e. the message enters on one side, the deamon performs relevant operations and the message reappears on the other side. To facilitate report exchange between individual deamons, alike in MTA Postfix, the message queues are implemented by means of simple files and directories on filesystem.
+Internally the daemon modules use the event driven design. There is the infinite event loop and events are being emited from different parts of the application and then are ordered into event queue. Scheduler takes care of fetching next event to be handled and is reponsible for calling appropriate chained list of registered event handlers.
+To further improve code reusability each of the daemon modules is composed of one or more *components*, which are the actual workers in this scenario, the enveloping daemon module serves mostly as a container. There are different components for different smaller tasks like fetching new message from filesystem queue, parsing the message into object, filtering the message according to given set of rules etc. These components can be chained together to achive desired final processing result.
+Following diagram describes the internal daemon module architecture with emphasis on event driven design and internal components:
+{{ ::mentat-daemon-architecture.png?nolink |Architecture of the Mentat daemon module}}
+So, when implementing a new deamon, one only needs to design and implement the actual processing; everything else is provided for automatically, including the selection of a message from the queue and subsequent upload into the queue of another daemon in the processing chain.
+==== Filesystem message queue protocol ====
+To facilitate message exchange between daemon modules a very simple filesystem-based message exchange protocol (aka. *filer protocol*) was designed and implemented. It is inspired again by [[http://www.postfix.org/|Postfix MTA]].
+The protocol uses designated filesystem directory with following substructure for exchanging the messages:
+  * **incoming**: input queue, only complete messages
+  * **pending**: daemon work directory, messages in progress
+  * **tmp**: work directory for other processes
+  * **errors**: messages that caused problems during processing
+Key requirement for everything to work properly is the **atomic move** operation on filesystem level. This requirement is satisfied on Linux system in case the source and target directories in the move operation are on the same partition. Therefore never put queue subdirectories on different partitions and be cautious when enqueuing new messages. To be safe use following procedure to put new message into the queue:
+  - create new file in **tmp** subdirectory and write/save its content
+  - filename is arbitrary, but must be unique within all subdirectories
+  - when done writing, move/rename the file to **incoming** subdirectory
+  - move operation **must be atomic**, so all queue subdirectories must be on same partition
+Following diagram describes the process of placing the message into the queue and the process of fetching the message from the queue by the daemon module:
+{{ ::mentat-queue-protocol.png?nolink |Mentat message queue protocol}}
+Blue arrows are filesystem operations performed by the external process, that is placing new message into the queue. It is clear, that according to the procedure described above the message is first placed into the **tmp** and then atomically moved into **incoming**. Red arrows indicate filesystem operations performed by the daemon process itself. By atomically moving the message from **incoming** to **pending** it marks the message as *in progress* and may now begin the processing. When done, the message may be moved to the queue of another daemon module, moved
+to the **errors** in case of any problems or even deleted permanently (in case the daemon module is last in message processing chain).
+The atomic move operation from **incoming** to **pending** serves also another purpose, which is locking. When multiple instances of the same daemon module work with the same queue the atomicity of the move operation makes sure that each message will be processed only once. All daemon modules are prepared for this eventuality and are not concerned when messages magically disappear from the queue.
+==== Web interface architecture ====
-The last important components of the system are administrative interfaces:
+The web interface for Mentat system is called [[https://alchemist.cesnet.cz/mentat/doc/production/html/_doclib/hawat.html|Hawat]] and it is built on top of the great [[http://flask.pocoo.org/docs/1.0/|Flask]] microframework. However the *micro* in the name means, that to make things more manageable and
-  * **hawat**\\ A web interface for the Mentat system. The interface enables in particular to search through the event database and sent reports, system statistics and overviews and to configure the entire system and the reporting algorithm in particular.
+easier a suite of custom tools had to be implemented to enable better interface component integration.
-  * **hawat-cli**\\ CLI interface for system administrators enabling the automation of certain acts relating to the administration of the Mentat system.
-  * **mentat-controler**\\ A script enabling to control particular deamons/components on a given engine.
-===== Current component architecture =====
+[[http://flask.pocoo.org/docs/1.0/|Flask]] already provides means for separating big applications into modules by the [[http://flask.pocoo.org/docs/1.0/blueprints/|blueprint]] mechanism. This is used very extensively and almost everything in the web interface is a pluggable blueprint.
-As mentioned above, all system features, including continuously running deamons or periodically launched scripts, use a simple implementation framework which ensures all common actions:
-  * Configuration loading and validation;
-  * Deamonisation;
-  * Log initialisation;
-  * Database abstract layer;
-  * Abstract layer for working with IDEA messages;
-  * Statistical data processing;
-  * WHOIS queries, DNS resolving;
-  * Formatting and report distribution.
-All continuously running deamons operate as ‘pipes’, i.e. the report enters on one side, the deamon performs relevant operations and the report reappears on the other side. To facilitate report exchange between individual deamons, alike in MTA Postfix, the file system and queues implemented by means of files and directories are used. Thus, all deamons alike use the predefined feature **Mentat::Processor** which ensures correct, easy and configurable configuration upload, log setting, deamonisaton, launches the processing using event service, correct ending at the end, etc. When implementing a new deamon, one only needs to configure the processing; everything else is provided for automatically, including the selection of a report from the queue and subsequent upload into the queue of another deamon in the processing chain.

Rozdíly

Links

Contact

HelpDesk