CHAPTER II
REVIEW OF THE LITERATURE
Overview
The purpose of this chapter is fourfold, thus it is divided into four sections. The first section presents a brief history of blackboard systems in order to provide the background necessary to understand the similarities and differences between the blackboard system and framework developed for this thesis and other blackboard systems and frameworks. (Frameworks may also be refered to as shells. This thesis uses the term framework exclusively.) Next, the second section describes those similaritis and differences and what is unique about the blackboard framework and system that was designed for this thesis. The third and fourth sections follow a similar outline with respect to intrusion detection systems.
A Brief History of Blackboard
Systems
As mentioned in the introduction, the AI community has developed and = implemented many blackboard systems and frameworks. Others have also used the blackboard model to develop intelligent systems. Some recent examples include: MAPP: A Matrix Architecture for Process Planning (Hayes, Gaines, Faheem, and Castano 1997), a system used in mobile-robot navigation (Liscano, Manz, Struck, Fayek, and Tigli 1995), and a Microwave Generic Controller for NASA's Deep Space Network (Ramanna 1995).
The blackboard paradigm has been found useful in many different applications. Rather than attempt a comprehensive review of every existing blackboard system, the purpose of this section is to cover those systems that have shaped the way blackboard systems are designed and implemented today. Hence, this section focuses on the history of a few early blackboard systems and frameworks. Noted are when each system/framework was developed, its purpose, and what new concepts the system designers contributed to the field of blackboard system design.
The first three systems described, Hearsay-II, HASP/SIAP and CRYSALSIS, are blackboard systems designed to solve problems from a specific problem domain. After AI researchers realized the value of the blackboard paradigm, and the amount of effort it takes to build a blackboard system from scratch, they began to focus on constructing frameworks which can be used to build blackboard systems instead of implementing domain specific blackboard systems. Thus, the next five systems discussed here are blackboard frameworks: AGE, Hearsay-III, BB1, and GBB. Finally, several BB frameworks that have been built since GBB are reviewed.
Hearsay-II
Hearsay-II, developed between 1971 and 1976, is the ancestor of all blackboard systems. It was the first implementation of a blackboard system. Hearsay-II's problem solving domain is understanding speech. It "recognizes connected speech in a 1000-word vocabulary with correct interpretations for 90% of test sentences" (Erman, Hayes-Roth, Lesser, and Reddy 1988). Hearsay-II consists of a globally available data structure (the blackboard), a group of problem solvers communicating via the blackboard (the knowledge sources), and a control system. By defining these as the major components of a blackboard system's architecture, Hearsay-II had a major impact on all blackboard systems to follow In addition, the design of Hearsay-II influenced the design of subsequent blackboard systems in the following areas: blackboard contents, problem solving behavior, and KS anatomy. The following paragraphs discuss each of these in detail.
Blackboard Contents. Hearsay-II's global data structure contains the hypotheses proposed by its knowledge sources. These hypotheses are organized into levels (Erman, Hayes-Roth, Lesser, and Reddy, 1988) As a result, most blackboard systems have a global data structure that holds hypotheses which are organized by level.
Problem Solving Behavior. Hearsay II exhibits "opportunistic" problem-solving behavior. Some problem-solving systems work by searching back from possible solutions, through a set of hypothesis, toward evidence that supports a solution. This type of search is referred to as "backward reasoning" (Rich and Knight 1991),or "top-down reasoning" (Erman, Hayes-Roth, Lesser and Reddy 1988). Other systems work by searching data for evidence to support low-level hypotheses. Then searching these lowest level hypotheses for support of higher-level hypotheses and so on until, if possible, enough evidence is found to support some conclusion. This type of search is called "forward reasoning" (Rich and Knight 1991),or "bottom-up reasoning" (Erman, Hayes-Roth, Lesser and Reddy 1988). Hearsay-II's knowledge sources look for opportunities to do either forward reasoning or backward reasoning, whatever it takes to solve the problem. The "system's ability to exploit selectively its best data and most promising [problem solving] methods" has been termed "opportunistic problem solving" (Erman, Hayes-Roth, Lesser and Reddy 1988).
KS Structure. Each Hearsay-II KS consists of two major components: a precondition component and an action component. The action component is the part of the KS that generates and modifies hypotheses. One might assume that the precondition component is not active. This is not the case. See the following text that describes the precondition component:
The purpose of the precondition is to find a subset of hypotheses that are appropriate for action by the KS and to invoke the KS on that subset¼ For example, the precondition of the KS that generates word hypothesis based on syllables looks for new syllable hypothesis¼ To keep from having to fire the precondition continuously to search the blackboard, each precondition declares to the blackboard handler in a nonprocedural way the primitive kinds of blackboard changes in which it is interested. (Lesser and Erman, 1988).
Blackboard system designers have since divided their knowledge sources into at least two components, a precondition component and an action component, though some have created more complex designs. For example, a BB1 KS consists of five distinct components (Hayes-Roth and Hewett 1989).
HASP/SAIP
The HASP/SIAP system was developed between 1973 and 1980. HASP/SIAP's problem domain is that of interpreting signals from a network of sonar sensors spread across the ocean floor. HASP/SIAP's designers were looking for a new problem solving methodology. When they came across the blackboard method, they decided to use it because it fit the task at hand so well (Nii, Feigenbaum, Anton, and Rockmore 1988). The HASP/SIAP design introduced the use of a "strong" control mechanism and a control knowledge source (CKS) to blackboard design. The use of a strong control mechanism and a CKS are discussed in the following paragraphs.
Weak Control vs. Strong Control. Before learning the specifics of the HASP/SIAP control system one should understand the difference between weak and strong problem solving methods. In general, there are two ways one can solve a problem. One, referred to as the "weak" method, is to use a general rule that relates to the task at hand. Weak methods include search, generate-and-test, and hill climbing. (For = a complete list see Rich & Knight page 63.) Searching for the exit of a maze by following the right wall of the maze is an example of solving a problem using a weak method. The other problem solving methodology, referred to as the "strong" method, consists of using knowledge specific to the problem. Using a map, with the route marked, and a compass to navigate through a maze is a strong problem solving method. Now it should be easy to understand the differences between the control systems of Hearsay-II and HASP/SIAP as described in the following paragraphs.
In contrast with Hearsay-II, which uses search, and a heuristic scheduler, weak methods, HASP/SIAP uses a problem solving approach that focuses on using "large amounts of situation-specific knowledge", a strong method (Nii 1988). Every knowledge source consists of a set of rules. Instead of having a specialized control component, a.k.a. the Hearsay-II scheduler, the knowledge sources in the HASP system are arranged in a hierarchy as shown in Figure 3.
Figure 3. Hierarchical control systems. This figure depicts the HASP/SIAP hierarchical control system (Nii, Feigenbaum, Anton, and Rockmore).
Control in a hierarchical control system proceeds in a top-down fashion. In the HASP/SIAP system the strategy KS examines the state of the system and selects a mid-layer KS. The mid-layer KS then either selects a KS at the domain KS layer, or returns control to the strategy KS. If the domain KS finds contributions it can make it makes changes to the blackboard. When finished, the domain KS returns control to the higher layer KS. HASP/SIAP's use of strong problem solving methods has had a great influence on blackboard system and framework design. Many systems/frameworks use rule-based domain knowledge sources and/or strong control. Next the control knowledge source concept is discussed in more detail.
Control Knowledge Sources. The beauty of the concept of a control knowledge source lies not with the fact that it can be used to implement a strong, hierarchical reasoning mechanism. In fact the control KS concept has been extensively used and extended in the BB1 framework which provides a weak control scheme (Johnson and Hayes-Roth 1989). The strength of the control KS concept lies in the fact it allows the blackboard system designer to use the same methods to solve the control problem as are used to solve the domain problem. For example, in the HASP/SIAP system each control KS, like each domain KS, consists of a set of rules (Nii 1986b). This methodology has the advantage of allowing the system architects to design, implement, and debug the control system using the same tools that they use to construct the domain knowledge sources.
CRYSALIS
Given a protein's electron density map (EDM) and any information about the protein that has been discovered by chemical means, CRYSALIS is designed to infer the atomic structure of the protein (Terry 1988). CRYSALIS is a direct descendent of the Hearsay-II system (Engelmore, Morgan, and Nii 1988). The major contribution of CRYSALIS to blackboard design is the concept of blackboard panels. (Terry 1988).
Blackboard Panels. The CRYSALIS designers divided the blackboard into a data panel and an hypothesis panel. Panels permit "additional modularity in the hypothesis elements, the associated knowledge sources and the control. Problem solving within each panel can involve different types of data, different knowledge sources and different control schemes" (Terry 1988). Since CRYSALIS many blackboard systems have been designed with more than one panel/blackboard. The blackboard frameworks described next all allow for more than one panel.
AGE
The Attempt to GEneralize (AGE) project was the first blackboard framework. It "permits the user to build a restricted class of blackboard systems, in which changes to the blackboard are posted on event lists, attention is focused on a particular node [hypothesis] at any point in the problem solving (rather than on a particular KS), and scheduling is based on a set of priorities." (Englemore and Morgan 1988). Blackboard systems constructed with AGE have architectures similar to the architectures of the Hearsay-II and CRYSALIS blackboard systems.
Hearsay-III
Hearsay-III is another early domain-independent framework for developing blackboard systems. "Hearsay-III can be viewed as an extension along some dimensions of the Hearsay-II architectural style, and a generalization of it along others" (Erman, London, and Fickas 1988). The architecture of the Hearsay-III framework is different from prior blackboard systems in that it is built on top of a relational database system, AP3. Following are several concepts introduced first in Hearsay-III that can be found in more recent blackboard systems.
Units Blackboard systems prior to Hearsay-III defined the object being placed on the blackboard as an hypothesis. Hearsay-III defines the fundamental component of representation built by an application as a Unit. The Unit class can be decomposed to create distinct sub-classes. This ability to create sub-classes of the class Unit is used to segment the blackboard into two blackboards, the domain blackboard, and the scheduling blackboard (Erman, London, and Fickas 1988). Defining a generic data type, like a Unit, as the base class of the objects that are placed on the blackboard is important when designing a blackboard framework. It gives one greater freedom later in the design cycle because one can modify the contents of the unit without changing the method of access
Segmented Blackboard The designers of Hearsay-III view KS scheduling as a problem separate from the domain problem being solved, which explains why they have segmented the blackboard as described in the previous paragraph. They have also designed two types of knowledge source s, domain knowledge sources which work on the problem at hand, and scheduling knowledge sources which work on the problem of deciding which knowledge source to activate. "Scheduling KSs may respond to changes both on the domain blackboards and on the scheduling blackboard" (Erman, London, and Fickas 1988). This concept of separating the scheduling (i.e. control) problem from the domain problem is taken one step further in the next blackboard framework reviewed, BB1.
BB1
BB1 was invented in 1983 by Barbara Hayes-Roth at Stanford (Engelmore and Morgan 1988). BB1 is "available to anyone who wants it" (http://www-ksl.stanford.edu/projects/BB1/bb1.html). The BB1 design differs from the two prior blackboard generalizations (AGE and Hearsay-III) with the following important features.
A Separate Control Blackboard System. Hayes-Roth felt that the control "problem", i.e. determining the next action to be taken by the knowledge sources solving a problem, warrants a complete separate blackboard system. Why? Hayes-Roth gives the following explanation.
Adaptability in the control of one's own problem-solving behavior is the hallmark of human intelligence¼ [People] know something about how they solve [a] problem, how they have solved similar problems in the past, why they perform one problem-solving action rather than another, what problem-solving actions they are like ly to perform in the future, and so forth. They use this knowledge to adapt their behavior to the demands of the problem-solving situation, to explain their behavior, to cope with new problems, to improve their approaches to familiar problems, and to transfer problem-solving knowledge to other people and to computers. Truly intelligent AI systems must do no less (Hayes-Roth and Hewett 1988).
Hence the control problem is as important as the domain problem being solved. BB1 provides its users with a uniform mechanism for coding both control and domain knowledge sources. BB1 has a single scheduler which invokes, schedules, and executes these knowledge sources. Unlike the schedulers of other blackboard systems, BB1's scheduler contains no control knowledge or heuristics. Instead, it uses the scheduling heuristics recorded on the current control plan of the control blackboard (Hayes-Roth and Hewett 1988). Treating the control task in the same manner as the domain task being solved by the blackboard system was, at the time it was developed, unique to the BB1 framework.
Uniform Data Representation. BB1 differs from other blackboard frameworks in that it contains no internal data structures. It puts everything, both domain and control knowledge, on the blackboard. And, both domain and control knowledge are represented with the same data type (Hayes-Roth and Hewett 1988). This uniform representation makes BB1 systems easy to design and debug.
Multiple Control Strategies. Early blackboard systems used data-directed control, preferring actions that are triggered by important events or states, while the strategic plan of the systems was built into the design of the system's KSs and the KS's relationships. BB1's control blackboard allows for the explicit expression of the blackboard strategy. In addition to a data-directed control strategy, BB1 can support any combination of the following strategies: (1) A goal-directed strategy, one which prefers actions that lead to a certain state or event, (2) an action-directed strategy, one which selects the most "important" actions, or (3) a plan-directed strategy, one that selects actions which are in line with an explicitly stated plan (Hayes-Roth 1989a)
GBB/NetGBB
The Generic Blackboard Builder(GBB) development system was developed by a team of researchers, lead by Daniel Corkill, at the University of Massachusetts during the mid 1980's. The team had two goals for the system: (1) to reduce the time required to implement a specific application, and (2) to increase the execution efficiency of the resulting implementation (Corkill, Gallagher, and Murray 1988). GBB and NetGBB are currently being supported and sold by Blackboard Technology. (See http://www.bbtech.com.) The GBB system designers contributed to the development of blackboard frameworks in the following areas.
Emphasis on Efficiency. GBB was unique at the time of its design in that a strong emphasis was made on efficient insertion and retrieval of blackboard objects. A pattern matching language was developed in order to meet this objective. Other means for achieving efficiency include (1) dividing the blackboard into different levels, or "spaces" as they are called in GBB, (2) allowing for ordered or enumerated spaces, and (3) using hashing or set tables to access the blackboard objects stored in a space (Corkill, Gallagher, and Murray 1988).
Nested Blackboards GBB allows blackboards to be nested (Corkill, Gallagher, and Murray 1988). Nested blackboards prove useful when the problem being solved by the enclosing blackboard is so complex that it requires decomposition into problems each of which require a blackboard for solution as well.
Networked Blackboards GBB was one of the first blackboard frameworks designed to support networked blackboards. By augmenting GBB with NetGBB, one can design networks of blackboard systems working together to solve a problem in a manner analogous to the way the individual knowledge sources work together. (See http://www.bbtech.com.)
Post GBB
Other important concepts that have been explored in more recent blackboard frameworks, but have had less of an impact on the design of all blackboard systems, include: real-time systems (Hayes-Roth & Collinot 1994), asynchronous knowledge sources (Baum, Dodhiawala, and Jagannanthan 1989), small imbeded systems (Reynolds 1988), and implementation in a Microsoft Windows environment (Vranes, Lucin, Stevanovic, and Subasic 1994).
Relationship of the Intelligent Guard
to Other Blackboard Systems
In this section the IDOG blackboard system and the Mike framework developed for this thesis are compared and contrasted with the architectures of the blackboard systems and frameworks discussed in the previous section. Compared and contrasted are the following features of the architectures: blackboard type, blackboard contents, reasoning method, KS structure, control mechanism, and the blackboard database.
Blackboard Type
There are two basic types of blackboard software, blackboard systems and blackboard frameworks. Blackboard systems are designed to solve problems in a particular domain. For example, the HASP/SIAP system was designed to interpret signals from a network of sonar sensors spread across the ocean floor (Nii, Feigenbaum, Anton, and Rockmore 1988). Blackboard frameworks, like Hearsay-III, AGE, BB1 and GBB, on the other hand are domain independent are are designed to build blackboard systems. Mike is a blackboard framework designed to build blackboard systems. The IDOG, which was built using Mike, is a blackboard system designed to solve a specific problem.
Blackboard Contents
With the exception of BB1, Hearsay-III and GBB/NetGBB, the objects held by the blackboard of the systems described above are hypotheses. The blackboard framework described in this paper, Mike, defines the objects being placed on its blackboard as Messages. Mike's Messages are similar in concept to BB1's objects or Hearsay-III's Units in that they are not limited to representing a structure defined as a hypothesis by the designer. A Message contains text and has a unique identification key which allows fast lookup via hashing. Since the value of a Message's key does not describe the Message's proximity on the blackboard to another Message, Mike's blackboard space is similar to the GBB's unstructured spaces described by Corkill, Gallagher, and Murray (1988). Because the text in the Message can consist of any object, Mike's Messages offer the knowledge sources of the system a representationally complete communication medium.
Reasoning Methodology
This section compares how the different systems reason. All of the blackboard systems discussed in the previous section perform both top-down and bottom-up reasoning opportunistically. They all generate more than one hypothesis, and form chains of hypotheses. The IDOG, as it sits today, is a bottom-up-only inference engine that looks for evidence to support a single hypothesis. The hypothesis being that "someone is misusing another person's account." This is a result of the design of the IDOG, not the underlying framework. Mike can be used to build systems that reason in both directions and that generate multiple hypothesis. The next section compares and contrasts the anatomy of the IDOG's knowledge sources with those of other blackboard systems.
KS Structure
While it is common for a blackboard system's knowledge sources to consist of a pre-condition and an action component, each of the IDOG's knowledge sources are designed as a single procedural (as opposed to being rule-based) action component. This design is a result of this author taking Nii's description of knowledge sources as being separate and independent (1986a) literally. Then, when the realization of the necessity of a control system came, it became apparent that the action component of Mike's knowledge sources needed the ability to state preconditions in the form of requests for messages from other knowledge sources. Hence the IDOG's knowledge sources are functionally similar to those of other blackboard systems except for the fact that they are asynchronous.
Knowledge sources built with Mike are inherently asynchronous. None of the blackboard systems reviewed above have asynchronous knowledge sources. Only the Erasamus framework and a few others provide support for asynchronous knowledge sources. An asynchronous KS is necessary when one needs a KS that can place data on the blackboard as it becomes available instead of only during the time the KS has been activated by the control system. Every KS built using Mike is asynchronous because each is an independent process running on the UNIX operating system. However, it is possible to synchronize the knowledge sources designed with Mike by using the UNIX signaling mechanism. To do so one would have every KS send a request to the control system to wake it, then sleep before it executes code that posts to the blackboard. This discussion of control lead s to the next topic, how the IDOG's control mechanism compares with that of other blackboard systems.
Control Mechanism
Like Hearsay-II's scheduler and BB1's control blackboard system, the IDOG's control strategy falls into the category of "weak" control systems. Though, as with HASP/SIAP, there is a control knowledge source (CKS), the CKS does not use a set of rules to select the next domain KS to activate. Instead, the IDOGs control system follows a general rule - activate all KSs whose requests for data have been satisfied.
To review, the IDOG's knowledge sources run asynchronously until they need information, at which time they post a request for control to the blackboard and sleep. The control system consists of a single control KS. The sole function of the CKS is to wake other knowledge sources when the information they have requested becomes available. Like Hearsay-II, the IDOG's control system is data driven. In contrast with system s designed with BB1 or GBB, where the goals and plans of the system are explicitly stated, the IDOG's goals and plans are implicit in the design of its knowledge sources. This simple control system is adequate for the IDOG as it sits today because the IDOG is merely a forward-chaining inference system that looks for evidence to support a single hypothesis
Of course it is possible to design more complex control systems with Mike. For example, Mike allows one to create either a strong, hierarchical control system, or a weak heuristic scheduler. Mike also allows one to create multiple instances of a blackboard server. Thus, if one wants a separate control blackboard, similar to BB1's control blackboard, one could create two blackboard server instances and use one as the problem domain blackboard and the other as the control blackboard. There are many different ways to implement these blackboard databases. The next section discusses the similarities and differences between the IDOG's blackboard database and other blackboard systems.
Blackboard Database
On some blackboard systems there is a strong coupling between the structure of the objects stored on the blackboard and the structure of the blackboard database. For example, the Hearsay-II blackboard database can only hold Hearsay-II hypotheses. Blackboard systems at the other end of the spectrum have a very lose coupling between the objects stored on them and their structure. Hearsay-III provides an example of this type of blackboard. Mike's blackboard also falls into this category. It can hold any object that is a sub-class of the Message class. In addition, a Message object can act as a container for an object of any other class. An object on the blackboard could even be another blackboard system. Thus, as with GBB, it is possible to use Mike to build nested blackboards.
Another way to compare blackboard databases is in how they define the relationships between the objects being held. As an example, the Hearsay-II blackboard represents time and the representation level of the decoding process. "The possible hypothesis at a level form a search space for KSs operating at that level" (Erman, Hayes-Roth, Lesser, and Reddy 1988). Structuring the blackboard makes the retrieval of information at a given level of the blackboard more efficient for knowledge sources operating at that level. Blackboards built with the Mike framework are unstructured. Any relationship between the Messages on a blackboard must be contained within the Message. An unstructured blackboard has the advantage of giving one a system with the loosest coupling between all knowledge sources.
Blackboard databases also differ depending on whether the blackboard database may be accessed by asynchronous knowledge sources or not. A blackboard database that is being accessed by asynchronous knowledge sources will need to provide a locking mechanism to maintain data consistency The blackboard database provided with Mike is always locked during access by a KS and so may be accessed by asynchronous knowledge sources.
Finally, blackboard databases may be distributed via a network, or not. The distribution mechanism can be either proprietary, or via an open standard. The author chose an open standard, RPC, to make Mike portable.
Blackboard Systems and Frameworks In Brief
Following are several tables which illustrate the relationship between the blackboard framework and system developed for this thesis and those discussed above. Table 1 below recaps the similarities and differences between the IDOG and the other blackboard systems discussed in this paper.
|
HEARSAY-II |
HASP/SIAP |
CRYSALIS |
IDOG |
|
|
PROBLEM DOMAIN
|
Speech Recognition |
Interpreting Sonar Signals |
Interpreting Protein EDMs |
Computer Security |
|
BLACKBOARD DATABASE
|
|
|
|
|
|
KNOWLEDGE SOURCES
|
|
|
|
|
|
CONTROL MECHANISM |
|
|
|
|
Table 1.
Tables 2 and 3 summarize the similarities and differences between Mike and the frameworks discussed in the previous sections.
|
AGE |
HEARSAY-III |
BB1 |
|
|
BLACKBOARD DATABASE
|
|
|
Uncoupled Structured Local 4 or more Boards |
|
KNOWLEDGE SOURCES
|
|
|
|
|
CONTROL MECHANISM |
|
|
|
Table 2.
|
GBB |
MIKE |
|
|
BLACKBOARD DATABASE
|
Uncoupled Structured Distributed N boards |
Uncoupled Unstructured Distributed N boards |
|
KNOWLEDGE SOURCES
|
|
|
|
CONTROL MECHANISM |
|
|
Table 3.
This completes the portion of the literature review covering blackboard systems and frameworks. The next section discusses intrusion detection systems (IDSs); then the final section of this chapter compares and contrasts the IDOG with other IDSs.