Java: Serializable (how to save instances)

Here is what we have for Serializable. Later will come the Java Beans info.

Class Serialization.

Java provides a feature called object serialization that allows you take any object that implements the Serializable interface and turns it into a sequence of bytes that can later be fully restored into the original object.

This mechanism can be used to preserve the information which a knowledge expert "loads" into the program.

"Objects are serialized with the ObjectOutputStream and they are deserialized with the ObjectInputStream. Both of these classes are part of the java.io package, and they function, in many ways, like DataOutputStream and DataInputStream because they define the same methods for writing and reading binary representations of Java primitive types to and from streams. What ObjectOutputStream and ObjectInputStream add, however, is the ability to write and read non-primitive object and array values to and from a stream." Java in a Nutshell

What Objects need to be serialized?

The basic idea is to save Instance information. Why Instances?

Classes are defined and "saved" in your .java files. However, for instances, they disappear when you exit the application. One might want to save Instance information.

What information might this be?

Well, it won't be methods, since methods are in Classes, not instances. Instances hold Instance Variables and can change Class Variables. Thus, static variables in an Instance can be a problem.. more later

To get an idea of what serialization is doing, let's consider more about the nature of objects. From, "Developing Java Beans" by Robert Englander.

"Most components maintain information that defines their appearance and behavior. This information is known as the state of the object. Some of this information is represented by the object's properties. For instance, the font or color properties of a visual component are usually considered to be part of that object's state. There may also be internal data used by an object that is not exposed as properties, but plays a part in defining the behavior of the object nevertheless.

...The state information of all the components, as well as the application or applet itself, must be saved on a persistent storage medium so that it can be used to recreate the overall application state at run-time. An important aspect of the application state is the definition of the components themselves: the persistent state of an application includes a description of the components being used, as well their collective state."

When an object is saved all of its state is saved. This means that all handles and objects that the saved object refers to are saved.In its simplest application the programmer includes the phrase implements java.io.Serializable in the class definition as shown in the following example from "Developing Java Beans."

In this example there are three data members. The first two, anInteger and aFloat are primitive data types and are therefore serializable. The third, aButton is an instance of type java.awt.Button, a subclass of java.awt.Component which itself implements java.io.Serializable. Therefore the class SimpleExample can be serialized without doing anything more than declaring that it implements java.io.Serializable.

Only classes that implement the Serializable or Externalizable interface can be written to or read from an object stream. Serializable is a marker interface - it doesn't define any mehtods and serves only to specify whether an object is allowed to be serialized (look in the API ... it is empty!). The Externailizable interface (which extends Serializable) does define methods and is used by objects that want advanced control.

Below is the save() method in the example 8.1 in Java in a Nutshell 2. Note the creation of the ObjectOutputStream and the use of writeObject(). (See Java in a Nutshell, page 173)

However, this "ease" is not true in all cases. As we shall see, serialization is not so easily applied to classes with static or transient data members. Only data associated with a specific instance of a class is serialized, therefore static data, that is, data associated with a class as opposed to an instance, is not serialized automatically. To serialize data stored in a static variable one must provide class-specific serialization.

Similarly, some classes may define data members to use as scratch variables. Serializing these data members may be unnecessary. Some examples of transient data include runtime statistics or hash table mapping references. These data should be marked with the transient modifier to avoid serialization. Transient, by definition, is used to designate data members that the programmer does not want or need to be serialized. (See Java in a Nutshell, page 174)

To serialize an object, you create some sort of OutputStream object and then wrap it inside an ObjectOutputStream object. At this point you only need to call writeObject() and your object is magically serialized and sent to the OutputStream. To reverse the process, you wrap an InputStream inside an ObjectInputStream and call readObject(). What comes back is, as usual, a handle to an upcast Object, so you must downcast to set things straight.

Note: a class can define custom serialization and deserialization behavior for its objects by implementing writeObject() and readObject() methods. These methods are not defined by any interface. The methods must be declared private (rather surprising since they are called from outside of the class during serialization and deserialization.)

The following example illustrates the serialization process. It is not necessarily what you need to do unless you need to customize; it is what is automatically done for an instances IVs

What Objects need to implement Serializable? Component implements Serializable, so all AWT components can be serialized. If a Class is serializable, all of its subclasses are. Otherwise, you need to make your classes implement serialzable.

Since in a given application, one probably does not inherit most classes (other than the AWT stuff) you often need to do this explicitly.

This means that each Class that you have Instances that will need to be saved should implement Serializable and , possibly have custom readObject() and writeObject() methods. If a class does not implement the method, the default serialization provided by defaultReadObject() will be used. In custom methods, call the default first. (The default methods may be called only from a class's read/writeObject methods. If it is called from outside the writeObject method (for example), the NotActiveException is thrown) (What does this readObject() probably do? Upon reading, instantiates the object and sets it variables to the values saved)

If you write a save() method for the top Object, then as Java tries to serialize it, it accesses its variables - which are possibly objects that need to be serialized, and performs this recursively. From specs "The writeObject method serializes the specified object and traverses its references to other objects in the object graph recursively to create a complete serialized represetnation of the graph. As far as subclasses (from Englander):

"When an object is serialized, the highest serializable class is its derivation hierachy is located and serialized first. Then the hierachy is walked, with each subclass being serialized in turn."

Another example: (This is Java in a Nutshell, page 175)

About Class Variables (static). The problem here is when one wants to dynamically change static variables. Since these are defined in the class, when the class recreates the saved instance, it would put the old value for the static variable there.

One needs to customize (as above) to restore this information. Beware that it is an instance that is trying to save this new class variable. Specifically, if an object1 refers to a class (static) attribute of another class and object1 is to be serialized, to accurately save object1 the static attribute from the referenced class would also have to be saved and any state associated with that static attribute. However, if the referenced class is not serializable then the object1 should throw a NotSerializableException.


Cautions:

While the model used for serialization is very simple, it has some drawbacks.

First, it's not as simple as marking serializable classes with the Serializable interface. It is possible for an object that can't be serialized to implement Serializable (either directly or by inheritance).

Ultimately, serialization has to do with the data members of the class, not the methods it contains; after all, Serializable is an empty interface and doesn't require you to implement any methods.

A class is serializable if, and only if, it has only members that are serializable--specifically: no static, transient members. By default, static and transient members are ignored when an object is serialized.

Generally speaking, classes that belong to the standard Java distribution are serializable unless serializing an object of that class would be a security risk. The problem is that there are many standard classes that would present security risks if serialized--for example, a FileInputStream can't be serialized, because when it is deserialized at a later time (and possibly on a different machine), you have an object that references some file handle that may no longer be meaningful, or that may point to a different file than it did originally.

You should make it a practice to check the class of any data members you add to a serializable class to make sure that data members can be serialized also. Don't make any assumptions; just look it up in the documentation.

Stating that a class implements Serializable is essentially a promise that the class can be successfully saved and restored using the serialization mechanism. The problem is that any subclass of that class automatically implements Serializable via inheritance, even if it adds some non-serializable members. Java throws a NotSerializableException (from the java.io package) if you try to save or restore a non-serializable object.

When you are writing Beans (or any class that you may want to serialize), you have to think carefully about what the class contains, and you also have to think about how the class will be used.

You can redesign almost any class so that it is serializable, but this redesign may have implications for the interface between your class and the rest of the world. Ultimately, that's the trick with object serialization. It's not as simple as marking a few classes Serializable; it has real implication for how you write code.

There is obviously a lot more to Serializable. For further reference try (the online book that Anne put in this text for you) "Thinking in Java" and "Java in a Nutshell", for code and "Developing Java Beans" by Robert Englander published by O'Reilly (ISBN: 1-56592-289-1). The SUN web site to visit would be: here