(Text: In Core Java Volume 1, Chapter 12 "Streams and Files" section Object Streams)
Java provides a feature called object serialization that allows you take any object that implements the Serializable interface and turns it into a sequence of bytes that can later be fully restored into the original object.
"Objects are serialized with the
ObjectOutputStream and they are deserialized
with the ObjectInputStream. Both of these
classes are part of the java.io package, and they
function, in many ways, like DataOutputStream
and DataInputStream because they define the
same methods for writing and reading binary
representations of Java primitive types to and
from streams. What ObjectOutputStream and
ObjectInputStream add, however, is the ability to
write and read non-primitive object and array
values to and from a stream." Java in a Nutshell
![]()
The basic idea is to save Instance information. Why Instances?
Classes are defined and "saved" in your .java source files. However, for instances, they disappear when you exit the application. One might want to save Instance information.
What information might this be?
Well, it won't be methods, since methods are defined in Classes (defined in the class files). Not so for Instance objects. Instance objects are created at run-time. Instance objects hold Instance Variables and can change Class Variables. Thus, static variables in an Instance can be a problem.. more later
To get an idea of what serialization is doing, let's consider more about the nature of objects. From, "Developing Java Beans" by Robert Englander.
"Most components maintain information that defines their appearance and behavior. This information is known as the state of the object. Some of this information is represented by the object's properties. For instance, the font or color properties of a visual component are usually considered to be part of that object's state. There may also be internal data used by an object that is not exposed as properties, but plays a part in defining the behavior of the object nevertheless.
...The state information of all the components, as well as the application or applet itself, must be saved on a persistent storage medium so that it can be used to recreate the overall application state at run-time. An important aspect of the application state is the definition of the components themselves: the persistent state of an application includes a description of the components being used, as well their collective state."
Note that we use ObjectStreams to save objects. You can only read and write objects not numbers. To write and read numbers, you use methods such as writeInt/readInt or writeDouble/readDouble. (The objectstream classes implement the DataInput/DataOutput interfaces.) (more later on why you would want to do this)
Of course, numbers inside objects (IVs) are saved and stored automatically (discussion on CV storage being static later).
When an object is saved all of its state is saved. This means that all handles and objects that the saved object refers to are saved.In its simplest application the programmer includes the phrase implements java.io.Serializable in the class definition as shown in the following example from "Developing Java Beans."
note implements java.io.Serializable could be implements Serializable but need to import java.io.*
In this example there are three data members.
The first two, anInteger and aFloat are primitive
data types and are therefore serializable. The
third, aButton is an instance of type
java.awt.Button, a subclass of
java.awt.Component which itself implements
java.io.Serializable. Therefore the class
SimpleExample can be serialized without doing
anything more than declaring that it implements
java.io.Serializable.
Only classes that implement the Serializable or
Externalizable interface can be written to or
read from an object stream. Serializable is a
marker interface - it doesn't define any methods
and serves only to specify whether an object is
allowed to be serialized (look in the API ... it is
empty!). Read what it says about the readObject and writeObject methods. The Externalizable interface (which
extends Serializable) does define methods and
is used by objects that want advanced control.
Below is the save() method in the example 8.1 in
Java in a Nutshell 2. Note the creation of the
ObjectOutputStream and the use of writeObject(). Since it does not implement the writeObject method, it needs to instantiate the ObjectOutputStream explicitly. It is only saving one Object (specificially lines).
Notice that is is just lines that is serializable.
(See Java in a Nutshell, page 173 http://www.ecst.csuchico.edu/~amk/foo/javanut2/ch08/ScribbleFrame.java)
Only data associated with a specific instance of a class is serialized, therefore static data, that is, data associated with a class as opposed to an instance, is not serialized automatically. To serialize data stored in a static variable one must provide class-specific serialization.
Similarly, some classes may define data members to use as scratch variables. Serializing these data members may be unnecessary. Some examples of transient data include runtime statistics or hash table mapping references. These data should be marked with the transient modifier to avoid serialization. Transient, by definition, is used to designate data members that the programmer does not want or need to be serialized. See Java in a Nutshell, page 174: mouse position, preferred size, file handles (machine specific (native code)).
When writing code if something is declared transient, then this triggers (to programmer) necessity of the posibility of special code for serialization later.
To serialize an object, you create some sort of
OutputStream object and then wrap it inside an
ObjectOutputStream object. At this point you
only need to call writeObject() and your object is
magically serialized and sent to the
OutputStream. To reverse the process, you wrap
an InputStream inside an ObjectInputStream and
call readObject(). What comes back is, as usual, a
handle to an upcast Object, so you must downcast
to set things straight.
If you need to dynamically query the type of the object, you can use the getClass method. Specifically dk.getClass.getName() returns the name of the class that dk is an instance of. I.e., this asks the object for the name of its corresponding class object. (Hmmm, True, but what about syntax? I still need to know what it is to declare it...too bad) (C++ can do this in one operation (dynamic_cast (gives null if wrong type)), java can use instanceof operator to check if it is what I think (see Core Java, Ch5 Inheritence, Casting section)
The following example illustrates the serialization process. It is not necessarily what you need to do unless you need to customize; it is what is automatically done for an instances IVs
Note the order: when reading back keep track of the number of objects, their order and their type. In Java, remember that strings and arrays are objects and can, therefore, be restored with the writeObject/readObject methods. (Why need to do this if they are objects? why not automatic? ... IS automatic if the object is serializable and it is an IV of an instance that is serializable... and it is not static or transient)
Another example: (This is Java in a Nutshell, page 175) (show use of transient data)
What Objects need to implement Serializable?
Component implements Serializable, so all AWT
components can be serialized. If a Class in the API is
serializable, all of its superclasses are (otherwise they could/should
not have made that claim). If you want to serialize
a Class of your own, you need to make sure that all of its supers are serializable as well.
Otherwise you may be fooling yourself and others.
Specifically, you need to insure that your own classes
implement serializable properly. Illustration
Since in a given application, one probably does not inherit most classes (other than the AWT stuff) you often need to do this explicitly.
Why not serialize everything?
Back to serializing...
So, each Class that you have Instances that will need to be saved should implement Serializable and , possibly have custom readObject() and writeObject() methods.
If a class does not implement the method, the default serialization provided by defaultReadObject() will be used. In custom methods, call the default first.
(The default methods may be called only from a class's read/writeObject methods. If it is called from outside the writeObject method (for example), the NotActiveException is thrown)
(What does this readObject() probably do? Upon (1) reading the object type, (2) instantiates the (blank) object and (3) sets it variables to the values saved)
If you write a save() method for the top Object, then as Java tries to serialize it, it accesses its variables - which are possibly objects that need to be serialized, and performs this recursively. From specs "The writeObject method serializes the specified object and traverses its references to other objects in the object graph recursively to create a complete serialized represetnation of the graph."
As far as subclasses (from Englander):
"When an object is serialized, the highest
serializable class in its derivation hierachy is located and serialized
first. Then the hierachy is walked, with each subclass being serialized in
turn."
Specifically, readObject and writeObject
methods only need to save and load their data
fields; they should not concern themselves with
superclass data or any other class information.
(except changes to static variables)
Example 1-4 in Core Java (example only to show how saved - poor OO use of main)
http://www.ecst.csuchico.edu/~amk/foo/CoreJava/v2ch1/ObjectFileTest.java On the SUNs at Chico (at least on Expert), the corejava package is at /opt/java/corejava,
this would need to be in your CLASSPATH for this code (and other code from
the CoreJava book) to run
Keep in mind that objects may contain references to their variables, not separate copies.
Specifically, consider the example below. Two managers can share the same secretary. One does not want to save three copies of Harry. (One wants to maintain consistent data and not worry about editing numerous copies.)
Thus the term serialization...
Example 1-5 in Core Java V2 (show how saved hierarchically) http://www.ecst.csuchico.edu/~amk/foo/CoreJava/v2ch1/ObjectRefTest.java
Remember, objects contain references to its IV objects, not separate copies of objects. We want the object layout on disk to be exactly like the object layout in memory. This is persistance . Java achieves persistance through serialization .
In general:
About Class Variables (static).
The problem here is when one wants to dynamically change static variables. Since these are defined in the class, when the class recreates the saved instance, it would put the old value for the static variable there.
One needs to customize (as above) to restore this information. Beware that it is an instance that is trying to save this new class variable. Specifically, if an object1 refers to a class (static) attribute of another class and object1 is to be serialized, to accurately save object1 the static attribute from the referenced class would also have to be saved and any state associated with that static attribute. However, if the referenced class is not serializable then the object1 should throw a NotSerializableException.
While the model used for serialization is very simple, it has some drawbacks.
First, it's not as simple as marking serializable classes with the Serializable interface. It is possible for an object that can't be serialized to implement Serializable (either directly or by inheritance).
Ultimately, serialization has to do with the data members of the class, not the methods it contains; after all, Serializable is an empty interface and doesn't require you to implement any methods.
A class is serializable if, and only if, it has only members that are serializable. By default, static and transient members are ignored when an object is serialized.
Generally speaking, classes that belong to the
standard Java distribution are serializable
unless serializing an object of that class would
be a security risk. The problem is that there are
many standard classes that would present
security risks if serialized--for example, a
FileInputStream can't be serialized,
because when it is deserialized at a later time
(and possibly on a different machine), you have
an object that references some file handle
that may no longer be meaningful, or that may
point to a different file than it did originally.
You should make it a practice to check the class
of any data members you add to a serializable
class to make sure that data members can be
serialized also. Don't make any assumptions;
just look it up in the documentation.
Stating that a class implements Serializable is
essentially a promise that the class can be
successfully saved and restored using the
serialization mechanism. The problem is that
any subclass of that class automatically
implements Serializable via inheritance, even if
it adds some non-serializable members. Java
throws a NotSerializableException (from the
java.io package) if you try to save or restore a
non-serializable object.
I looked up JTable and Swing. The Swing components DO implement Serializable but
look at the bottom of their API page descriptions (scroll up a little). It gives a warning that "this class will not be compatible with future Swing releases."
So, a "short term" serialization. Hence better to leave them out of major projects.
When you are writing Beans (or any class that
you may want to serialize), you have to think
carefully about what the class contains, and you
also have to think about how the class will be
used.
You can redesign almost any class so that it is
serializable, but this redesign may have
implications for the interface between your class
and the rest of the world. Ultimately, that's the
trick with object serialization. It's not as
simple as marking a few classes Serializable; it
has real implication for how you write code.
"When an object is serialized, some information about its class must
obviously be serialized with it, so that the correct class file can be loaded
when the object is deserialized. This information about the class is
represented by the java.io.ObjectStreamClass class. It contains the
fully-qualified name of the class and a version number. The version number
is very important because an early version of a class may not be able to
deserialize a serialized instance created by a later version of the same
class." Java in a Nutshell, page 175
Core JavaV2, (In the 1.2 Core Java text, this information is
in the Volume1) discusses what the files that are saved during
serialization actually look like. Note (2) of the class description:
"the serial version unique ID , which is a fingerprint of the
data field types and method signatures".
Java gets this fingerprint by using their Secure Hash Algorithm SHA
on the data of the class.
When a class definition changes in any way, so does this SHA fingerprint.
So the idea is when you start serializing instances of a class, you
should identify the fingerprint of the current version of the class.
See SUNs Stream Unique Identifiers page to see what all
is in this fingerprint
To get the SHA fingerprint:
If you make larger changes that break serialization (2 and 3 above) compatibility, run serialver again to generate an updated version number.
"It is up to the class designer to implement additional code in the readObject method to fix version incompatibilities or to make sure the
methods are robust enough to handle null data." CoreJavaV2 pg.66
( null is
the default for un-instantiated objects)
The idea: you have a class and you have serialized objects made from
this class. Now the class changes (you have a new version). What
happens when you try to load old instance information to a newly
created instance (from a newer class version)?
"breaking" serializations produces exceptions like:
java.io.InvalidClassException: Person; local class incompatible:
stream classdesc serialVersionUID = -2832314155938395448,
local class serialVersionUID = 480295508009809219
Do we get it? For a final overview see Serialization part of Bean
tutorial
There is obviously a lot more to Serializable.
For further reference try
(the online book I have mentioned) "Thinking in Java" also "Java in a
Nutshell", 2nd ed for code and 3rd ed in IO chapter
(scroll own to look at the method .GetField),
and "Developing Java Beans" by Robert
Englander published by
O'Reilly (ISBN: 1-56592-289-1). The SUN web
site to visit would be:
at http://java.sun.com/j2se/1.4.2/docs/guide/serialization and
specifically the Specs at http://java.sun.com/j2se/1.4.2/docs/guide/serialization/spec/serialTOC.html