Data References
Data References are a convenient programming interface for passing lightweight references to data across the network. A Grid client or Service can create Data References, pass them over the network, but leave the data where it created the original Data Reference. If any Grid client or Grid node actually needs the data, it can de-reference the object and the data is automatically downloaded from the original source.
You can use this abstraction for generalizing Grid workflows. A Grid client can receive the results of a particular Service as a reference, and then send another request to the Grid with that reference. The GridServer Engine that services the request de-references the data object, loading it from the original Grid node that produced the data. This is equivalent to passing pointers across the network.
A DataReference is an Object that you can pass interoperably if you use it as an argument, return type, or a GridCache object. Note that to work interoperably, it must be the actual object passed; it can’t be part of another object.
To create a DataReference, use a DataReferenceFactory. You cannot create Data References with a null source. To retrieve the actual data, use the fetch methods. The data is not cached after a fetch, that is, each fetch retrieves data. After a reference expires, a fetch throws a FileNotFound exception. A downed Client throws a Connect exception, although in some cases, refused connections are due to other causes, such as socket backlog limitations.
When an object is transmitted over the network, it is serialized on the sending side and then reconstituted completely in memory on the receiving side. The larger the object, the more memory it occupies. In situations with large DataReference objects, you must use DataReferenceOutputStream instead. Streaming the data can reduce the memory that would otherwise be necessary to reconstitute it in its entirety. As an added benefit, streaming also saves the time it would have taken to reconstruct the entire object.