Disk Space Calculations

Disk Space Required

The disk space requirements for Spotfire Data Streams can be considerable, and fall into the following categories.

Installation Directory

Recent releases of Spotfire Data Streams have required approximately:

  • 2.0 GB on Windows

  • 1.7 GB on macOS

  • 2.6 GB on Linux

Installer Files

The installer files for Spotfire Data Streams require approximately the following amounts, plus several hundred megabytes of temporary space during installation. The installer files can be removed after successful installation.

  • 1.3 GB on Windows

  • 1.4 GB on macOS

  • 1.7 GB on Linux

StreamBase Studio Workspace

Disk use for your StreamBase Studio workspace for developing applications depends on the number of projects you maintain and volume of data they store. Expect to need another gigabyte for a large number of projects — but see Node Directories, next.

Node Directories

Running a StreamBase EventFlow or LiveView fragment or application in the StreamBase Runtime creates a node directory that consumes disk space.

The node directory includes a memory-mapped file, ossm, that represents the memory potentially consumable by the node. On Windows, Linux, and recent macOS versions, this is a sparse file, whose size appears to be equal to the full memory size configured for the node. However, the disk space actually consumed by this file is only the amount actually consumed.

However, macOS Sierra 10.12 and earlier versions did not support the APFS file system, and therefore did not support sparse files. On those systems the ossm files in node directories actually do consume the amount of disk space they appear to consume.

In general, node directories are temporary and are automatically removed when the node is removed. Node directories are preserved in case of error so that you can analyze log files stored there.

Running fragments and applications from the command line creates one node directory per node, and you specify the location of the node directory at node installation time. This means you can specify a fast, local disk with plenty of room for these node directories. It is not recommended to use a network-attached location for node directories.

Running fragments and applications in Studio creates one node directory per node in the Studio workspace, in a folder named .nodes, which is hidden by default. As with command line node directories, Studio-launched node directories are temporary and are automatically removed at Studio exit time for successfully removed nodes. However, the node directories for failed nodes are preserved for analysis, which can increase the disk space consumed by your Studio workspace.

It is recommended to inspect any node directories left by failed nodes as soon as practical after the failure, and creating a snapshot zip file, so that the failed node directory can be removed.

Local Maven Repository

StreamBase uses the Maven software build system, which automatically retrieves dependencies from a network location and stores them in a local disk repository. The local Maven repository is placed by default in the .m2/repository folder of your home directory.

On first use of Studio, it inspects the local repository and populates it with a base set of artifacts (usually JAR files) that are required for most EventFlow and LiveView development for the current release of StreamBase. When you load and run a project, Studio loads more artifacts to support the project, especially one that uses any adapters. It does not take long for the local repository to take up over one gigabyte.

The more projects you load and run, more artifacts may be downloaded into the local repository. Each StreamBase release has its own version of the JAR file artifacts, which are loaded parallel to any existing artifacts, with any similar file names distinguished by release number.

Also remember that the Maven repository is a universal resource, used by any other programs you install that are based on Maven.

In short, prepare for your local Maven repository to grow to multiple gigabyte sizes.

It is strongly recommended to place your local Maven repository on a fast, local, solid state disk. If your organization configures your home directory to be on a network server, this can result in slower performance when configuring and building StreamBase and LiveView applications.

Specifying a Different Local Maven Repository

You may prefer to specify a non-default location for your local Maven repository in cases like the following:

  • Your organization places your home directory on a network drive and you want to move the repository to a fast local drive.

  • You are running out of disk space on your home directory's drive.

  • You want to keep different local repositories for different purposes while developing a StreamBase application, or to keep your StreamBase-related development separate from other Maven projects.

The simplest way to move your local Maven repository is with a single line in a ~/.m2/settings.xml file. For example:

<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
   http://maven.apache.org/xsd/settings-1.0.0.xsd">

   <localRepository>${user.home}/tempus/fugit/repository</localRepository>

</settings>

The <localRepository> element of this settings.xml file moves the local repository to the tempus/fugit subdirectory of your home directory. The <localRepository> element must specify a full absolute path, or use the ${user.home} variable.

This method moves the local repository for all Maven purposes until changed in the settings.xml file.

Notice that we still use ~/.m2 to contain the settings.xml file itself. This standard location allows both Studio and command line mvn to locate your settings. You can leave the settings.xml file as the only contents of the ~/.m2 folder and still have the gigabytes of the repository folder placed elsewhere.

To identify the repository location that StreamBase Studio is currently using, open Window>Preferences (Windows) or StreamBase Studio 11.1>Preferences (Mac). Then open Maven>User Settings and look in the Local Repository field.

Using a Maven Repository Manager

By default, Maven downloads publicly available artifacts from Maven Central, a large distributed server complex maintained by the Apache Maven project. The server is very busy, and requires direct Internet access from your workstation outside your organization's firewall.

Most organizations with more than a few developers using Maven will want to configure a Repository Manager, which is server software you install locally to act as a proxy server for the public Maven repositories. A Repository Manager is especially helpful for organizations with security requirements that restrict open access to the Internet.

See this Best Practice page in the Apache Maven documentation for a list of Repository Manager software providers, both commercial and open source.

StreamBase Tools to Manage the Local Repository

As described above, StreamBase Studio populates the local repository automatically when Studio first starts in a new workspace, with continuous updates as needed.

However, there are conditions under which you may prefer a manual approach. StreamBase provides two ways to help you manage your local Maven repository.

Help > Synchronize StreamBase Maven Artifacts

This Studio menu option forces Studio to rerun its population of the local Maven repository with all StreamBase-related artifacts.

You can run this command after you have moved the local repository as described above, or to replace some of the contents of the repository that were incorrectly deleted. This command runs a smart merge that checks to see what files are present and what version they have before recopying the files. Only missing files are recopied.

The epdev Command

Use the epdev command to rerun Studio's local repository update in the same way as the Help>Synchronize StreamBase Maven Artifacts menu option. You can also use epdev to install local copies of dependencies that would normally be retrieved from Maven Central or from other remote locations, to prepare a computer for offline use while traveling.

See epdev for syntax and further options.