sbprofile

StreamBase Profiler — Runs the StreamBase profiler to produce formatted statistics about operators and queues in a running server.

SYNOPSIS

sbprofile {[-u sburi | -p port | -s statfile]} [-h] [OUTPUT OPTIONS] [FILTER OPTIONS]

DESCRIPTION

Use sbprofile to produce formatted statistical profile information about operators, queues, threads, and system information in a given run of StreamBase Server. sbprofile reads the raw system.statv2 statistics stream either from a live server process (-u or -p options), or from a saved raw statistics file (-s option), which can be compressed with gzip. When calculating operator description information (-d or -D options), sbprofile also examines the currently running application on the server.

The sbprofile command does not analyze and draw conclusions from the system statistics. Its purpose is to extract useful information from the system statistics stream and format it for manual or automated inspection. See Profiling for examples of output and for an overall discussion of StreamBase profiling.

By default, sbprofile with no arguments connects to the default server URI, sb://localhost:10000, and prints to stdout a set of CSV formatted statistics lines, one line for each active operator, queue and thread, plus a line of system information. sbprofile prints one set of statistics for every system.statv2 snapshot interval, which is once per second by default. You can change the snapshot interval using the <period-ms> parameter of the <sbmonitor> element in the server configuration file. (See <sbmonitor> for details.) The command extracts and formats statistics until it is stopped with Ctrl+C or until the server shuts down.

Note

The profiling system shows meaningful statistics only for operators and queues that have run for a while. It is common to have StreamBase applications with operators that use no measurable amount of CPU time when the application has not run very long or has not run with much data. For low latency applications running on fast CPUs, it could take hundreds of thousands of tuples to register statistics greater than zero.

Use only one of the -u, -p, or -s options. You can specify connecting to a live server at a different StreamBase URI than the default (-u), or to the localhost server on a non-default port (-p). Use -s to specify reading from a saved raw statistics file, which must already exist, and must have been generated with sbc -u sburi dequeue system.statv2 > statfile. Raw statistics input files compressed with gzip are automatically interpreted by sbprofile, independent of filename extension.

The name of the statistics stream in the system container changed to statv2 with release 7.0.0. For earlier releases, generate a statistics file from the statistics stream named stat, like this example: sbc -u sburi dequeue system.stat > statfile. Because the statistics stream was renamed, you cannot run a system monitoring client, including sbmonitor and StreamBase Manager from an earlier release against StreamBase Server 7.0.0 or later. Dequeued system.stat data from StreamBase 6.x servers should not be used with the 7.x sbprofile command, because the units of measuring CPU statistics changed.

Use the output options (-F, -o, -b, -c, and -i) to change the output format to a simple HTML table, to redirect the formatted output to a file (or to a bzip2 or gzip compressed file), or to change the reporting interval. Notice that the -i option does not alter the system.statv2 sampling interval, and does not accumulate statistics. It only changes how often to print formatted output lines.

Use the filter options (-f, -Q, -t, -Y, -C, -S, -I, -O, -d, -D, and -z) to narrow the output report to operators or queues matching a regular expression, to operators only, or queues only. You can suppress thread or system information. You can restrict the output report for outlying cases by setting threshold values for various fields. You can combine filter options so that an output report contains entries only for statistics matching any of several conditions.

Do not confuse sbprofile's input and output files. If used, the input file is a collection of raw statistics emitted from a server's system.statv2 stream, optionally compressed with gzip by an administrator. If specified, the output file is a formatted CSV or HTML file, extracted from the input stream, optionally compressed with gzip by the sbprofile -c option. The output of sbprofile cannot be used as the input of another sbprofile command.

In the default CSV output format, each emitted line begins with a field containing a single letter that distinguishes the output type. The values in the first field are one of the following:

O for operator statistics lines.
D for operator description lines.
Q for queue statistics lines.
T for thread statistics lines.
S for system information lines.

For the meanings of the other fields in each output line type, see the field tables in the Meanings of the Profiling Statistics Collected section of the Profiling page of the Administration Guide.

Note

When a StreamBase application starts, the JVM associated with StreamBase Server interprets the bytecode at the start of execution, compiles it into native CPU instructions, and then optimizes it as the application is run. Thus, when profiling, statistics gathered at the start of the run do not reflect the optimization timing that will be automatically applied as the application continues to run. Therefore, the longer the application runs, the more accurate the profile.

OPTIONS

You must specify one of -u, -p, or -s (or specify -h, which overrides these three options).

-h, --help

Displays usage text.

-u sburi

Specifies the URI of the StreamBase Server instance to communicate with. See the sburi page of the Reference Guide (or see sburi(5) at the UNIX shell prompt) for a discussion of the URI format and its shortcuts. The URI can also be set using the STREAMBASE_SERVER environment variable. If neither -u, -p, nor -s is specified, the command proceeds with standard defaults as if you entered -u sb://localhost:10000.

-p TCP-port

Specifies the port number on localhost for the StreamBase Server instance to communicate with. This is a shortcut alternative to specifying the full URI with -u in cases where the server is running on localhost on a port other than the default 10000.

Note

The -p option is not supported for applications that have StreamBase authentication enabled (because there is no way to specify a username and password) or in conjunction with the multiple URI syntax.

-s statfile

Specifies the path to a file containing StreamBase Server raw statistics for an application or container, to be opened for formatting and display. The raw statistics file must already exist and must be collected with a command like the following: sbc -u sburi dequeue system.statv2 > statfile (The name of the statistics stream in the system container changed to statv2 with release 7.0.0. For earlier releases, generate a statistics file from the statistics stream named stat, like this example: sbc -u sburi dequeue system.stat > statfile

(The raw statistics input file created with sbc dequeue is NOT the same as a formatted output file created with sprofile and its -o or -c options.)

-Jjvm-option

Specifies a system property setting or other JVM argument to be passed to the JVM that runs this sbprofile command. Use this option to specify temporary settings that affect only the current invocation of sbprofile. You must specify multiple -J options to specify multiple JVM arguments.

There must be no space after the -J. For example, specify -J-Xmx2G. Use the full option syntax for jvm-option that you would use at the Java command line, including the initial hyphen. For example, specify -J-Dstreambase.log-level=2 to increase the log level for this invocation of sbprofile.

Your jvm-option argument might require surrounding quotes, depending on the characters it contains and the shell you are using. However, do not use quotes to escape the spaces between separate JVM arguments; instead use separate -J options. For example: -J-Xms512M -J-Xmx2G

OUTPUT OPTIONS

-F csv | html, --format csv | html

Specifies the display output format. The default is CSV format, with one line per operator or queue for each snapshot interval (or for the snapshot interval at every nSecs, if used with -i). Specify HTML format to generate a simple HTML page with a table for each snapshot interval (one second by default), or a table for the snapshot interval at every nSecs, if used with -i.

-o filename, --outfile filename

Specifies the path to a filename to contain the CSV or HTML formatted output of the command. The default is to write to stdout.

-b filename, --bzip2-outfile filename

Specifies the path to a filename to contain the formatted output of the command, which will be compressed with bzip2. The .bz2 filename extension is automatically appended to the filename you specify. Profile data is compressed in line and written to the target file in that form. Since profile output files have a regular format, bzip2 compression provides about half the file size of gzip compression. See the --roll-size option for more information.

-c filename, --compressed-outfile filename

Specifies the path to a filename to contain the formatted output of the command, which will be compressed with gzip. The .gz filename extension is automatically appended to the filename you specify. Profile data is compressed in line and written to the target file in that form. See the --roll-size option for more information.

-i nSecs

Specifies emitting statistics every nSecs seconds instead of every snapshot interval (every second by default). This option does not accumulate statistics over the specified nSecs; instead it emits the statistics for the specified snapshot interval at each nSecs period, omitting statistics between nSecs values. For example, sbprofile -i 10 emits the statistics for the 10th second, then the 20th, then the 30th, and so on, discarding statistics for seconds 0 through 9, 11 through 19, 21 through 29, and so on. This option works best with the default CSV output format.

--roll-size M

When -o, -b, or -c is used to write an output file, close the file when it reaches approximately M megabytes and start writing to a new output file. The value M is approximate, as file sizes can vary considerably depending on the selected output options:

  • With the -o option, final file size is can be slightly larger than specified. The extra amount is the size of the longest profile output line -1.

  • With the -b or -c compression options, final file size will be much smaller then specified (a factor of 10 is not uncommon), depending on its data content and which method is used for compression.

--roll-time T[M | H | D]

When -o or -c is used to write an output file, close the file when T time units have elapsed and start writing to a new output file. The time units are expressed as M for minutes, H for hours, and D for days, with H the default.

FILTER OPTIONS

The options in this section are OR'ed together. This allows you to specify two or more independent filter options, in which case sbprofile prints output statistics for each match of any specified filter option.

-f regex, --format regex

Specifies filtering the formatted output by matching a regular expression against the operator or queue name in the first field of the CSV output. You can specify more than one -f option. Use this feature to restrict the output to a subset of operators or queues of interest. To match a substring of the operator or queue name, place dot-asterisk (.*) before and after your case-sensitive substring characters. For example:

sbprofile -s rawstatsfile.gz -f ".*Update.*"
-Q, --no-queues

By default, sbprofile shows formatted statistics for each operator, queue, system and thread in the raw input statistics. Use -Q to suppress statistics for all queues.

-t, --no-threads

By default, sbprofile shows formatted statistics for each operator, queue, system, and thread in the raw input statistics. Use -t to suppress statistics for all threads.

-Y, --no-system

By default, sbprofile shows formatted statistics for each operator, queue, system, and thread in the raw input statistics. Use -Y to suppress statistics for System information.

-C T, --cpuThreshold T

Only include operators whose Time (ms) entry is greater than or equal to T milliseconds. (Has no filter effect on queue statistics.) Since most operators show zero or very low values in the Time (ms) column, start with low-numbered arguments such as 1 or 2 and increase them as required. Remember that the argument is in milliseconds, so -C 1 means: show operators with a CPU time of .001 seconds or more.

-S N, --sizeThreshold N

Only include operators whose Size entry is greater than or equal to N, or queues whose Queue Current Size entry is greater than or equal to N.

-I N, --inputThreshold N

Only include operators whose Operator In entry is greater than or equal to N. (Has no filter effect on queue statistics.)

-O N, --outputThreshold N

Only include operators whose Operator Out entry is greater than or equal to N. (Has no filter effect on queue statistics.)

-d, --description

For each operator that passes filtering, emit its type once.

-D, --properties

Same as -d, except include any operator properties.

-z, --printZeroOperators

Include all operators, even those that have accumulated zero processing time. Include all queues, even those that have never had a size greater then 0. Include all threads, even those that have never had CPU time charged to them.

ENVIRONMENT

STREAMBASE_SERVER

Optional. Contains the URI for a StreamBase Server instance. Use this variable to set a default StreamBase URI for StreamBase commands that take the -u option. If set, commands use the URI in this variable, overriding their built-in default URI, which is sb://localhost:10000. If this variable is set, you must use the -u option to communicate with any server other than the one specified in this variable. See the sburi page in the Reference Guide for more on StreamBase URIs.