SPEC SFS® 2014

What's new in SPEC SFS 2014 SP2?

The introduction of the SPEC SFS 2014 SP2 benchmark is a minor revision update to the existing SPEC SFS 2014 SP1 benchmark. All published benchmark results remain valid.

The original SPEC SFS 2014 SP1 workloads remain unchanged in the SP2 release: DATABASE, SWBUILD, VDA, and VDI. These workloads continue to be in the SP2 release and results from SP1 and SP2 are directly comparable to each other.

The SP2 release introduces one new workloads: EDA. The EDA workload is an entirely new workload only available in the SPEC SFS 2014 SP2 release.

EDA

The EDA workload is representative of the Electronic Design Automation environments. The SPEC SFS 2014 SP2 EDA workload is based on network traces collected from real environments and input from domain experts in and published documentation of the real-world implementation of the EDA applications being simulated. It consists of two component workloads that provide two distinct types of behaviors. This workload also is the first workload in the SPEC SFS 2014 suite to have data sets that are both compressible and dedupable. The addition of support for dedupable data sets is a new feature within the SPEC SFS 2014 SP2 benchmark release.

One component workload is EDA_FRONTEND, which presents a load consisting of large numbers of small files with an emphasis on meta-data intensive operations. These small files represent small components that will be aggregated into larger assemblies. Each process generates 100 ops/sec of load to ~1,200 files that are each 16 KiB in size, and 1,200 are empty files. The empty files are used for appends, existence checks, permission checks, and various other operations that may involve data or metadata operations but start with empty files. The empty files also help spread accesses across more of the name space, and physical space associated with metadata. There are typically a large number of these small files that are transient or used for conveying state information to the EDA application. For each increment in LOAD, the number of EDA_FRONTEND processes will increase by 3.

The other component workload is EDA_BACKEND, which represents the load of large assemblies being created. These types of operations are reads and writes to large assembly type files. Each process generates 75 ops/sec of load to ~580 files that are 10 MiB in size, and ~580 that are empty files. For each increment in LOAD the number of EDA_BACKEND processes will increase by 2.

The ratio of EDA_FRONTEND to EDA_BACKEND processes is maintained at 3 to 2 (3 EDA_FRONTEND processes for every 2 EDA_BACKEND processes) as the LOAD increments.

The EDA workload produces the new JOB_SETS metric, and will be publishable in the same way that the other workloads are published.

For more information about the EDA workload, the technical session presented at SNIA’s Storage Developer Conference 2016 “Introducing the EDA Workload for the SPEC SFS® Benchmark” is available on YouTube: https://youtu.be/LaxXsrOeux4

SPEC SFS 2014 SP2 framework enhancements

Improved scaling: expanded maximum number of processes to 30,000 from 10,000. (System-wide total number of processes within the benchmark framework)
Improved startup time: reduced the wall-clock time to get tests started
Enhanced client synchronization capabilities to ensure benchmark stays in sync regardless of network latency or requested load level
Epoch-based time synchronization method added to enhance barrier synchronization
Improved internal timer resolution from microseconds to nanoseconds
Improved robustness: less susceptibility to transient network issues causing the test to fail
Enhanced workload definitions
- Added support for component workloads to have different numbers of directories, files, and sizes of files (used by new EDA workload)
- Added support for dedupable data sets (used by new EDA workload)
- Added new operation types available to workloads
  - Op_rmdir() Removes a directory
  - Op_Unlink2() Unlinks a file that has a non-zero file size
  - Option to use encrypted data sets
  - New hotspot options and data layouts
- Added new EDA workload definitions
Added support for INIT rate throttling (available for use with the new EDA workload)
Added another digit in the precision reported for the latency reporting. Now reported as XX.YYY milliseconds.
Added support for heterogeneous client types: can mix Windows and Unix-like load generators
Improved logging to log more events and log existing events with more detail
Added new lab testing features (results not publishable if used during runs)
- NO_OP_VALIDATE: skip validating that all the op types work
- NO_SHARED_BUCKETS: only create files if they will be used by an active op type

New workload definition attributes in SP2:

Percent rmdir	Additional Op type available in SP2
Percent unlink2	Additional Op type available in SP2 (unlinks non-empty files (size > 0 bytes))
Percent dedup	Set percentage of each file that is dedupable
Percent dedup_within	Percent of dedupe region in a file that is dedupable only within that single file
Percent dedup_across	Percent of dedupe region in a file that is dedupable only across files, not within the same file
Dedupe group count	How many unique groups of dedupable files to create – dedupe_across does not cross dedupe group boundaries
Percent per_spot	Determines how many hot spot regions will be created within each file. Example: 20% -> each file has 5 hot spot regions.
Min acc_per_spot	Minimum accesses within a hot spot region – access will stay within a single hot spot this many times before choosing another hot spot region. This sets the absolute low threshold for hotspot region accesses.
Acc mult_spot	How many 8 KiB chunks within a hot spot region must be accessed before choosing another hot spot region. This is the primary driver for setting affinity before choosing another hot spot region
Percent affinity	Percent chance when choosing a hot spot region to move to that a different hot spot region will be selected
Spot shape	The access pattern used when selecting an 8 KiB chunk to use within the current hot spot region: can be uniform random or geometric
Dedup granule size	Minimum size of region, in bytes, that comprises a dedupable chunk
Dedup gran rep limit	Maximum number of identical dedupable chunks that will be used before using a new dedupable chunk pattern
Comp granule size	Minimum size of region, in bytes, that comprises a compressible chunk
Cipher behavior	Enable or disable encryption of all dataset patterns. Note that this will effectively disable dedupe and compression savings.
Notification percent	Percent of write or metadata write ops that will generate a file or directory change notification.
LRU	Use internal LRU algorithm for file descriptor caching. This is an enhancement to file descriptor caching and is recommended for new custom workloads.
Pattern version	Use SP1 or SP2 data pattern layout for compression and dedup. The pattern layout for SP2 is recommended for new custom workloads.
Init rate throttle	Set dataset creation rate throttle such that no proc will exceed this rate in MiB/s during the INIT phase
Init read flag	Enable or disable re-reading of existing files on all LOAD points after the first during the INIT phase. For SP1 workloads, this behavior enabled. Disabling this is recommended for new custom workloads.
Rand dist behavior	Selection algorithm for regions within files – either in 8 KiB chunks or in hot spot regions. For 8 KiB chunks, the behavior can be set to uniform random or geometric. Most SP1 workloads use uniform random for 8 KiB chunks, while DATABASE uses geometric for 8 KiB chunks. With SP2, a new mode allows a geometric access pattern across hot spot regions, so some hot spots will be hotter than others. This must be set to geometric access across hot spot regions to enable hot spot functionality.

Q & A

Q: Is this a major or minor update?

A: This is a minor update. All published results remain comparable and all workloads are still available.

Q: Is this going to cost me money?

A: No. Minor updates are free for existing license holders.

Q: Have any of the configuration files changed?

A: Yes, the sfs_rc, benchmarks.xml, and the format of the custom workload definitions have all been enhanced to support the new functionality as well as the new workloads. If you have any of these from the previous version you can easily move the data from the old version to the new version. These are all simple text files and the older formatted labels and values can be moved to the new format where the labels continue to be the same.

Q: Can I publish any of the workloads from SPEC SFS 2014 SP1 using SPEC SFS 2014 SP2?

A: Yes. DATABASE, SWBUILD, VDA, and VDI remain unchanged from SP1 to SP2.