100 Things #64: Using Statistics to Create Test Limits

The statistics feature in SoundCheck adds the ability to perform a variety of statistical measurements. SoundCheck’s statistics step can work with data, results, or both. Statistics allows users to take a set of data, like frequency responses of multiple devices, and automatically calculate the best or worst fit to average, maximum, minimum, and more. This statistics functionality is not just confined to a sequence, since all the same functionality is available with offline statistics. This is a great solution of performing statistics independent of a sequence, for applications like finding golden units in production testing.

Using Statistics to Create Test Limits

Learn more about statistics and limits in SoundCheck

If you want to learn more about using statistics in SoundCheck, our four-part tutorial series on using statistics with SoundCheck is available to watch here. This series goes in-depth with statistics data, results, processing capability, and offline capability.

Our three-part tutorial series on limits in SoundCheck is available to watch here. This three part series covers the basics of limits functionality in SoundCheck, data, and advanced limit creation.

Video Script:

One question I often hear from customers is “I wrote a sequence to measure my devices. I have the frequency response, THD, sensitivity, but how do I know if this is good or bad?” We have statistics tools inside SoundCheck that can make this determination a lot easier. 

It’s important to remember that measurement targets are completely different depending on the device. For example, the acceptable level of distortion in a high end pair of bluetooth headphones, would be completely different than a cheap USB headset made for online meetings. A great place to start is picking out five units that are subjectively “good” devices. We can measure these units in SoundCheck, use statistics to help us generate limits, then compare other devices to this.

Let’s look at a standard headphone test sequence. Right now it’s configured to just run one test, but by adding in a statistics step to the end of the sequence, I can run this test as many times as I like and average all of those different units. 

The statistics step has many different features, but let’s look at Mean and Standard deviation. Mean takes the average point of the selected curve or value for every run. If I measure 5 devices and get their frequency responses, the mean is a running average of all 5 devices combined. We can use our mean as a reference curve, and compare each device to this. 

Standard deviation outputs plus minus sigma curves, which we define in the editor. For example if I want to make sure that all my devices fall within 3 sigma of my 5 reference devices, I set up my statistics step to output +/- 3 sigma, and after I run my 5 different units these upper and lower sigma curves are added to memory. I can then use these as the upper and lower limits in my test sequence, and pass a device if it falls within this range and fail it if it’s outside the range. 

And one final note… If you already captured measurements but didn’t run statistics on it while the sequence was running, all of these same features are available in the offline statistics editor. Just open up your curves from your good units in the memory list and you can run statistics directly through the offline menu.

With offline statistics, you can calculate the Best Fit to Average and Worst Fit to Average curves by finding which unit comes closest to, or furthest away from the average curve. Best Fit to Average can be used to find a reference or “Golden Unit”. This can be used as a sanity check when things go wrong on the production line and for developing limit curves. Some manufacturers prefer this approach because the factory environment e.g. temperature and humidity can vary from day to day and affect devices’ measurement performance.

By measuring the golden unit before measuring newly manufactured devices, the limits can be updated relative to the golden unit under current conditions. Worst Fit to Average can be used to find outliers or bad units that you don’t want to use in your statistical calculations when developing limits. Once you find a Worst Fit to Average curve, simply unselect it and re-run your statistics on the remaining good units. 

Do you use statistics to set pass/fail criteria? Let us know in the comments below.