Results-Driven Decisions, Faster: Accelerated Stress Testing as a Reliability Life Test [transcript]

We want to ensure our designs perform reliably, as expected and intended. With today’s high-reliability products and quick release to market, we probably don’t have enough time to just test our parts at normal use rates. It would take too long, because our products ARE so reliable. Or, we’ll miss our window of opportunity to get our product to market. There are reliability prediction methods we could use, some of which are standards-based and others use physics of failures methods. Outside of those, we also have a way to test our own products. Today, we talk about reliability life testing options, specifically accelerated stress testing. I’ll tell you how it all fits together and what types of things you can do with the results.

Hello and welcome to quality during design the place to use quality thinking to create products others love, for less. My name is Dianna. I’m a senior level quality professional and engineer with over 20 years of experience in manufacturing and design. Listen in and then join the conversation at QualityDuringDesign.com. 

Before we start accelerating our testing, let’s slow down and get a broader perspective of reliability life testing.

We want our product to be reliable. Part of designing for reliability is understanding how our part is going to fail, what stresses are going to bring about that failure, and what the normal operation of our product is going to be. We want to test our product to identify design weaknesses, including its limits. We could perform qualitative life testing, which is the shake-and-bake or HALT testing, something I covered in an earlier podcast episode. This type of testing we use to just get failures and then design them out. But, with the design we’re working on now, we want to test it to failure and then analyze the results: get failure rate data. We want to do quantitative reliability life testing. We want to be able to generate normal use failures while measuring times or cycles to failure under a stress load. We want to better understand the failure, and then generate a model to predict how reliable our part will be in its use case.

Why would we want to spend resources generating a model like this? To test our product is going to require parts, test time, and personnel, so it’s going to cost us to do this. With quantitative reliability life testing we get reliability information about our product under normal use conditions, or that we can extrapolate to normal use conditions. We can use this data to calculate probabilities of failure (which could help us with risk assessments), estimate product returns or warranties, or help us compare design choices. It gives us a better understanding of our product so we can make informed decisions based on data.

Here’s one scenario: The reliability of our system needs to meet this reliability requirement: 99% reliability in system start-up to at least 300 rpm is required after 600 on-off cycles of operation with 95% confidence when operating in an environment with a temperature range of –15℃ to 40℃. How do we verify that? We can use quantitative reliability life testing. By the way, there’s a previous Quality During Design episode that explains reliability requirements and how we came up with that example.

A different scenario is that our product is made up of several components, but there is one component in particular that, if it fails, the whole system is irreparable or dangerous to operate. Because of how it’s used and the failures that we’ve seen in the past, we’re concerned with the vibration it sees in use. We chose our critical part as a reliable part, independently. But how reliable is it going to be when it’s within the system? Do we need to place use limits on it? We could perform quantitative reliability life testing to test our system, understand how vibration is going to affect the failure rate, and calculate reliability at different vibration levels.

We can perform this reliability life testing under normal use conditions. But, remember that part of this type of testing is producing failures. If our design and the components we chose are very reliable, then the time or number of cycles that we’ll have to run our products…well, it might be a really long time before we start seeing failures.

This is where accelerated life testing is a good option. Accelerated life testing is reliability life testing, except that we’re accelerating the failures. The main purpose of accelerated life testing is to reduce the length (or time) that we’re testing. The failure modes are going to be the same, whether it is at normal stress levels or at higher stress levels. This is an important detail: no new failure modes are introduced. We’re only accelerating the test to get failures more quickly.

The way we accelerate life testing is by increasing the rate that we’re going to get failures. We can do this in three ways:

  • we can increase the number of products that we test (so we increase the number of failures within a given time),
  • we can compress the time to test by speeding-up the number of cycles, to simulate longer use under normal conditions. An example of this is activating a switch multiple times per minute when it would really only see one or two per week, or
  • we can increase the stresses that generate failures. This last way, to increase stresses, is called accelerated stress testing. Stresses that are commonly used in accelerated stress testing are temperature, humidity, vibration, voltage, current, and radiation.

Where do we even get started? Following our vibration scenario, we’ve got our test objective: we want to quantify our product’s reliability. In other words, we want to be able to calculate probabilities of failures of our system. We know the type of test we want to perform is accelerated life test because it will take too long to generate failures under normal conditions. We know our design is susceptible to vibration, so we’ll decide to plan for some accelerated stress testing. We’re going to expose our parts to vibration levels beyond what it will see in normal use conditions to accelerate the test and produce failures in a shorter time.

The hardest part about accelerated stress testing is understanding, choosing, and calculating the right stress and stress levels to test. For complex designs, we may need to focus on a dominant failure. To understand the stresses that cause that failure, we can look to experts, field data of similar products, or use physics of failure. We can also perform some preliminary testing to understand the stress factors that affect our product, including DOE (design of experiments). As far as the level of stress, a rule of thumb is that the stress levels we pick are higher than the spec limits but lower than the destruct limits. There is a risk in setting up the experiment incorrectly, but that’s why you talk with your Reliability Engineering friends. Here’s what’s going to happen when we decide to move forward with an accelerated stress test:

First, we’re going to design a test that applies stresses at levels that exceed the normal stresses that our product would see, with the goal to accelerate a certain failure mechanism.

Next, we’ll estimate a way to use the accelerated results to predict normal use results. We do this by picking a model to be able to extrapolate from one stress level to a different stress level. This model is going to be a measure of stress against a measure of life (like time or cycles). Reliability Engineers may mention terms like acceleration model, life-stress relationship, or life characteristic – they’re referring to this model. Common examples of models are the Arrhenius model, the Eyring model, and the Power Law Model…and more, including ways to combine stress models for multiple stresses. If our estimated acceleration factor is on the order of 100 times the normal use, then practitioners warn that we’ll likely not get useful results from our accelerated life test (ref. “Accelerated Test Data Analysis”).

Then, we’re going to perform the test and collect the data. We’ll end up with failures or suspensions (which is when the parts survive through all testing and don’t fail). We’ll have failure modes, the corresponding stress at the time of failure, and the time or cycle when that failure occurred. And, we’ll have a record of whatever, other cumulated stress schedules our part survived through the test.

Finally, we’ll analyze our data. We’ll choose a probability distribution to fit our accelerated stress data. This distribution is likely going to be the Weibull, exponential, or lognormal distribution. And, we’ll use the acceleration model to translate the high-stress test results to normal use-stress levels. We’ll end up with model of the reliability life of our product under normal use conditions for that failure mode.

Having these results allows us to use statistics and reliability analyses to calculate reliability measures, like failure rate and probabilities of failure at certain stresses. We can use those measures for design choices, warranty decisions, risk management, and other design decisions like whether to perform preventive maintenance, or to do screening testing at manufacturing.

Is this type of test an investment? Yes. It becomes easier to justify if we have a portfolio of similar products where we can reuse test methods and fixturing, or even be able to reuse the results. It’s also easier to justify if the stakes are high with product failure. There are a lot of independent test houses that have the equipment, fixturing, and know-how to be able to help design and perform accelerated stress testing. Having quantitative reliability life data takes all the guess-work out of a lot of design decisions, and this is a case where up-front work is an investment for huge benefits later.

To recap: Reliability life testing is testing our products to failure in order to improve its reliability, either from identifying and eliminating failures or being able to model the system to calculate probabilities to make decisions. Accelerated life testing is a subset where we are forcing failures to occur more quickly. Accelerated stress testing is then a subset of THAT, where we’re increasing the stresses seen to produce the same failure modes more quickly. We reviewed 3 levels of topics today to talk about accelerated stress testing, which means there are many other options for reliability life testing.

What is today’s insight to action? If we’ve designed a product to be highly-reliable and we need to verify the reliability requirements or want to develop a stress-life model for a failure mode, accelerated stress testing is an option. The benefit is reduced test time and more information for us to be able to make decisions about our product. There is a lot of planning and investment involved, but it can be done with long-term benefits in mind.

I’m adding a reference to “Reliasoft’s Accelerated Life Testing Reference” to this podcast blog. It’s an eTextbook, but there’s also a downloadable .pdf. If accelerated life testing is a topic you’re interested in, then I recommend that you download this reference.

There are earlier episodes of Quality During Design that expand upon some of the ideas we talked about today.

Episode 6 “HALT! Watch out for that weakest link” explores the purposes of HALT (highly accelerated life testing) as a qualitative accelerated life test, where we try to design out the weakest component. This episode also links to an independent test house with videos and pictures of the equipment and fixturing that can be used in accelerated life tests.

Episode 31 “5 Aspects of Good Reliability Goals and Requirements” explores how we can set reliability requirements that we can verify through reliability life testing methods.

Episode 36 “When to use DOE (Design of Experiments)” explains how it can be used to explore the factors that affect our design, which is a method that can be used to help identify the stresses we want to test for accelerated stress testing.

Episode 30 “Using Failure Rate Functions to Drive Early Design Decisions” reviews how we can use one of the outputs of a quantitative accelerated life test, the failure rate function.

Episode 10 “How to Handle Competing Failure Modes” talks about how to analyze reliability life data with different failure modes. Although our accelerated stress test should focus on one failure mode, treating the parts that don’t end up failing as suspensions is the carry-over idea to this episode.

Please visit this podcast blog and others at qualityduringdesign.com. Subscribe to the weekly newsletter to keep in touch. If you like this podcast or have a suggestion for an upcoming episode, let me know. You can find me at qualityduringdesign.com, on LinkedIn, or you could leave me a voicemail at 484-341-0238. This has been a production of Denney Enterprises. Thanks for listening!