
Last Updated 8 October, 2003
· Table of Contents · Fundamental & Applied Research  Probabability of detection for magnetic rubber inspections of F111 steel componentsC A Harding and G R HugoDefence Science and Technology Organisation 506 Lorimer Street, Fishermans Bend, Victoria, 3207, Australia. Contact 
Abstract:Magnetic rubber inspection is used extensively on F111 aircraft to detect cracks in critical steel components. Scheduled inspection intervals are based on a durability and damage tolerance analysis which requires as input data an assessment of the reliability of the technique. DSTO and the Royal Australian Air Force (RAAF) recently conducted an experimental program to determine the probability of detection for magnetic rubber inspection of F111 components. This program involved a series of inspections performed on coupon specimens by RAAF technicians under conditions simulating those experienced during inservice inspections. A program of Monte Carlo simulations was used to demonstrate the validity of different statistical methods for analysis of the relatively small experimental data set obtained from the field trial. Keywords: Magnetic rubber inspection, reliability, probability of detection IntroductionMagnetic rubber inspection (MRI) is a nondestructive evaluation (NDE) technique, which is used extensively on F111 aircraft to detect cracks in D6ac steel components, including the wingpivot fitting, the wing carrythrough box and several other critical structures within the airframe. The magnetic rubber technique is a variation on magnetic particle methods, in which a liquid rubber containing suspended magnetic particles is poured into a dam surrounding the area to be inspected on a magnetised component. After the rubber sets, the cast is removed and examined for evidence of cracks or other discontinuities, which appear as dark lines on the surface of the cast. Inspections can be performed using either an applied magnetic field, which is maintained whilst the rubber sets (active field), or the residual field from magnetisation of the component prior to pouring the rubber. A key feature of the inspections conducted on the F111 D6ac steel structure is the need to reliably detect very small fatigue cracks down to 0.010 inch [Imperial units are conventionally used in connection with the F111.] (0.25 mm) in length. MRI is labour intensive but capable of detecting such small defects. Scheduled inspection intervals for nondestructive inspection of F111 are based on a durability and damage tolerance analysis (DADTA), which incorporates as input data an assessment of the reliability of the NDE technique. The DADTA assumes that the airframe contains preexisting flaws at critical locations, and then models the growth of these flaws given a representative flight loading spectrum. The initial flaw size is assumed to be the smallest that can be reliably detected by the NDE technique. It is taken to be the minimum crack size (a_{9095}) for which a 90% probability of detection (POD) has been demonstrated experimentally with 95% statistical confidence. Here, the 95% statistical confidence level accounts for the uncertainty inherent in determining the POD from a finite statistical sample. The DADTA modelling predicts a number of flight hours for the defect(s) to grow to a size, which could cause failure of the component. The inspection interval for periodic NDE is then taken to be a fraction (typically onehalf) of the number of flight hours for the assumed defect to grow to a critical size. The withdrawal of all USAF F111 from service circa 1998 left Australia as the only nation operating this aircraft type, with the expectation that the F111 would remain in Royal Australian Air Force (RAAF) service for a further 20 years until its planned withdrawal date in 2020. As part of a coordinated package of research to support the RAAF as soleoperator of the F111, DSTO conducted a review of all available information concerning the reliability of magnetic rubber inspections. In this review, insufficient documentary evidence could be located to satisfactorily demonstrate the required POD (Hugo & Scala 2001). Consequently, DSTO commenced an experimental program to determine the POD (including a_{9095}) for RAAF magnetic rubber inspection of F111. It was anticipated that this information could allow inspection intervals for magnetic rubber inspections to be increased, thereby achieving significant savings on maintenance costs and reducing aircraft unavailability due to scheduled inspections. During the initial review, it was noted that the published methodologies for analysis of POD data were generally developed and demonstrated for the analysis of relatively large data sets. Since the experimental program for RAAF MRI would necessarily be a relatively small trial, a number of possible analysis algorithms were examined to assess their applicability for the modest quantity of data to be obtained. Experimental Program
The experimental program involved simulated field inspections of a series of coupon specimens by RAAF technicians. Two specimen types were used (Figure 1): a 'bolthole' specimen representative of typical cracks occurring in boltholes, and a 'mousehole' specimen representing cracks in more general structure, including radii. The latter specimen was designed to be similar to the fuelflow vent holes (mouseholes) within the F111 wingpivot fitting. Field inspections were conducted with the specimens inserted inside a scrap wingpivot fitting in order to realistically simulate the effects on reliability of the restricted access typically encountered by RAAF technicians.
The coupon specimens were fabricated from D6ac steel, heattreated to the same condition as components in the F111 airframe. Fatigue cracks were generated in the specimens at DSTO using a 'DADTA2b' spectrum loading, representative of flight loading at a typical location in the lower plate of the wing pivot fitting. Small corrosion pits (up to 50 µm in size) were electrochemically generated in the specimens prior to fatiguing to act as fatigue crack initiators. The use of corrosion pits as crack initiators was necessary in order to reduce the scatter in crack initiation times sufficiently to be able to successfully generate the very small fatigue cracks (down to 0.004" in length) required for the trial. Both bore and quadrant (corner) cracks with lengths ranging from 0.002" (0.05 mm) to 0.090" (2.3 mm) were successfully generated using this procedure. For the mousehole specimens, the mousehole shape was produced by electric discharge machining from an initial keyhole notch shape after generating the fatigue cracks, taking care not to remove the cracks during the machining process. From a total of 103 specimens prepared, a set of 21 bolthole and 28 mousehole specimens were selected to achieve as uniform a distribution of crack sizes as possible. Uncracked specimens were used as placebos, some of which had been fatigued but were uncracked, whilst others were as machined. The trial included a total of 360 inspections on cracked holes and placebos, in roughly equal proportion, according to a randomised schedule of inspections. Six RAAF technicians of different experience levels participated in the trial and were drawn from the Base NDT section at 501 Wing RAAF Amberley. Trials were conducted at RAAF Amberley under similar levels of pressure due to workload as those encountered by technicians for onaircraft inspections. To prevent collusion between technicians, they reported their results by session number and by the station number of each of four coupons located inside the wingpivot fitting for each session. Two different methods were used to magnetise the specimens in the field trial. The mousehole specimens were inspected using an active field from a horseshoe magnet spanning the mousehole, whilst the bolthole specimens were inspected using the residual field following magnetisation using a central conductor inserted through the hole (applied current of 500A for 5 sec). Technician results were reported using defect codes adapted from those used in service. Technician reports were compared to the results of 'master' magnetic rubber inspections in order to determine a 'hit' or 'miss' result for each inspection of each confirmed crack. The master inspections were performed at DSTO with the specimens under an applied tensile load in a mechanical testing machine, which was found to give significantly clearer indications because the crack mouths were opened by the applied load. Master inspections were performed both before and after the field trials. A subset of specimens were broken open for fractographic examination in order to confirm Statistical Analysis Methods
Methodologies for determining POD fall into two basic categories, Determination of POD as a function of crack size requires a series of inspections on specimens containing a range of crack lengths,a, to infer a curve, POD(a), which plots the POD as a function of crack size. Statistical methods may be further differentiated into Other methods, which make no assumptions about the functional relationship between POD and crack size, include the range interval method (RIM) and optimised probability method (OPM) (Berens & Hovey 1981, Bruce 1998). These methods can be applied to data sets of any size and have the advantage that the POD curves inferred by them cannot be compromised by an inappropriate choice of functional form for POD(a). However, the confidence limits derived from these methods are almost always more conservative than those obtained from curve fitting methods and may be very conservative when applied to small data sets. Validation of Analysis MethodsThe applicability of various statistical methods for relatively small data sets was assessed using a program of Monte Carlo simulations. Synthetic hit / miss results were generated for a random set of crack lengths according to an assumed true POD(a). The functional form used for the true POD(a) curve was a cumulative lognormal distribution, which is commonly used for analysis of POD data. Several analysis algorithms were applied to this synthetic data set. Curve fitting methods provide a 'best fit' (MLE) POD curve and a lower 95% confidence limit curve, as well as key parameters such as the a_{9095} crack length (minimum crack length which gives 90% POD with 95% statistical confidence). The simulation procedure was repeated 1000 times to determine the distributions of the fitted POD curves and parameters and to assess the conservatism or nonconservatism of the fitted curves and parameters_{ }with respect to the assumed true POD curve. Results and DiscussionTable 1 presents the results for 1000 simulations, each comprising 100 inspections, with the true POD curve chosen to give a_{50,true} = 0.005" (0.13mm) and a_{90,true} = 0.011" (0.28mm), where a_{50} and a_{90} denote the crack lengths corresponding to 50% and 90% POD. Methods 'MLE method 1' and 'MLE method 2' denote two different formulations for determining the 95% confidence limit on the maximum likelihood estimation of POD. Method 1 was based on a general procedure described by Cheng & Iles (1983), using their parameter Q_{2} to define the confidence region. Method 2 implemented a much simpler closedform solution proposed by Bullock, Forsyth & Fahr (1994). For a 95% confidence limit, there is only a 5% chance of obtaining a data set which gives a confidence limit that is nonconservative with respect to the true value. Thus, we would expect the 95% confidence limit to be nonconservative (a_{9095}<a_{90,true}) at most 5% of the time, on average[The 95% confidence limits derived by these methods are for the whole POD curve as a function of crack length; i.e. there is a 95% confidence that the confidence limit curve will be conservative with respect to the true POD at all points. The chance that any single point on the curve (eg a_{9095}) will be nonconservative should be significantly less than 5%]. From Table 1, this expectation is easily satisfied for OPM and MLE method 1. However, for MLE method 2, the a_{9095} value was nonconservative for 15.9% of the simulations. This result could not have occurred by chance (probability < 10^{10}) and indicates a serious problem with this analysis method. Indeed, a review of the derivation of the key formulae in the paper from which the method was taken revealed an incorrect assumption which, when corrected, renders the method invalid for determining confidence limits on POD curves. Thus MLE method 2 should not be used for the analysis of POD data.
The other results in Table 1 are consistent with expectations and show that OPM and MLE method 1 are both acceptable for determining POD curves from data sets as small as 100 inspections. The results are consistent with the curve fitting (MLE) method being much more efficient that OPM. For MLE method 1, the confidence limit a_{9095 }was on average 70% greater than the true value a_{90,true}, whilst for OPM, a_{9095 }was on average 2.3 times the true value. OPM was unable to determine an a_{9095} confidence limit for 30% of simulations as the curve did not reach a POD of 90% within the range of crack sizes considered. The results of the experimental trial, analysed using MLE 1 and OPM methods, are shown in Figures 2 and 3 for inspections on mousehole and bolthole specimens respectively. The bolthole inspections have a somewhat higher POD at small crack lengths but a significantly poorer POD at large crack lengths. This is consistent with the proportion of cracks detected (hits) at each crack length in the field data. The bolthole data includes two very significant misses at 0.021" (0.53 mm) and 0.018" (0.46 mm). For the mousehole specimens, the 'best fit' or maximum likelihood estimate of the crack length at which the POD reaches 90% is a_{90} = 0.009" (0.23 mm)_{, }whilst the 95% confidence limit crack length for 90% POD is a_{9095} = 0.012" (0.30 mm). By comparison, the more conservative OPM gives a_{9095} = 0.028" (0.71 mm). For the bolthole specimens, the maximum likelihood estimate a_{90} = 0.015" (0.38 mm). However, due to the relatively poorer POD and the limited quantity of inspection data, the 95% confidence limit does not reach a POD of 90% within the range of the field inspection data and an a_{9095} value cannot be reported. The bolthole and mousehole inspections differ significantly in the magnetisation method (central conductor vs active field) and in the surface condition of the areas inspected. The mouseholes were highly polished, consistent with the surface condition during RAAF inspections of mouseholes and stiffener runouts in the F111 wing pivot fitting. The surface in boltholes was less well polished. The POD results obtained for the mousehole specimens are thus considered to be applicable only to inspections conducted on highly polished surfaces using an active field. It is possible that the significant misses and the poorer POD for bolthole inspections are related to the fact that, for a central conductor inspection with no defects present, there is normally no sign of the magnetic field on the cast, as the field lines run circumferentially with no leakage field in the absence of a defect. By comparison, for the active field technique, the casts always shows a 'halo' at the edges which provides postinspection confirmation that the field was correctly applied. Thus, human error in applying the centralconductor procedure resulting in a lack of magnetisation could easily pass undetected, whereas for the active field technique inadequate magnetisation would easily be detected when inspecting the casts and the inspection would be repeated. This could explain the poorer reliability (lower POD) for the bolthole inspections at larger crack lengths, since inadequate magnetisation could cause a failure to detect cracks of any size. In spite of this, the best guess (MLE) a_{90} value is 0.38 mm (0.015"). This is still significantly better than could be achieved using other NDE techniques. It is likely that a usable lower confidence limit a_{9095} value could be obtained by extending the field trial to obtain more data for the bolthole inspections. The open circles plot the proportion of hits obtained for the field inspection data at each crack size, with their area being proportional to the total number of inspections performed at each crack size.
The open circles plot the proportion of hits obtained for the field inspection data at each crack size, with their area being proportional to the total number of inspections performed at each crack size.
ConclusionsThe POD for magnetic rubber inspections of F111 D6ac steel components has been determined as a function of crack length based on field trials completed by RAAF technicians at RAAF Amberley. It was found that inspections of boltholes using central conductor magnetisation were less reliable (lower POD) than the inspection of a mousehole geometry using an active field technique. The applicability of different statistical methods for analysis of relatively small data sets was examined using Monte Carlo simulations. A significant error was identified in one previously published analysis method. The simulations demonstrated that a variant of the standard MLE method (MILHDBK1823 1999), utilising parameter Q2 of Cheng & Iles (1983) to define the confidence limit, is acceptable for determining POD curves from data sets as small as 100 inspections. AcknowledgementsThe authors gratefully acknowledge the valuable contributions made by staff in the RAAF NonDestructive Testing Standard Laboratory, who contributed to the planning and supervision of the field trial, and the technicians and supervising officers from 501 Wing Base NDT section whose willing cooperation was vital to the trial's success. References

F111.net
www.F111.net Hosted WebSites: 
Copyright 'F111.net F111 Aardvark'. 19992002 Links to, and reviews of this site are welcome. Disclaimer: This F111 Aardvark Internet site does not represent the views of Boeing, Convair, General Dynamics, Grumman, Lockheed Martin, the United States Air Force, the Royal Australian Air Force or any other company or organisation which may be named herein. Should any company, organisation or individual feel grieved that I am using their logo or product without permission, please contact me at the email above. Significant use is made of images public released by the Australian Defence Force (ADF). These images are crown copyright, meaning that they may be reproduced but not for third party monetary gain. F111.net is not responsible or liable in any way for third party products or services featured or advertised on this website. F111.net is a 'notforprofit' website. 
Flag Gifs from 3dflags.com