Interpretation of organic components from positive matrix factorization of aerosol mass spectrometric data
1Cooperative Inst. for Research in the Environmental Sciences (CIRES), Boulder, CO, USA
2Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO, USA
3Aerodyne Research, Inc., Billerica, MA, USA
4Atmos. Sci. Res. Center, University at Albany, State University of New York, Albany, NY, USA
Abstract. The organic aerosol (OA) dataset from an Aerodyne Aerosol Mass Spectrometer (Q-AMS) collected at the Pittsburgh Air Quality Study in September 2002 was analyzed for components with Positive Matrix Factorization (PMF). Three components – hydrocarbon-like organic aerosol OA (HOA), a highly-oxygenated OA (OOA-I) that correlates well with sulfate, and a less-oxygenated, semi-volatile OA (OOA-II) that correlates well with nitrate and chloride – are identified and interpreted as primary combustion emissions, aged SOA, and semivolatile, less aged SOA, respectively. The complexity of interpreting the PMF solutions of unit mass resolution (UMR) AMS data is illustrated by a detailed analysis of the solutions as a function of number of components and rotational state. A public database of AMS spectra has been created to aid this type of analysis. A sensitivity analysis with realistic synthetic data is also used to characterize the behavior of PMF for choosing the best number of factors, rotations of non-unique solutions, and the retrievability of more (or less) correlated factors. The ambient and synthetic data indicate that the variation of the PMF quality of fit parameter (Q, a normalized chi-squared metric) vs. number of factors in the solution is useful to identify the minimum number of factors, but more detailed analysis and interpretation is needed to choose the best number of factors. The maximum value of the rotational matrix is not useful for determining the best number of factors. In synthetic datasets, factors are "split" into two or more components when solving for more factors than were used in the input. Elements of the "splitting" behavior are observed in solutions of real datasets with several factors. Significant structure remains in the residual of the real dataset after physically-meaningful factors have been assigned and an unrealistic number of factors would be required to explain the remaining variance. This residual structure appears to be due to variability in the spectra of the components (especially OOA-II in this case), which is likely to be a key limit of the retrievability of components from AMS datasets using PMF and similar methods that need to assume constant component mass spectra. Methods for characterizing and dealing with this variability are needed. Values of the rotational parameter (FPEAK) near zero appear to be most appropriate for these datasets. Interpretation of PMF factors must be done carefully. Synthetic data indicate that PMF internal diagnostics and similarity to available source component spectra together are not sufficient for identifying factors. It is critical to use correlations between factor time series and external measurement time series to support factor interpretations. Components with <5% of the mass or with high correlation (R>0.9) with other components are suspect and should be interpreted with care. Results from this study may be useful for interpreting the PMF analysis of data from other aerosol mass spectrometers.