Statistical analysis of particulate matter data in Doha, Qatar

Tools

TAYLOR, Charles C., YOUSIF, Asil E. and MWITONDI, Kassim (2018). Statistical analysis of particulate matter data in Doha, Qatar. WIT Transactions on Ecology and the Environment, 230, 107-118. [Article]

[+][-]

Documents

23018:517618

[+][-]

23018:517618

[thumbnail of Mwitondi Statistical analysis of particulate matter data in Doha.pdf]

Preview

PDF
Mwitondi Statistical analysis of particulate matter data in Doha.pdf - Published Version
Available under License All rights reserved.

Download (305kB) | Preview

Abstract

Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.

More Information

Official URL:

https://www.witpress.com/elibrary/wit-transactions...

Additional Information:

** From Crossref via Jisc Publications Router.

Research Institute, Centre or Group - Does NOT include content added after October 2018:

Cultural Communication and Computing Research Institute > Communication and Computing Research Centre

Departments - Does NOT include content added after October 2018:

Faculty of Science, Technology and Arts > Department of Computing

Page Range:

107-118

Identifiers