Statistical analysis of particulate matter data in Doha, Qatar

TAYLOR, Charles C., YOUSIF, Asil E. and MWITONDI, Kassim (2018). Statistical analysis of particulate matter data in Doha, Qatar. WIT Transactions on Ecology and the Environment, 230, 107-118.

[img]
Preview
PDF
Mwitondi Statistical analysis of particulate matter data in Doha.pdf - Published Version
All rights reserved.

Download (305kB) | Preview
Official URL: https://www.witpress.com/elibrary/wit-transactions...
Link to published version:: https://doi.org/10.2495/air180101
Related URLs:

    Abstract

    Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.

    Item Type: Article
    Additional Information: ** From Crossref via Jisc Publications Router.
    Research Institute, Centre or Group - Does NOT include content added after October 2018: Cultural Communication and Computing Research Institute > Communication and Computing Research Centre
    Departments - Does NOT include content added after October 2018: Faculty of Science, Technology and Arts > Department of Computing
    Identification Number: https://doi.org/10.2495/air180101
    Page Range: 107-118
    SWORD Depositor: Margaret Boot
    Depositing User: Margaret Boot
    Date Deposited: 31 Oct 2018 12:09
    Last Modified: 16 Nov 2018 11:59
    URI: http://shura.shu.ac.uk/id/eprint/23018

    Actions (login required)

    View Item View Item

    Downloads

    Downloads per month over past year

    View more statistics