API¶
Code documentation of Pydoas API.
Data import¶
-
class
pydoas.dataimport.
ResultImportSetup
(base_dir=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(3000, 1, 1, 0, 0), meta_import_info='doasis', result_import_dict={}, default_dict={}, doas_fit_err_factors={}, dev_id='', lt_to_utc_offset=datetime.timedelta(0))[source]¶ Setup class for spectral result imports from text like files
-
__init__
(base_dir=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(3000, 1, 1, 0, 0), meta_import_info='doasis', result_import_dict={}, default_dict={}, doas_fit_err_factors={}, dev_id='', lt_to_utc_offset=datetime.timedelta(0))[source]¶ Parameters: - base_dir (str) – folder containing resultfiles
- start (datetime) – time stamp of first spectrum
- stop (datetime) – time stamp of last spectrum
- meta_import_info – Specify the result file format and columns
for meta information (see als file import_info.txt or example
script 2). Input can be str or dict. In case a string is
provided, it is assumed, that the specs are defined in
import_info.txt, i.e. can be imported (as dictionary)
from this file (using
get_import_info()
, e.g. with arg =doasis
). If a dictionary is provided, the information is directly set from the provided dictionary. - result_import_dict (dict) –
specify file and header information for import. Keys define the used abbreveations after import, the values to each key consist of a list with 2 elements: the first specifies the UNIQUE string which is used to identify this species in the header of a given Fit result file, the second entry is a list with arbitrary length containing the fit scenario IDs defining from which fit scenario result files this specific species is to be extracted.
Example:
result_import_dict = {"so2" : ['SO2_Hermans', ['f01','f02']], "o3" : ['o3_burrows'], ['f01']]}
Here
so2
and “o3” are imported, the data column in the result files is found by the header string'SO2_Hermans'
/'o3_burrows'
and this species is imported from all fit scenario result files with fit Ids["f01", "f02"]
(UNIQUE substrings in FitScenario file names.Exemplary file name:
D130909_S0628_i6_f19_r20_f01so2.dat
This (exemplary) filename convention is used for the example result files shipped with this package (see folder pydoas/data/doasis_resultfiles) which include fit result files from the software DOASIS.
The delimiter for retrieving info from these file names is “_”, the first substring provides info about the date (day), the second about the start time of this time series (HH:MM), 3rd, 4th and 5th information about first and last fitted spectrum number and the corresponding number of the reference spectrum used for this time series and the last index about the fit scenario (fitID).
Each resultfile must therefore include a unique ID in the file name by which it can be identified.
- default_dict (dict) –
specify default species, e.g.:
dict_like = {"so2" : "f02", "o3" : "f01"}
- doas_fit_err_factors (dict) –
fit correction factors (i.e. factors by which the DOAS fit error is increased):
dict_like = {"so2" : "f02", "o3" : "f01"}
- dev_id (str) – string ID for DOAS device (of minor importance)
- lt_to_utc_offset (timedelta) – specify time zone offset (will be added on data import if applicable).
-
start
¶ Start time-stamp of data
-
stop
¶ Stop time-stamp of data
-
base_path
¶ Old name of base_dir for versions <= 1.0.1
-
set_start_time
(dt)[source]¶ Set the current start time
Parameters: dt (datetime) – start time of dataset
-
set_stop_time
(dt)[source]¶ Set the current start time
Parameters: dt (datetime) – start time of dataset
-
set_defaults
(dict_like)[source]¶ Update default fit IDs for fitted species
Scheme:
dict_like = {"so2" : "f02", "o3" : "f01"}
-
set_fitcorr_factors
(dict_like)[source]¶ Set correction factors for uncertainty estimate from DOAS fit errors
Parameters: dict_like (dict) – dictionary specifying correction factors for DOAS fit errors (which are usually underestimated, see e.g. Gliss et al. 2015) for individual fit scenarios, e.g.:
dict_like = {"f01" : 4.0, "f02" : 2.0}
Default value is 3.0.
-
xs
¶ Returns list with xs names
-
get_fit_ids_species
(species_id)[source]¶ Find all fit scenarios which contain results of species
Parameters: species_id (str) – string ID of fitted species (e.g. SO2)
-
fit_ids
¶ Returns list with all fit ids
-
access_type
¶ Return the current setting for data access type
-
HEADER_ACCESS_OPT
¶ Checks if current settings allow column identification from file header line
-
FIRST_DATA_ROW_INDEX
¶
-
-
class
pydoas.dataimport.
DataImport
(setup=None)[source]¶ A class providing reading routines of DOAS result files
Here, it is assumed, that the results are stored in FitResultFiles, tab delimited whereas the columns correspond to the different variables (i.e. fit results, metainfo, …) and the rows to the individual spectra.
-
load_result_type_info
()[source]¶ Load import information for result type specified in setup
The detailed import information is stored in the package data file import_info.txt, this file can also be used to create new filetypes
-
base_dir
¶ Returns current basepath of resultfiles
-
start
¶ Returns start date and time of dataset
-
stop
¶ Returns stop date and time of dataset
-
time_str_format
¶ Returns datetime formatting info for string to datetime conversion
This information should be available in the resultfile type specification file (package data: data/import_info.txt)
-
fit_err_add_col
¶ Return current value for relative column of fit errors
-
find_valid_indices_header
(fileheader, dict)[source]¶ Find positions of species in header of result file
Parameters:
-
find_all_indices
(fileheader, fit_id)[source]¶ Find all relevant indices for a given result file (fit scenario)
Parameters: - fileheader (list) – list containing all header strings from
result file (not required if data access mode is from columns
see also
HEADER_ACCESS_OPT()
inResultImportSetup
) - fit_id (str) – ID of fit scenario (required in order to find
all fitted species supposed to be extracted, specified in
self.setup.import_info
)
- fileheader (list) – list containing all header strings from
result file (not required if data access mode is from columns
see also
-
load_results
()[source]¶ Load all results
The results are loaded as specified in
self.import_setup
for all valid files which were detected inget_all_files()
which writesself.file_paths
-
check_time_match
(data)[source]¶ Check if data is within time interval set by self.start and self.stop
Parameters: data (list) – data as read by read_text_file()
Returns: - bool, Match or no match
-
first_file
¶ Get filepath of first file match in
self.base_dir
This can for instance be read with
read_text_file()
-
get_all_files
()[source]¶ Get all valid files based on current settings
Checks
self.base_dir
for files matching the specified file type, and which include one of the required fit IDs in their name. Files matching these 2 criteria are opened and the spectrum times are read and checked. If they match the time interval specified byself.start
andself.stop
the files are added to the dictionaryself.file_paths
where the keys specify the individual fit scenario IDs.Note
This function does not load data but only assigns the individual result files to the fit IDs, the data will then be loaded calling
load_results()
-
Fit result analysis and plotting¶
-
class
pydoas.analysis.
DatasetDoasResults
(setup=None, init=1, **kwargs)[source]¶ A Dataset for DOAS fit results import and processing
-
__init__
(setup=None, init=1, **kwargs)[source]¶ Initialisation of object
Parameters: - setup (ResultImportSetup) – setup specifying all necessary import
settings (please see documentation of
ResultImportSetup
for setup details) - **kwargs –
alternative way to setup
self.setup
(ResultImportSetup
object), which is only used in case no input parameter setup is invalid. Valid keyword arguments are input parameters ofResultImportSetup
object.
- setup (ResultImportSetup) – setup specifying all necessary import
settings (please see documentation of
-
load_input
(setup=None, **kwargs)[source]¶ Process input information
Writes
self.setup
based on setupParameters: - setup – is set if valid (i.e. if input is
ResultImportSetup
) - **kwargs –
- keyword arguments for new
ResultImportSetup
(are used in case first parameter is invalid)
- keyword arguments for new
- setup – is set if valid (i.e. if input is
-
base_path
¶ Returns current basepath of resultfiles (from
self.setup
)
-
start
¶ Returns start date and time of dataset (from
self.setup
)
-
stop
¶ Returns stop date and time of dataset (from
self.setup
)
-
dev_id
¶ Returns device ID of dataset (from
self.setup
)
-
import_info
¶ Returns information about result import details
-
change_time_ival
(start, stop)[source]¶ Change the time interval for the considered dataset
Parameters: - start (datetime) – new start time
- stop (datatime) – new stop time
Note
Previously loaded results will be deleted
-
get_start_stop_mask
(fit, start=None, stop=None)[source]¶ Creates boolean mask for data access only in a certain time interval
-
get_meta_info
(fit, meta_id, start=None, stop=None)[source]¶ Get meta info array
Parameters: - meta_id (str) – string ID of meta information
- boolMask (array) – boolean mask for data retrieval
Note
Bool mask must have same length as the meta data array
-
get_results
(species_id, fit_id=None, start=None, stop=None)[source]¶ Get spectral results object
Parameters:
-
get_default_fit_id
(species_id)[source]¶ Get default fit scenario id for species
Parameters: species_id (str) – ID of species (e.g. “so2”)
-
set_default_fitscenarios
(default_dict)[source]¶ Update default fit scenarios for species
Parameters: default_dict (dict) – dictionary specifying new default fit scenarios, it could e.g. look like:
default_dict = {"so2" : "f01", "o3" : "f01", "bro" : "f03"}
-
scatter_plot
(species_id_xaxis, fit_id_xaxis, species_id_yaxis, fit_id_yaxis, lin_fit_opt=1, species_id_zaxis=None, fit_id_zaxis=None, start=None, stop=None, ax=None, **kwargs)[source]¶ Make a scatter plot of two species
Parameters: - species_id_xaxis (str) – string ID of x axis species (e.g. “so2”)
- fit_id_xaxis (str) – fit scenario ID of x axis species (e.g. “f01”)
- species_id_yaxis (str) – string ID of y axis species (e.g. “so2”)
- fit_id_yaxis (str) – fit scenario ID of y axis species (e.g. “f02”)
- species_id_zaxis (str) – string ID of z axis species (e.g. “o3”)
- fit_id_zaxis (str) – fit scenario ID of z axis species (e.g. “f01”)
:param bool linF
-
linear_regression
(x_data, y_data, mask=None, ax=None)[source]¶ Perform linear regression and return parameters
Parameters: - x_data (ndarray) – x data array
- y_data (ndarray) – y data array
- mask (ndarray) – mask specifying indices of input data supposed to be considered for regression (None)
- ax – matplotlib axes object (None), if provided, then the result is plotted into the axes
-
-
class
pydoas.analysis.
DoasResults
(data, index=None, start_acq=[], stop_acq=[], fit_errs=None, species_id='', fit_id='', fit_errs_corr_fac=1.0)[source]¶ Data time series object inheriting from
pandas.Series
for handling and analysing DOAS fit resultsParameters: - data (arraylike) – DOAS fit results (column densities)
- index (arraylike) – Time stamps of data points
- fit_errs (arraylike) – DOAS fit errors
- species_id (string) – String specifying the fitted species
- fit_id (string) – Unique string specifying the fit scenario used
- fit_errs_corr_fac (int) – DOAS fit error correction factor
Todo
Finish magic methods, i.e. apply error propagation, think about time merging etc…
-
__init__
(data, index=None, start_acq=[], stop_acq=[], fit_errs=None, species_id='', fit_id='', fit_errs_corr_fac=1.0)[source]¶ x.__init__(…) initializes x; see help(type(x)) for signature
-
fit_errs
= None¶
-
fit_id
= None¶
-
fit_errs_corr_fac
= None¶
-
start_acq
= []¶
-
stop_acq
= []¶
-
start
¶ Start time of data
-
stop
¶ Stop time of data
-
species
¶ Return name of current species
-
merge_other
(other, itp_method='linear', dropna=True)[source]¶ Merge with other time series sampled on different grid
Note
This object will not be changed, instead, two new Series objects will be created and returned
Parameters: Returns: 2-element tuple containing
- this Series (merged)
- other Series (merged)
Return type:
-
get_data_above_detlim
()[source]¶ Get fit results exceeding the detection limit
The detection limit is determined as follows:
self.fit_errs_corr_fac*self.data_err
-
plot
(date_fmt=None, **kwargs)[source]¶ Plot time series
Uses plotting utility of
Series
object (pandas)Parameters: **kwargs – - keyword arguments for pandas plot method
-
shift
(timedelta=datetime.timedelta(0))[source]¶ Shift time stamps of object
Parameters: timedelta (timedelta) – temporal shift Returns: shifted DoasResults
object
Supplemental / IO / Helpers¶
This module contains I/O routines for DOAS result files
-
pydoas.inout.
get_data_dirs
()[source]¶ Get directories containing example package data
Returns: list of package subfolders containing data files
-
pydoas.inout.
get_data_files
(which=u'doasis')[source]¶ Get all example result files from package data
-
pydoas.inout.
get_result_type_ids
()[source]¶ Read file import_info.txt and find all valid import types
-
pydoas.inout.
import_type_exists
(type_id)[source]¶ Checks if data import type exists in import_info.txt
Parameters: type_id (str) – string ID to be searched in import_info.txt
-
pydoas.inout.
get_import_info
(resulttype=u'doasis')[source]¶ Try to load DOAS result import specification for default type
Import specifications for a specified data type (see package data file “import_info.txt” for available types, use the instructions in this file to create your own import setup if necessary)
Parameters: resulttype (str) – name of result type (field “type” in “import_info.txt” file)