API
Code documentation of Pydoas API.
Data import
- class pydoas.dataimport.DataImport(setup=None)[source]
Bases:
objectA class providing reading routines of DOAS result files
Here, it is assumed, that the results are stored in FitResultFiles, tab delimited whereas the columns correspond to the different variables (i.e. fit results, metainfo, …) and the rows to the individual spectra.
- property base_dir
Returns current basepath of resultfiles
- check_time_match(data)[source]
Check if data is within time interval set by self.start and self.stop
- Parameters:
data (list) – data as read by
read_text_file()- Returns:
bool, Match or no match
- find_all_indices(fileheader, fit_id)[source]
Find all relevant indices for a given result file (fit scenario)
- Parameters:
fileheader (list) – list containing all header strings from result file (not required if data access mode is from columns see also
HEADER_ACCESS_OPT()inResultImportSetup)fit_id (str) – ID of fit scenario (required in order to find all fitted species supposed to be extracted, specified in
self.setup.import_info)
- find_valid_indices_header(fileheader, dict)[source]
Find positions of species in header of result file
- property first_file
Get filepath of first file match in
self.base_dirThis can for instance be read with
read_text_file()
- property fit_err_add_col
Return current value for relative column of fit errors
- get_all_files()[source]
Get all valid files based on current settings
Checks
self.base_dirfor files matching the specified file type, and which include one of the required fit IDs in their name. Files matching these 2 criteria are opened and the spectrum times are read and checked. If they match the time interval specified byself.startandself.stopthe files are added to the dictionaryself.file_pathswhere the keys specify the individual fit scenario IDs.Note
This function does not load data but only assigns the individual result files to the fit IDs, the data will then be loaded calling
load_results()
- load_result_type_info()[source]
Load import information for result type specified in setup
The detailed import information is stored in the package data file import_info.txt, this file can also be used to create new filetypes
- load_results()[source]
Load all results
The results are loaded as specified in
self.import_setupfor all valid files which were detected inget_all_files()which writesself.file_paths
- read_text_file(p)[source]
Read text file using csv.reader and return data as list
- Parameters:
p (str) – file path
- Returns list:
data
- property start
Returns start date and time of dataset
- property stop
Returns stop date and time of dataset
- property time_str_format
Returns datetime formatting info for string to datetime conversion
This information should be available in the resultfile type specification file (package data: data/import_info.txt)
- class pydoas.dataimport.ResultImportSetup(base_dir=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(3000, 1, 1, 0, 0), meta_import_info='doasis', result_import_dict={}, default_dict={}, doas_fit_err_factors={}, dev_id='')[source]
Bases:
objectSetup class for spectral result imports from text like files
- Parameters:
base_dir – folder containing resultfiles
start – time stamp of first spectrum
stop – time stamp of last spectrum
meta_import_info – Specify the result file format and columns for meta information (see als file import_info.txt or example script 2). Input can be str or dict. In case a string is provided, it is assumed, that the specs are defined in import_info.txt, i.e. can be imported (as dictionary) from this file (using
get_import_info(), e.g. with arg =doasis). If a dictionary is provided, the information is directly set from the provided dictionary.result_import_dict –
specify file and header information for import. Keys define the used abbreveations after import, the values to each key consist of a list with 2 elements: the first specifies the UNIQUE string which is used to identify this species in the header of a given Fit result file, the second entry is a list with arbitrary length containing the fit scenario IDs defining from which fit scenario result files this specific species is to be extracted.
Example:
result_import_dict = {"so2" : ['SO2_Hermans', ['f01','f02']], "o3" : ['o3_burrows'], ['f01']]}
Here
so2and “o3” are imported, the data column in the result files is found by the header string'SO2_Hermans'/'o3_burrows'and this species is imported from all fit scenario result files with fit Ids["f01", "f02"](UNIQUE substrings in FitScenario file names.Example file name:
D130909_S0628_i6_f19_r20_f01so2.datThis (exemplary) filename convention is used for the example result files shipped with this package (see folder pydoas/data/doasis_resultfiles) which include fit result files from the software `DOASIS.
The delimiter for retrieving info from these file names is “_”, the first substring provides info about the date (day), the second about the start time of this time series (HH:MM), 3rd, 4th and 5th information about first and last fitted spectrum number and the corresponding number of the reference spectrum used for this time series and the last index about the fit scenario (fitID).
Each resultfile must therefore include a unique ID in the file name by which it can be identified.
default_dict –
specify default species, e.g.:
dict_like = {"so2" : "f02", "o3" : "f01"}
doas_fit_err_factors –
fit correction factors (i.e. factors by which the DOAS fit error is increased):
dict_like = {"so2" : "f02", "o3" : "f01"}
dev_id – string ID for DOAS device (of minor importance)
- property FIRST_DATA_ROW_INDEX
- property HEADER_ACCESS_OPT
Checks if current settings allow column identification from file header line
- __init__(base_dir=None, start=datetime.datetime(1900, 1, 1, 0, 0), stop=datetime.datetime(3000, 1, 1, 0, 0), meta_import_info='doasis', result_import_dict={}, default_dict={}, doas_fit_err_factors={}, dev_id='')[source]
- property access_type
Return the current setting for data access type
- property base_path
Old name of base_dir for versions <= 1.0.1
- property fit_ids
Returns list with all fit ids
- get_fit_ids()[source]
Get all fit id abbreveations
Gets all fit ids (i.e. keys of fit import dict
self.import_info)
- get_fit_ids_species(species_id)[source]
Find all fit scenarios which contain results of species
- Parameters:
species_id (str) – string ID of fitted species (e.g. SO2)
- set_defaults(dict_like)[source]
Update default fit IDs for fitted species
Scheme:
dict_like = {"so2" : "f02", "o3" : "f01"}
- set_fitcorr_factors(dict_like)[source]
Set correction factors for uncertainty estimate from DOAS fit errors
- Parameters:
dict_like (dict) –
dictionary specifying correction factors for DOAS fit errors (which are usually underestimated, see e.g. Gliss et al. 2015) for individual fit scenarios, e.g.:
dict_like = {"f01" : 4.0, "f02" : 2.0}
Default value is 3.0.
- set_start_time(dt)[source]
Set the current start time
- Parameters:
dt (datetime) – start time of dataset
- set_stop_time(dt)[source]
Set the current start time
- Parameters:
dt (datetime) – start time of dataset
- property start
Start time-stamp of data
- property stop
Stop time-stamp of data
Fit result analysis and plotting
- class pydoas.analysis.DatasetDoasResults(setup=None, init=1, **kwargs)[source]
Bases:
objectA Dataset for DOAS fit results import and processing
- setup
setup specifying all necessary import settings (please see documentation of
ResultImportSetupfor setup details)- Type:
- Parameters:
setup (ResultImportSetup) – setup specifying all necessary import settings (please see documentation of
ResultImportSetupfor setup details)init (int) – if 1, the raw results will be loaded immediately
**kwargs – alternative way to setup
setup(ResultImportSetupobject), which is only used in case input parameter setup is None.
- change_time_ival(start, stop)[source]
Change the time interval for the considered dataset
Note
Previously loaded results will be deleted
- Parameters:
start (datetime) – new start time
stop (datetime) – new stop time
- Return type:
- get_default_fit_id(species_id)[source]
Get default fit scenario id for species
- Parameters:
species_id (str) – ID of species (e.g. “so2”)
- get_meta_info(fit, meta_id, start=None, stop=None)[source]
Get meta info array
- Parameters:
meta_id (str) – string ID of meta information
boolMask (array) – boolean mask for data retrieval
Note
Bool mask must have same length as the meta data array
- get_results(species_id, fit_id=None, start=None, stop=None)[source]
Get spectral results object
- Parameters:
- get_start_stop_mask(fit, start=None, stop=None)[source]
Creates boolean mask for data access only in a certain time interval
- property import_info
Returns information about result import details
- linear_regression(x_data, y_data, mask=None, ax=None)[source]
Perform linear regression and return parameters
- Parameters:
x_data (ndarray) – x data array
y_data (ndarray) – y data array
mask (ndarray) – mask specifying indices of input data supposed to be considered for regression (None)
ax – matplotlib axes object (None), if provided, then the result is plotted into the axes
- load_input(setup=None, **kwargs)[source]
Process input information
- Parameters:
setup (ResultImportSetup) – setup specifying all necessary import settings (please see documentation of
ResultImportSetupfor setup details)**kwargs – alternative way to setup
setup(ResultImportSetupobject), which is only used in case input parameter setup is None.
- load_raw_results()[source]
Try to load all results as specified in the setup
This method will try to load all results as specified in the setup. If the import setup is not complete, an exception will be raised.
- Returns:
True if data is loaded, False otherwise
- Return type:
- Raises:
AttributeError – If the import setup is not complete
- scatter_plot(species_id_xaxis, fit_id_xaxis, species_id_yaxis, fit_id_yaxis, lin_fit_opt=1, species_id_zaxis=None, fit_id_zaxis=None, start=None, stop=None, ax=None, **kwargs)[source]
Make a scatter plot of two species
- Parameters:
species_id_xaxis (str) – string ID of x axis species (e.g. “so2”)
fit_id_xaxis (str) – fit scenario ID of x axis species (e.g. “f01”)
species_id_yaxis (str) – string ID of y axis species (e.g. “so2”)
fit_id_yaxis (str) – fit scenario ID of y axis species (e.g. “f02”)
species_id_zaxis (str) – string ID of z axis species (e.g. “o3”)
fit_id_zaxis (str) – fit scenario ID of z axis species (e.g. “f01”)
start (datetime) – start time stamp for data retrieval
stop (datetime) – stop time stamp for data retrieval
ax – matplotlib axes object (None), if provided, then the result is plotted into the axes
kwargs – keyword arguments for matplotlib scatter plot (e.g. color, marker, edgecolor, etc.)
- class pydoas.analysis.DoasResults(data, index=None, start_acq=None, stop_acq=None, fit_errs=None, species_id=None, fit_id=None, fit_errs_corr_fac=1.0)[source]
Bases:
SeriesData time series for handling and analysing DOAS fit results
- Parameters:
data (arraylike) – DOAS fit results (column densities)
index (arraylike) – Time stamps of data points
fit_errs (arraylike) – DOAS fit errors
species_id (string) – String specifying the fitted species
fit_id (string) – Unique string specifying the fit scenario used
fit_errs_corr_fac (int) – DOAS fit error correction factor
- __init__(data, index=None, start_acq=None, stop_acq=None, fit_errs=None, species_id=None, fit_id=None, fit_errs_corr_fac=1.0)[source]
- get_data_above_detlim()[source]
Get fit results exceeding the detection limit
The detection limit is determined as follows:
self.fit_errs_corr_fac*self.data_err
- merge_other(other, itp_method='linear', dropna=True)[source]
Merge with other time series sampled on different grid
Note
This object will not be changed, instead, two new Series objects will be created and returned
- Parameters:
- Returns:
2-element tuple containing
this Series (merged)
other Series (merged)
- Return type:
- plot(date_fmt=None, **kwargs)[source]
Plot time series
Uses plotting utility of
Seriesobject (pandas)- Parameters:
**kwargs –
keyword arguments for pandas plot method
- shift(timedelta=datetime.timedelta(0))[source]
Shift time stamps of object
- Parameters:
timedelta (timedelta) – temporal shift
- Returns:
shifted
DoasResultsobject
- property species
Return name of current species
- property start
Start time of data
- property stop
Stop time of data
Supplemental / IO / Helpers
This module contains I/O routines for DOAS result files
- pydoas.inout.get_import_info(resulttype='doasis')[source]
Try to load DOAS result import specification for default type
Import specifications for a specified data type (see package data file “import_info.txt” for available types, use the instructions in this file to create your own import setup if necessary)
- Parameters:
resulttype (str) – name of result type (field “type” in “import_info.txt” file)
- pydoas.inout.get_result_type_ids()[source]
Read file import_info.txt and find all valid import types