I developed a python module, named GetNCEI, with several functions to provide an easier approach for digging into Climate Data Online managed by The United States National Climatic Data Center (NCDC).
GetNCEI has scripts to access the web services using API requests module that can be used to narrow down the request hopefully to find the most desired kind of data. GetNCEI has functions to find specific datatypes (i.e. temperature, wind, etc.) or stations by specifying a keyword, and also find the nearest stations by the given coordinate point, which we can also view the visualization of those recorded stations in a cartographic chart.
NCDC's Climate Data Online (CDO) offers web services that provide access to current data. This API is for developers looking to create their own scripts or programs that use the CDO database of weather and climate data. An access token is required to use the API, and each token will be limited to five requests per second and 10,000 requests per day.
CDO data generally can be explained as below:
datasets
, based on its datatypes
, which describe the context of the datadatactegories
is provided as general type of data used to group similar datatypes.stations
(for most datasets). Stations
may be grouped in locations
.Locations
is categorized as Location Categories
such as City, Country, etc.station
field, which record origin of datadatatype
field, describes context of datavalue
actual value of the data corresponds with its datatypeFrom previous explanation, we can see that CDO must has these kind of data: datasets, datacategories, datatypes, locationcategories, locations, stations,.
To receive each kind of data, CDO provides access to those by using several endpoints according to its name. Endpoint will be considered in the url that will be used to access the online database: https://www.ncei.noaa.gov/cdo-web/api/v2/{endpoint}
.
datasets
: to get the available datasets. The returned data will has an 'id' field that can be used as datasetid
.datatypes
: to get the available datatypes. Same as datasets endpoint, the returned 'id' field can be used as datatypeid
.locationcategories
: to get the available locationcategories, can be accessed to get the locationcategoryid
locations
: to get the available locations, can be accessed to get the locationid
stations
: to get the available stations, can be accessed to get the stationid
.The recorded/raw data itself is different from previously stated kind of data. This introduce us to the sixth endpoint:
data
: to actually fetch the data.Actually, previous id
field (such as datasetid
, datatypeid
, stationid
) can be used to narrow down the data that we want to request. This id
field is passed as Additional Parameters
for data
endpoint while accessing the url.
Moreover, the id
field can also be used for each of the 1st to 5th endpoint to confirm that the id
will be supported in which, as an 'Additional Parameters' for each endpoint. For example, we can use datasetid=GHCND
while we are accessing the stations
endpoint, the purpose is to confirm that the stations returned from the request will guaranteed to support the GHCND datasets. Discover https://www.ncdc.noaa.gov/cdo-web/webservices/v2#gettingStarted
to identify which id can be used as filter parameter for which endpoints.
I will try to explain how each endpoints work to get the recorded climate data in below flowchart(Red Font: Endpoints access):
Further explanation:
Minimum requirement for GetNCEI is python 3.9+ (recommended) and to have below libraries installed:
GetNCEI is developed with general functionalities consists of:
GetNCEI Class Object | Function |
---|---|
FindDataIdNCEI | Help users to find pairs of datasetid-datatypeid based on specific keyword. |
FindLocationInfoNCEI | Help users to find locationid based on specific keyword. Specifying locationid in each request is highly recommended especially to search for stations since it will narrow down the scope of request rather than exploring so many stations available in the database. By this release of documentation, there are around 140,000 stations available. Each request to the website is limited to 1000 rows, so for accessing those stations there will be hundred requests which is time-consuming activity. |
FindStationInfoNCEI | Help users to find stations based on specific keyword. As I mentioned earlier, specifying locationid with this function is highly recommended. |
FindNearestStationNCEI | Help users to find nearest stations by specifying location coordinate (decimal degree). This function can return n-numbers of nearest station by the target coordinate, and also to visualize the location of each stations. |
GetNCEI | Perform request to the website to retrieve data. |
Class Object: FindDataIdNCEI
Method | Description | Return | Note |
---|---|---|---|
get_matched_datasets() | Returns datasets that contain datatypes that list keyword |
list[dict] |
dict['id'] can be used as datasetid |
get_matched_datatypes() | Returns datatypes that specifically list keyword |
list[dict] |
dict['id'] can be used as datatypeid parameter |
get_id_pairs() | Returns pairs of datasetid -[datatypeids ], which is the list of datatypeids that contain keyword paired with the dataset where these datatypes exist |
dict[str: list] |
contains datasetid and datatypeid |
Class Object: FindLocationInfoNCEI
Method | Description | Return | Note |
---|---|---|---|
get_location_info() | Returns location info, which the name of location equal to keyword |
dict |
dict['id'] can be used as locationid |
Class Object: FindStationInfoNCEI
Method | Description | Return | Note |
---|---|---|---|
get_station_info() | Returns list of stations that contain keyword |
list[dict] |
dict['id'] can be used as stationid . Passing locationid to the filter parameter is highly recommended |
Class Object: FindNearestStationNCEI
Method | Description | Return | Note |
---|---|---|---|
get_nearest_station() | Return nearest stations to the target coordinate | list[dict] |
dict['id'] can be used as stationid . Passing locationid to the filter parameter is highly recommended |
show_location() | Show the location of nearest stations in a cartographic chart | plotly object |
Each point also describes distance to the target coordinate |
Class Object: GetNCEI
Method | Description | Return | Note |
---|---|---|---|
get_datasets() | Return available datasets based on filter parameter |
list[dict] |
Using datasets endpoint |
get_datacategories() | Return available datacategories based on filter parameter |
list[dict] |
Using datacategories endpoint |
get_datatypes() | Return available datatypes based on filter parameter |
list[dict] |
Using datatypes endpoint |
get_locationcategories() | Return available locationcategories based on filter parameter |
list[dict] |
Using locationcategories endpoint |
get_locations() | Return available locations based on filter parameter |
list[dict] |
Using locations endpoint |
get_stations() | Return available stations based on filter parameter |
list[dict] |
Using stations endpoint |
get_data() | Fetch the data that recorded by stations, narrowed by filter parameter |
list[dict] |
Using data endpoint |
Note:
[{field1: value1, field2: value2, ...}, ..., {field1: value1, ...}]
, which can be accessed with various processor, such as pandas.DataFrame().How each class object and methods work can be identified in below flowchart: (Red: Class Object, Blue: Methods.)
To see an example of how we can utilize GetNCEI modules, please discover this section.
Suppose we want to get the daily summary of temperature data, with specification as below:
We need to request data
endpoint to NCEI website, narrow it down by filtering data that recorded only from nearest station and also only select datatypes related to temperature. Objectives:
locationid
of Jakarta so that we can query the nearest station by passing locationid
parameters to narrow down our search for stations.stationid
of the nearest station using FindNearestStationNCEI.get_nearest_station().datasetid
and datatypeid
related to temperature that recorded by stationid
which is the 3 nearest stations.datasetid
,startdate
, enddate
, and filter
parameter that contains: stationid
, datatypeid
,.locationid
¶Firstly we want to narrow down list of stations that available in Jakarta, Indonesia. It is clear that we should know the locationid
of Jakarta (as a city) or Indonesia (as a country), both will work but it is best to try the most specific target first.
Find the location info of Jakarta City
To run FindLocationInfoNCEI()
, we should know what is locationcategoryid
for CITY. To do so, we can utilize GetNCEI.get_locationcategories().
import getncei
token = '' #insert token
locationcategories = getncei.GetNCEI(token).get_locationcategories()
locationcategories[0]
{'name': 'City', 'id': 'CITY'}
Take a look at id
key. Based on the results above, to search a city, we shall use locationcategoryid = CITY
.
Next we want to get the location info of Jakarta.
location = getncei.FindLocationInfoNCEI(token, 'Jakarta', 'CITY').get_location_info()
location
{'mindate': '1973-01-12', 'maxdate': '2022-06-26', 'name': 'Jakarta, ID', 'datacoverage': 1, 'id': 'CITY:ID000008'}
Take a look at id
key. Based on the results above, our locationid='CITY:ID000008'
.
stationid
¶Next we would to know our stationid
.
First we can discover all stations that available in Jakarta using GetNCEI.get_stations()
and passing locationid='CITY:ID000008'
as or filter, as follows:
filter = dict(
locationid='CITY:ID000008'
)
jakarta_stations = getncei.GetNCEI(token).get_stations(filter)
As it returns list of dictionary, we will try to access the first record to know the returned fields.
jakarta_stations[0].keys()
dict_keys(['elevation', 'mindate', 'maxdate', 'latitude', 'name', 'datacoverage', 'id', 'elevationUnit', 'longitude'])
Lets inspect the name
and id
to make a clearer list of available stations:
{station['id']: station['name'] for station in jakarta_stations}
{'GHCND:ID000096745': 'JAKARTA OBSERVATORY, ID', 'GHCND:IDM00096739': 'BUDIARTO, ID', 'GHCND:IDM00096741': 'JAKARTA TANJUNG PRIOK', 'GHCND:IDM00096749': 'SOEKARNO HATTA INTERNATIONAL, ID', 'GHCND:IDM00096753': 'BOGOR DERMAGA, ID'}
Although we already get the stationid
, alternatively we can query the nearest 3 stations as follows:
filter = dict(
locationid='CITY:ID000008'
)
coord = (-6.12, 106.75)
nearest_stations_jakarta = \
getncei.FindNearestStationNCEI(token, coord, filter, 3)
{station['id']: station['name'] for station in nearest_stations_jakarta.get_nearest_station()}
{'GHCND:IDM00096749': 'SOEKARNO HATTA INTERNATIONAL, ID', 'GHCND:ID000096745': 'JAKARTA OBSERVATORY, ID', 'GHCND:IDM00096741': 'JAKARTA TANJUNG PRIOK'}
Looks like we already got our station of interest. Furthermore, we can see the location of our stations as below and hover the marker to reveal additional information:
nearest_stations_jakarta.show_location()
To conclude, our stationid
for filter is: ['GHCND:IDM00096749', 'GHCND:ID000096745','GHCND:IDM00096741']
datasetid
and datatypeid
¶Next we would to know which temperature data is supported by our stations record. To do this, we will pass 'temperature'
as a keyword and our stationid
as a filter to FindDataIdNCEI() object.
station_id_list = ['GHCND:IDM00096749', 'GHCND:ID000096745','GHCND:IDM00096741']
filter = dict(
stationid=station_id_list
)
dataid = getncei.FindDataIdNCEI(token, 'temperature', filter)
Let's look at the pair of datasetid-datatypeids that match with temperature:
dataid.get_id_pairs()
{'GHCND': ['TAVG', 'TMAX', 'TMIN'], 'GSOM': ['DT00', 'DT32', 'DX32', 'DX70', 'DX90', 'DYNT', 'DYXT', 'EMNT', 'EMXT', 'TAVG', 'TMAX', 'TMIN']}
The available datasetids are 'GHCND' and 'GSOM'. To know its description:
dataid.get_matched_datasets()
[{'uid': 'gov.noaa.ncdc:C00861', 'mindate': '1763-01-01', 'maxdate': '2022-06-28', 'name': 'Daily Summaries', 'datacoverage': 1, 'id': 'GHCND'}, {'uid': 'gov.noaa.ncdc:C00946', 'mindate': '1763-01-01', 'maxdate': '2022-06-01', 'name': 'Global Summary of the Month', 'datacoverage': 1, 'id': 'GSOM'}]
Looks like 'GHCND'
is our preferred daily data, we also discovered that available data covers until June 2022. We can dig more about datatypes
in 'GHCND'
and time range of data that available. The datatypes are the first 3 of matched datatypes.
dataid.get_matched_datatypes()[0:3]
[{'mindate': '1874-10-13', 'maxdate': '2022-06-28', 'name': 'Average Temperature.', 'datacoverage': 1, 'id': 'TAVG'}, {'mindate': '1763-01-01', 'maxdate': '2022-06-28', 'name': 'Maximum temperature', 'datacoverage': 1, 'id': 'TMAX'}, {'mindate': '1763-01-01', 'maxdate': '2022-06-28', 'name': 'Minimum temperature', 'datacoverage': 1, 'id': 'TMIN'}]
By above filed, we can confirm that the 2021 data should be available for our datatypes. To summarize, we shall add below specification to our request related to datatypes:
datasetid = 'GHCND'
datatypeid = ['TAVG', 'TMAX', 'TMIN']
startdate = '2021-01-01'
enddate = '2021-12-31'
Before actually fetched the raw data using GetNCEI.get_data()
, we should confirm our filter
parameters. From our activity before, we can conclude below parameters:
for primary parameters:
datasetid = 'GHCND'
startdate = '2021-01-01'
enddate = '2021-12-31'
req_size = 'all'
for optional filter
parameters:
stationid = ['GHCND:IDM00096749', 'GHCND:ID000096745','GHCND:IDM00096741']
datatypeid = ['TAVG', 'TMAX', 'TMIN']
Looks like we are ready to retrieve the data we want.
Now we want to fetch our data using data
endpoint:
datasetid = 'GHCND'
startdate = '2021-01-01'
enddate = '2021-12-31'
filter = dict(
datatypeid=['TAVG', 'TMAX', 'TMIN'],
stationid=['GHCND:IDM00096749', 'GHCND:ID000096745','GHCND:IDM00096741']
)
temperature_data = \
getncei.GetNCEI(token).get_data(
datasetid=datasetid,
startdate=startdate,
enddate=enddate,
req_size='all',
filter=filter
)
temperature_data[0:3]
[{'date': '2021-01-01T00:00:00', 'datatype': 'TAVG', 'station': 'GHCND:ID000096745', 'attributes': 'H,,S,', 'value': 271}, {'date': '2021-01-01T00:00:00', 'datatype': 'TMAX', 'station': 'GHCND:ID000096745', 'attributes': ',,S,', 'value': 300}, {'date': '2021-01-01T00:00:00', 'datatype': 'TMIN', 'station': 'GHCND:ID000096745', 'attributes': ',,S,', 'value': 250}]
We has received the narrowed temperature data, hopefully according to our specification (need to be checked). The data fields are: date
, datatype
, station
, attributes
, and values
. Detailed information of each data can be discovered in NCEI documentation https://www1.ncdc.noaa.gov/pub/data/cdo/documentation/
.
Please carefully check the value as it may need more judgement or identification. Look at our example above about the value of temperature that reach 271, which is abnormally high for daily temperature so we need additional insight about the data to verify it.
In the next section, we will try to process the data using pandas dataframe
.
pandas.DataFrame
¶Lucky enough for us that the data returned consists of dictionary that mark the data field as keys and its records as values.
We can process this kind of data using pandas.DataFrame
.
import pandas as pd
temperature_df = pd.DataFrame(temperature_data)
temperature_df.head(5)
date | datatype | station | attributes | value | |
---|---|---|---|---|---|
0 | 2021-01-01T00:00:00 | TAVG | GHCND:ID000096745 | H,,S, | 271 |
1 | 2021-01-01T00:00:00 | TMAX | GHCND:ID000096745 | ,,S, | 300 |
2 | 2021-01-01T00:00:00 | TMIN | GHCND:ID000096745 | ,,S, | 250 |
3 | 2021-01-01T00:00:00 | TAVG | GHCND:IDM00096741 | H,,S, | 272 |
4 | 2021-01-01T00:00:00 | TMAX | GHCND:IDM00096741 | ,,S, | 298 |
Let's see unique values for datatype
and station
columns:
for column in temperature_df.columns[1:3]:
print(f'unique "{column}": ',
temperature_df[column].unique())
unique "datatype": ['TAVG' 'TMAX' 'TMIN'] unique "station": ['GHCND:ID000096745' 'GHCND:IDM00096741' 'GHCND:IDM00096749']
Our specified datatypes and stations is satisfied according to our filter.
Let's discover our date
column:
df = temperature_df
for station in df.station.unique():
date_count = len(df[df.station == station])
print(f'{station}. Unique "date" count = ', date_count)
GHCND:ID000096745. Unique "date" count = 989 GHCND:IDM00096741. Unique "date" count = 980 GHCND:IDM00096749. Unique "date" count = 846
Note that for daily record, data for 1 year of 3 datatypes (TAVG, TMAX, TMIN) shall contains roughly about 1,095 records. Seems that we should verify the data accordingly.
From above sections, we can see the process to retrieve data from NCEI website and try to look at general idea of the data that we received.
For full documentation of each GetNCEI methods, proceed to the next section.
FindDataIdNCEI(token, keyword, [filter])
Find the datatypes that contains keyword, and inform in which datasets are they existed.
Parameters
----------
token (str):
Token to access web services, obtained from https://www.ncdc.noaa.gov/cdo-web/token.
keyword (str or list[str]):
Specify the keyword to search in various available datatypes.
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the datasets or datatypes that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Matched datatypes that returned will be available for location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'stationid':
VALUE (str or list[str]) -> Accepts a valid stationid or a list of stationids. Matched datatypes that returned will be available for the station(s) specified. Example: {'stationid': ['GHCND:ID000096745', 'GHCND:IDM00096739'], ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Matched datasets that returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE(str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Matched datasets that returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
Example:
filter = {
'stationid': 'GHCND:ID000096745'
}
FindDataIdNCEI.get_matched_datasets()
Returns datasets that have any datatypes that contains specified keyword.
Parameter
---------
None
Returns
-------
list[dict]:
List of datasets that have any datatypes for specified keyword
FindDataIdNCEI.get_matched_datatypes()
Returns datatypes that contains specified keyword.
Parameters
----------
None
Returns
-------
list[dict]:
List of datatypes that contains specified keyword
Returns pairs of datasets-matched datatypes.
Parameters
----------
None
Returns
-------
dict[str: list[str]]:
Dictionary of {matched_datasetid1: [matched_datatypeids], matched_datasetid2: [matched_datatypeids], ...}. Can be used as datasetid and datatypeid for get_data() method or other get_* method().
FindLocationInfoNCEI(token, target, locationcategoryid, [filter])
Find a location of available data by searching it based on target keyword. Matched location can be filtered using filter parameter to verify that location will contains that specified features.
Parameters
----------
token (str):
Token to access web services, obtained from https://www.ncdc.noaa.gov/cdo-web/token.
target (str):
Specify the keyword to search in various available locations. Example: 'New York'.
locationcategoryid (str):
As a category which describes the scope of target keyword. Example: 'CITY', as a suited value if 'New York' was specified in target parameter.
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the datasets or datatypes that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:\n
KEYS:\n
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Locations returned will match with keyword and will be supported by dataset(s) specified. Example: {'datasetid': 'GHCND', ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Locations returned will match with keyword and will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP', ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Locations returned will match with keyword and will have data after the specified date. Parameter can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Locations returned will match with keyword and will have data before the specified date. Parameter can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
Example:
filter = {
'datasetid': 'GHCND',
'datacategoryid': 'TEMP'
}
FindLocationInfoNCEI.get_location_info()
Get the location info as a dict that match the target keyword.
Parameters
----------
None
Returns
-------
dict:
Dictionary that contains location info that matched with target keyword.
FindStationInfoNCEI(token, target, [filter])
Find available stations that contains specified target keyword.
Parameters
----------
token (str):
Token to access web services, obtained from https://www.ncdc.noaa.gov/cdo-web/token.
target (str):
Specify the keyword to search in various available stations. Example: 'Salt Lake' to find stations that contains this keyword in its description.
filter (dict):
Filter the station data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:\n
KEYS:\n
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Matched stations returned will be supported by dataset(s) specified. Example: {'datasetid': 'GHCND'}.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Matched stations returned will contain data for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Matched stationss returned will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP'}.
'datatypeid':
VALUE (str or list[str]) -> Accepts a valid datatypeid or a list of datatypeids. Matched stations returned will contain all of the available data type(s) specified. Example: {'datatypeid': ['TAVG', 'TMAX', 'TMIN'], ...}.
'extent':
VALUE (str) -> The desired geographical extent for search. Designed to take a parameter generated by Google Maps API V3 LatLngBounds.toUrlValue. Stations returned must be located within the extent specified. Example: {'extent': '47.5204,-122.2047,47.6139,-122.1065', ...}
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Matched stations returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE(str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Matched stations returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
FindStationInfoNCEI.get_station_info()
Returns all stations that contain target keyword in its description.
Parameters
----------
None
Returns
-------
list[dict]:
List of dictionaries of matched stations.
FindNearestStation(token, coord, [filter, station_nos=1])
Find the nearest station with the specified coordinate.
Parameters
----------
token (str):
Token to access web services, obtained from https://www.ncdc.noaa.gov/cdo-web/token.
coord (tuple):
Tuple of (lat, long) decimal degree coordinate. The latitude (decimated degrees w/northern hemisphere values > 0, southern hemisphere values < 0), longitude (decimated degrees w/western hemisphere values < 0, eastern hemisphere values > 0).
filter (dict), optional, default = {}:
Filter the station data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Nearest stations returned will be supported by dataset(s) specified. Example: {'datasetid': 'GHCND'}.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Nearest stations returned will contain data for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Nearest stationss returned will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP'}.
'datatypeid':
VALUE (str or list[str]) -> Accepts a valid datatypeid or a list of datatypeids. Nearest stations returned will contain all of the available data type(s) specified. Example: {'datatypeid': ['TAVG', 'TMAX', 'TMIN'], ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Nearest stations returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE(str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Nearest stations returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
station_nos (int), optional, default = 1:
Number of nearest stations that wanted to be returned.
FindNearestStation.get_nearest_station()
Return a station info that placed nearest with specified target coordinate.
Parameters
----------
None
Returns
-------
dict:
Nearest station info fields stored as a dictionary.
FindNearestStation.show_location()
Plotting nearest station and target coordinate on a cartographic chart.
Parameters
----------
None
Returns
-------
Plotly.Figure object:
Nearest station plot that located nearest to the target coordinate.
Get the data by requesting several endpoint of NCEI API url.
Parameters
----------
token (str):
Token to access web services, obtained from https://www.ncdc.noaa.gov/cdo-web/token.
GetNCEI.get_datasets([filter, req_size])
Get the available datasets (using API request endpoint:'datasets'). All of the CDO data are in datasets. The containing dataset must be known before attempting to access its data.
Criteria of the datasets available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the data sets that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datatypeid':
VALUE (str or list[str]) -> Accepts a valid datatypeid or a list of datatypeids. Datasets returned will contain all of the data type(s) specified. Example: 'ACMH'.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Datasets returned will contain data for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'stationid':
VALUE (str or list[str]) -> Accepts a valid stationid or a list of stationids. Datasets returned will contain data for the station(s) specified. Example: {'stationid': 'GHCND:ID000096745', ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Datasets returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Datasets returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'stationid': ''GHCND:ID000096745'
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available datasets will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain datatypes data, which contains fields of {'field1': 'values1', 'field2':'values2', ....}. The value associated within 'id' field can be used as 'datasetid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_datacategories([filter, req_size])
Get the available datacategories (using API request endpoint:'datacategories'). Data Categories represent groupings of data types.
Criteria of the datacategories available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the data categories that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Data categories returned will be supported by dataset(s) specified. Example: 'GHCND'.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Data categories returned will be applicable for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'stationid':
VALUE (str or list[str]) -> Accepts a valid stationid or a list of stationids. Data categories returned will be applicable for the station(s) specified. Example: {'stationid': 'GHCND:ID000096745', ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data categories returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data categories returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'datasetid': 'GHCND',
'stationid': ''GHCND:ID000096745'
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available datacategories data will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain datatypes data, which contains fields of {'field1': 'values1', 'field2':'values2', ....}. The value associated within 'id' field can be used as 'datacategoryid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_datatypes([filter, req_size])
Get the available datatypes (using API request endpoint:'datatypes'). Data Type describes the type of data, acts as a label. If it's 64°f out right now, then the data type is Air Temperature and the data is 64.
Criteria of the datatypes available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the data types that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Data types returned will be supported by dataset(s) specified. Example: 'GHCND'.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Data types returned will be applicable for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'stationid':
VALUE (str or list[str]) -> Accepts a valid stationid or a list of stationids. Data types returned will be applicable for the station(s) specified. Example: {'stationid': 'GHCND:ID000096745', ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Data types returned will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP'}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data types returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data types returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'datasetid': 'GHCND',
'datacategoryid': 'TEMP',
'stationid': ''GHCND:ID000096745''
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available datatypes data will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain datatypes data, which contains fields of {'field1': 'values1', 'field2':'values2', ....}. The value associated within 'id' field can be used as 'datatypeid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_locationcategories([filter, req_size])
Get the available locationcategories (using API request endpoint:'locationcategories'). Location categories are groupings of locations under an applicable label.
Criteria of the locationcategories available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the location categories data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Location categories returned will be supported by dataset(s) specified. Example: 'GHCND'.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Location categories returned will have data after the specified date. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Location categories returned will have data before the specified date. Parameter can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'datasetid': 'GHCND',
'startdate': ''1970-10-03',
'sortfield': 'name'
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available locationcategories data will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain locationcategories data, which contains fields of {'field1': 'values1', 'field2':'values2', ...}. The value associated within 'id' field can be used as 'locationcategoryid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_locations([filter, req_size])
Get the available locations (using API request endpoint:'locations'). Locations can be a specific latitude/longitude point such as a station, or a label representing a bounding area such as a city.
Criteria of the locations available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the location data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Locations returned will be supported by dataset(s) specified. Example: 'GHCND'.
'locationcategoryid':
VALUE (str or list[str]) -> Accepts a valid locationcategoryid or a list of locationcategoryids. Locations returned will be in the location category(ies) specified. Example: {'locationcategoryid': 'CITY', ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Locations returned will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP', ...}.
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Locations returned will have data after the specified date. Parameter can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Locations returned will have data before the specified date. Parameter can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'datasetid': 'GHCND',
'locationcategoryid': 'CITY'
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available locations data will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain locations data, which contains fields of {'field1': 'values1', 'field2':'values2', ....}. The value associated within 'id' field can be used as 'locationid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_stations([filter, req_size])
Get the available stations (using API request endpoint:'stations'). Stations are where the data comes from (for most datasets) and can be considered the smallest granual of location data. If the desired station is known, all of its data can quickly be viewed
Criteria of the stations available is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the station data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datasetid':
VALUE (str or list[str]) -> Accepts a valid datasetid or a list of datasetids. Stations returned will be supported by dataset(s) specified. Example: 'GHCND'.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Stations returned will contain data for the location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'datacategoryid':
VALUE (str or list[str]) -> Accepts a valid datacategoryid or a list of datacategoryids. Stations returned will be associated with the data category(ies) specified. Example: {'datacategoryid': 'TEMP'}.
'datatypeid':
VALUE (str or list[str]) -> Accepts a valid datatypeid or a list of datatypeids. Stations returned will contain all of the available data type(s) specified. Example: {'datatypeid': ['TAVG', 'TMAX', 'TMIN'], ...}.
'extent':
VALUE (str) -> The desired geographical extent for search. Designed to take a parameter generated by Google Maps API V3 LatLngBounds.toUrlValue. Stations returned must be located within the extent specified. Example: {'extent': '47.5204,-122.2047,47.6139,-122.1065', ...}
'startdate':
VALUE (str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Stations returned will have data after the specified date. Paramater can be use independently of 'enddate'. Example: {'startdate': '1970-10-03', ...}.
'enddate':
VALUE(str) -> Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Stations returned will have data before the specified date. Paramater can be use independently of 'startdate'. Example: {'enddate': '2012-09-10', ...}.
'sortfield':
VALUE (str = one from any of 'id', 'name', 'mindate', 'maxdate', 'datacoverage') -> Sort the results by the specified field. Example: {'sortfield': 'name', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
{
'datasetid': 'GHCND',
'datatypeid': ['TMAX', 'TMIN'],
'locationid': 'CITY:ID000008'
}
req_size (int), Optional, default = None:
Determining maximum row size of the data that will be retrieved. If not specified, all of the available stations data will be retrieved.
Returns
-------
list[dict]
A list of dictionaries that contain stations data, which contains fields of {'field1': 'values1', 'field2':'values2', ....}. The value associated within 'id' field can be used as 'stationid' as a filter for fetching the data using get_data() method or other get_* method.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.
GetNCEI.get_data(datasetid, startdate, enddate, [req_size=1000, filter])
Get the fetched data (using API request endpoint:'data') from a single datasetid.
Criteria of the data is specified by the filter parameter, and number of maximum rows returned is specified by req_size parameter.
Parameters
----------
datasetid (str), required:
Datasetid of the data that want to be retrieved. Data returned will be from the datasetid specified. Example: 'GHCND'.
startdate (str), required:
Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data returned will be after the specified date. Annual and Monthly data will be limited to a ten year range while all other data will be limited to a one year range. Example: '1970-10-03'.
enddate (str), required:
Required. Accepts a valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). Data returned will be before the specified date. Annual and Monthly data will be limited to a ten year range while all other data will be limted to a one year range. Example: '2012-09-10'.
req_size (int or str = 'all'), Optional, default = 1000:
Determining maximum row size of the data that will be retrieved. req_size = 'all' will retrieve all of the available data.
filter (dict[str, str | list[str]]), optional, default = {}:
Filter the data that will be retrieved using a Dict, which the KEYS are the 'Additional Parameters' for the API request. Accepted {KEYS:VALUES} pairs are as explained below:
KEYS:
'datatypeid':
VALUE (str or list[str]) -> Accepts a valid datatypeid or a list of datatypeids. Data returned will contain all of the available data type(s) specified. Example: {'datatypeid': ['TAVG', 'TMAX', 'TMIN'], ...}.
'locationid':
VALUE (str or list[str]) -> Accepts a valid locationid or a list of locationids. Data returned will contain data for the available location(s) specified. Example: {'locationid': ['FIPS:37', 'CITY:ID000008'], ...}.
'stationid':
VALUE (str or list[str]) -> Accepts a valid stationid or a list of stationids. Data returned will contain data for the available station(s) specified. Example: {'stationid': ['GHCND:ID000096745', 'GHCND:IDM00096739'], ...}.
'units':
VALUE (str = 'standard' or 'metric') -> Accepts the literal strings 'standard' or 'metric'. Data will be scaled and converted to the specified units. If a unit is not provided then no scaling nor conversion will take place. Example: {'unit': 'standard', ...).
'sortfield':
VALUE (str = one from any of 'date', 'datatype', 'station', 'atribute', 'value') -> Sort the results by the specified field. Example: {'sortfield': 'value', ...}.
'sortorder'
VALUE (str = 'asc' or 'desc') -> Specifies whether sort is ascending or descending. Defaults to 'asc'. Example: {'sortorder': 'desc', ...}.
Example:
filter = {
'datatypeid': ['TMAX', 'TMIN'],
'stationid': 'GHCND:ID000096745'
}
Returns
-------
list[dict]
A list of dictionaries that contain data fields of {'field1': 'values1', 'field2':'values2', ....}.
Raises
------
InputTypeError
If the input type of each parameters is not valid.
InputValueError
If the input value of req_size and filter keys are not valid.
JSONDecodeError
If there was an error with requesting API.