feat: 142 update usgs instant values endpoint#205
Conversation
krowvin
left a comment
There was a problem hiding this comment.
Really just curious on the plans for the api target version
We use cda v2 timeseries now but we do not call that cwms-cli load timeseries-v2
Not that it has to be consistent but I would think exposing api-v through arg is more futureproof?
|
|
||
|
|
||
| @usgs_group.command( | ||
| "timeseries-v2", |
There was a problem hiding this comment.
I think it could get a bit confusing to have multiple sub commands for timeseries.
What are your thoughts on having an api-source with it defaulting to v1 until that goes away? Then you add a warning in this version+ that says
"switch to using source --version=2 as soon as possible"
Something like cwms-cli usgs timeseries --version=2
They're almost certainly going to make a v3, v4, v5 and so on
That would make the command look like this
commands:
timeseries
timeseries-v2
timeseries-v3
timeseries-v4
timeseries-v5
...
timeseries-v349
or you could even be more explicit and say --api-version
Then we continue to default to v1, then when USGS sunsets or at some point we just remove v1 as the default and/or attempt backwards compat. Same inputs but to a different api endpoint if possible.
There was a problem hiding this comment.
Or maybe even just switch to the new USGS OGC API by default if there isn't a defined USGS method in the timeseries group and fall back to the legacy NWIS if there is a method defined w/ a warning.
Also, have a conversion script available to run to lookup up the OGC timeseries metadata and try and replace the old method with the new timeseries id in the cwms group.
Our local USGS changed a bunch of site configs on their end a couple weeks ago, so legacy nwis is not bringing data unless I specify the USGS method which is a pain to look up. If I use the new OGC, it just provides what is posted on the web. Not sure if that is rolling out to other USGS offices.
| help='Backfill version, save data to a different version of the timeseries (e.g. "Rev-USGS" instead of "Raw-USGS")', | ||
| ) | ||
| @requires(reqs.cwms, reqs.requests) | ||
| def getusgs_timeseries_v2( |
There was a problem hiding this comment.
then you'd put this inside the other getusgs_timeseries and have a switch for based on the version the enduser wants to target?
There was a problem hiding this comment.
Would it make sense to not require the backfill version tsid to be in the USGS group? That's how I'm using it now.
TS group has the full timeseries ID, for example, SABM5.Stage.Inst.~15Minutes.0.Raw-USGS but then I have a side script that will backfill to SABM5.Stage.Inst.~15Minutes.0.rev-USGS which isn't in the group. I use CCP to copy the Raw-USGS to rev-USGS. That way I can keep the record of the raw data and backfill the rev. That keeps the group 1/2 the size. And if folks want it in there USGS group it would still work without the CCP copy too.
Also, does it make sense to require the backfill argument to just be the cwms location rather than the full TSID? It's pulling all the parameters anyway. Is there a case were you would only want one parameter backfilled and not the others?
| return usgs_ts | ||
|
|
||
|
|
||
| def CWMS_writeData( |
There was a problem hiding this comment.
Opinion on this
Do not start a function/def with uppercase only do that with classes
Then we mix other case as well
PEP 8 would be lower_snake_case and would otherwise be conventionally cwms_write_data
See this in other variable names/ functions that mix camelCase and capitalization
| ) | ||
|
|
||
|
|
||
| _OGC_BASE_URL = "https://api.waterdata.usgs.gov/ogcapi/v0/collections/continuous" |
There was a problem hiding this comment.
I suggest we split the API domain/endpoints out from this and centralize them in the init.py so we will have one place to change them later on
|
|
||
| Produces the same DataFrame structure as getUSGS_ts() so CWMS_writeData() works unchanged. | ||
|
|
||
| TODO: fill in OGC response parsing once sample JSON is available. |
There was a problem hiding this comment.
Looks like we are heading towards a requests call?
We use dataretrieval in other places, do we know if there are plans to update it to v2? (or 1 since they use v0 now) - imo we keep the api target the same as theres and say v0 default, v1 for current to match?
(it's v2 for us sure)
https://github.com/DOI-USGS/dataretrieval-python
i.e.
from dataretrieval import waterdata
df, metadata = waterdata.get_continuous(
monitoring_location_id="USGS-01646500",
parameter_code="00065",
time="2024-10-01/2025-09-30",
)| USGS_data_method = pd.DataFrame() | ||
|
|
||
| if len(sites) > 0: | ||
| USGS_data = getUSGS_ts_ogc(sites, startDT, endDT) |
There was a problem hiding this comment.
I think both of these getUSGS_ts_ogccould throw an error if they do -h because of the NotImplementedError?
| ) | ||
| usgs_ts = usgs_ts[~usgs_ts["timeseries-id"].isin(missing_original_ts)] | ||
|
|
||
| except Exception as e: |
There was a problem hiding this comment.
Catching every error generically will fail, but then continue anyways. We may want to return here so we don't continue anyways if we had failures?
And/or handle explicit exceptions?
create new version for the new USGS api
add backfill version to allow users to change the version of the timeseries if doing a backfill
add tests