Environment:
ezesri 0.3.3
Python 3.12
geopandas / pandas current
Summary
When extract_layer(url) is called with a layer URL that responds successfully (HTTP 200) but with an Esri-style error JSON ({"error": {"code": ..., "message": ...}}), the function returns an empty pd.DataFrame with no warning. The caller has no way to distinguish "layer is a non-geometry table with 0 rows" from "the metadata endpoint returned an error" without re-fetching the URL themselves.
Reproduction
These three Wisconsin county GIS endpoints reproduce it today (Service not started error is intermittent); each returns HTTP 200 with an error body when its underlying service is down or its layer index is wrong:
import ezesri
from urllib3.exceptions import InsecureRequestWarning
import warnings, requests
warnings.filterwarnings("ignore", category=InsecureRequestWarning)
# These are the kinds of responses the endpoints return for ?f=json:
# {"error": {"code": 404, "message": "Layer not found", ...}}
# {"error": {"code": 500, "message": "Service ... not started", ...}}
urls = [
"https://arcgisweb.sccwi.gov/arcgis/rest/services/StCroixLFparcel/Parcels_wPA_LF/MapServer/3", # 404 Layer not found (changed their layer id)
"https://public1.co.waupaca.wi.us/arcgis/rest/services/Public/LandRecords/MapServer/12", # 500 Service not started
"https://gis.woodcountywi.gov/gis/rest/services/LandRecordsViewer/FoundationalElements/MapServer/20", # 500 Service not started
]
for url in urls:
result = ezesri.extract_layer(url)
print(type(result).__name__, len(result))
# → DataFrame 0
Expected
One of:
Raise a clear exception, e.g. ezesri.LayerMetadataError(error_code, error_message).
Or return an empty GeoDataFrame consistently (currently you get a plain DataFrame, which makes downstream isinstance(result, GeoDataFrame) checks misleading).
Actual
Returns an empty pandas.DataFrame. Downstream code that expects a GeoDataFrame for a feature layer cannot distinguish this from a layer that legitimately has no geometry, leading to silent data loss in pipelines.
Root cause
In ezesri/extract.py, extract_layer does:
metadata = get_metadata(url)
if not metadata:
return gpd.GeoDataFrame()
has_geometry = metadata.get("geometryType") is not None
When the server responds with {"error": {...}}, metadata is a non-empty dict (truthy), so the if not metadata guard doesn't fire. geometryType is absent, so has_geometry becomes False, and the function continues down the table (non-geometry) code path, eventually returning an empty pd.DataFrame.
Suggested fix
Detect the error key before the empty-metadata check:
metadata = get_metadata(url)
if not metadata:
return gpd.GeoDataFrame()
if isinstance(metadata, dict) and "error" in metadata:
err = metadata["error"]
raise ValueError(
f"Esri layer metadata request failed: "
f"code={err.get('code')} message={err.get('message')}"
)
A dedicated exception class (e.g. EsriLayerError) would be even better so callers can except it specifically and decide whether to fall back to another source.
Why this matters
I have a pipeline downloading from multiple county ArcGIS endpoints. The silent empty-DataFrame return caused corrupted files to be saved and propagated through downstream normalization steps, masking the real problem (county-side service outages and URL changes) for some time before we caught it. Raising would have made the upstream failure immediately diagnosable.
P.S. I appreciate this project, thanks for open sourcing it!
Environment:
ezesri 0.3.3
Python 3.12
geopandas / pandas current
Summary
When extract_layer(url) is called with a layer URL that responds successfully (HTTP 200) but with an Esri-style error JSON ({"error": {"code": ..., "message": ...}}), the function returns an empty pd.DataFrame with no warning. The caller has no way to distinguish "layer is a non-geometry table with 0 rows" from "the metadata endpoint returned an error" without re-fetching the URL themselves.
Reproduction
These three Wisconsin county GIS endpoints reproduce it today (Service not started error is intermittent); each returns HTTP 200 with an error body when its underlying service is down or its layer index is wrong:
Expected
One of:
Raise a clear exception, e.g. ezesri.LayerMetadataError(error_code, error_message).
Or return an empty GeoDataFrame consistently (currently you get a plain DataFrame, which makes downstream isinstance(result, GeoDataFrame) checks misleading).
Actual
Returns an empty pandas.DataFrame. Downstream code that expects a GeoDataFrame for a feature layer cannot distinguish this from a layer that legitimately has no geometry, leading to silent data loss in pipelines.
Root cause
In ezesri/extract.py, extract_layer does:
metadata = get_metadata(url)
if not metadata:
return gpd.GeoDataFrame()
has_geometry = metadata.get("geometryType") is not None
When the server responds with {"error": {...}}, metadata is a non-empty dict (truthy), so the if not metadata guard doesn't fire. geometryType is absent, so has_geometry becomes False, and the function continues down the table (non-geometry) code path, eventually returning an empty pd.DataFrame.
Suggested fix
Detect the error key before the empty-metadata check:
metadata = get_metadata(url)
if not metadata:
return gpd.GeoDataFrame()
if isinstance(metadata, dict) and "error" in metadata:
err = metadata["error"]
raise ValueError(
f"Esri layer metadata request failed: "
f"code={err.get('code')} message={err.get('message')}"
)
A dedicated exception class (e.g. EsriLayerError) would be even better so callers can except it specifically and decide whether to fall back to another source.
Why this matters
I have a pipeline downloading from multiple county ArcGIS endpoints. The silent empty-DataFrame return caused corrupted files to be saved and propagated through downstream normalization steps, masking the real problem (county-side service outages and URL changes) for some time before we caught it. Raising would have made the upstream failure immediately diagnosable.
P.S. I appreciate this project, thanks for open sourcing it!