Assuming there is a raster file that contains multiple bands, is there a simple way to extract the pixel value for each band and store this information in a dataframe for later use in machine learning modeling?
核动力破冰船反应堆使用寿命首次达到20万小时
-
What do you want to dataframe to look like? One row for each pixel and multiple columns for each bandd value?– BeraCommented Apr 2 at 13:20
-
Exactly. The dataframe should have the rows (pixels) and columns (bands). For example, if we had a [3,2,2] raster - where we have 3 bands, the extent is only 2x and 2y, we would have 4 pixels total with 3 bands. This would then be converted into a dataframe with the dimensions [4x3].– tdsCommented Apr 2 at 13:40
Add a comment
|
1 Answer
One solution is using a combination of rioxarray (raster handling) and GeoPandas (dataframe handling).
def raster_to_gdf(r_path):
# Read in the raster
raster = rioxarray.open_rasterio(r_path)
# Create gdf from raster shape
gdf = gpd.GeoDataFrame(index = np.arange(raster.shape[1] * raster.shape[2]))
# Add band values as columns
for band in range(raster.shape[0]):
gdf[band] = raster[band].values.flatten()
return gdf