0

Assuming there is a raster file that contains multiple bands, is there a simple way to extract the pixel value for each band and store this information in a dataframe for later use in machine learning modeling?

2
  • What do you want to dataframe to look like? One row for each pixel and multiple columns for each bandd value?
    – Bera
    Commented Apr 2 at 13:20
  • Exactly. The dataframe should have the rows (pixels) and columns (bands). For example, if we had a [3,2,2] raster - where we have 3 bands, the extent is only 2x and 2y, we would have 4 pixels total with 3 bands. This would then be converted into a dataframe with the dimensions [4x3].
    – tds
    Commented Apr 2 at 13:40

1 Answer 1

0

One solution is using a combination of rioxarray (raster handling) and GeoPandas (dataframe handling).

def raster_to_gdf(r_path):
    # Read in the raster
    raster = rioxarray.open_rasterio(r_path)
    
    # Create gdf from raster shape
    gdf = gpd.GeoDataFrame(index = np.arange(raster.shape[1] * raster.shape[2]))

    # Add band values as columns
    for band in range(raster.shape[0]):
        gdf[band] = raster[band].values.flatten()

    return gdf

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.