From 41aae8599156bed1c4c8424bed1d88bab0cd3284 Mon Sep 17 00:00:00 2001 From: Anix Lynch Date: Tue, 10 Mar 2026 02:02:58 -0700 Subject: [PATCH] docs: add pandas DataFrame serialization guidance to encoder.md --- docs/en/docs/tutorial/encoder.md | 67 ++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/docs/en/docs/tutorial/encoder.md b/docs/en/docs/tutorial/encoder.md index c8f8bca8c9..5398f32e35 100644 --- a/docs/en/docs/tutorial/encoder.md +++ b/docs/en/docs/tutorial/encoder.md @@ -33,3 +33,70 @@ It doesn't return a large `str` containing the data in JSON format (as a string) `jsonable_encoder` is actually used by **FastAPI** internally to convert data. But it is useful in many other scenarios. /// + +## Working with Pandas DataFrames { #working-with-pandas-dataframes } + +**Pandas** is commonly used in FastAPI applications for data science, analytics APIs, and ML inference backends. However, returning data from a Pandas DataFrame requires care — Pandas uses **NumPy types** internally (`numpy.int64`, `numpy.float64`, `numpy.nan`, `pandas.NaT`), which are **not natively JSON serializable**. + +Calling `.to_dict(orient="records")` on a DataFrame returns a list of dicts where numeric columns are still NumPy types, not standard Python `int` or `float`. This causes a `TypeError` at runtime: + +``` +TypeError: Object of type int64 is not JSON serializable +``` + +Additionally, `numpy.nan` and `pandas.NaT` are not valid JSON values and will cause serialization errors or malformed output. + +### Safe Pattern: Use `jsonable_encoder` { #safe-pattern-use-jsonable-encoder } + +Wrap the result of `.to_dict()` with `jsonable_encoder()` to safely convert all NumPy types to JSON-compatible Python types, and convert `NaT`/`nan` to `null`: + +```python +from fastapi import FastAPI +from fastapi.encoders import jsonable_encoder +import pandas as pd + +app = FastAPI() + + +@app.get("/encounters") +def get_encounters(): + df = pd.read_csv("healthcare_data.csv") + + # ❌ May raise TypeError — numpy.int64 / numpy.float64 / NaN not JSON serializable + # return df.to_dict(orient="records") + + # ✅ jsonable_encoder converts numpy types and NaT/NaN -> null + return jsonable_encoder(df.to_dict(orient="records")) +``` + +### Alternative: Pandas Native JSON Serialization { #alternative-pandas-native-json-serialization } + +Pandas has its own JSON serializer that handles NumPy types and dates natively. Use `df.to_json()` with `json.loads()` to produce a clean Python list: + +```python +import json +from fastapi import FastAPI +import pandas as pd + +app = FastAPI() + + +@app.get("/encounters") +def get_encounters(): + df = pd.read_csv("healthcare_data.csv") + + # ✅ Pandas handles numpy types + datetime columns with ISO format + return json.loads(df.to_json(orient="records", date_format="iso")) +``` + +/// tip + +Use the `date_format="iso"` parameter with `df.to_json()` to ensure datetime columns are serialized as ISO 8601 strings (e.g., `"2024-01-15T00:00:00"`), consistent with how `jsonable_encoder` handles Python `datetime` objects. + +/// + +/// note + +Both approaches produce `null` for missing values (`NaN`, `NaT`), which is the correct JSON representation for missing data. + +///