When building Python applications, especially those involving external libraries like pandas or polars, it's easy to fall into the trap of writing your entire codebase around those libraries. But what happens if you need to switch libraries down the road?
A smart strategy is to use an API wrapper—a layer between your main code and the external library. This wrapper acts as a middleman, so your application only interacts with your own defined interface. If the library ever changes, you update the wrapper—not your whole codebase.
This approach is strongly recommended in Serious Python by Julien Danjou:
“No matter how useful an external library might be, be wary of letting it get its hooks into your actual source code... A better idea is to write your own API—a wrapper that encapsulates your external libraries and keeps them out of your source code.”
— Serious Python, Chapter 5
Let’s walk through a real-world example using pandas and polars, two popular Python libraries for working with tabular data.
In this example, we’ll start by building a simple wrapper around the pandas library to calculate the average of a column and count rows in a CSV file. Then, we’ll see how easy it is to swap in polars with just a small change to the wrapper—leaving the rest of the main code untouched.
Step 1: Define the Wrapper
# my_data_api.py (using pandas initially)
import pandas as pd
class DataAPI:
def __init__(self, file_path):
self.df = pd.read_csv(file_path)
def get_column_average(self, column_name):
return self.df[column_name].mean()
def get_row_count(self):
return len(self.df)
Step 2: Main Code Uses Only the Wrapper
# main.py
from my_data_api import DataAPI
def main():
api = DataAPI("data.csv")
print(f"Average: {api.get_column_average('score')}")
print(f"Rows: {api.get_row_count()}")
Step 3: Update the Wrapper to switch from using Pandas to Polars (All changes made in the one wrapper file)
# my_data_api.py (now using polars)
import polars as pl
class DataAPI:
def __init__(self, file_path):
self.df = pl.read_csv(file_path) # polars equivalent of pd.read_csv()
def get_column_average(self, column_name):
return self.df[column_name].mean()
def get_row_count(self):
return self.df.height # polars equivalent of len(df)
✅ Done. We swapped libraries with zero changes to the main code:
# main.py
from my_data_api import DataAPI
def main():
api = DataAPI("data.csv")
print(f"Average: {api.get_column_average('score')}")
print(f"Rows: {api.get_row_count()}")
Now, imagine the main code directly used pandas everywhere:
# main.py (original version using pandas directly)
import pandas as pd
def main():
df = pd.read_csv("data.csv")
avg = df["score"].mean()
rows = len(df)
print(f"Average: {avg}")
print(f"Rows: {rows}")
Now if we want to switch to polars, we have to:
Replace pd.read_csv() with pl.read_csv()
Update len(df) to df.height
Ensure every single usage of pandas syntax works with polars' APIs
Update all function calls and refactor logic across the codebase
Rewritten version with polars (without a wrapper)
# main.py (rewritten for polars)
import polars as pl
def main():
df = pl.read_csv("data.csv")
avg = df["score"].mean()
rows = df.height
print(f"Average: {avg}")
print(f"Rows: {rows}")
Now imagine doing that in 10, 50, or 100 files across a large codebase. Yikes.
Using an API wrapper is a classic case of protecting yourself from future pain. It’s a little more work upfront, but a huge time-saver if:
A better library comes along
The original library is deprecated or falls behind
You want to swap libraries or add testing layers
Your code becomes modular, maintainable, and future-proof.
Just like Serious Python suggests—treat external libraries like power tools. Use them carefully, and store them in your "tool shed" (wrapper) to protect the rest of your house (codebase).