This guide assumes you have pandas and openpyxl installed. If not, you can install pandas and openpyxl by running pip install pandas openpyxl
in your command line or terminal.
First, import the pandas library and use read_excel()
to load your Excel file into a DataFrame. Replace 'your_file.xlsx'
with the path to your Excel file.
import pandas as pd # Load the Excel file df = pd.read_excel('your_file.xlsx')
You can add a new column to the DataFrame by simply assigning values to it. Here are a few examples:
apply()
method.After adding the new column, you can save the modified DataFrame back to an Excel file using the to_excel()
function. Be sure to set the index
parameter to False
if you don't want to include the DataFrame's index as a separate column in the Excel file.
df.to_excel('modified_file.xlsx', index=False)
A new Excel file will be generated in your working directory.
- Handling multiple sheets: If your Excel file contains multiple sheets and you wish to modify a specific sheet, you can specify the sheet name in read_excel()
using the sheet_name
parameter.
df = pd.read_excel('your_file.xlsx', sheet_name='Sheet1')
- Installing openpyxl: Pandas relies on openpyxl
for reading and writing Excel files (.xlsx). If you haven't already, you might need to install it: pip install openpyxl
.
- Preserving formatting: Note that pandas does not preserve the original formatting of the Excel file. If maintaining the original formatting is crucial, consider using other libraries like openpyxl
directly for more fine-grained control over Excel file manipulations.