{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Title: Iterating over a DataFrame\n", "Slug: Iterating_over_a_dataframe\n", "Summary: Iterating over a Pandas DataFrame with a generator\n", "Date: 2017-10-14 20:33 \n", "Category: Python \n", "Tags: Data Wrangling \n", "Authors: Guillaume Redoulès" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a sample dataframe" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Import modules\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fruitcolorkcal
0Bananayellow89
1Orangeorange47
2Applered52
3lemonyellow15
4limegreen30
5plumpurple28
\n", "
" ], "text/plain": [ " fruit color kcal\n", "0 Banana yellow 89\n", "1 Orange orange 47\n", "2 Apple red 52\n", "3 lemon yellow 15\n", "4 lime green 30\n", "5 plum purple 28" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Example dataframe\n", "\n", "raw_data = {'fruit': ['Banana', 'Orange', 'Apple', 'lemon', \"lime\", \"plum\"], \n", " 'color': ['yellow', 'orange', 'red', 'yellow', \"green\", \"purple\"], \n", " 'kcal': [89, 47, 52, 15, 30, 28]\n", " }\n", "\n", "df = pd.DataFrame(raw_data, columns = ['fruit', 'color', 'kcal'])\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using the iterrows method\n", "\n", "Pandas DataFrames can return a generator with the iterrrows method. It can then be used to loop over the rows of the DataFrame\n", "\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "At line 0 there is a Banana which is yellow and contains 89 kcal\n", "At line 1 there is a Orange which is orange and contains 47 kcal\n", "At line 2 there is a Apple which is red and contains 52 kcal\n", "At line 3 there is a lemon which is yellow and contains 15 kcal\n", "At line 4 there is a lime which is green and contains 30 kcal\n", "At line 5 there is a plum which is purple and contains 28 kcal\n" ] } ], "source": [ "for index, row in df.iterrows():\n", " print(\"At line {0} there is a {1} which is {2} and contains {3} kcal\".format(index, row[\"fruit\"], row[\"color\"], row[\"kcal\"]))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }