Autonomía digital y tecnológica

Código e ideas para una internet distribuida

Linkoteca. Pandas


Nothing beats the bar chart for fast data exploration and comparison of variable values between different groups, or building a story around how groups of data are composed.

The advantage of bar charts (or “bar plots”, “column charts”) over other chart types is that the human eye has evolved a refined ability to compare the length of objects, as opposed to angle or area.

Luckily for Python users, options for visualisation libraries are plentiful, and Pandas itself has tight integration with the Matplotlib visualisation library, allowing figures to be created directly from DataFrame and Series data objects. This blog post focuses on the use of the DataFrame.plot functions from the Pandas visualisation API.

NumPy (short for Numerical Python) is one of the top libraries equipped with useful resources to help data scientists turn Python into a powerful scientific analysis and modelling tool. The popular open source library is available under the BSD license. It is the foundational Python library for performing tasks in scientific computing. NumPy is part of a bigger Python-based ecosystem of open source tools called SciPy.

Pandas is another great library that can enhance your Python skills for data science. Just like NumPy, it belongs to the family of SciPy open source software and is available under the BSD free software license.

Matplotlib is also part of the SciPy core packages and offered under the BSD license. It is a popular Python scientific library used for producing simple and powerful visualizations. You can use the Python framework for data science for generating creative graphs, charts, histograms, and other shapes and figures—without worrying about writing many lines of code. For example, let’s see how the Matplotlib library can be used to create a simple bar chart.