Author: Dr. Tirthajyoti Sarkar
1
Ebook

Data Wrangling with Python. Creating actionable data from raw sources

Dr. Tirthajyoti Sarkar, Shubhadeep Roychowdhury

For data to be useful and meaningful, it must be curated and refined. Data Wrangling with Python teaches you the core ideas behind these processes and equips you with knowledge of the most popular tools and techniques in the domain.The book starts with the absolute basics of Python, focusing mainly on data structures. It then delves into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python. This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The book will further help you grasp concepts through real-world examples and datasets.By the end of this book, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently.

2
Ebook

The Data Wrangling Workshop. Create your own actionable insights using data from multiple raw sources - Second Edition

Brian Lipp, Shubhadeep Roychowdhury, Dr. Tirthajyoti Sarkar

While a huge amount of data is readily available to us, it is not useful in its raw form. For data to be meaningful, it must be curated and refined.If you’re a beginner, then The Data Wrangling Workshop will help to break down the process for you. You’ll start with the basics and build your knowledge, progressing from the core aspects behind data wrangling, to using the most popular tools and techniques.This book starts by showing you how to work with data structures using Python. Through examples and activities, you’ll understand why you should stay away from traditional methods of data cleaning used in other languages and take advantage of the specialized pre-built routines in Python. Later, you’ll learn how to use the same Python backend to extract and transform data from an array of sources, including the internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, the book teaches you how to handle missing or incorrect data, and reformat it based on the requirements from your downstream analytics tool.By the end of this book, you will have developed a solid understanding of how to perform data wrangling with Python, and learned several techniques and best practices to extract, clean, transform, and format your data efficiently, from a diverse array of sources.