Analiza danych

129
Ebook

Data science od podstaw. Analiza danych w Pythonie. Wydanie II

Joel Grus

Analityka danych jest uważana za wyjątkowo obiecującą dziedzinę wiedzy. Rozwija się błyskawicznie i znajduje coraz to nowsze zastosowania. Profesjonaliści biegli w eksploracji danych i wydobywaniu z nich pożytecznych informacji mogą liczyć na interesującą pracę i bardzo atrakcyjne warunki zatrudnienia. Jednak aby zostać analitykiem danych, trzeba znać matematykę i statystykę, a także nauczyć się programowania. Umiejętności w zakresie uczenia maszynowego i uczenia głębokiego również są ważne. W przypadku tak specyficznej dziedziny, jaką jest nauka o danych, szczególnie istotne jest zdobycie gruntownych podstaw i dogłębne ich zrozumienie. W tym przewodniku opisano zagadnienia związane z podstawami nauki o danych. Wyjaśniono niezbędne elementy matematyki i statystyki. Przedstawiono także techniki budowy potrzebnych narzędzi i sposoby działania najistotniejszych algorytmów. Książka została skonstruowana tak, aby poszczególne implementacje były jak najbardziej przejrzyste i zrozumiałe. Zamieszczone tu przykłady napisano w Pythonie: jest to język dość łatwy do nauki, a pracę na danych ułatwia szereg przydatnych bibliotek Pythona. W drugim wydaniu znalazły się nowe tematy, takie jak uczenie głębokie, statystyka i przetwarzanie języka naturalnego, a także działania na ogromnych zbiorach danych. Zagadnienia te często pojawiają się w pracy współczesnego analityka danych. W książce między innymi: elementy algebry liniowej, statystyki i rachunku prawdopodobieństwa zbieranie, oczyszczanie i eksploracja danych algorytmy modeli analizy danych podstawy uczenia maszynowego systemy rekomendacji i przetwarzanie języka naturalnego analiza sieci społecznościowych i algorytm MapReduce Nauka o danych: bazuj na solidnych podstawach!

130
Ebook

Data Science with SQL Server Quick Start Guide. Integrate SQL Server with data science

Dejan Sarka

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you.This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment.You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.

131
Ebook

Data Visualization for Business Decisions. Transforming Data into Actionable Insights

Mercury Learning and Information, Andres Fortino

This workbook is for business analysts aiming to enhance their skills in creating data visuals, presentations, and report illustrations to support business decisions. It focuses on developing visualization and analytical skills through qualitative labs. Readers will analyze and describe chart improvements instead of directly modifying them. The course covers eighteen elements across six dimensions: Story, Signs, Purpose, Perception, Method, and Charts.The journey starts with labs and a case study, introducing the analysis tool. It then delves into each dimension, guiding readers through exercises to enhance their understanding and skills. A comprehensive RAIKS survey assesses progress before and after using the text. The workbook concludes with a capstone exercise to review and analyze the final results of the two studied charts.These skills are crucial for effective data communication in business. This workbook transitions readers from basic to advanced visualization techniques, blending theoretical insights with practical skills. Companion files with videos, sample files, and slides enhance learning, making this workbook an essential resource for mastering business data visualization.

132
Ebook

Data Visualization with D3 4.x Cookbook. Visualization Strategies for Tackling Dirty Data - Second Edition

Nick Zhu

Master D3.js and create amazing visualizations with the Data Visualization with D3 4.x Cookbook. Written by professional data engineer Nick Zhu, this D3.js cookbook features over 65 recipes. ? Solve real-world visualization problems using D3.js practical recipes ? Understand D3 fundamentals ? Includes illustrations, ready-to-go code samples and pre-built chart recipes

133
Ebook
134
Ebook

Data Wrangling on AWS. Clean and organize complex data for analysis

Navnit Shukla, Sankar M, Sampat Palani

Data wrangling is the process of cleaning, transforming, and organizing raw, messy, or unstructured data into a structured format. It involves processes such as data cleaning, data integration, data transformation, and data enrichment to ensure that the data is accurate, consistent, and suitable for analysis. Data Wrangling on AWS equips you with the knowledge to reap the full potential of AWS data wrangling tools.First, you’ll be introduced to data wrangling on AWS and will be familiarized with data wrangling services available in AWS. You’ll understand how to work with AWS Glue DataBrew, AWS data wrangler, and AWS Sagemaker. Next, you’ll discover other AWS services like Amazon S3, Redshift, Athena, and Quicksight. Additionally, you’ll explore advanced topics such as performing Pandas data operation with AWS data wrangler, optimizing ML data with AWS SageMaker, building the data warehouse with Glue DataBrew, along with security and monitoring aspects.By the end of this book, you’ll be well-equipped to perform data wrangling using AWS services.

135
Ebook

Data Wrangling Using Pandas, SQL, and Java. A Comprehensive Guide to Data Cleaning and Transformation

Mercury Learning and Information, Oswald Campesato

This book is designed for aspiring data scientists and those involved in data cleaning. It covers features of NumPy and Pandas, along with creating databases and tables in MySQL. It also addresses various data wrangling tasks using Python scripts and awk-based shell scripts. Companion files with code are available from the publisher.Understanding data cleaning and manipulation is vital for data scientists. This book provides a comprehensive introduction to essential tools and techniques. From Python basics to advanced data wrangling, it equips readers with the skills needed to manage and clean data effectively.The journey begins with an introduction to Python and progresses through working with data, Pandas, and SQL. It also covers Java, JSON, XML, and specific data cleaning tasks. The book culminates with detailed data wrangling techniques, ensuring readers gain practical, hands-on experience in data management.

136
Ebook

Data Wrangling with Python. Creating actionable data from raw sources

Dr. Tirthajyoti Sarkar, Shubhadeep Roychowdhury

For data to be useful and meaningful, it must be curated and refined. Data Wrangling with Python teaches you the core ideas behind these processes and equips you with knowledge of the most popular tools and techniques in the domain.The book starts with the absolute basics of Python, focusing mainly on data structures. It then delves into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python. This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The book will further help you grasp concepts through real-world examples and datasets.By the end of this book, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently.