Електронні книги
7537
Eлектронна книга

Data science: najseksowniejszy zawód XXI wieku w Polsce. Big data, sztuczna inteligencja i PowerPoint

Remigiusz Żulicki

Czy sztuczna inteligencja pozbawia nas pracy? Algorytmy przejmują władzę nad światem? Czy big data sprawia, że jesteśmy bezustannie inwigilowani, a ogromna ilość danych zastępuje ekspertów i naukowców? Cokolwiek sądzimy na te tematy, jedno jest pewne - istnieje heterogeniczne środowisko ludzi zajmujących się tzw. "sztuczną inteligencją" czy tzw. "big data" od strony technicznej oraz metodologicznej. Pole ich działania nazywane jest data science, a oni - data scientists. Publikacja to pierwsza monografia socjologiczna dotycząca data science i pierwsza praca w naukach społecznych, w której data science zostało zbadane jako społeczny świat w rozumieniu Adele E. Clarke. Podejście to pozwala spojrzeć na data science, nazwane dekadę wstecz w "Harvard Business Review" "najseksowniejszym zawodem XXI wieku", zarówno z perspektywy jego uczestników, jak i z lotu ptaka, w relacji do akademii, biznesu, prawa, mediów czy polityki.

7538
Eлектронна книга

Data science od podstaw. Analiza danych w Pythonie

Joel Grus

Współczesne ogromne zbiory danych zawierają odpowiedzi na prawie każde pytanie. Równocześnie nauka o danych jest dziedziną, która cokolwiek onieśmiela. Znajduje się gdzieś pomiędzy subtelnymi umiejętnościami hakerskimi, twardą wiedzą z matematyki i statystyki a merytoryczną znajomością zagadnień z danej branży. Co więcej, dziedzina ta niezwykle dynamicznie się rozwija. Trud włożony w naukę o danych niewątpliwie się jednak opłaca: biegły analityk danych może liczyć na dobrze płatną, inspirującą i bardzo atrakcyjną pracę. Dzięki tej książce opanujesz najważniejsze zagadnienia związane z matematyką i statystyką, będziesz także rozwijać umiejętności hakerskie. W ten sposób zyskasz podstawy pozwalające na rozpoczęcie przygody z analizą danych. Gruntownie zapoznasz się z potrzebnymi narzędziami i algorytmami. Pozwoli Ci to lepiej zrozumieć ich działanie. Poszczególne przykłady, którymi zilustrowano omawiane zagadnienia, są przejrzyste, dobrze opisane i zrozumiałe. Podczas lektury książki poznasz biblioteki, które umożliwią zaimplementowanie omówionych technik podczas analizy dużych zbiorów danych. Szybko się przekonasz, że aby zostać analitykiem danych, wystarczy odrobina ciekawości, sporo chęci, mnóstwo ciężkiej pracy i... ta książka. Najważniejsze zagadnienia: Praktyczne wprowadzenie do Pythona Podstawy algebry liniowej, statystyki i rachunku prawdopodobieństwa w analizie danych Podstawy uczenia maszynowego Implementacje algorytmów modeli, w tym naiwny klasyfikator bayesowski, regresja liniowa, regresja logistyczna, drzewa decyzyjne, sieci neuronowe i grupowanie, MapReduce Systemy rekomendacji i mechanizmy przetwarzania języka naturalnego Korzystanie z mediów społecznościowych i baz danych. Python. Wyciśniesz z danych każdą kroplę wiedzy!

7539
Eлектронна книга

Data science od podstaw. Analiza danych w Pythonie. Wydanie II

Joel Grus

Analityka danych jest uważana za wyjątkowo obiecującą dziedzinę wiedzy. Rozwija się błyskawicznie i znajduje coraz to nowsze zastosowania. Profesjonaliści biegli w eksploracji danych i wydobywaniu z nich pożytecznych informacji mogą liczyć na interesującą pracę i bardzo atrakcyjne warunki zatrudnienia. Jednak aby zostać analitykiem danych, trzeba znać matematykę i statystykę, a także nauczyć się programowania. Umiejętności w zakresie uczenia maszynowego i uczenia głębokiego również są ważne. W przypadku tak specyficznej dziedziny, jaką jest nauka o danych, szczególnie istotne jest zdobycie gruntownych podstaw i dogłębne ich zrozumienie. W tym przewodniku opisano zagadnienia związane z podstawami nauki o danych. Wyjaśniono niezbędne elementy matematyki i statystyki. Przedstawiono także techniki budowy potrzebnych narzędzi i sposoby działania najistotniejszych algorytmów. Książka została skonstruowana tak, aby poszczególne implementacje były jak najbardziej przejrzyste i zrozumiałe. Zamieszczone tu przykłady napisano w Pythonie: jest to język dość łatwy do nauki, a pracę na danych ułatwia szereg przydatnych bibliotek Pythona. W drugim wydaniu znalazły się nowe tematy, takie jak uczenie głębokie, statystyka i przetwarzanie języka naturalnego, a także działania na ogromnych zbiorach danych. Zagadnienia te często pojawiają się w pracy współczesnego analityka danych. W książce między innymi: elementy algebry liniowej, statystyki i rachunku prawdopodobieństwa zbieranie, oczyszczanie i eksploracja danych algorytmy modeli analizy danych podstawy uczenia maszynowego systemy rekomendacji i przetwarzanie języka naturalnego analiza sieci społecznościowych i algorytm MapReduce Nauka o danych: bazuj na solidnych podstawach!

7540
Eлектронна книга

Data Science. Programowanie, analiza i wizualizacja danych z wykorzystaniem języka R

Michael Freeman, Joel Ross

Aby surowe dane przekuć w gotową do wykorzystania wiedzę, potrzebna jest umiejętność ich analizy, przekształcania i niekiedy również wizualizacji. Nagrodą za włożony w to wysiłek jest lepsze rozumienie różnych złożonych zagadnień z wielu dziedzin wiedzy. Co więcej, znajomość procesów programowego przetwarzania danych pozwala na szybkie wykrywanie i opisywanie wzorców danych, praktycznie niemożliwych do dostrzeżenia innymi technikami. Dla wielu badaczy jednak barierą na drodze do skorzystania z tych atrakcyjnych możliwości jest konieczność pisania kodu. Oto podręcznik programowania w języku R dla analityków danych, szczególnie przydatny dla osób, które nie mają doświadczenia w tej dziedzinie. Dokładnie opisano tu potrzebne narzędzia i technologie. Zamieszczono wskazówki dotyczące instalacji i konfiguracji oprogramowania do pisania kodu, wykonywania go i zarządzania nim, a także śledzenia wersji projektów i zmian w nich oraz korzystania z innych podstawowych mechanizmów. Poszczególne kroki tworzenia kodu w języku R wyjaśniono dokładnie i przystępnie. Dzięki tej książce można płynnie przejść do konkretnych zadań i budować potrzebne aplikacje. Zrozumienie prezentowanych w niej treści ułatwiają liczne przykłady i ćwiczenia, co pozwala szybko przystąpić do skutecznego analizowania własnych zbiorów danych. W tej książce między innymi: przygotowanie środowiska pracy i rozpoczęcie programowania w R podstawy zarządzania projektami, kontrola wersji i generowanie dokumentacji ramki danych, pakiety dplyr i tidyr kod do wizualizacji danych i pakiet ggplot2 tworzenie aplikacji i techniki współpracy w zespołach specjalistów Po prostu R i dane. Wyciśniesz każdą kroplę wiedzy!

7541
Eлектронна книга

Data Science Projects with Python. A case study approach to gaining valuable insights from real data with machine learning - Second Edition

Stephen Klosterman

If data is the new oil, then machine learning is the drill. As companies gain access to ever-increasing quantities of raw data, the ability to deliver state-of-the-art predictive models that support business decision-making becomes more and more valuable.In this book, you’ll work on an end-to-end project based around a realistic data set and split up into bite-sized practical exercises. This creates a case-study approach that simulates the working conditions you’ll experience in real-world data science projects.You’ll learn how to use key Python packages, including pandas, Matplotlib, and scikit-learn, and master the process of data exploration and data processing, before moving on to fitting, evaluating, and tuning algorithms such as regularized logistic regression and random forest. Now in its second edition, this book will take you through the end-to-end process of exploring data and delivering machine learning models. Updated for 2021, this edition includes brand new content on XGBoost, SHAP values, algorithmic fairness, and the ethical concerns of deploying a model in the real world.By the end of this data science book, you’ll have the skills, understanding, and confidence to build your own machine learning models and gain insights from real data.

7542
Eлектронна книга

Data Science Projects with Python. A case study approach to successful data science projects using Python, pandas, and scikit-learn

Stephen Klosterman

Data Science Projects with Python is designed to give you practical guidance on industry-standard data analysis and machine learning tools, by applying them to realistic data problems. You will learn how to use pandas and Matplotlib to critically examine datasets with summary statistics and graphs, and extract the insights you seek to derive. You will build your knowledge as you prepare data using the scikit-learn package and feed it to machine learning algorithms such as regularized logistic regression and random forest. You’ll discover how to tune algorithms to provide the most accurate predictions on new and unseen data. As you progress, you’ll gain insights into the working and output of these algorithms, building your understanding of both the predictive capabilities of the models and why they make these predictions.By then end of this book, you will have the necessary skills to confidently use machine learning algorithms to perform detailed data analysis and extract meaningful insights from unstructured data.

7543
Eлектронна книга

Data Science Tools. Comprehensive Guide to Mastering Fundamental Data Science and Statistics Techniques

Mercury Learning and Information, Christopher Greco

This book introduces popular data science tools and guides readers on how to use them effectively. It covers data analysis using Microsoft Excel, KNIME, R, and OpenOffice, applying statistical concepts such as confidence intervals, normal distribution, T-Tests, linear regression, histograms, and geographic analysis with real data from Federal Government sources.The course begins with the basics, including importing data and conducting various statistical tests. It progresses to specific methods for each tool, ensuring a comprehensive understanding of data analysis. Capstone exercises provide hands-on experience, reinforcing the concepts learned throughout the book.Understanding these tools and concepts is crucial for effective data analysis. This book takes readers from the basics to advanced statistical methods, combining theoretical insights with practical applications. Companion files with source code and data sets enhance the learning experience, making this book an essential resource for mastering data analysis with popular software applications.

7544
Eлектронна книга

Data Science with .NET and Polyglot Notebooks. Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel

Matt Eland

As the fields of data science, machine learning, and artificial intelligence rapidly evolve, .NET developers are eager to leverage their expertise to dive into these exciting domains but are often unsure of how to do so. Data Science in .NET with Polyglot Notebooks is the practical guide you need to seamlessly bring your .NET skills into the world of analytics and AI. With Microsoft’s .NET platform now robustly supporting machine learning and AI tasks, the introduction of tools such as .NET Interactive kernels and Polyglot Notebooks has opened up a world of possibilities for .NET developers. This book empowers you to harness the full potential of these cutting-edge technologies, guiding you through hands-on experiments that illustrate key concepts and principles. Through a series of interactive notebooks, you’ll not only master technical processes but also discover how to integrate these new skills into your current role or pivot to exciting opportunities in the data science field. By the end of the book, you’ll have acquired the necessary knowledge and confidence to apply cutting-edge data science techniques and deliver impactful solutions within the .NET ecosystem.

7545
Eлектронна книга

Data Science with SQL Server Quick Start Guide. Integrate SQL Server with data science

Dejan Sarka

SQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you.This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment.You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm.

7546
Eлектронна книга

Data science, wyzwania i rozwiązania. Jak zostać ekspertem analizy danych

Daniel Vaughan

Uczenie się i praktykowanie danologii nie należy do najłatwiejszych zadań. Edukacja w tej dziedzinie zazwyczaj dotyczy programowania i uczenia maszynowego, a przecież świetny analityk danych musi się znać na wielu innych zagadnieniach. Może się ich nauczyć w pracy, ale w tym celu konieczne jest znalezienie mentora. A to niestety nie zawsze jest możliwe. Ten podręcznik zaczyna się tam, gdzie większość książek się kończy - od rzeczywistych procesów decyzyjnych opartych na wnioskach wynikających z danych. Brett Holleman, niezależny danolog Dzięki tej książce przyswoisz różne techniki, które pomogą Ci stać się bardziej produktywnym analitykiem danych. Najpierw zapoznasz się z tematami związanymi z rozumieniem danych i umiejętnościami miękkimi, które okazują się konieczne w pracy dobrego danologa. Dopiero potem skupisz się na kluczowych aspektach uczenia maszynowego. W ten sposób stopniowo przejdziesz ścieżkę od przeciętnego kandydata do wyjątkowego specjalisty data science. Umiejętności opisane w tym przewodniku przez wiele lat były rozpoznawane, katalogowane, analizowane i stosowane do generowania wartości i szkolenia danologów w różnych firmach i branżach. Z książki dowiesz się: jak sprawić, by procesy oparte na analizie danych generowały wartość jak zaprojektować przydatne wskaźniki jak zdobywać poparcie interesariuszy jak się upewnić, że algorytm uczenia maszynowego nadaje się do rozwiązania danego zadania jak zapanować nad wyciekami danych Oto brakujący podręcznik pozwalający odnieść sukces komercyjny dzięki data science! Adri Purkayastha, dyrektor do spraw zagrożeń związanych z AI, BNP Paribas

7547
Eлектронна книга

Data Stewardship in Action. A roadmap to data value realization and measurable business outcomes

Pui Shing Lee, Dr. Toa Charm

In the competitive data-centric world, mastering data stewardship is not just a requirement—it's the key to organizational success. Unlock strategic excellence with Data Stewardship in Action, your guide to exploring the intricacies of data stewardship and its implementation for maximum efficiency.From business strategy to data strategy, and then to data stewardship, this book shows you how to strategically deploy your workforce, processes, and technology for efficient data processing. You’ll gain mastery over the fundamentals of data stewardship, from understanding the different roles and responsibilities to implementing best practices for data governance. You’ll elevate your data management skills by exploring the technologies and tools for effective data handling. As you progress through the chapters, you’ll realize that this book not only helps you develop the foundational skills to become a successful data steward but also introduces innovative approaches, including leveraging AI and GPT, for enhanced data stewardship.By the end of this book, you’ll be able to build a robust data governance framework by developing policies and procedures, establishing a dedicated data governance team, and creating a data governance roadmap that ensures your organization thrives in the dynamic landscape of data management.

7548
Eлектронна книга

Data Storytelling with Google Looker Studio. A hands-on guide to using Looker Studio for building compelling and effective dashboards

Sireesha Pulipati, Nicholas Kelly

Presenting data visually makes it easier for organizations and individuals to interpret and analyze information. Looker Studio is an easy-to-use, collaborative tool that enables you to transform your data into engaging visualizations. This allows you to build and share dashboards that help monitor key performance indicators, identify patterns, and generate insights to ultimately drive decisions and actions.Data Storytelling with Looker Studio begins by laying out the foundational design principles and guidelines that are essential to creating accurate, effective, and compelling data visualizations. Next, you’ll delve into features and capabilities of Looker Studio – from basic to advanced – and explore their application with examples. The subsequent chapters walk you through building dashboards with a structured three-stage process called the 3D approach using real-world examples that’ll help you understand the various design and implementation considerations. This approach involves determining the objectives and needs of the dashboard, designing its key components and layout, and developing each element of the dashboard.By the end of this book, you will have a solid understanding of the storytelling approach and be able to create data stories of your own using Looker Studio.

7549
Eлектронна книга

Data Structures and Algorithms with the C++ STL. A guide for modern C++ practitioners

John Farrier

While the Standard Template Library (STL) offers a rich set of tools for data structures and algorithms, navigating its intricacies can be daunting for intermediate C++ developers without expert guidance. This book offers a thorough exploration of the STL’s components, covering fundamental data structures, advanced algorithms, and concurrency features.Starting with an in-depth analysis of the std::vector, this book highlights its pivotal role in the STL, progressing toward building your proficiency in utilizing vectors, managing memory, and leveraging iterators. The book then advances to STL’s data structures, including sequence containers, associative containers, and unordered containers, simplifying the concepts of container adaptors and views to enhance your knowledge of modern STL programming. Shifting the focus to STL algorithms, you’ll get to grips with sorting, searching, and transformations and develop the skills to implement and modify algorithms with best practices. Advanced sections cover extending the STL with custom types and algorithms, as well as concurrency features, exception safety, and parallel algorithms.By the end of this book, you’ll have transformed into a proficient STL practitioner ready to tackle real-world challenges and build efficient and scalable C++ applications.

7550
Eлектронна книга

Data Structures and Program Design Using C++. A Self-Teaching Introduction to Data Structures and C++

Mercury Learning and Information, D. Malhotra, N. Malhotra

This book introduces the fundamentals of data structures using C++ in a self-teaching format. It covers managing large amounts of information, SEO, and creating Internet/Web indexing services. Practical analogies with real-world applications help explain technical concepts. The book includes end-of-chapter exercises such as programming tasks, theoretical questions, and multiple-choice quizzes.The course starts with an introduction to data structures and the C++ language, progressing through arrays, linked lists, queues, searching and sorting, stacks, trees, multi-way search trees, hashing, files, and graphs. Each chapter builds on the previous one, ensuring a comprehensive understanding of data structures.Understanding these concepts is crucial for managing large databases and optimizing web services. This book guides readers from basic to advanced data structure techniques, blending theoretical knowledge with practical skills. Companion files with source code and data sets enhance the learning experience, making this book an essential resource for mastering data structures with C++.

7551
Eлектронна книга

Data Structures and Program Design Using Java. A Self-Teaching Introduction to Data Structures and Java

Mercury Learning and Information, D. Malhotra, N. Malhotra

This book introduces the fundamentals of data structures using Java in a self-teaching format. It covers managing large databases, effective SEO, and creating web indexing services. Real-world analogies help explain technical concepts. Each chapter includes programming tasks, theoretical questions, and multiple-choice quizzes.The course begins with an introduction to data structures and Java, moving through arrays, linked lists, queues, searching and sorting, stacks, trees, multi-way search trees, hashing, files, and graphs. Each chapter builds on the previous one, ensuring a thorough understanding of data structures.Understanding these concepts is crucial for managing information and optimizing web services. This book guides readers from basic to advanced techniques, blending theory with practical skills. It is an essential resource for mastering data structures with Java, enhanced by end-of-chapter exercises and real-world examples.

7552
Eлектронна книга

Data Structures and Program Design Using Python. A Self-Teaching Introduction to Data Structures and Python

Mercury Learning and Information, D. Malhotra, N. Malhotra

This book, part of the Pocket Primer series, introduces the basic concepts of data science using Python 3 and other applications. It offers a fast-paced introduction to data analytics, statistics, data visualization, linear algebra, and regular expressions. The book features numerous code samples using Python, NumPy, R, SQL, NoSQL, and Pandas. Companion files with source code and color figures are available.Understanding data science is crucial in today's data-driven world. This book provides a comprehensive introduction, covering key areas such as Python 3, data visualization, and statistical concepts. The practical code samples and hands-on approach make it ideal for beginners and those looking to enhance their skills.The journey begins with working with data, followed by an introduction to probability, statistics, and linear algebra. It then delves into Python, NumPy, Pandas, R, regular expressions, and SQL/NoSQL, concluding with data visualization techniques. This structured approach ensures a solid foundation in data science.

7553
Eлектронна книга

Data Visualization: a successful design process

Andy Kirk, Andy Kirk

Do you want to create more attractive charts? Or do you have huge data sets and need to unearth the key insights in a visual manner? Data visualization is the representation and presentation of data, using proven design techniques to bring alive the patterns, stories and key insights locked away.Data Visualization: a Successful Design Process explores the unique fusion of art and science that is data visualization; a discipline for which instinct alone is insufficient for you to succeed in enabling audiences to discover key trends, insights and discoveries from your data. This book will equip you with the key techniques required to overcome contemporary data visualization challenges. You'll discover a proven design methodology that helps you develop invaluable knowledge and practical capabilities.You'll never again settle for a default Excel chart or resort to fancy-looking graphs. You will be able to work from the starting point of acquiring, preparing and familiarizing with your data, right through to concept design. Choose your killer visual representation to engage and inform your audience.Data Visualization: a Successful Design Process will inspire you to relish any visualization project with greater confidence and bullish know-how; turning challenges into exciting design opportunities.

7554
Eлектронна книга

Data Visualization for Business Decisions. Transforming Data into Actionable Insights

Mercury Learning and Information, Andres Fortino

This workbook is for business analysts aiming to enhance their skills in creating data visuals, presentations, and report illustrations to support business decisions. It focuses on developing visualization and analytical skills through qualitative labs. Readers will analyze and describe chart improvements instead of directly modifying them. The course covers eighteen elements across six dimensions: Story, Signs, Purpose, Perception, Method, and Charts.The journey starts with labs and a case study, introducing the analysis tool. It then delves into each dimension, guiding readers through exercises to enhance their understanding and skills. A comprehensive RAIKS survey assesses progress before and after using the text. The workbook concludes with a capstone exercise to review and analyze the final results of the two studied charts.These skills are crucial for effective data communication in business. This workbook transitions readers from basic to advanced visualization techniques, blending theoretical insights with practical skills. Companion files with videos, sample files, and slides enhance learning, making this workbook an essential resource for mastering business data visualization.

7555
Eлектронна книга

Data Visualization: Representing Information on Modern Web. Click here to enter text

Simon Timms, Andy Kirk, Aendrew Rininsland, Swizec Teller

Do you want to create more attractive charts? Or do you have huge data sets and need to unearth the key insights in a visual manner? Data visualization is the representation and presentation of data, using proven design techniques to bring alive the patterns, stories, and key insights that are locked away.This learning path is divided into three modules. The first module will equip you with the key techniques required to overcome contemporary data visualization challenges. In the second module, Social Data Visualization with HTML5 and JavaScript, it teaches you how to leverage HTML5 techniques through JavaScript to build visualizations.In third module, Learning d3.js Data Visualization, will lead you to D3, which has emerged as one of the leading platforms to develop beautiful, interactive visualizations over the web. By the end of this course, you will have unlocked the mystery behind successful data visualizations.This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:? Data Visualization: a successful design process by Andy Kirk? Social Data Visualization with HTML5 and JavaScript by Simon Timms? Learning d3.js Data Visualization, Second Edition by Ændrew Rininsland and Swizec Teller

7556
Eлектронна книга

Data Visualization with D3 4.x Cookbook. Visualization Strategies for Tackling Dirty Data - Second Edition

Nick Zhu

Master D3.js and create amazing visualizations with the Data Visualization with D3 4.x Cookbook. Written by professional data engineer Nick Zhu, this D3.js cookbook features over 65 recipes. ? Solve real-world visualization problems using D3.js practical recipes ? Understand D3 fundamentals ? Includes illustrations, ready-to-go code samples and pre-built chart recipes

7557
Eлектронна книга
7558
Eлектронна книга

Data Visualization with D3.js Cookbook. Turn your digital data into dynamic graphics with this exciting, leading-edge cookbook. Packed with recipes and practical guidance it will quickly make you a proficient user of the D3 JavaScript library

Nick Zhu

D3.js is a JavaScript library designed to display digital data in dynamic graphical form. It helps you bring data to life using HTML, SVG, and CSS. D3 allows great control over the final visual result, and it is the hottest and most powerful web-based data visualization technology on the market today.Data Visualization with D3.js Cookbook is packed with practical recipes to help you learn every aspect of data visualization with D3.Data Visualization with D3.js Cookbook is designed to provide you with all the guidance you need to get to grips with data visualization with D3. With this book, you will create breathtaking data visualization with professional efficiency and precision with the help of practical recipes, illustrations, and code samples.Data Visualization with D3.js Cookbook starts off by touching upon data visualization and D3 basics before gradually taking you through a number of practical recipes covering a wide range of topics you need to know about D3.You will learn the fundamental concepts of data visualization, functional JavaScript, and D3 fundamentals including element selection, data binding, animation, and SVG generation. You will also learn how to leverage more advanced techniques such as custom interpolators, custom tweening, timers, the layout manager, force manipulation, and so on. This book also provides a number of pre-built chart recipes with ready-to-go sample code to help you bootstrap quickly.

7559
Eлектронна книга

Data Wrangling on AWS. Clean and organize complex data for analysis

Navnit Shukla, Sankar M, Sampat Palani

Data wrangling is the process of cleaning, transforming, and organizing raw, messy, or unstructured data into a structured format. It involves processes such as data cleaning, data integration, data transformation, and data enrichment to ensure that the data is accurate, consistent, and suitable for analysis. Data Wrangling on AWS equips you with the knowledge to reap the full potential of AWS data wrangling tools.First, you’ll be introduced to data wrangling on AWS and will be familiarized with data wrangling services available in AWS. You’ll understand how to work with AWS Glue DataBrew, AWS data wrangler, and AWS Sagemaker. Next, you’ll discover other AWS services like Amazon S3, Redshift, Athena, and Quicksight. Additionally, you’ll explore advanced topics such as performing Pandas data operation with AWS data wrangler, optimizing ML data with AWS SageMaker, building the data warehouse with Glue DataBrew, along with security and monitoring aspects.By the end of this book, you’ll be well-equipped to perform data wrangling using AWS services.

7560
Eлектронна книга

Data Wrangling Using Pandas, SQL, and Java. A Comprehensive Guide to Data Cleaning and Transformation

Mercury Learning and Information, Oswald Campesato

This book is designed for aspiring data scientists and those involved in data cleaning. It covers features of NumPy and Pandas, along with creating databases and tables in MySQL. It also addresses various data wrangling tasks using Python scripts and awk-based shell scripts. Companion files with code are available from the publisher.Understanding data cleaning and manipulation is vital for data scientists. This book provides a comprehensive introduction to essential tools and techniques. From Python basics to advanced data wrangling, it equips readers with the skills needed to manage and clean data effectively.The journey begins with an introduction to Python and progresses through working with data, Pandas, and SQL. It also covers Java, JSON, XML, and specific data cleaning tasks. The book culminates with detailed data wrangling techniques, ensuring readers gain practical, hands-on experience in data management.