Analiza danych

561
Wird geladen...
E-BOOK

Practical Data Quality. Learn practical, real-world strategies to transform the quality of data in your organization

Robert Hawker, Nicola Askham

Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating.Practical Data Quality provides a comprehensive view of managing data quality within your organization, covering everything from business cases through to embedding improvements that you make to the organization permanently. Each chapter explains a key element of data quality management, from linking strategy and data together to profiling and designing business rules which reveal bad data. The book outlines a suite of tried-and-tested reports that highlight bad data and allow you to develop a plan to make corrections. Throughout the book, you’ll work with real-world examples and utilize re-usable templates to accelerate your initiatives.By the end of this book, you’ll have gained a clear understanding of every stage of a data quality initiative and be able to drive tangible results for your organization at pace.

562
Wird geladen...
E-BOOK

Practical Data Science Cookbook. Data pre-processing, analysis and visualization using R and Python - Second Edition

Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean...

As increasing amounts of data are generated each year, the need to analyze and create value out of it is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don’t. Because of this, there will be an increasing demand for people that possess both the analytical and technical abilities to extract valuable insights from data and create valuable solutions that put those insights to use. Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python.

563
Wird geladen...
E-BOOK

Practical Data Wrangling. Expert techniques for transforming your raw data into a valuable source for analytics

Allan Visochek

Around 80% of time in data analysis is spent on cleaning and preparing data for analysis. This is, however, an important task, and is a prerequisite to the rest of the data analysis workflow, including visualization, analysis and reporting. Python and R are considered a popular choice of tool for data analysis, and have packages that can be best used to manipulate different kinds of data, as per your requirements. This book will show you the different data wrangling techniques, and how you can leverage the power of Python and R packages to implement them.You’ll start by understanding the data wrangling process and get a solid foundation to work with different types of data. You’ll work with different data structures and acquire and parse data from various locations. You’ll also see how to reshape the layout of data and manipulate, summarize, and join data sets. Finally, we conclude with a quick primer on accessing and processing data from databases, conducting data exploration, and storing and retrieving data quickly using databases.The book includes practical examples on each of these points using simple and real-world data sets to give you an easier understanding. By the end of the book, you’ll have a thorough understanding of all the data wrangling concepts and how to implement them in the best possible way.

564
Wird geladen...
E-BOOK

Practical Deep Learning at Scale with MLflow. Bridge the gap between offline experimentation and online production

Yong Liu, Dr. Matei Zaharia

The book starts with an overview of the deep learning (DL) life cycle and the emerging Machine Learning Ops (MLOps) field, providing a clear picture of the four pillars of deep learning: data, model, code, and explainability and the role of MLflow in these areas.From there onward, it guides you step by step in understanding the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipelines, models, parameters, and metrics at scale. You’ll also tackle running DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tuning DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna, and HyperBand. As you progress, you’ll learn how to build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally create a DL explanation as a service (EaaS) using the popular Shapley Additive Explanations (SHAP) toolbox.By the end of this book, you’ll have built the foundation and gained the hands-on experience you need to develop a DL pipeline solution from initial offline experimentation to final deployment and production, all within a reproducible and open source framework.

565
Wird geladen...
E-BOOK

Practical GIS. Learn novice to advanced topics such as QGIS, Spatial data analysis, and more

Gábor Farkas

The most commonly used GIS tools automate tasks that were historically done manually—compiling new maps by overlaying one on top of the other or physically cutting maps into pieces representing specific study areas, changing their projection, and getting meaningful results from the various layers by applying mathematical functions and operations. This book is an easy-to-follow guide to use the most matured open source GIS tools for these tasks.We’ll start by setting up the environment for the tools we use in the book. Then you will learn how to work with QGIS in order to generate useful spatial data. You will get to know the basics of queries, data management, and geoprocessing.After that, you will start to practice your knowledge on real-world examples. We will solve various types of geospatial analyses with various methods. We will start with basic GIS problems by imitating the work of an enthusiastic real estate agent, and continue with more advanced, but typical tasks by solving a decision problem. Finally, you will find out how to publish your data (and results) on the web. We will publish our data with QGIS Server and GeoServer, and create a basic web map with the API of the lightweight Leaflet web mapping library.

566
Wird geladen...
E-BOOK

Practical Machine Learning Cookbook. Supervised and unsupervised machine learning simplified

Atul Tripathi

Machine learning has become the new black. The challenge in today’s world is the explosion of data from existing legacy data and incoming new structured and unstructured data. The complexity of discovering, understanding, performing analysis, and predicting outcomes on the data using machine learning algorithms is a challenge. This cookbook will help solve everyday challenges you face as a data scientist. The application of various data science techniques and on multiple data sets based on real-world challenges you face will help you appreciate a variety of techniques used in various situations.The first half of the book provides recipes on fairly complex machine-learning systems, where you’ll learn to explore new areas of applications of machine learning and improve its efficiency. That includes recipes on classifications, neural networks, unsupervised and supervised learning, deep learning, reinforcement learning, and more.The second half of the book focuses on three different machine learning case studies, all based on real-world data, and offers solutions and solves specific machine-learning issues in each one.

567
Wird geladen...
E-BOOK

Practical Predictive Analytics. Analyse current and historical data to predict future trends using R, Spark, and more

Ralph Winters

This is the go-to book for anyone interested in the steps needed to develop predictive analytics solutions with examples from the world of marketing, healthcare, and retail. We'll get startedwith a brief history of predictive analytics and learn about different roles and functions people play within a predictive analytics project. Then, we will learn about various ways of installing R along with their pros and cons, combined with a step-by-step installation of RStudio,and a description of the best practices for organizing your projects.On completing the installation, we will begin to acquire the skills necessary to input, clean, and prepare your data for modeling. We will learn the six specific steps needed to implement andsuccessfully deploy a predictive model starting from asking the right questions through model development and ending with deploying your predictive model into production. We will learn whycollaboration is important and how agile iterative modeling cycles can increase your chances of developing and deploying the best successful model.We will continue your journey in the cloud by extending your skill set by learning about Databricks and SparkR, which allow you to develop predictive models on vast gigabytes of data.

568
Wird geladen...
E-BOOK

Practical Real-time Data Processing and Analytics. Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

Shilpi Saxena, Saurabh Gupta

With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible.This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you’ll be equipped with a clear understanding of how to solve challenges on your own.We’ll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You’ll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case.By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner.