Analiza danych

121
Wird geladen...
E-BOOK

Data Exploration and Preparation with BigQuery. A practical guide to cleaning, transforming, and analyzing data for business insights

Mike Kahn

Data professionals encounter a multitude of challenges such as handling large volumes of data, dealing with data silos, and the lack of appropriate tools. Datasets often arrive in different conditions and formats, demanding considerable time from analysts, engineers, and scientists to process and uncover insights. The complexity of the data life cycle often hinders teams and organizations from extracting the desired value from their data assets. Data Exploration and Preparation with BigQuery offers a holistic solution to these challenges.The book begins with the basics of BigQuery while covering the fundamentals of data exploration and preparation. It then progresses to demonstrate how to use BigQuery for these tasks and explores the array of big data tools at your disposal within the Google Cloud ecosystem.The book doesn’t merely offer theoretical insights; it’s a hands-on companion that walks you through properly structuring your tables for query efficiency and ensures adherence to data preparation best practices. You’ll also learn when to use Dataflow, BigQuery, and Dataprep for ETL and ELT workflows. The book will skillfully guide you through various case studies, demonstrating how BigQuery can be used to solve real-world data problems.By the end of this book, you’ll have mastered the use of SQL to explore and prepare datasets in BigQuery, unlocking deeper insights from data.

122
Wird geladen...
E-BOOK

Data Forecasting and Segmentation Using Microsoft Excel. Perform data grouping, linear predictions, and time series machine learning statistics without using code

Fernando Roque

Data Forecasting and Segmentation Using Microsoft Excel guides you through basic statistics to test whether your data can be used to perform regression predictions and time series forecasts. The exercises covered in this book use real-life data from Kaggle, such as demand for seasonal air tickets and credit card fraud detection.You’ll learn how to apply the grouping K-means algorithm, which helps you find segments of your data that are impossible to see with other analyses, such as business intelligence (BI) and pivot analysis. By analyzing groups returned by K-means, you’ll be able to detect outliers that could indicate possible fraud or a bad function in network packets.By the end of this Microsoft Excel book, you’ll be able to use the classification algorithm to group data with different variables. You’ll also be able to train linear and time series models to perform predictions and forecasts based on past data.

123
Wird geladen...
E-BOOK

Data Governance Handbook. A practical approach to building trust in data

Wendy S. Batchelder

2.5 quintillion bytes! This is the amount of data being generated every single day across the globe. As this number continues to grow, understanding and managing data becomes more complex. Data professionals know that it’s their responsibility to navigate this complexity and ensure effective governance, empowering businesses with the right data, at the right time, and with the right controls.If you are a data professional, this book will equip you with valuable guidance to conquer data governance complexities with ease. Written by a three-time chief data officer in global Fortune 500 companies, the Data Governance Handbook is an exhaustive guide to understanding data governance, its key components, and how to successfully position solutions in a way that translates into tangible business outcomes.By the end, you’ll be able to successfully pitch and gain support for your data governance program, demonstrating tangible outcomes that resonate with key stakeholders.*Email sign-up and proof of purchase required

124
Wird geladen...
E-BOOK

Data Ingestion with Python Cookbook. A practical guide to ingesting, monitoring, and identifying errors in the data ingestion process

Gláucia Esppenchutz

Data Ingestion with Python Cookbook offers a practical approach to designing and implementing data ingestion pipelines. It presents real-world examples with the most widely recognized open source tools on the market to answer commonly asked questions and overcome challenges.You’ll be introduced to designing and working with or without data schemas, as well as creating monitored pipelines with Airflow and data observability principles, all while following industry best practices. The book also addresses challenges associated with reading different data sources and data formats. As you progress through the book, you’ll gain a broader understanding of error logging best practices, troubleshooting techniques, data orchestration, monitoring, and storing logs for further consultation.By the end of the book, you’ll have a fully automated set that enables you to start ingesting and monitoring your data pipeline effortlessly, facilitating seamless integration with subsequent stages of the ETL process.

125
Wird geladen...
E-BOOK

Data Lake for Enterprises. Lambda Architecture for building enterprise data systems

Vivek Mishra, Tomcy John, Pankaj Misra

The term Data Lake has recently emerged as a prominent term in the big data industry. Data scientists can make use of it in deriving meaningful insights that can be used by businesses to redefine or transform the way they operate. Lambda architecture is also emerging as one of the very eminent patterns in the big data landscape, as it not only helps to derive useful information from historical data but also correlates real-time data to enable business to take critical decisions. This book tries to bring these two important aspects — data lake and lambda architecture—together.This book is divided into three main sections. The first introduces you to the concept of data lakes, the importance of data lakes in enterprises, and getting you up-to-speed with the Lambda architecture. The second section delves into the principal components of building a data lake using the Lambda architecture. It introduces you to popular big data technologies such as Apache Hadoop, Spark, Sqoop, Flume, and ElasticSearch. The third section is a highly practical demonstration of putting it all together, and shows you how an enterprise data lake can be implemented, along with several real-world use-cases. It also shows you how other peripheral components can be added to the lake to make it more efficient.By the end of this book, you will be able to choose the right big data technologies using the lambda architectural patterns to build your enterprise data lake.

126
Wird geladen...
E-BOOK

Data Literacy With Python. A Comprehensive Guide to Understanding and Analyzing Data with Python

Mercury Learning and Information, Oswald Campesato

This book ushers readers into the world of data, emphasizing its importance in modern industries and how its management leads to insightful decision-making. Using Python 3, the book introduces foundational data tasks and progresses to advanced model training concepts. Detailed, step-by-step Python examples help readers master training models, starting with the kNN algorithm and moving to other classifiers with minimal code adjustments. Tools like Sweetviz, Skimpy, Matplotlib, and Seaborn are introduced for hands-on chart and graph rendering.The course begins with working with data, detecting outliers and anomalies, and cleaning datasets. It then introduces statistics and progresses to using Matplotlib and Seaborn for data visualization. Each chapter builds on the previous one, ensuring a comprehensive understanding of data management and analysis.These concepts are crucial for making data-driven decisions. This book transitions readers from basic data handling to advanced model training, blending theoretical knowledge with practical skills. Companion files with source code and data sets enhance the learning experience, making this book an invaluable resource for mastering data science with Python.

127
Wird geladen...
E-BOOK

Data Management Strategy at Microsoft. Best practices from a tech giant's decade-long data transformation journey

Aleksejs Plotnikovs

Microsoft pioneered data innovation and investment ahead of many in the industry, setting a remarkable standard for data maturity. Written by a data leader with over 15 years of experience following Microsoft’s data journey, this book delves into every crucial aspect of this journey, including change management, aligning with business needs, enhancing data value, and cultivating a data-driven culture.This book emphasizes that success in a data-driven enterprise goes beyond relying solely on modern technology and highlights the importance of prioritizing genuine business needs to propel necessary modernizations through change management practices. You’ll see how data-driven innovation does not solely reside within central IT engineering teams but also among the data's business owners who rely on data daily for their operational needs. This guide empower these professionals with clean, easily discoverable, and business-ready data, marking a significant breakthrough in how data is perceived and utilized throughout an enterprise. You’ll also discover advanced techniques to nurture the value of data as unique intellectual property, and differentiate your organization with the power of data.Its storytelling approach and summary of essential insights at the end of each chapter make this book invaluable for business and data leaders to advocate for crucial data investments.

128
Wird geladen...
E-BOOK

Data Modeling with Microsoft Excel. Model and analyze data using Power Pivot, DAX, and Cube functions

Bernard Obeng Boateng, Michael Olafusi

Microsoft Excel's BI solutions have evolved, offering users more flexibility and control over analyzing data directly in Excel. Features like PivotTables, Data Model, Power Query, and Power Pivot empower Excel users to efficiently get, transform, model, aggregate, and visualize data.Data Modeling with Microsoft Excel offers a practical way to demystify the use and application of these tools using real-world examples and simple illustrations.This book will introduce you to the world of data modeling in Excel, as well as definitions and best practices in data structuring for both normalized and denormalized data. The next set of chapters will take you through the useful features of Data Model and Power Pivot, helping you get to grips with the types of schemas (snowflake and star) and create relationships within multiple tables. You’ll also understand how to create powerful and flexible measures using DAX and Cube functions.By the end of this book, you’ll be able to apply the acquired knowledge in real-world scenarios and build an interactive dashboard that will help you make important decisions.Note: To access the supplemental material, subscribers should purchase a print copy of the book. The ebook can be accessed through the QR code or link provided inside the Print book. Proof of purchase is mandatory to access the ebook.