Verleger: K-i-s-publishing

1201
Wird geladen...
E-BOOK

Data Analytics Using Splunk 9.x. A practical guide to implementing Splunk's features for performing data analysis at scale

Dr. Nadine Shillingford

Splunk 9 improves on the existing Splunk tool to include important features such as federated search, observability, performance improvements, and dashboarding. This book helps you to make the best use of the impressive and new features to prepare a Splunk installation that can be employed in the data analysis process.Starting with an introduction to the different Splunk components, such as indexers, search heads, and forwarders, this Splunk book takes you through the step-by-step installation and configuration instructions for basic Splunk components using Amazon Web Services (AWS) instances. You’ll import the BOTS v1 dataset into a search head and begin exploring data using the Splunk Search Processing Language (SPL), covering various types of Splunk commands, lookups, and macros. After that, you’ll create tables, charts, and dashboards using Splunk’s new Dashboard Studio, and then advance to work with clustering, container management, data models, federated search, bucket merging, and more.By the end of the book, you’ll not only have learned everything about the latest features of Splunk 9 but also have a solid understanding of the performance tuning techniques in the latest version.

1202
Wird geladen...
E-BOOK

Data Augmentation with Python. Enhance deep learning accuracy with data augmentation methods for image, text, audio, and tabular data

Duc Haba

Data is paramount in AI projects, especially for deep learning and generative AI, as forecasting accuracy relies on input datasets being robust. Acquiring additional data through traditional methods can be challenging, expensive, and impractical, and data augmentation offers an economical option to extend the dataset.The book teaches you over 20 geometric, photometric, and random erasing augmentation methods using seven real-world datasets for image classification and segmentation. You’ll also review eight image augmentation open source libraries, write object-oriented programming (OOP) wrapper functions in Python Notebooks, view color image augmentation effects, analyze safe levels and biases, as well as explore fun facts and take on fun challenges. As you advance, you’ll discover over 20 character and word techniques for text augmentation using two real-world datasets and excerpts from four classic books. The chapter on advanced text augmentation uses machine learning to extend the text dataset, such as Transformer, Word2vec, BERT, GPT-2, and others. While chapters on audio and tabular data have real-world data, open source libraries, amazing custom plots, and Python Notebook, along with fun facts and challenges.By the end of this book, you will be proficient in image, text, audio, and tabular data augmentation techniques.

1203
Wird geladen...
E-BOOK

Data Center Virtualization Certification: VCP6.5-DCV Exam Guide. Everything you need to achieve 2V0-622 certification – with exam tips and exercises

Andrea Mauro , Paolo Valsecchi

This exam guide enables you to install, configure, and manage the vSphere 6.5 infrastructure in all its components: vCenter Server, ESXi hosts, and virtual machines, while helping you to prepare for the industry standard certification.This data center book will assist you in automating administration tasks and enhancing your environment’s capabilities. You will begin with an introduction to all aspects related to security, networking, and storage in vSphere 6.5. Next, you will learn about resource management and understand how to back up and restore the vSphere 6.5 infrastructure. As you advance, you will also cover troubleshooting, deployment, availability, and virtual machine management. This is followed by two mock tests that will test your knowledge and challenge your understanding of all the topics included in the exam.By the end of this book, you will not only have learned about virtualization and its techniques, but you’ll also be prepared to pass the VCP6.5-DCV (2V0-622) exam.

1204
Wird geladen...
E-BOOK

Data Cleaning and Exploration with Machine Learning. Get to grips with machine learning techniques to achieve sparkling-clean data quickly

Michael Walker

Many individuals who know how to run machine learning algorithms do not have a good sense of the statistical assumptions they make and how to match the properties of the data to the algorithm for the best results.As you start with this book, models are carefully chosen to help you grasp the underlying data, including in-feature importance and correlation, and the distribution of features and targets. The first two parts of the book introduce you to techniques for preparing data for ML algorithms, without being bashful about using some ML techniques for data cleaning, including anomaly detection and feature selection. The book then helps you apply that knowledge to a wide variety of ML tasks. You’ll gain an understanding of popular supervised and unsupervised algorithms, how to prepare data for them, and how to evaluate them. Next, you’ll build models and understand the relationships in your data, as well as perform cleaning and exploration tasks with that data. You’ll make quick progress in studying the distribution of variables, identifying anomalies, and examining bivariate relationships, as you focus more on the accuracy of predictions in this book.By the end of this book, you’ll be able to deal with complex data problems using unsupervised ML algorithms like principal component analysis and k-means clustering.

1205
Wird geladen...
E-BOOK

Data Cleaning with Power BI. The definitive guide to transforming dirty data into actionable insights

Gus Frazer

Microsoft Power BI offers a range of powerful data cleaning and preparation options through tools such as DAX, Power Query, and the M language. However, despite its user-friendly interface, mastering it can be challenging. Whether you're a seasoned analyst or a novice exploring the potential of Power BI, this comprehensive guide equips you with techniques to transform raw data into a reliable foundation for insightful analysis and visualization.This book serves as a comprehensive guide to data cleaning, starting with data quality, common data challenges, and best practices for handling data. You’ll learn how to import and clean data with Query Editor and transform data using the M query language. As you advance, you’ll explore Power BI’s data modeling capabilities for efficient cleaning and establishing relationships. Later chapters cover best practices for using Power Automate for data cleaning and task automation. Finally, you’ll discover how OpenAI and ChatGPT can make data cleaning in Power BI easier.By the end of the book, you will have a comprehensive understanding of data cleaning concepts, techniques, and how to use Power BI and its tools for effective data preparation.

1206
Wird geladen...
E-BOOK

Data Democratization with Domo. Bring together every component of your business to make better data-driven decisions using Domo

Jeff Burtenshaw

Domo is a power-packed business intelligence (BI) platform that empowers organizations to track, analyze, and activate data in record time at cloud scale and performance.Data Democratization with Domo begins with an overview of the Domo ecosystem. You’ll learn how to get data into the cloud with Domo data connectors and Workbench; profile datasets; use Magic ETL to transform data; work with in-memory data sculpting tools (Data Views and Beast Modes); create, edit, and link card visualizations; and create card drill paths using Domo Analyzer. Next, you’ll discover options to distribute content with real-time updates using Domo Embed and digital wallboards. As you advance, you’ll understand how to use alerts and webhooks to drive automated actions. You’ll also build and deploy a custom app to the Domo Appstore and find out how to code Python apps, use Jupyter Notebooks, and insert R custom models. Furthermore, you’ll learn how to use Auto ML to automatically evaluate dozens of models for the best fit using SageMaker and produce a predictive model as well as use Python and the Domo Command Line Interface tool to extend Domo. Finally, you’ll learn how to govern and secure the entire Domo platform.By the end of this book, you’ll have gained the skills you need to become a successful Domo master.

1207
Wird geladen...
E-BOOK

Data Engineering Best Practices. Architect robust and cost-effective data solutions in the cloud era

Richard J. Schiller, David Larochelle

Revolutionize your approach to data processing in the fast-paced business landscape with this essential guide to data engineering. Discover the power of scalable, efficient, and secure data solutions through expert guidance on data engineering principles and techniques. Written by two industry experts with over 60 years of combined experience, it offers deep insights into best practices, architecture, agile processes, and cloud-based pipelines. You’ll start by defining the challenges data engineers face and understand how this agile and future-proof comprehensive data solution architecture addresses them. As you explore the extensive toolkit, mastering the capabilities of various instruments, you’ll gain the knowledge needed for independent research. Covering everything you need, right from data engineering fundamentals, the guide uses real-world examples to illustrate potential solutions. It elevates your skills to architect scalable data systems, implement agile development processes, and design cloud-based data pipelines. The book further equips you with the knowledge to harness serverless computing and microservices to build resilient data applications.By the end, you'll be armed with the expertise to design and deliver high-performance data engineering solutions that are not only robust, efficient, and secure but also future-ready.

1208
Wird geladen...
E-BOOK

Data Engineering with Alteryx. Helping data engineers apply DataOps practices with Alteryx

Paul Houghton

Alteryx is a GUI-based development platform for data analytic applications.Data Engineering with Alteryx will help you leverage Alteryx’s code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have.This book will teach you the principles of DataOps and how they can be used with the Alteryx software stack. You’ll build data pipelines with Alteryx Designer and incorporate the error handling and data validation needed for reliable datasets. Next, you’ll take the data pipeline from raw data, transform it into a robust dataset, and publish it to Alteryx Server following a continuous integration process.By the end of this Alteryx book, you’ll be able to build systems for validating datasets, monitoring workflow performance, managing access, and promoting the use of your data sources.

1209
Wird geladen...
E-BOOK

Data Engineering with Apache Spark, Delta Lake, and Lakehouse. Create scalable pipelines that ingest, curate, and aggregate complex data in a timely and secure way

Manoj Kukreja

In the world of ever-changing data and schemas, it is important to build data pipelines that can auto-adjust to changes. This book will help you build scalable data platforms that managers, data scientists, and data analysts can rely on.Starting with an introduction to data engineering, along with its key concepts and architectures, this book will show you how to use Microsoft Azure Cloud services effectively for data engineering. You'll cover data lake design patterns and the different stages through which the data needs to flow in a typical data lake. Once you've explored the main features of Delta Lake to build data lakes with fast performance and governance in mind, you'll advance to implementing the lambda architecture using Delta Lake. Packed with practical examples and code snippets, this book takes you through real-world examples based on production scenarios faced by the author in his 10 years of experience working with big data. Finally, you'll cover data lake deployment strategies that play an important role in provisioning the cloud resources and deploying the data pipelines in a repeatable and continuous way.By the end of this data engineering book, you'll know how to effectively deal with ever-changing data and create scalable data pipelines to streamline data science, ML, and artificial intelligence (AI) tasks.

1210
Wird geladen...
E-BOOK

Data Engineering with AWS. Acquire the skills to design and build AWS-based data transformation pipelines like a pro - Second Edition

Gareth Eagar

This book, authored by a Senior Data Architect with 25 years of experience, helps you gain expertise in the AWS ecosystem for data engineering. This revised edition updates every chapter to cover the latest AWS services and features, provides a refreshed view on data governance, and introduces a new section on building modern data platforms. You will learn how to implement a data mesh, work with open-table formats such as Apache Iceberg, and apply DataOps practices for automation and observability.You will begin by exploring core concepts and essential AWS tools used by data engineers, along with modern data management approaches. You will then design and build data pipelines, review raw data sources, transform data, and understand how it is consumed by various stakeholders. The book also covers data governance, populating data marts and warehouses, and how a data lakehouse fits into the architecture. You will explore AWS tools for analysis, SQL queries, visualizations, and learn how AI and machine learning generate insights from data. Later chapters cover transactional data lakes, data meshes, and building a complete AWS data platform.By the end, you will be able to confidently implement data engineering pipelines on AWS.*Email sign-up and proof of purchase required

1211
Wird geladen...
E-BOOK

Data Engineering with AWS Cookbook. A recipe-based approach to help you tackle data engineering problems with AWS services

Trâm Ngoc Pham, Gonzalo Herreros González, Viquar...

Performing data engineering with Amazon Web Services (AWS) combines AWS's scalable infrastructure with robust data processing tools, enabling efficient data pipelines and analytics workflows. This comprehensive guide to AWS data engineering will teach you all you need to know about data lake management, pipeline orchestration, and serving layer construction.Through clear explanations and hands-on exercises, you’ll master essential AWS services such as Glue, EMR, Redshift, QuickSight, and Athena. Additionally, you’ll explore various data platform topics such as data governance, data quality, DevOps, CI/CD, planning and performing data migration, and creating Infrastructure as Code. As you progress, you will gain insights into how to enrich your platform and use various AWS cloud services such as AWS EventBridge, AWS DataZone, and AWS SCT and DMS to solve data platform challenges.Each recipe in this book is tailored to a daily challenge that a data engineer team faces while building a cloud platform. By the end of this book, you will be well-versed in AWS data engineering and have gained proficiency in key AWS services and data processing techniques. You will develop the necessary skills to tackle large-scale data challenges with confidence.

1212
Wird geladen...
E-BOOK

Data Engineering with AWS. Learn how to design and build cloud-based data transformation pipelines using AWS

Gareth Eagar

Written by a Senior Data Architect with over twenty-five years of experience in the business, Data Engineering for AWS is a book whose sole aim is to make you proficient in using the AWS ecosystem. Using a thorough and hands-on approach to data, this book will give aspiring and new data engineers a solid theoretical and practical foundation to succeed with AWS.As you progress, you’ll be taken through the services and the skills you need to architect and implement data pipelines on AWS. You'll begin by reviewing important data engineering concepts and some of the core AWS services that form a part of the data engineer's toolkit. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how the transformed data is used by various data consumers. You’ll also learn about populating data marts and data warehouses along with how a data lakehouse fits into the picture. Later, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. In the final chapters, you'll understand how the power of machine learning and artificial intelligence can be used to draw new insights from data.By the end of this AWS book, you'll be able to carry out data engineering tasks and implement a data pipeline on AWS independently.

1213
Wird geladen...
E-BOOK

Data Engineering with Databricks Cookbook. Build effective data and AI solutions using Apache Spark, Databricks, and Delta Lake

Pulkit Chadha

Written by a Senior Solutions Architect at Databricks, Data Engineering with Databricks Cookbook will show you how to effectively use Apache Spark, Delta Lake, and Databricks for data engineering, starting with comprehensive introduction to data ingestion and loading with Apache Spark.What makes this book unique is its recipe-based approach, which will help you put your knowledge to use straight away and tackle common problems. You’ll be introduced to various data manipulation and data transformation solutions that can be applied to data, find out how to manage and optimize Delta tables, and get to grips with ingesting and processing streaming data. The book will also show you how to improve the performance problems of Apache Spark apps and Delta Lake. Advanced recipes later in the book will teach you how to use Databricks to implement DataOps and DevOps practices, as well as how to orchestrate and schedule data pipelines using Databricks Workflows. You’ll also go through the full process of setup and configuration of the Unity Catalog for data governance.By the end of this book, you’ll be well-versed in building reliable and scalable data pipelines using modern data engineering technologies.

1214
Wird geladen...
E-BOOK

Data Engineering with dbt. A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL

Roberto Zagni

dbt Cloud helps professional analytics engineers automate the application of powerful and proven patterns to transform data from ingestion to delivery, enabling real DataOps.This book begins by introducing you to dbt and its role in the data stack, along with how it uses simple SQL to build your data platform, helping you and your team work better together. You’ll find out how to leverage data modeling, data quality, master data management, and more to build a simple-to-understand and future-proof solution. As you advance, you’ll explore the modern data stack, understand how data-related careers are changing, and see how dbt enables this transition into the emerging role of an analytics engineer. The chapters help you build a sample project using the free version of dbt Cloud, Snowflake, and GitHub to create a professional DevOps setup with continuous integration, automated deployment, ELT run, scheduling, and monitoring, solving practical cases you encounter in your daily work.By the end of this dbt book, you’ll be able to build an end-to-end pragmatic data platform by ingesting data exported from your source systems, coding the needed transformations, including master data and the desired business rules, and building well-formed dimensional models or wide tables that’ll enable you to build reports with the BI tool of your choice.

1215
Wird geladen...
E-BOOK

Data Engineering with Google Cloud Platform. A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud - Second Edition

Adi Wijaya, António Vilares

The second edition of Data Engineering with Google Cloud builds upon the success of the first edition by offering enhanced clarity and depth to data professionals navigating the intricate landscape of data engineering.Beyond its foundational lessons, this new edition delves into the essential realm of data governance within Google Cloud, providing you with invaluable insights into managing and optimizing data resources effectively. Written by a Data Strategic Cloud Engineer at Google, this book helps you stay ahead of the curve by guiding you through the latest technological advancements in the Google Cloud ecosystem. You’ll cover essential aspects, from exploring Cloud Composer 2 to the evolution of Airflow 2.5. Additionally, you’ll explore how to work with cutting-edge tools like Dataform, DLP, Dataplex, Dataproc Serverless, and Datastream to perform data governance on datasets.By the end of this book, you'll be equipped to navigate the ever-evolving world of data engineering on Google Cloud, from foundational principles to cutting-edge practices.

1216
Wird geladen...
E-BOOK

Data Engineering with Google Cloud Platform. A practical guide to operationalizing scalable data analytics systems on GCP

Adi Wijaya

With this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards.Starting with a quick overview of the fundamental concepts of data engineering, you'll learn the various responsibilities of a data engineer and how GCP plays a vital role in fulfilling those responsibilities. As you progress through the chapters, you'll be able to leverage GCP products to build a sample data warehouse using Cloud Storage and BigQuery and a data lake using Dataproc. The book gradually takes you through operations such as data ingestion, data cleansing, transformation, and integrating data with other sources. You'll learn how to design IAM for data governance, deploy ML pipelines with the Vertex AI, leverage pre-built GCP models as a service, and visualize data with Google Data Studio to build compelling reports. Finally, you'll find tips on how to boost your career as a data engineer, take the Professional Data Engineer certification exam, and get ready to become an expert in data engineering with GCP.By the end of this data engineering book, you'll have developed the skills to perform core data engineering tasks and build efficient ETL data pipelines with GCP.