Ebooki
7537
Ebook

Data Wrangling on AWS. Clean and organize complex data for analysis

Navnit Shukla, Sankar M, Sampat Palani

Data wrangling is the process of cleaning, transforming, and organizing raw, messy, or unstructured data into a structured format. It involves processes such as data cleaning, data integration, data transformation, and data enrichment to ensure that the data is accurate, consistent, and suitable for analysis. Data Wrangling on AWS equips you with the knowledge to reap the full potential of AWS data wrangling tools.First, you’ll be introduced to data wrangling on AWS and will be familiarized with data wrangling services available in AWS. You’ll understand how to work with AWS Glue DataBrew, AWS data wrangler, and AWS Sagemaker. Next, you’ll discover other AWS services like Amazon S3, Redshift, Athena, and Quicksight. Additionally, you’ll explore advanced topics such as performing Pandas data operation with AWS data wrangler, optimizing ML data with AWS SageMaker, building the data warehouse with Glue DataBrew, along with security and monitoring aspects.By the end of this book, you’ll be well-equipped to perform data wrangling using AWS services.

7538
Ebook

Data Wrangling Using Pandas, SQL, and Java. A Comprehensive Guide to Data Cleaning and Transformation

Mercury Learning and Information, Oswald Campesato

This book is designed for aspiring data scientists and those involved in data cleaning. It covers features of NumPy and Pandas, along with creating databases and tables in MySQL. It also addresses various data wrangling tasks using Python scripts and awk-based shell scripts. Companion files with code are available from the publisher.Understanding data cleaning and manipulation is vital for data scientists. This book provides a comprehensive introduction to essential tools and techniques. From Python basics to advanced data wrangling, it equips readers with the skills needed to manage and clean data effectively.The journey begins with an introduction to Python and progresses through working with data, Pandas, and SQL. It also covers Java, JSON, XML, and specific data cleaning tasks. The book culminates with detailed data wrangling techniques, ensuring readers gain practical, hands-on experience in data management.

7539
Ebook

Data Wrangling with Python. Creating actionable data from raw sources

Dr. Tirthajyoti Sarkar, Shubhadeep Roychowdhury

For data to be useful and meaningful, it must be curated and refined. Data Wrangling with Python teaches you the core ideas behind these processes and equips you with knowledge of the most popular tools and techniques in the domain.The book starts with the absolute basics of Python, focusing mainly on data structures. It then delves into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python. This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The book will further help you grasp concepts through real-world examples and datasets.By the end of this book, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently.

7540
Ebook

Data Wrangling with R. Load, explore, transform and visualize data for modeling with tidyverse libraries

Gustavo R Santos

In this information era, where large volumes of data are being generated every day, companies want to get a better grip on it to perform more efficiently than before. This is where skillful data analysts and data scientists come into play, wrangling and exploring data to generate valuable business insights. In order to do that, you’ll need plenty of tools that enable you to extract the most useful knowledge from data.Data Wrangling with R will help you to gain a deep understanding of ways to wrangle and prepare datasets for exploration, analysis, and modeling. This data book enables you to get your data ready for more optimized analyses, develop your first data model, and perform effective data visualization.The book begins by teaching you how to load and explore datasets. Then, you’ll get to grips with the modern concepts and tools of data wrangling. As data wrangling and visualization are intrinsically connected, you’ll go over best practices to plot data and extract insights from it. The chapters are designed in a way to help you learn all about modeling, as you will go through the construction of a data science project from end to end, and become familiar with the built-in RStudio, including an application built with Shiny dashboards.By the end of this book, you’ll have learned how to create your first data model and build an application with Shiny in R.

7541
Ebook

Data Wrangling with SQL. A hands-on guide to manipulating, wrangling, and engineering data using SQL

Raghav Kandarpa, Shivangi Saxena

The amount of data generated continues to grow rapidly, making it increasingly important for businesses to be able to wrangle this data and understand it quickly and efficiently. Although data wrangling can be challenging, with the right tools and techniques you can efficiently handle enormous amounts of unstructured data.The book starts by introducing you to the basics of SQL, focusing on the core principles and techniques of data wrangling. You’ll then explore advanced SQL concepts like aggregate functions, window functions, CTEs, and subqueries that are very popular in the business world. The next set of chapters will walk you through different functions within SQL query that cause delays in data transformation and help you figure out the difference between a good query and bad one. You’ll also learn how data wrangling and data science go hand in hand. The book is filled with datasets and practical examples to help you understand the concepts thoroughly, along with best practices to guide you at every stage of data wrangling.By the end of this book, you’ll be equipped with essential techniques and best practices for data wrangling, and will predominantly learn how to use clean and standardized data models to make informed decisions, helping businesses avoid costly mistakes.

7542
Ebook

Database Design and Modeling with Google Cloud. Learn database design and development to take your data to applications, analytics, and AI

Abirami Sukumaran, Priyanka Vergadia, Bagirathi Narayanan

In the age of lightning-speed delivery, customers want everything developed, built, and delivered at high speed and at scale. Knowledge, design, and choice of database is critical in that journey, but there is no one-size-fits-all solution. This book serves as a comprehensive and practical guide for data professionals who want to design and model their databases efficiently. The book begins by taking you through business, technical, and design considerations for databases. Next, it takes you on an immersive structured database deep dive for both transactional and analytical real-world use cases using Cloud SQL, Spanner, and BigQuery. As you progress, you’ll explore semi-structured and unstructured database considerations with practical applications using Firestore, cloud storage, and more. You’ll also find insights into operational considerations for databases and the database design journey for taking your data to AI with Vertex AI APIs and generative AI examples. By the end of this book, you will be well-versed in designing and modeling data and databases for your applications using Google Cloud.

7543
Ebook

Database Design and Modeling with PostgreSQL and MySQL. Build efficient and scalable databases for modern applications using open source databases

Alkin Tezuysal, Ibrar Ahmed, Peter Zaitsev

Database Design and Modeling with PostgreSQL and MySQL will equip you with the knowledge and skills you need to architect, build, and optimize efficient databases using two of the most popular open-source platforms.As you progress through the chapters, you'll gain a deep understanding of data modeling, normalization, and query optimization, supported by hands-on exercises and real-world case studies that will reinforce your learning. You'll explore topics like concurrency control, backup and recovery strategies, and seamless integration with web and mobile applications. These advanced topics will empower you to tackle complex database challenges confidently and effectively. Additionally, you’ll explore emerging trends, such as NoSQL databases and cloud-based solutions, ensuring you're well-versed in the latest developments shaping the database landscape. By embracing these cutting-edge technologies, you'll be prepared to adapt and innovate in today's ever-evolving digital world.By the end of this book, you’ll be able to understand the technologies that exist to design a modern and scalable database for developing web applications using MySQL and PostgreSQL open-source databases.

7544
Ebook

Database Security. Master the Art of Protecting Your Data with Cutting-Edge Techniques

Mercury Learning and Information, Christopher Diaz

This book provides a comprehensive guide to resolving database security issues during design, implementation, and production phases. It emphasizes specific measures and controls unique to database security, beyond general information security. Topics include account credential management, data access management, and techniques like database normalization, referential integrity, transactions, locks, and check constraints.The importance of database security lies in protecting sensitive data from unauthorized access and ensuring data integrity. This book is designed for professionals, workshops, and self-learners, offering hands-on demonstrations with major Database Management Systems (MySQL, Oracle, and Microsoft SQL Server) across various computing platforms (Linux/UNIX, MacOS, Windows).Starting with an introduction to information, data, and database security, the book covers database design, management, administration, user accounts, privileges, roles, and security controls for confidentiality. It also delves into transactions and data integrity with concurrent access. Each chapter includes questions and projects to reinforce learning and comprehension.

7545
Ebook

Databricks Certified Associate Developer for Apache Spark Using Python. The ultimate guide to getting certified in Apache Spark using practical examples with Python

Saba Shah, Rod Waltermann

Spark has become a de facto standard for big data processing. Migrating data processing to Spark saves resources, streamlines your business focus, and modernizes workloads, creating new business opportunities through Spark’s advanced capabilities. Written by a senior solutions architect at Databricks, with experience in leading data science and data engineering teams in Fortune 500s as well as startups, this book is your exhaustive guide to achieving the Databricks Certified Associate Developer for Apache Spark certification on your first attempt.You’ll explore the core components of Apache Spark, its architecture, and its optimization, while familiarizing yourself with the Spark DataFrame API and its components needed for data manipulation. You’ll also find out what Spark streaming is and why it’s important for modern data stacks, before learning about machine learning in Spark and its different use cases. What’s more, you’ll discover sample questions at the end of each section along with two mock exams to help you prepare for the certification exam.By the end of this book, you’ll know what to expect in the exam and gain enough understanding of Spark and its tools to pass the exam. You’ll also be able to apply this knowledge in a real-world setting and take your skillset to the next level.

7546
Ebook

Databricks ML in Action. Learn how Databricks supports the entire ML lifecycle end to end from data ingestion to the model deployment

Stephanie Rivera, Anastasia Prokaieva, Amanda Baker, Hayley Horn

Discover what makes the Databricks Data Intelligence Platform the go-to choice for top-tier machine learning solutions. Written by a team of industry experts at Databricks with decades of combined experience in big data, machine learning, and data science, Databricks ML in Action presents cloud-agnostic, end-to-end examples with hands-on illustrations of executing data science, machine learning, and generative AI projects on the Databricks Platform.You’ll develop expertise in Databricks' managed MLflow, Vector Search, AutoML, Unity Catalog, and Model Serving as you learn to apply them practically in everyday workflows. This Databricks book not only offers detailed code explanations but also facilitates seamless code importation for practical use. You’ll discover how to leverage the open-source Databricks platform to enhance learning, boost skills, and elevate productivity with supplemental resources.By the end of this book, you'll have mastered the use of Databricks for data science, machine learning, and generative AI, enabling you to deliver outstanding data products.

7547
Ebook

Data-Centric Applications with Vaadin 8. Develop and maintain high-quality web applications using Vaadin

Alejandro Duarte

Vaadin is an open-source Java framework used to build modern user interfaces. Vaadin 8 simplifies application development and improves user experience. The book begins with an overview of the architecture of Vaadin applications and the way you can organize your code in modules.Then it moves to the more advanced topics about advanced topics such as internationalization, authentication, authorization, and database connectivity. The book also teaches you how to implement CRUD views, how to generate printable reports, and how to manage data with lazy loading.By the end of this book you will be able to architect, implement, and deploy stunning Vaadin applications, and have the knowledge to master web development with Vaadin.

7548
Ebook

Data-Centric Machine Learning with Python. The ultimate guide to engineering and deploying high-quality models based on good data

Jonas Christensen, Nakul Bajaj, Manmohan Gosada, Kirk D. Borne

In the rapidly advancing data-driven world where data quality is pivotal to the success of machine learning and artificial intelligence projects, this critically timed guide provides a rare, end-to-end overview of data-centric machine learning (DCML), along with hands-on applications of technical and non-technical approaches to generating deeper and more accurate datasets.This book will help you understand what data-centric ML/AI is and how it can help you to realize the potential of ‘small data’. Delving into the building blocks of data-centric ML/AI, you’ll explore the human aspects of data labeling, tackle ambiguity in labeling, and understand the role of synthetic data. From strategies to improve data collection to techniques for refining and augmenting datasets, you’ll learn everything you need to elevate your data-centric practices. Through applied examples and insights for overcoming challenges, you’ll get a roadmap for implementing data-centric ML/AI in diverse applications in Python.By the end of this book, you’ll have developed a profound understanding of data-centric ML/AI and the proficiency to seamlessly integrate common data-centric approaches in the model development lifecycle to unlock the full potential of your machine learning projects by prioritizing data quality and reliability.

7549
Ebook

Datadog Cloud Monitoring Quick Start Guide. Proactively create dashboards, write scripts, manage alerts, and monitor containers using Datadog

Thomas Kurian Theakanath

Datadog is an essential cloud monitoring and operational analytics tool which enables the monitoring of servers, virtual machines, containers, databases, third-party tools, and application services. IT and DevOps teams can easily leverage Datadog to monitor infrastructure and cloud services, and this book will show you how.The book starts by describing basic monitoring concepts and types of monitoring that are rolled out in a large-scale IT production engineering environment. Moving on, the book covers how standard monitoring features are implemented on the Datadog platform and how they can be rolled out in a real-world production environment. As you advance, you'll discover how Datadog is integrated with popular software components that are used to build cloud platforms. The book also provides details on how to use monitoring standards such as Java Management Extensions (JMX) and StatsD to extend the Datadog platform. Finally, you'll get to grips with monitoring fundamentals, learn how monitoring can be rolled out using Datadog proactively, and find out how to extend and customize the Datadog platform.By the end of this Datadog book, you will have gained the skills needed to monitor your cloud infrastructure and the software applications running on it using Datadog.

7550
Ebook
7551
Ebook

Daughters of Destiny

L. Frank Baum

Danger, intrigue, and adventure await you in one of L. Frank Baums rarest works! Baum published the novel under the pen name Schuyler Staunton, one of his several pseudonyms (Baum arrived at the name by adding one letter to the name of his late maternal uncle, Schuyler Stanton). Daughters of Destiny unfolds in the Middle Eastern country of Baluchistan and is an exciting page-turner from start to finish. Conflict occurs when the American Construction Syndicate wants to build a railroad across a city in Pakistan, as part of their plans for global development. The company appoints a commission, headed by Colonel Piedmont Moore, to obtain the right of way from the Baluchi ruler. What follows is a complex but tightly-woven plot that involves subterfuge and conspiracy, poisonings and attempted assassinations, sword fights and a pursuit in the desert, a scheming femme fatale, disguises and false identities all the ingredients of melodrama.

7552
Ebook

Dave Dashaway, Air Champion. Or Wizard-Work in the Clouds

Roy Rockwood

Never was there a more clever young aviator than Dave Dashaway, and all up-to-date youths will be will surely wish to hear about about him. In this, the last volume of the Dave Dashaway adventure series, Dave, with the assistance of his loyal chum Hiram Dobbs, makes several daring trips, and then enters a contest for a big prize. They are preparing for a new aerial contest, but competition is fierce, and dirty! An old enemy lurks in the shadows, sending spies and saboteurs. An aviation tale thrilling in the extreme. Add some new friends and a diamond thief into the mix, and youre in for another exciting Dave Dashaway adventure! Written by Weldon J. Cobb under the Stratemeyer Syndicate pseudonym Roy Rockwood. A highly entertaining literature being written for young readers.