Big data
Developing Kaggle Notebooks. Pave your way to becoming a Kaggle Notebooks Grandmaster
Gabriel Preda, D. Sculley, Anthony Goldbloom
Developing Kaggle Notebooks introduces you to data analysis, with a focus on using Kaggle Notebooks to simultaneously achieve mastery in this fi eld and rise to the top of the Kaggle Notebooks tier. The book is structured as a sevenstep data analysis journey, exploring the features available in Kaggle Notebooks alongside various data analysis techniques.For each topic, we provide one or more notebooks, developing reusable analysis components through Kaggle's Utility Scripts feature, introduced progressively, initially as part of a notebook, and later extracted for use across future notebooks to enhance code reusability on Kaggle. It aims to make the notebooks' code more structured, easy to maintain, and readable.Although the focus of this book is on data analytics, some examples will guide you in preparing a complete machine learning pipeline using Kaggle Notebooks. Starting from initial data ingestion and data quality assessment, you'll move on to preliminary data analysis, advanced data exploration, feature qualifi cation to build a model baseline, and feature engineering. You'll also delve into hyperparameter tuning to iteratively refi ne your model and prepare for submission in Kaggle competitions. Additionally, the book touches on developing notebooks that leverage the power of generative AI using Kaggle Models.
Bryon Kataoka, James Brennan, Ashish Aggarwal
IBM API Connect enables organizations to drive digital innovation using its scalable and robust API management capabilities across multi-cloud and hybrid environments. With API Connect's security, flexibility, and high performance, you'll be able to meet the needs of your enterprise and clients by extending your API footprint. This book provides a complete roadmap to create, manage, govern, and publish your APIs.You'll start by learning about API Connect components, such as API managers, developer portals, gateways, and analytics subsystems, as well as the management capabilities provided by CLI commands. You’ll then develop APIs using OpenAPI and discover how you can enhance them with logic policies. The book shows you how to modernize SOAP and FHIR REST services as secure APIs with authentication, OAuth2/OpenID, and JWT, and demonstrates how API Connect provides safeguards for GraphQL APIs as well as published APIs that are easy to discover and well documented. As you advance, the book guides you in generating unit tests that supplement DevOps pipelines using Git and Jenkins for improved agility, and concludes with best practices for implementing API governance and customizing API Connect components.By the end of this book, you'll have learned how to transform your business by speeding up the time-to-market of your products and increase the ROI for your enterprise.
Srikumar Nair
Microsoft Dataverse for Teams is a built-in, low-code data platform for Teams and enables everyone to easily build and deploy apps, flows, and intelligent chatbots using Power Apps, Power Automate, and Power Virtual Agents (PVA) embedded in Microsoft Teams.Without learning any coding language, you will be able to build apps with step-by-step explanations for setting up Teams, creating tables to store data, and leverage the data for your digital solutions. With the techniques covered in the book, you’ll be able to develop your first app with Dataverse for Teams within an hour! You’ll then learn how to automate repetitive tasks or build alerts using Power Automate and Power Virtual Agents. As you get to grips with building these digital solutions, you’ll also be able to understand when to consider upgrading from Dataverse for Teams to Dataverse, along with its advanced features. Finally, you’ll explore features for administration and governance and understand the licensing requirements of Microsoft Dataverse for Teams and PowerApps.Having acquired the skills to build and deploy an enterprise-grade digital solution, by the end of the book, you will have become a qualified citizen developer and be ready to lead a digital revolution in your organization.
Distributed Data Systems with Azure Databricks. Create, deploy, and manage enterprise data pipelines
Alan Bernardo Palacio
Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines. The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks. Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline.
Guanhua Wang
Reducing time cost in machine learning leads to a shorter waiting time for model training and a faster model updating cycle. Distributed machine learning enables machine learning practitioners to shorten model training and inference time by orders of magnitude. With the help of this practical guide, you'll be able to put your Python development knowledge to work to get up and running with the implementation of distributed machine learning, including multi-node machine learning systems, in no time. You'll begin by exploring how distributed systems work in the machine learning area and how distributed machine learning is applied to state-of-the-art deep learning models. As you advance, you'll see how to use distributed systems to enhance machine learning model training and serving speed. You'll also get to grips with applying data parallel and model parallel approaches before optimizing the in-parallel model training and serving pipeline in local clusters or cloud environments. By the end of this book, you'll have gained the knowledge and skills needed to build and deploy an efficient data processing pipeline for machine learning model training and inference in a distributed manner.
Dodaj mocy Power BI! Jak za pomocą kodu w Pythonie i R pobierać, przekształcać i wizualizować dane
Luca Zavarella, Francesca Lazzeri
Ważnym zadaniem inżynierów danych jest kreowanie modeli uczenia maszynowego. Używa się do tego narzędzi do analizy biznesowej, takich jak Power BI. Możliwości Power BI są imponujące, a można je dodatkowo rozbudować. Jedną z ciekawszych metod wzbogacania modelu danych i wizualizacji Power BI jest zastosowanie złożonych algorytmów zaimplementowanych w językach Python i R. W ten sposób można nie tylko tworzyć interesujące wizualizacje danych, ale także pozyskiwać dzięki nim kluczowe dla biznesu informacje. Dzięki tej książce dowiesz się, jak to zrobić. Zaczniesz od przygotowania środowiska Power BI do używania skryptów w Pythonie i R. Następnie będziesz importować dane z nieobsługiwanych obiektów i przekształcać je za pomocą wyrażeń regularnych i złożonych algorytmów. Nauczysz się wywoływać zewnętrzne interfejsy API i korzystać z zaawansowanych technik w celu przeprowadzenia dogłębnych analiz i wyodrębnienia cennych informacji za pomocą narzędzi statystyki i uczenia maszynowego, a także poprzez zastosowanie optymalizacji liniowej i innych algorytmów. Zapoznasz się również z głównymi cechami statystycznymi zestawów danych i z metodami tworzenia różnych wykresów ułatwiających zrozumienie relacji między zmiennymi. Najciekawsze zagadnienia: złożone przekształcanie danych w Power BI za pomocą skryptów Pythona i R anonimizacja i pseudonimizacja danych praca z dużymi zestawami danych wartości odstające i brakujące dla danych wielowymiarowych i szeregów czasowych tworzenie złożonych wizualizacji danych Wyzwól potężną moc Power BI!
Don't Fear the Spreadsheet. A Beginner's Guide to Overcoming Excel's Frustrations
MrExcel's Holy Macro! Books, Tyler Nash, Bill...
This book is written in an easy-to-follow question-and-answer format, specifically designed for complete Excel beginners. Focusing on the extreme basics of using spreadsheets, it avoids overwhelming readers with advanced topics and instead builds a foundational understanding. Readers will quickly gain a passable knowledge of the program, addressing common fears and frustrations through clear explanations and practical examples.The guide answers hundreds of everyday questions, such as Can I delete data without changing formatting? and How do I use text-wrapping? as well as slightly more advanced queries like What is a Macro, and how do I create one? It empowers users by breaking down intimidating concepts into manageable steps, making Excel approachable and useful for even the most inexperienced users. The focus is on helping readers become comfortable with essential tasks, from merging cells and formatting text to understanding formulas and navigating the interface.Aimed at the 40 percent of Excel users who have never entered a formula, this book demystifies the program's tools and functions, transforming confusion into confidence. By the end, readers will feel equipped to use Excel effectively for personal and professional tasks, overcoming barriers to productivity.
Andrew Jones, Kevin Hu
Despite the passage of time and the evolution of technology and architecture, the challenges we face in building data platforms persist. Our data often remains unreliable, lacks trust, and fails to deliver the promised value.With Driving Data Quality with Data Contracts, you’ll discover the potential of data contracts to transform how you build your data platforms, finally overcoming these enduring problems. You’ll learn how establishing contracts as the interface allows you to explicitly assign responsibility and accountability of the data to those who know it best—the data generators—and give them the autonomy to generate and manage data as required. The book will show you how data contracts ensure that consumers get quality data with clearly defined expectations, enabling them to build on that data with confidence to deliver valuable analytics, performant ML models, and trusted data-driven products.By the end of this book, you’ll have gained a comprehensive understanding of how data contracts can revolutionize your organization’s data culture and provide a competitive advantage by unlocking the real value within your data.
Dziennikarstwo danych i data storytelling
Łukasz Żyła
Bez danych jesteś jedynie kolejną osobą z opinią... Dziennikarstwo danych przeżywa dziś prawdziwy rozkwit. Dzieje się tak dlatego, że nasze życie w dużej mierze przeniosło się do internetu, a internet to... dane. Megabajty, gigabajty, terabajty danych. Misją współczesnego dziennikarza jest przedstawiać je społeczeństwu rzetelnie, a równocześnie pięknie, czyli w sposób zrozumiały, łatwy do przyswojenia. Nim się jednak owe dane pięknie zestawi, trzeba je znaleźć. Gdzie szukać? Jak je zdobyć? W jaki sposób opowiedzieć dane? Na takie pytania autor odpowiada w tej książce. Nie przeczytasz w niej o "ładnych wykresach", bo wbrew pozorom to nie one są esencją dziennikarstwa danych i data storytellingu. Dowiesz się natomiast, gdzie biją źródła potrzebnych Ci informacji, jak je przetwarzać i analizować. Znajdziesz tu także wskazówki, w jaki sposób tworzyć dobre wizualizacje za pomocą prostych aplikacji dostępnych za darmo w internecie i jak kreować angażujące odbiorców data stories. Na koniec wejdziesz na wyższy poziom - nauczysz się prezentować dane z wykorzystaniem kodu programistycznego. Kto? Co? Jak? Gdzie? Kiedy? ― odpowiedzi na te podstawowe pytania musi znaleźć każdy dziennikarz, który chce rzetelnie wykonać swoją pracę. Jednocześnie przy zalewie informacji, danych ze źródeł, których weryfikacja jest równie czasochłonna, każdy wykonujący ten piękny zawód coraz bardziej przypomina mitycznego Syzyfa. Przebicie się przez gigabajty informacji, przetworzenie ich i stworzenie materiału, który tłumaczy odbiorcy rzeczywistość, jest dziś działaniem obarczonym ogromnym wysiłkiem i jeszcze większym ryzykiem. Kaskadowy spadek zaufania do instytucji publicznych i prywatnych, z jakim mamy do czynienia od lat, oddziałuje także na media, z jednej strony wystawiane na szereg nacisków biznesowych, politycznych i społecznych, z drugiej ― borykające się z ciągłymi problemami finansowymi. Co warto wiedzieć, dobre dziennikarstwo, jakościowe dziennikarstwo to coś, co wymaga swobodnego poruszania się autorów w przestrzeni internetu i danych, a także poznania podstaw funkcjonowania w tej przestrzeni. Dlatego, jeżeli chcemy mieć przynajmniej cień nadziei na dobrze wykonaną pracę, warto sięgnąć po książkę Łukasza Żyły. W zawodzie zawsze mi powtarzano, że tej profesji człowiek uczy się tylko w praktyce i na pewno nie na studiach. Nadal tak jest, choć czasy, w których media dosłownie pączkują na każdym kroku i angażują coraz młodszych adeptów dziennikarstwa, wymagają, by sięgnąć po informacyjną pigułę, swoisty wykrywacz min, dzięki czemu te pierwsze kroki wspomniany początkujący dziennikarz będzie mógł stawiać względnie bezpiecznie. Dziennikarstwo danych i data storytelling to także pozycja dla osób doświadczonych w tym zawodzie. Powód jest oczywisty, technologia zmieniła dziennikarstwo i w pędzie żywiołu, którym ono jest, łatwo popaść w bezpieczną i przez to złudną rutynę, a wtedy jesteśmy o krok od poważnego błędu. Dzięki książce Łukasza Żyły łatwiejsze do ominięcia będą cyfrowe rafy, którymi sieć jest usłana. Bartosz Kurek, były dziennikarz Polsatu, obecnie menedżer ds. public affairs w Philip Morris Co wy tam tak naprawdę robicie? ― to częste pytanie, kiedy mówię, że pracuję w dziale danych „Wyborczej”. Niektórzy ze znawstwem odpowiadają: „Aaa, czyli robicie analizy wyników sprzedaży gazety?”. Inni zmieniają temat, spodziewając się, że zarzucę ich nudnymi opowieściami o uzupełnianiu tabelek liczbami. Co ciekawe, pytanie o to, jak dokładnie wygląda nasza praca, zadają również dziennikarze. Teraz, zamiast wchodzić w szczegóły, będę mógł zacząć odpowiedź od słów: „Jest taka książka, warto przeczytać…”, bo Łukasz w bardzo przystępny sposób tłumaczy, czym to się je. I myślę, że niezależnie od tego, jaką działką dziennikarstwa się zajmujecie, znajdziecie w niej coś dla siebie. Części dotyczące współpracy z urzędnikami, dostępu do informacji czy opowiadania historii powinien przyswoić każdy, kto będzie pracował w zawodzie. Po te o opracowywaniu danych sięgną ambitniejsi, a może po prostu bardziej przewidujący, bo pisać potrafi wielu, ale zdolność pisania połączona z umiejętnością analizowania, programowania lub wizualizowania robi z dziennikarza człowieka do zadań specjalnych. Kiedy czytałem tę książkę, wiele razy żałowałem, że czegoś takiego nie było, kiedy ja zaczynałem przygodę z danymi. Dzięki niej widzę, ile jeszcze powinienem się w tej dziedzinie nauczyć. Dominik Uhlig, szef BIQdata.pl ― działu danych „Gazety Wyborczej”
Effective Amazon Machine Learning. Expert web services for machine learning on cloud
Alexis Perrier
Predictive analytics is a complex domain requiring coding skills, an understanding of the mathematical concepts underpinning machine learning algorithms, and the ability to create compelling data visualizations. Following AWS simplifying Machine learning, this book will help you bring predictive analytics projects to fruition in three easy steps: data preparation, model tuning, and model selection.This book will introduce you to the Amazon Machine Learning platform and will implement core data science concepts such as classification, regression, regularization, overfitting, model selection, and evaluation. Furthermore, you will learn to leverage the Amazon Web Service (AWS) ecosystem for extended access to data sources, implement realtime predictions, and run Amazon Machine Learning projects via the command line and the Python SDK. Towards the end of the book, you will also learn how to apply these services to other problems, such as text mining, and to more complex datasets.
Effective Business Intelligence with QuickSight. Boost your business IQ with Amazon QuickSight
Rajesh Nadipalli
Amazon QuickSight is the next-generation Business Intelligence (BI) cloud service that can help you build interactive visualizations on top of various data sources hosted on Amazon Cloud Infrastructure. QuickSight delivers responsive insights into big data and enables organizations to quickly democratize data visualizations and scale to hundreds of users at a fraction of the cost when compared to traditional BI tools.This book begins with an introduction to Amazon QuickSight, feature differentiators from traditional BI tools, and how it fits in the overall AWS big data ecosystem. With practical examples, you will find tips and techniques to load your data to AWS, prepare it, and finally visualize it using QuickSight. You will learn how to build interactive charts, reports, dashboards, and stories using QuickSight and share with others using just your browser and mobile app.The book also provides a blueprint to build a real-life big data project on top of AWS Data Lake Solution and demonstrates how to build a modern data lake on the cloud with governance, data catalog, and analysis. It reviews the current product shortcomings, features in the roadmap, and how to provide feedback to AWS.Grow your profits, improve your products, and beat your competitors.
Ankur Jain
Organizations are moving their applications, data, and processes to the cloud to reduce application costs, effort, and maintenance. However, adopting new technology poses challenges for developers, solutions architects, and designers due to a lack of knowledge and appropriate practical training resources. This book helps you get to grips with Oracle Visual Builder (VB) and enables you to quickly develop web and mobile applications and deploy them to production without hassle.This book will provide you with a solid understanding of VB so that you can adopt it at a faster pace and start building applications right away. After working with real-time examples to learn about VB, you'll discover how to design, develop, and deploy web and mobile applications quickly. You'll cover all the VB components in-depth, including web and mobile application development, business objects, and service connections. In order to use all these components, you'll also explore best practices, security, and recommendations, which are well explained within the chapters. Finally, this book will help you gain the knowledge you need to enhance the performance of an application before deploying it to production.By the end of this book, you will be able to work independently and deploy your VB applications efficiently and with confidence.
Ekstrakcja danych w Pythonie. Teoria i praktyka
Piotr Rybka
Dane: załaduj, przetwarzaj, analizuj Ekstrakcja danych jest procesem, w którym informacje pozyskuje się z różnych źródeł - zwykle po to, by następnie poddać je dalszej transformacji i analizie. Umiejętność pozyskiwania danych, scalania, filtrowania i obrabiania ich na rozmaite sposoby przydaje się nie tylko zawodowym analitykom. Zdolność poruszania się po świecie danych jest wysoce pożądana również u osób pracujących w działach IT i na stanowiskach menadżerskich. Kto ma dane, ten ma wiedzę i zyskuje przewagę nad konkurencją! Jeśli chcesz zgłębić teorię ekstrakcji danych i zdobyć praktyczne umiejętności pozwalające operować nimi w Pythonie, ten podręcznik powinien być dla Ciebie pozycją obowiązkową. Dzięki książce między innymi: Opanujesz podstawowe pojęcia, których znajomość jest niezbędna podczas działań na zbiorach danych Zrozumiesz specyfikę plików binarnych i tekstowych Dowiesz się, na czym polega kodowanie tekstu Poznasz zagadnienia wyrażeń regularnych Zorientujesz się, jakie formaty wymiany danych są dostępne w Pythonie Nauczysz się przeszukiwać dokumenty znacznikowe Zapoznasz się ze schematami formatów wymiany danych
Elasticsearch Indexing. How to Improve User's Search Experience
Huseyin Akdogan
Beginning with an overview of the way ElasticSearch stores data, you’ll begin to extend your knowledge to tackle indexing and mapping, and learn how to configure ElasticSearch to meet your users’ needs. You’ll then find out how to use analysis and analyzers for greater intelligence in how you organize and pull up search results – to guarantee that every search query is met with the relevant results! You’ll explore the anatomy of an ElasticSearch cluster, and learn how to set up configurations that give you optimum availability as well as scalability. Once you’ve learned how these elements work, you’ll find real-world solutions to help you improve indexing performance, as well as tips and guidance on safety so you can back up and restore data. Once you’ve learned each component outlined throughout, you will be confident that you can help to deliver an improved search experience – exactly what modern users demand and expect.
Elasticsearch 5.x Cookbook. Distributed Search and Analytics - Third Edition
Alberto Paro
Elasticsearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. This book is your one-stop guide to master the complete Elasticsearch ecosystem. We’ll guide you through comprehensive recipes on what’s new in Elasticsearch 5.x, showing you how to create complex queries and analytics, and perform index mapping, aggregation, and scripting. Further on, you will explore the modules of Cluster and Node monitoring and see ways to back up and restore a snapshot of an index. You will understand how to install Kibana to monitor a cluster and also to extend Kibana for plugins. Finally, you will also see how you can integrate your Java, Scala, Python, and Big Data applications such as Apache Spark and Pig with Elasticsearch, and add enhanced functionalities with custom plugins.By the end of this book, you will have an in-depth knowledge of the implementation of the Elasticsearch architecture and will be able to manage data efficiently and effectively with Elasticsearch.
Anurag Srivastava, Douglas Miller
Elasticsearch is one of the most popular tools for distributed search and analytics. This Elasticsearch book highlights the latest features of Elasticsearch 7 and helps you understand how you can use them to build your own search applications with ease.Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. You'll learn how to install, configure, manage, secure, and deploy Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also understand how to troubleshoot any issues that you may encounter along the way. Finally, the book will help you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results.By the end of this book, you'll have a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch.
Bharvi Dixit
With constantly evolving and growing datasets, organizations have the need to find actionable insights for their business. ElasticSearch, which is the world's most advanced search and analytics engine, brings the ability to make massive amounts of data usable in a matter of milliseconds. It not only gives you the power to build blazing fast search solutions over a massive amount of data, but can also serve as a NoSQL data store.This guide will take you on a tour to become a competent developer quickly with a solid knowledge level and understanding of the ElasticSearch core concepts. Starting from the beginning, this book will cover these core concepts, setting up ElasticSearch and various plugins, working with analyzers, and creating mappings. This book provides complete coverage of working with ElasticSearch using Python and performing CRUD operations and aggregation-based analytics, handling document relationships in the NoSQL world, working with geospatial data, and taking data backups. Finally, we’ll show you how to set up and scale ElasticSearch clusters in production environments as well as providing some best practices.
Marek Rogozinski, Rafal Kuc
ElasticSearch is a very fast and scalable open source search engine, designed with distribution and cloud in mind, complete with all the goodies that Apache Lucene has to offer. ElasticSearch’s schema-free architecture allows developers to index and search unstructured content, making it perfectly suited for both small projects and large big data warehouses, even those with petabytes of unstructured data.This book will guide you through the world of the most commonly used ElasticSearch server functionalities. You’ll start off by getting an understanding of the basics of ElasticSearch and its data indexing functionality. Next, you will see the querying capabilities of ElasticSearch, followed by a through explanation of scoring and search relevance. After this, you will explore the aggregation and data analysis capabilities of ElasticSearch and will learn how cluster administration and scaling can be used to boost your application performance. You’ll find out how to use the friendly REST APIs and how to tune ElasticSearch to make the most of it. By the end of this book, you will have be able to create amazing search solutions as per your project’s specifications.
Emotional Intelligence for IT Professionals. The must-have guide for a successful career in IT
Emília M. Ludovino
This book will help you discover your emotional quotient (EQ) through practices and techniques that are used by the most successful IT people in the world. It will make you familiar with the core skills of Emotional Intelligence, such as understanding the role that emotions play in life, especially in the workplace. You will learn to identify the factors that make your behavior consistent, not just to other employees, but to yourself. This includes recognizing, harnessing, predicting, fostering, valuing, soothing, increasing, decreasing, managing, shifting, influencing or turning around emotions and integrating accurate emotional information into decision-making, reasoning, problem solving, etc., because, emotions run business in a way that spreadsheets and logic cannot. When a deadline lurks, you’ll know the steps you need to take to keep calm and composed. You’ll find out how to meet the deadline, and not get bogged down by stress. We’ll explain these factors and techniques through real-life examples faced by IT employees and you’ll learn using the choices that they made. This book will give you a detailed analysis of the events and behavioral pattern of the employees during that time. This will help you improve your own EQ to the extent that you don’t just survive, but thrive in a competitive IT industry.
Nicolae Tarla
Power Virtual Agents is a set of technologies released under the Power Platform umbrella by Microsoft. It allows non-developers to create solutions to automate customer interactions and provide services using a conversational interface, thus relieving the pressure on front-line staff providing this kind of support.Empowering Organizations with Power Virtual Agents is a guide to building chatbots that can be deployed to handle front desk services without having to write code. The book takes a scenario-based approach to implementing bot services and automation to serve employees in the organization and external customers. You will uncover the features available in Power Virtual Agents for creating bots that can be integrated into an organization’s public site as well as specific web pages. Next, you will understand how to build bots and integrate them within the Teams environment for internal users. As you progress, you will explore complete examples for implementing automated agents (bots) that can be deployed on sites for interacting with external customers.By the end of this Power Virtual Agents chatbot book, you will have implemented several scenarios to serve external client requests for information, created scenarios to help internal users retrieve relevant information, and processed these in an automated conversational manner.
Emmanuel Raj
Engineering MLps presents comprehensive insights into MLOps coupled with real-world examples in Azure to help you to write programs, train robust and scalable ML models, and build ML pipelines to train and deploy models securely in production.The book begins by familiarizing you with the MLOps workflow so you can start writing programs to train ML models. Then you’ll then move on to explore options for serializing and packaging ML models post-training to deploy them to facilitate machine learning inference, model interoperability, and end-to-end model traceability. You’ll learn how to build ML pipelines, continuous integration and continuous delivery (CI/CD) pipelines, and monitor pipelines to systematically build, deploy, monitor, and govern ML solutions for businesses and industries. Finally, you’ll apply the knowledge you’ve gained to build real-world projects.By the end of this ML book, you'll have a 360-degree view of MLOps and be ready to implement MLOps in your organization.
Matt Benatan, Jochem Gietema, Marian Schneider
Deep learning has an increasingly significant impact on our lives, from suggesting content to playing a key role in mission- and safety-critical applications. As the influence of these algorithms grows, so does the concern for the safety and robustness of the systems which rely on them. Simply put, typical deep learning methods do not know when they don’t know.The field of Bayesian Deep Learning contains a range of methods for approximate Bayesian inference with deep networks. These methods help to improve the robustness of deep learning systems as they tell us how confident they are in their predictions, allowing us to take more in how we incorporate model predictions within our applications.Through this book, you will be introduced to the rapidly growing field of uncertainty-aware deep learning, developing an understanding of the importance of uncertainty estimation in robust machine learning systems. You will learn about a variety of popular Bayesian Deep Learning methods, and how to implement these through practical Python examples covering a range of application scenarios.By the end of the book, you will have a good understanding of Bayesian Deep Learning and its advantages, and you will be able to develop Bayesian Deep Learning models for safer, more robust deep learning systems.
Dipayan Sarkar, Vijayalakshmi Natarajan
Ensemble modeling is an approach used to improve the performance of machine learning models. It combines two or more similar or dissimilar machine learning algorithms to deliver superior intellectual powers. This book will help you to implement popular machine learning algorithms to cover different paradigms of ensemble machine learning such as boosting, bagging, and stacking.The Ensemble Machine Learning Cookbook will start by getting you acquainted with the basics of ensemble techniques and exploratory data analysis. You'll then learn to implement tasks related to statistical and machine learning algorithms to understand the ensemble of multiple heterogeneous algorithms. It will also ensure that you don't miss out on key topics, such as like resampling methods. As you progress, you’ll get a better understanding of bagging, boosting, stacking, and working with the Random Forest algorithm using real-world examples. The book will highlight how these ensemble methods use multiple models to improve machine learning results, as compared to a single model. In the concluding chapters, you'll delve into advanced ensemble models using neural networks, natural language processing, and more. You’ll also be able to implement models such as fraud detection, text categorization, and sentiment analysis.By the end of this book, you'll be able to harness ensemble techniques and the working mechanisms of machine learning algorithms to build intelligent models using individual recipes.
Ryan Doan
The rapid advancements in large language models (LLMs) bring significant challenges in deployment, maintenance, and scalability. This Essential Guide to LLMOps provides practical solutions and strategies to overcome these challenges, ensuring seamless integration and the optimization of LLMs in real-world applications.This book takes you through the historical background, core concepts, and essential tools for data analysis, model development, deployment, maintenance, and governance. You’ll learn how to streamline workflows, enhance efficiency in LLMOps processes, employ LLMOps tools for precise model fine-tuning, and address the critical aspects of model review and governance. You’ll also get to grips with the practices and performance considerations that are necessary for the responsible development and deployment of LLMs. The book equips you with insights into model inference, scalability, and continuous improvement, and shows you how to implement these in real-world applications.By the end of this book, you’ll have learned the nuances of LLMOps, including effective deployment strategies, scalability solutions, and continuous improvement techniques, equipping you to stay ahead in the dynamic world of AI.