Big data
Valery Manokhin, Agus Sudjianto
In the rapidly evolving landscape of machine learning, the ability to accurately quantify uncertainty is pivotal. Practical Guide to Applied Conformal Prediction in Python addresses this need by offering an in-depth exploration of Conformal Prediction, a cutting-edge framework set to revolutionize uncertainty management in various ML applications.Embark on a comprehensive journey through Conformal Prediction, exploring its fundamentals and practical applications in binary classification, regression, time series forecasting, imbalanced data, computer vision, and NLP. Each chapter delves into specific aspects, offering hands-on insights and best practices for enhancing prediction reliability. The book concludes with a focus on multi-class classification nuances, providing expert-level proficiency to seamlessly integrate Conformal Prediction into diverse industries. Practical examples in Python using real-world datasets reinforce intuitive explanations, ensuring you acquire a robust understanding of this modern framework for uncertainty quantification.This guide is a beacon for mastering Conformal Prediction in Python, providing a blend of theory and practical application. It serves as a comprehensive toolkit to enhance machine learning skills, catering to professionals from data scientists to ML engineers.
Practical Machine Learning Cookbook. Supervised and unsupervised machine learning simplified
Atul Tripathi
Machine learning has become the new black. The challenge in today’s world is the explosion of data from existing legacy data and incoming new structured and unstructured data. The complexity of discovering, understanding, performing analysis, and predicting outcomes on the data using machine learning algorithms is a challenge. This cookbook will help solve everyday challenges you face as a data scientist. The application of various data science techniques and on multiple data sets based on real-world challenges you face will help you appreciate a variety of techniques used in various situations.The first half of the book provides recipes on fairly complex machine-learning systems, where you’ll learn to explore new areas of applications of machine learning and improve its efficiency. That includes recipes on classifications, neural networks, unsupervised and supervised learning, deep learning, reinforcement learning, and more.The second half of the book focuses on three different machine learning case studies, all based on real-world data, and offers solutions and solves specific machine-learning issues in each one.
Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah...
With huge amounts of data being generated every moment, businesses need applications that apply complex mathematical calculations to data repeatedly and at speed. With machine learning techniques and R, you can easily develop these kinds of applications in an efficient way.Practical Machine Learning with R begins by helping you grasp the basics of machine learning methods, while also highlighting how and why they work. You will understand how to get these algorithms to work in practice, rather than focusing on mathematical derivations. As you progress from one chapter to another, you will gain hands-on experience of building a machine learning solution in R. Next, using R packages such as rpart, random forest, and multiple imputation by chained equations (MICE), you will learn to implement algorithms including neural net classifier, decision trees, and linear and non-linear regression. As you progress through the book, you’ll delve into various machine learning techniques for both supervised and unsupervised learning approaches. In addition to this, you’ll gain insights into partitioning the datasets and mechanisms to evaluate the results from each model and be able to compare them. By the end of this book, you will have gained expertise in solving your business problems, starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not overtrain it.
Ralph Winters
This is the go-to book for anyone interested in the steps needed to develop predictive analytics solutions with examples from the world of marketing, healthcare, and retail. We'll get startedwith a brief history of predictive analytics and learn about different roles and functions people play within a predictive analytics project. Then, we will learn about various ways of installing R along with their pros and cons, combined with a step-by-step installation of RStudio,and a description of the best practices for organizing your projects.On completing the installation, we will begin to acquire the skills necessary to input, clean, and prepare your data for modeling. We will learn the six specific steps needed to implement andsuccessfully deploy a predictive model starting from asking the right questions through model development and ending with deploying your predictive model into production. We will learn whycollaboration is important and how agile iterative modeling cycles can increase your chances of developing and deploying the best successful model.We will continue your journey in the cloud by extending your skill set by learning about Databricks and SparkR, which allow you to develop predictive models on vast gigabytes of data.
Shilpi Saxena, Saurabh Gupta
With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible.This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you’ll be equipped with a clear understanding of how to solve challenges on your own.We’ll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You’ll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case.By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner.
Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh
Site reliability engineering (SRE) is being touted as the most competent paradigm in establishing and ensuring next-generation high-quality software solutions.This book starts by introducing you to the SRE paradigm and covers the need for highly reliable IT platforms and infrastructures. As you make your way through the next set of chapters, you will learn to develop microservices using Spring Boot and make use of RESTful frameworks. You will also learn about GitHub for deployment, containerization, and Docker containers. Practical Site Reliability Engineering teaches you to set up and sustain containerized cloud environments, and also covers architectural and design patterns and reliability implementation techniques such as reactive programming, and languages such as Ballerina and Rust. In the concluding chapters, you will get well-versed with service mesh solutions such as Istio and Linkerd, and understand service resilience test practices, API gateways, and edge/fog computing.By the end of this book, you will have gained experience on working with SRE concepts and be able to deliver highly reliable apps and services.
Pethuru Raj Chelliah, Shreyash Naithani, Shailender Singh
Site reliability engineering (SRE) is being touted as the most competent paradigm in establishing and ensuring next-generation high-quality software solutions.This book starts by introducing you to the SRE paradigm and covers the need for highly reliable IT platforms and infrastructures. As you make your way through the next set of chapters, you will learn to develop microservices using Spring Boot and make use of RESTful frameworks. You will also learn about GitHub for deployment, containerization, and Docker containers. Practical Site Reliability Engineering teaches you to set up and sustain containerized cloud environments, and also covers architectural and design patterns and reliability implementation techniques such as reactive programming, and languages such as Ballerina and Rust. In the concluding chapters, you will get well-versed with service mesh solutions such as Istio and Linkerd, and understand service resilience test practices, API gateways, and edge/fog computing.By the end of this book, you will have gained experience on working with SRE concepts and be able to deliver highly reliable apps and services.
Valentina Costa-Gazcón
Threat hunting (TH) provides cybersecurity analysts and enterprises with the opportunity to proactively defend themselves by getting ahead of threats before they can cause major damage to their business.This book is not only an introduction for those who don’t know much about the cyber threat intelligence (CTI) and TH world, but also a guide for those with more advanced knowledge of other cybersecurity fields who are looking to implement a TH program from scratch.You will start by exploring what threat intelligence is and how it can be used to detect and prevent cyber threats. As you progress, you’ll learn how to collect data, along with understanding it by developing data models. The book will also show you how to set up an environment for TH using open source tools. Later, you will focus on how to plan a hunt with practical examples, before going on to explore the MITRE ATT&CK framework.By the end of this book, you’ll have the skills you need to be able to carry out effective hunts in your own environment.
Avishek Pal, PKS Prakash
Time Series Analysis allows us to analyze data which is generated over a period of time and has sequential interdependencies between the observations. This book describes special mathematical tricks and techniques which are geared towards exploring the internal structures of time series data and generating powerful descriptive and predictive insights. Also, the book is full of real-life examples of time series and their analyses using cutting-edge solutions developed in Python. The book starts with descriptive analysis to create insightful visualizations of internal structures such as trend, seasonality, and autocorrelation. Next, the statistical methods of dealing with autocorrelation and non-stationary time series are described. This is followed by exponential smoothing to produce meaningful insights from noisy time series data. At this point, we shift focus towards predictive analysis and introduce autoregressive models such as ARMA and ARIMA for time series forecasting. Later, powerful deep learning methods are presented, to develop accurate forecasting models for complex time series, and under the availability of little domain knowledge. All the topics are illustrated with real-life problem scenarios and their solutions by best-practice implementations in Python.The book concludes with the Appendix, with a brief discussion of programming and solving data science problems using Python.
Praktyczne uczenie maszynowe w języku R
Fred Nwanganga, Mike Chapple
WPROWADZENIE DO UCZENIA MASZYNOWEGO Z WYKORZYSTANIEM INTUICYJNEGO JĘZYKA R Uczenie maszynowe i analiza danych pełnią coraz ważniejszą rolę w tworzeniu wartości dodanej. Uczenie maszynowe pozwala znajdować ukryte w danych zależności, wnosząc nowe pomysły i wiedzę, którą trudno byłoby osiągnąć bez tej zaawansowanej techniki. Książka Praktyczne uczenie maszynowe w języku R to wstępne przygotowanie do pracy z dużymi zbiorami danych w języku R, który jest łatwy w zrozumieniu i został opracowany specjalnie z myślą o analizie statystycznej. Nawet osoby bez doświadczenia w programowaniu mogą skorzystać z tej książki, dowiadując się, w jaki sposób praktyczne zastosowania uczenia maszynowego pozwalają analitykom danych pozyskiwać strategiczne informacje biznesowe, solidne prognozy i podejmować trafniejsze decyzje. W odróżnieniu od innych książek na ten temat, Praktyczne uczenie maszynowe w języku R oferuje zarówno teoretyczne, jak i techniczne wprowadzenie do uczenia maszynowego. W przykładach i ćwiczeniach wykorzystywany jest język programowania R oraz najnowsze narzędzia analizy danych, co pozwala zacząć pracę bez nadmiernego zagłębiania się w zaawansowaną matematykę. Dzięki tej książce techniki uczenia maszynowego – począwszy od regresji logistycznej po reguły asocjacyjne i analizę skupień – są w zasięgu ręki. Jedyna publikacja, która łączy intuicyjne wprowadzenie do uczenia maszynowego z opisami zastosowań technicznych krok po kroku. Praktyczne uczenie maszynowe w języku R pokaże jak: przyswoić koncepcje różnych typów uczenia maszynowego, odkrywać wzorce ukryte w dużych zbiorach danych, pisać i wykonywać skrypty R za pomocą RStudio, używać języka R w połączeniu z pakietami Tidyverse do zarządzania danymi i ich wizualizacji, stosować podstawowe techniki statystyczne, takie jak regresja logistyczna czy naiwny klasyfikator Bayesa, oceniać i ulepszać modele uczenia maszynowego. DR FRED NWANGANGA jest profesorem uczelni na wydziale Business Analytics w Mendoza College of Business na uniwersytecie Notre Dame, Indiana, USA. Ma ponad 15-letnie doświadczenie w pełnieniu roli lidera technicznego. DR MIKE CHAPPLE jest profesorem uczelni na wydziale Technology, Analytics, and Operations w Mendoza College of Business. Mike jest autorem ponad 25 poczytnych książek i pełni funkcję dyrektora naukowego programu studiów magisterskich z analizy biznesowej.
Praktyczne uczenie nienadzorowane przy użyciu języka Python
Ankur A. Patel
Wielu ekspertów branżowych uważa uczenie nienadzorowane za kolejną granicę w dziedzinie sztucznej inteligencji, która może stanowić klucz do pełnej sztucznej inteligencji. Ponieważ większość danych na świecie jest nieoznakowana, nie można do nich zastosować konwencjonalnego uczenia nadzorowanego. Z kolei uczenie nienadzorowane może być stosowane wobec nieoznakowanych zbiorów danych w celu odkrycia istotnych wzorców ukrytych głęboko w tych danych, które dla człowieka mogą być niemal niemożliwe do odkrycia. Autor Ankur Patel pokazuje, jak stosować uczenie nienadzorowane przy wykorzystaniu dwóch prostych platform dla języka Python: Scikit-learn oraz TensorFlow (wraz z Keras). Dzięki dołączonemu kodowi i praktycznym przykładom analitycy danych będą mogli identyfikować trudne do znalezienia wzorce w danych i odkrywać dogłębne zależności biznesowe, wykrywać anomalie, przeprowadzać automatyczną selekcję zmiennych i generować syntetyczne zbiory danych. Wystarczy znajomość programowania i nieco doświadczenia w uczeniu maszynowym, aby zająć się: Porównywaniem mocnych i słabych stron różnych podejść do uczenia maszynowego: uczenia nadzorowanego, nienadzorowanego i wzmacnianego. Przygotowywaniem i zarządzaniem projektami uczenia maszynowego. Budowaniem systemu wykrywania anomalii w celu wychwycenia oszustwa dotyczącego kard kredytowych. Rozdzielaniem użytkowników na wydzielone i jednorodne grupy. Przeprowadzaniem uczenia pół-nadzorowanego. Opracowywaniem systemów polecania filmów z użyciem ograniczonych automatów Boltzmanna. Generowaniem syntetycznych obrazów przy użyciu generujących sieci antagonistycznych. Badacze, inżynierowie i studenci docenią tę książkę pełną praktycznych technik uczenia nienadzorowanego, napisaną prostym językiem z nieskomplikowanymi przykładami w języku Python, które można szybko i skutecznie implementować. Sarah Nagy Główny analityk danych w firmie Edison Ankur A. Patel jest wiceprezesem ds. informatyki analitycznej w firmie 7Park Data, wspieranej przez firmę inwestycyjną Vista Equity Partners. W firmie 7Park Data, Ankur i jego zespół analizy danych wykorzystują dane alternatywne do opracowywania produktów związanych z danymi dla funduszy hedgingowych i korporacji oraz rozwijają usługi uczenia maszynowego dla klientów firmowych.
Sinan Ozdemir
Principles of Data Science bridges mathematics, programming, and business analysis, empowering you to confidently pose and address complex data questions and construct effective machine learning pipelines. This book will equip you with the tools to transform abstract concepts and raw statistics into actionable insights.Starting with cleaning and preparation, you’ll explore effective data mining strategies and techniques before moving on to building a holistic picture of how every piece of the data science puzzle fits together. Throughout the book, you’ll discover statistical models with which you can control and navigate even the densest or the sparsest of datasets and learn how to create powerful visualizations that communicate the stories hidden in your data.With a focus on application, this edition covers advanced transfer learning and pre-trained models for NLP and vision tasks. You’ll get to grips with advanced techniques for mitigating algorithmic bias in data as well as models and addressing model and data drift. Finally, you’ll explore medium-level data governance, including data provenance, privacy, and deletion request handling.By the end of this data science book, you'll have learned the fundamentals of computational mathematics and statistics, all while navigating the intricacies of modern ML and large pre-trained models like GPT and BERT.
Principles of Data Science. Mathematical techniques and theory to succeed in data-driven industries
Sinan Ozdemir
Need to turn your skills at programming into effective data science skills? Principles of Data Science is created to help you join the dots between mathematics, programming, and business analysis. With this book, you’ll feel confident about asking—and answering—complex and sophisticated questions of your data to move from abstract and raw statistics to actionable ideas.With a unique approach that bridges the gap between mathematics and computer science, this books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques, you’ll move on to build a comprehensive picture of how every piece of the data science puzzle fits together. Learn the fundamentals of computational mathematics and statistics, as well as some pseudocode being used today by data scientists and analysts. You’ll get to grips with machine learning, discover the statistical models that help you take control and navigate even the densest datasets, and find out how to create powerful visualizations that communicate what your data means.
Sinan Ozdemir, Sunil Kakade, Marco Tibaldeschi
Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas.Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.
Principles of Strategic Data Science. Creating value from data, big and small
Peter Prevos
Mathematics and computer science form an integral part of data science, and understanding them is crucial for efficiently managing data. This book is designed to take you through the entire data science pipeline and help you join the dots between mathematics, programming, and business analysis. You’ll start by learning what data science is and how organizations can use it to revolutionize the way they use their data. The book then covers the criteria for the soundness of data products and demonstrates how to effectively visualize information. As you progress, you’ll discover the strategic aspects of data science by exploring the five-phase framework that enables you to enhance the value you extract from data. Toward the concluding chapters, you’ll understand the role of a data science manager in helping an organization take the data-driven approach.By the end of this book, you’ll have a good understanding of data science and how it can enable you to extract value from your data.
Srinivasa Rao Aravilli, Sam Hamilton
– In an era of evolving privacy regulations, compliance is mandatory for every enterprise – Machine learning engineers face the dual challenge of analyzing vast amounts of data for insights while protecting sensitive information – This book addresses the complexities arising from large data volumes and the scarcity of in-depth privacy-preserving machine learning expertise, and covers a comprehensive range of topics from data privacy and machine learning privacy threats to real-world privacy-preserving cases – As you progress, you’ll be guided through developing anti-money laundering solutions using federated learning and differential privacy – Dedicated sections will explore data in-memory attacks and strategies for safeguarding data and ML models – You’ll also explore the imperative nature of confidential computation and privacy-preserving machine learning benchmarks, as well as frontier research in the field – Upon completion, you’ll possess a thorough understanding of privacy-preserving machine learning, equipping them to effectively shield data from real-world threats and attacks
MrExcel's Holy Macro! Books, Eduardo N Sanchez
This book dives deep into the world of PowerPoint VBA with this practical guide. Starting with basic operations, the book moves through advanced automation techniques, covering themes, layouts, and animations. Readers will learn to craft professional-grade presentations, create interactive quizzes, and integrate PowerPoint with other Office apps, while mastering VBA essentials.This book provides a comprehensive introduction to PowerPoint VBA, enabling readers to apply programming concepts to automate repetitive tasks, enhance presentations, and build creative visual effects. With real-world projects and practical code examples, the book helps readers gain confidence in customizing PowerPoint features.Whether you're creating professional reports, interactive presentations, or seamless Office integrations, this guide ensures you stay ahead with advanced VBA techniques, enabling you to transform your PowerPoint workflow.