Big data - Ebooki - Biblioteka BIBLIO | BIBLIO ebookpoint

241

EBOOK

Digital Transformation with Dataverse for Teams. Become a citizen developer and lead the digital transformation wave with Microsoft Teams and Power Platform

Srikumar Nair

Microsoft Dataverse for Teams is a built-in, low-code data platform for Teams and enables everyone to easily build and deploy apps, flows, and intelligent chatbots using Power Apps, Power Automate, and Power Virtual Agents (PVA) embedded in Microsoft Teams.Without learning any coding language, you will be able to build apps with step-by-step explanations for setting up Teams, creating tables to store data, and leverage the data for your digital solutions. With the techniques covered in the book, you’ll be able to develop your first app with Dataverse for Teams within an hour! You’ll then learn how to automate repetitive tasks or build alerts using Power Automate and Power Virtual Agents. As you get to grips with building these digital solutions, you’ll also be able to understand when to consider upgrading from Dataverse for Teams to Dataverse, along with its advanced features. Finally, you’ll explore features for administration and governance and understand the licensing requirements of Microsoft Dataverse for Teams and PowerApps.Having acquired the skills to build and deploy an enterprise-grade digital solution, by the end of the book, you will have become a qualified citizen developer and be ready to lead a digital revolution in your organization.

242

EBOOK

Distributed Data Systems with Azure Databricks. Create, deploy, and manage enterprise data pipelines

Alan Bernardo Palacio

Microsoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines. The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you’ll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you’ll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks. Finally, you’ll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you’ll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline.

243

EBOOK

Distributed Machine Learning with Python. Accelerating model training and serving with distributed systems

Guanhua Wang

Reducing time cost in machine learning leads to a shorter waiting time for model training and a faster model updating cycle. Distributed machine learning enables machine learning practitioners to shorten model training and inference time by orders of magnitude. With the help of this practical guide, you'll be able to put your Python development knowledge to work to get up and running with the implementation of distributed machine learning, including multi-node machine learning systems, in no time. You'll begin by exploring how distributed systems work in the machine learning area and how distributed machine learning is applied to state-of-the-art deep learning models. As you advance, you'll see how to use distributed systems to enhance machine learning model training and serving speed. You'll also get to grips with applying data parallel and model parallel approaches before optimizing the in-parallel model training and serving pipeline in local clusters or cloud environments. By the end of this book, you'll have gained the knowledge and skills needed to build and deploy an efficient data processing pipeline for machine learning model training and inference in a distributed manner.

244

EBOOK

Dodaj mocy Power BI! Jak za pomocą kodu w Pythonie i R pobierać, przekształcać i wizualizować dane

Luca Zavarella, Francesca Lazzeri

Ważnym zadaniem inżynierów danych jest kreowanie modeli uczenia maszynowego. Używa się do tego narzędzi do analizy biznesowej, takich jak Power BI. Możliwości Power BI są imponujące, a można je dodatkowo rozbudować. Jedną z ciekawszych metod wzbogacania modelu danych i wizualizacji Power BI jest zastosowanie złożonych algorytmów zaimplementowanych w językach Python i R. W ten sposób można nie tylko tworzyć interesujące wizualizacje danych, ale także pozyskiwać dzięki nim kluczowe dla biznesu informacje. Dzięki tej książce dowiesz się, jak to zrobić. Zaczniesz od przygotowania środowiska Power BI do używania skryptów w Pythonie i R. Następnie będziesz importować dane z nieobsługiwanych obiektów i przekształcać je za pomocą wyrażeń regularnych i złożonych algorytmów. Nauczysz się wywoływać zewnętrzne interfejsy API i korzystać z zaawansowanych technik w celu przeprowadzenia dogłębnych analiz i wyodrębnienia cennych informacji za pomocą narzędzi statystyki i uczenia maszynowego, a także poprzez zastosowanie optymalizacji liniowej i innych algorytmów. Zapoznasz się również z głównymi cechami statystycznymi zestawów danych i z metodami tworzenia różnych wykresów ułatwiających zrozumienie relacji między zmiennymi. Najciekawsze zagadnienia: złożone przekształcanie danych w Power BI za pomocą skryptów Pythona i R anonimizacja i pseudonimizacja danych praca z dużymi zestawami danych wartości odstające i brakujące dla danych wielowymiarowych i szeregów czasowych tworzenie złożonych wizualizacji danych Wyzwól potężną moc Power BI!

245

EBOOK

Don't Fear the Spreadsheet. A Beginner's Guide to Overcoming Excel's Frustrations

MrExcel's Holy Macro! Books, Tyler Nash, Bill...

This book is written in an easy-to-follow question-and-answer format, specifically designed for complete Excel beginners. Focusing on the extreme basics of using spreadsheets, it avoids overwhelming readers with advanced topics and instead builds a foundational understanding. Readers will quickly gain a passable knowledge of the program, addressing common fears and frustrations through clear explanations and practical examples.The guide answers hundreds of everyday questions, such as Can I delete data without changing formatting? and How do I use text-wrapping? as well as slightly more advanced queries like What is a Macro, and how do I create one? It empowers users by breaking down intimidating concepts into manageable steps, making Excel approachable and useful for even the most inexperienced users. The focus is on helping readers become comfortable with essential tasks, from merging cells and formatting text to understanding formulas and navigating the interface.Aimed at the 40 percent of Excel users who have never entered a formula, this book demystifies the program's tools and functions, transforming confusion into confidence. By the end, readers will feel equipped to use Excel effectively for personal and professional tasks, overcoming barriers to productivity.

246

EBOOK

Driving Data Quality with Data Contracts. A comprehensive guide to building reliable, trusted, and effective data platforms

Andrew Jones, Kevin Hu

Despite the passage of time and the evolution of technology and architecture, the challenges we face in building data platforms persist. Our data often remains unreliable, lacks trust, and fails to deliver the promised value.With Driving Data Quality with Data Contracts, you’ll discover the potential of data contracts to transform how you build your data platforms, finally overcoming these enduring problems. You’ll learn how establishing contracts as the interface allows you to explicitly assign responsibility and accountability of the data to those who know it best—the data generators—and give them the autonomy to generate and manage data as required. The book will show you how data contracts ensure that consumers get quality data with clearly defined expectations, enabling them to build on that data with confidence to deliver valuable analytics, performant ML models, and trusted data-driven products.By the end of this book, you’ll have gained a comprehensive understanding of how data contracts can revolutionize your organization’s data culture and provide a competitive advantage by unlocking the real value within your data.

247

EBOOK

Dziennikarstwo danych i data storytelling

Łukasz Żyła

Bez danych jesteś jedynie kolejną osobą z opinią... Dziennikarstwo danych przeżywa dziś prawdziwy rozkwit. Dzieje się tak dlatego, że nasze życie w dużej mierze przeniosło się do internetu, a internet to... dane. Megabajty, gigabajty, terabajty danych. Misją współczesnego dziennikarza jest przedstawiać je społeczeństwu rzetelnie, a równocześnie pięknie, czyli w sposób zrozumiały, łatwy do przyswojenia. Nim się jednak owe dane pięknie zestawi, trzeba je znaleźć. Gdzie szukać? Jak je zdobyć? W jaki sposób opowiedzieć dane? Na takie pytania autor odpowiada w tej książce. Nie przeczytasz w niej o "ładnych wykresach", bo wbrew pozorom to nie one są esencją dziennikarstwa danych i data storytellingu. Dowiesz się natomiast, gdzie biją źródła potrzebnych Ci informacji, jak je przetwarzać i analizować. Znajdziesz tu także wskazówki, w jaki sposób tworzyć dobre wizualizacje za pomocą prostych aplikacji dostępnych za darmo w internecie i jak kreować angażujące odbiorców data stories. Na koniec wejdziesz na wyższy poziom - nauczysz się prezentować dane z wykorzystaniem kodu programistycznego. Kto? Co? Jak? Gdzie? Kiedy? ― odpowiedzi na te podstawowe pytania musi znaleźć każdy dziennikarz, który chce rzetelnie wykonać swoją pracę. Jednocześnie przy zalewie informacji, danych ze źródeł, których weryfikacja jest równie czasochłonna, każdy wykonujący ten piękny zawód coraz bardziej przypomina mitycznego Syzyfa. Przebicie się przez gigabajty informacji, przetworzenie ich i stworzenie materiału, który tłumaczy odbiorcy rzeczywistość, jest dziś działaniem obarczonym ogromnym wysiłkiem i jeszcze większym ryzykiem. Kaskadowy spadek zaufania do instytucji publicznych i prywatnych, z jakim mamy do czynienia od lat, oddziałuje także na media, z jednej strony wystawiane na szereg nacisków biznesowych, politycznych i społecznych, z drugiej ― borykające się z ciągłymi problemami finansowymi. Co warto wiedzieć, dobre dziennikarstwo, jakościowe dziennikarstwo to coś, co wymaga swobodnego poruszania się autorów w przestrzeni internetu i danych, a także poznania podstaw funkcjonowania w tej przestrzeni. Dlatego, jeżeli chcemy mieć przynajmniej cień nadziei na dobrze wykonaną pracę, warto sięgnąć po książkę Łukasza Żyły. W zawodzie zawsze mi powtarzano, że tej profesji człowiek uczy się tylko w praktyce i na pewno nie na studiach. Nadal tak jest, choć czasy, w których media dosłownie pączkują na każdym kroku i angażują coraz młodszych adeptów dziennikarstwa, wymagają, by sięgnąć po informacyjną pigułę, swoisty wykrywacz min, dzięki czemu te pierwsze kroki wspomniany początkujący dziennikarz będzie mógł stawiać względnie bezpiecznie. Dziennikarstwo danych i data storytelling to także pozycja dla osób doświadczonych w tym zawodzie. Powód jest oczywisty, technologia zmieniła dziennikarstwo i w pędzie żywiołu, którym ono jest, łatwo popaść w bezpieczną i przez to złudną rutynę, a wtedy jesteśmy o krok od poważnego błędu. Dzięki książce Łukasza Żyły łatwiejsze do ominięcia będą cyfrowe rafy, którymi sieć jest usłana. Bartosz Kurek, były dziennikarz Polsatu, obecnie menedżer ds. public affairs w Philip Morris Co wy tam tak naprawdę robicie? ― to częste pytanie, kiedy mówię, że pracuję w dziale danych „Wyborczej”. Niektórzy ze znawstwem odpowiadają: „Aaa, czyli robicie analizy wyników sprzedaży gazety?”. Inni zmieniają temat, spodziewając się, że zarzucę ich nudnymi opowieściami o uzupełnianiu tabelek liczbami. Co ciekawe, pytanie o to, jak dokładnie wygląda nasza praca, zadają również dziennikarze. Teraz, zamiast wchodzić w szczegóły, będę mógł zacząć odpowiedź od słów: „Jest taka książka, warto przeczytać…”, bo Łukasz w bardzo przystępny sposób tłumaczy, czym to się je. I myślę, że niezależnie od tego, jaką działką dziennikarstwa się zajmujecie, znajdziecie w niej coś dla siebie. Części dotyczące współpracy z urzędnikami, dostępu do informacji czy opowiadania historii powinien przyswoić każdy, kto będzie pracował w zawodzie. Po te o opracowywaniu danych sięgną ambitniejsi, a może po prostu bardziej przewidujący, bo pisać potrafi wielu, ale zdolność pisania połączona z umiejętnością analizowania, programowania lub wizualizowania robi z dziennikarza człowieka do zadań specjalnych. Kiedy czytałem tę książkę, wiele razy żałowałem, że czegoś takiego nie było, kiedy ja zaczynałem przygodę z danymi. Dzięki niej widzę, ile jeszcze powinienem się w tej dziedzinie nauczyć. Dominik Uhlig, szef BIQdata.pl ― działu danych „Gazety Wyborczej”

248

EBOOK

Effective Amazon Machine Learning. Expert web services for machine learning on cloud

Alexis Perrier

Predictive analytics is a complex domain requiring coding skills, an understanding of the mathematical concepts underpinning machine learning algorithms, and the ability to create compelling data visualizations. Following AWS simplifying Machine learning, this book will help you bring predictive analytics projects to fruition in three easy steps: data preparation, model tuning, and model selection.This book will introduce you to the Amazon Machine Learning platform and will implement core data science concepts such as classification, regression, regularization, overfitting, model selection, and evaluation. Furthermore, you will learn to leverage the Amazon Web Service (AWS) ecosystem for extended access to data sources, implement realtime predictions, and run Amazon Machine Learning projects via the command line and the Python SDK. Towards the end of the book, you will also learn how to apply these services to other problems, such as text mining, and to more complex datasets.